[PYTHON] I wrote the basic operation of Seaborn in Jupyter Lab

24th

Seaborn

import seaborn as sns
%matplotlib inline
import pandas as pd
df = pd.read_csv('train.csv')
df = df.dropna(subset=['Age'])
sns.distplot(df['Age'])
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcdd7e3d0>

png

sns.set()
sns.distplot(df['Age'],bins=50)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcdd05850>

png

See the distribution of two variables with sns.jointplot ()

sns.jointplot(x='Age', y='Fare', data=df)
<seaborn.axisgrid.JointGrid at 0x7fcbcdbe31d0>

png

sns.jointplot(x='Age', y='Fare', data=df, kind='hex')
<seaborn.axisgrid.JointGrid at 0x7fcbcdb2b890>

png

(Important) Display the distribution of multiple columns in one shot with sns.pairplot ()

sns.pairplot(df[['Age', 'Fare', 'Pclass', 'Survived']], hue='Survived', kind='scatter', plot_kws={'alpha': 0.5})
/opt/anaconda3/lib/python3.7/site-packages/statsmodels/nonparametric/kde.py:487: RuntimeWarning: invalid value encountered in true_divide
  binned = fast_linbin(X, a, b, gridsize) / (delta * nobs)
/opt/anaconda3/lib/python3.7/site-packages/statsmodels/nonparametric/kdetools.py:34: RuntimeWarning: invalid value encountered in double_scalars
  FAC1 = 2*(np.pi*bw/RANGE)**2





<seaborn.axisgrid.PairGrid at 0x7fcbcda34990>

png

25th

sns.barplot () Make a "bar chart"

import numpy as np
import pandas as pd
import seaborn as sns
%matplotlib inline
df = pd.read_csv('train.csv')
df.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S
sns.barplot(x='Survived', y='Age', data=df) #Average value
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcdd7e390>

png

sns.barplot(x='Survived', y='Age', data=df, estimator=np.median) #Median
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcce3a190>

png

sns.countplot () Compare the number of data

sns.countplot(x='Sex', data=df, hue='Survived')
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbccfb0c10>

png

sns.boxplot () Compare values by category

sns.boxplot(x='Pclass', y='Age', data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbccd82750>

png

sns.boxplot(x='Pclass', y='Age', data=df, hue='Survived')
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcc2a4ed0>

png

sns.violonplot () Visualize data analysis

sns.violinplot(x='Pclass', y='Age', data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcc1e3ad0>

png

sns.violinplot(x='Pclass', y='Age', data=df, hue=('Survived'))
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcc107210>

png

sns.swarmplot () You can see the true distribution

sns.swarmplot(x='Pclass', y='Age', data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcc19a710>

png

sns.swarmplot(x='Pclass', y='Age', data=df, size=4, hue='Survived')
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc6e9fe50>

png

26th

Draw a Heatmap

Make a correlation with df.corr ()

import pandas as pd
df = pd.read_csv('train.csv')
corr = df.corr()
corr
PassengerId Survived Pclass Age SibSp Parch Fare
PassengerId 1.000000 -0.005007 -0.035144 0.036847 -0.057527 -0.001652 0.012658
Survived -0.005007 1.000000 -0.338481 -0.077221 -0.035322 0.081629 0.257307
Pclass -0.035144 -0.338481 1.000000 -0.369226 0.083081 0.018443 -0.549500
Age 0.036847 -0.077221 -0.369226 1.000000 -0.308247 -0.189119 0.096067
SibSp -0.057527 -0.035322 0.083081 -0.308247 1.000000 0.414838 0.159651
Parch -0.001652 0.081629 0.018443 -0.189119 0.414838 1.000000 0.216225
Fare 0.012658 0.257307 -0.549500 0.096067 0.159651 0.216225 1.000000

plot Heatmap with sns.heatmap ()

sns.heatmap(corr)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc6df9850>

png

sns.heatmap(corr, cmap='coolwarm', annot=True)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbcd0b3290>

png

Sns.heatmap () to help you get a bird's eye view of your data

flights = sns.load_dataset('flights')
print(len(flights))
flights.head()
144
year month passengers
0 1949 January 112
1 1949 February 118
2 1949 March 132
3 1949 April 129
4 1949 May 121
# pivot_Create table
flights_pivot = flights.pivot_table(index='month', columns='year', values='passengers') 
flights_pivot
year 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
month
January 112 115 145 171 196 204 242 284 315 340 360 417
February 118 126 150 180 196 188 233 277 301 318 342 391
March 132 141 178 193 236 235 267 317 356 362 406 419
April 129 135 163 181 235 227 269 313 348 348 396 461
May 121 125 172 183 229 234 270 318 355 363 420 472
June 135 149 178 218 243 264 315 374 422 435 472 535
July 148 170 199 230 264 302 364 413 465 491 548 622
August 148 170 199 242 272 293 347 405 467 505 559 606
September 136 158 184 209 237 259 312 355 404 404 463 508
October 119 133 162 191 211 229 274 306 347 359 407 461
November 104 114 146 172 180 203 237 271 305 310 362 390
December 118 140 166 194 201 229 278 306 336 337 405 432
sns.heatmap(flights_pivot)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc5baabd0>

png

27th

Change basic style with sns.set ()

Specify the usage with the context argument

import pandas as pd
import seaborn as sns
%matplotlib inline

df = pd.read_csv('train.csv')

sns.set(context=('poster'))
df = df.dropna(subset=['Age'])
sns.distplot(df['Age'])
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc568f690>

png

Specify the style of the entire graph with the style argument

sns.set_style(style='whitegrid') #Change background color
sns.distplot(df['Age'])
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc563d850>

png

Specify color with palette argument

sns.set(palette='bright')
sns.violinplot(x='Pclass', y='Age', data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc5472850>

png

Drop axes and frames with sns.despine ()

sns.set(palette='bright' ,style='ticks') ##style='ticks'With axis to plot
sns.violinplot(x='Pclass', y='Age', data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc53ed810>

png

sns.set(palette='bright' )
sns.violinplot(x='Pclass', y='Age', data=df)
sns.despine()

png

Like matplotlib, you can do many things with the plt module

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 5))
sns.distplot(df['Age'])
<matplotlib.axes._subplots.AxesSubplot at 0x7fcbc52d8750>

png

sns.distplot(df['Age'])
plt.savefig('seaborn_sample.png')

png


Recommended Posts

I wrote the basic operation of Seaborn in Jupyter Lab
I wrote the basic operation of Numpy in Jupyter Lab.
I wrote the basic operation of matplotlib with Jupyter Lab
I wrote the basic operation of Pandas with Jupyter Lab (Part 1)
I wrote the basic operation of Pandas with Jupyter Lab (Part 2)
I wrote the basic grammar of Python with Jupyter Lab
I wrote the queue in Python
I wrote the stack in Python
I wrote the code to write the code of Brainf * ck in python
Build the execution environment of Jupyter Lab
I wrote the selection sort in C
I wrote the sliding wing in creation.
I checked the list of shortcut keys of Jupyter
I wrote the hexagonal architecture in go language
Basic operation of pandas
Basic operation of Pandas
I tried to summarize the basic form of GPLVM
[Tips] Save / copy the graph displayed in Jupyter Lab
[Note] I can't call the installed module in jupyter
I wrote python in Japanese
I participated in the translation activity of Django official documents
I tried the accuracy of three Stirling's approximations in python
Check the operation of Python for .NET in each environment
About the garbled Japanese part of pandas-profiling in Jupyter notebook
I want to get the operation information of yahoo route
I wrote it in Go to understand the SOLID principle
Change the theme of Jupyter
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I wrote a script that splits the image in two
Part 1 I wrote the answer to the reference problem of how to write offline in real time in Python
I compared the calculation time of the moving average written in Python
Feel free to change the label of the legend in Seaborn in python
LINEbot development, I want to check the operation in the local environment
I want to know the population of each country in the world.
I got lost in the maze
The story of participating in AtCoder
I investigated the mechanism of flask-login!
I participated in the ISUCON10 qualifying!
I wrote Fizz Buzz in Python
About the basic type of Go
The story of the "hole" in the file
I wrote Gray Scale in Pytorch
The meaning of ".object" in Django
Note that I understand the algorithm of the machine learning naive Bayes classifier. And I wrote it in Python.
[Fundamental Information Technology Engineer Examination] I wrote an algorithm for the maximum value of an array in Python.
[I touched the Raspberry Pi (1)] I summarized the basic operations of Minecraft Pi Edition (2015.5.23 pre-release)
Comparing the basic grammar of Python and Go in an easy-to-understand manner
I want to batch convert the result of "string" .split () in Python
I want to explain the abstract class (ABCmeta) of Python in detail.
I want to sort a list in the order of other lists
I wrote a corpus reader that reads the results of MeCab analysis
I want to leave an arbitrary command in the command history of Shell
I made a program to check the size of a file in Python
I made a mistake in fetching the hierarchy with MultiIndex of pandas
I tried to display the altitude value of DTM in a graph
I tried touching touch related methods in the scene module of pythonista
What I investigated in the process of expressing (schematicizing) containers in a nested frame with Jupyter and making it
Part 1 I wrote an example of the answer to the reference problem of how to write offline in real time in Python
I checked the contents of docker volume
I tried the asynchronous server of Django 3.0
[Understanding in 3 minutes] The beginning of Linux