Data science companion in python, how to specify elements in pandas

background

Occasionally, the content of writing code properly. It took a lot of time to get confused when specifying the dataframe in pandas. I'd like to organize it. It's compiled by pandas beginners, so if you make a mistake or have a better summary, please give me some advice.

Click here for the blog that we run: Effort 1mm

Each version is as follows.

pandas (0.18.1)
numpy (1.11.0)
Python 2.7.10

Method of specifying three elements

There are three ways to specify elements in pandas DataFrame.

The prerequisite DataFrame was created with the following code by referring to 10minites to pandas.

dates = pd.date_range('20130101', periods=6) df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))

The resulting DataFrame looks like this.

                   A         B         C         D
2013-01-01 -0.682002  1.977886  0.348623  0.405755
2013-01-02  0.085698  2.067378 -0.356269  1.349520
2013-01-03  0.058207 -0.539280  0.023205  1.154293
2013-01-04 -0.319075  1.174168 -1.282305  0.359333
2013-01-05 -2.557677  0.922672  0.202042  0.171645
2013-01-06  1.039422  0.300340  0.701594 -0.229087

How to specify df [a] type

It seems that it can be processed in the row direction or the column direction by the specified method.

#Specify a single column
df[‘A’]  #Column name=Specify A
df.A     #Same as above

#Row direction slice: df[ 0:3 ]
df[0:3]                    #Specify from line 0 to line 3
df[‘20130102’:’20130104’]  #Index is 2013-01-02~2013-01-Specify up to 04

Specify by label name of index column-> loc method

The first of the specified arguments (may I call it?) Is the operation on the index, and the second is the operation on the column.

#Get the corresponding index
# A    0.469112
# B  -0.282863
# C   -1.509059
# D  -1.135632
# Name: 2013-01-01 00:00:00, dtype: float64
df.loc[dates[0]]

#Specify the index column at the same time.
#                    A        B
# 2013-01-01  0.469112 -0.282863
# 2013-01-02  1.212112 -0.173215
# 2013-01-03 -0.861849 -2.104569
# 2013-01-04  0.721555 -0.706771
# 2013-01-05 -0.424972  0.567020
# 2013-01-06 -0.673690  0.113648
df.loc[:, [‘A’, ‘B’]]

#Get index by index name
#                    A        B
# 2013-01-02  1.212112 -0.173215
# 2013-01-03 -0.861849 -2.104569
# 2013-01-04  0.721555 -0.706771
df.loc[‘20130102’:’20130104’, [‘A’, ‘B’]]  #When specifying multiple columns, pass them as a list

#If you want to specify only one, it is faster to use at than loc
#0.46911229990718628
df.at[dates[0],'A']

Specify by the position number of the index column

How to specify by the position of the element when replacing it with a matrix. Of course, you can select more than one.

#Specified by index position (3rd line this time)
# A    0.721555
# B  -0.706771
# C  -1.039575
# D    0.271860
# Name: 2013-01-04 00:00:00, dtype: float64
df.iloc[3]

#index/Simultaneous specification of columns(Here, 3rd to 4th rows, 0th to 1st columns)
#                   A        B
#2013-01-04  0.721555 -0.706771
#2013-01-05 -0.424972  0.567020
df.iloc[3:5,0:2]

#Specify specific elements that are skipped
#                    A        C
# 2013-01-02  1.212112  0.119209
# 2013-01-03 -0.861849 -0.494929
# 2013-01-05 -0.424972  0.276232

df.iloc[[1,2,4],[0,2]]

I have no choice but to get used to it!

Click here for the blog that we run: Effort 1mm

Recommended Posts

Data science companion in python, how to specify elements in pandas
How to specify TLS version in python requests
How to remove duplicate elements in Python3 list
How to develop in Python
I tried to summarize how to use pandas in python
[Python] How to FFT mp3 data
How to write soberly in pandas
[Python] How to use Pandas Series
How to collect images in Python
How to swap elements in an array in Python, and how to reverse an array.
How to use SQLite in Python
How to generate exponential pulse time series data in python
How to specify Cache-Control for blob storage in Azure Storage in Python
How to use Mysql in python
How to wrap C in Python
How to use ChemSpider in Python
How to use PubChem in Python
How to create dataframes and mess with elements in pandas
How to handle Japanese in Python
<Pandas> How to handle time series data in a pivot table
[Beginner memo] How to specify the library reading path in Python
[Introduction to Python] How to use class in Python?
How to dynamically define variables in Python
How to do R chartr () in Python
How to work with BigQuery in Python
How to get a stacktrace in python
How to display multiplication table in python
How to extract polygon area in Python
How to reassign index in pandas dataframe
How to check opencv version in python
[Python] Pandas to fully understand in 10 minutes
How to specify non-check target in Flake8
How to switch python versions in cloud9
How to adjust image contrast in Python
How to use __slots__ in Python class
How to dynamically zero pad in Python
How to use regular expressions in Python
How to display Hello world in python
How to read CSV files in Pandas
Adding Series to columns in python pandas
How to use is and == in Python
How to write Ruby to_s in Python
Books on data science to read in 2020
How to send a visualization image of data created in Python to Typetalk
How to plot galaxy visible light data using OpenNGC database in python
[python] How to display list elements side by side
How to use the C library in Python
How to receive command line arguments in Python
[REAPER] How to play with Reascript in Python
How to clear tuples in a list (Python)
How to generate permutations in Python and C ++
How to embed a variable in a python string
Summary of how to import files in Python 3
How to simplify restricted polynomial fit in python
How to use Python Image Library in python3 series
How to implement shared memory in Python (mmap.mmap)
[Python] How to read excel file with pandas
[Python] How to read data from CIFAR-10 and CIFAR-100
Summary of how to use MNIST in Python
[Introduction to Python] How to handle JSON format data
How to specify attributes with Mock of python