[PYTHON] Browse .loc and .iloc at the same time in pandas DataFrame

.ix deprecated

It's a useful .ix when referencing a pandas DataFrame, but if you don't understand it, you'll fall into a trap.

For example, suppose you have the following DataFrame.

from pandas import DataFrame

df = DataFrame(
    [['a0', 'b0'], ['a1', 'b1'], ['a2', 'b2']],
    index=[2, 4, 6],
    columns=['a', 'b'])
a b
2 a0 b0
4 a1 b1
6 a2 b2

What if you want to refer to the second (starting from 0) row ʻa column here? The expected result is ʻa2. .ix accepts both order and index, so if you refer to it as below ...

df.ix[2, 'a']

The result will be ʻa0. This is because the reference given a number in .ix` goes to the index if the index exists, and to see the order if it does not exist in the index.

This ambiguity in the .ix reference seems to be deprecated in pandas version 0.20.

To refer to the sequence and index name at the same time

It may not be the best solution, but I will try to refer to it by .iloc. However, this only allows references in order, so use pandas.Index.get_loc together. This is a method that looks up a row name (or column name) and returns the order.

df.iloc[2, df.columns.get_loc('a')]

The expected result, ʻa2`, is now returned.

In the above example, the column name is specified, but when specifying the row name, do as follows.

df.iloc[df.index.get_loc(6), 0]

If you know the row and column names in advance, you can just use .loc normally.

df.loc[6, 'a']

Similarly, .iloc is fine if you know the row and column order in advance.

df.iloc[2, 0]

Summary

If you do not understand the behavior of .ix, it will behave unintentionally. Even if you understand the behavior, you need to know the contents of the index, so it seems better to avoid using .ix as much as possible.

Recommended Posts

Browse .loc and .iloc at the same time in pandas DataFrame
Loop variables at the same time in the template
Visualize data and understand correlation at the same time
Type conversion of multiple columns of pandas DataFrame with astype at the same time
Is there NaN in the pandas DataFrame?
Plot multiple maps and data at the same time with Python's matplotlib
I want to make a music player and file music at the same time
Turn multiple lists with a for statement at the same time in Python
Check if the expected column exists in Pandas DataFrame
I tried the same data analysis with kaggle notebook (python) and Power BI at the same time ②
Python built-in function ~ divmod ~ Let's get the quotient and remainder of division at the same time
Calculate the time difference between two columns with Pandas DataFrame
[Pandas] If the first row data is in the header in DataFrame
Put the lists together in pandas to make a DataFrame
I tried to illustrate the time and time in C language
[Python] Display the elapsed time in hours, minutes, and seconds (00:00:00)
Get the current date and time in Python, considering the time difference
[Python] Strengths and weaknesses of DataFrame in terms of time required
Graph time series data in Python using pandas and matplotlib
python memo: enumerate () -get index and element of list at the same time and turn for statement
Determine the date and time format in Python and convert to Unixtime
[Python 3.8 ~] Rewrite arrays etc. at the same time as definition [tips]
[Python3] Save the mean and covariance matrix in json with pandas
Verify the compression rate and time of PIXZ used in practice
How to display bytes in the same way in Java and Python
Count the number of times two values appear in a Python 3 iterator type element at the same time
How to read standard input or variable files at the same time like paste command in Python
Ignore # line and read in pandas
How to get the date and time difference in seconds with python
Get and convert the current time in the system local timezone with python
Get a datetime instance at any time of the day in Python
Set up a server that processes multiple connections at the same time
[Python] How to open two or more files at the same time