The contents related to sorting of DataFrame were only simple in Japanese, so I summarized them. I will touch on places where there is not much demand.
Confirmed to work with pandas 0.17.1.
This is the data used this time.
sort.py
import numpy as np
import pandas as pd
if __name__ == "__main__":
df = pd.DataFrame([[1, 3, "Hokkaido"], [4, 5, "Tokyo"], [3, 5, "Saitama"], [6, 9, "Osaka"], [1, 1, "Aomori Prefecture"]])
df.index = ["Suzuki", "Tanaka", "Kimura", "Endo", "Yoshida"]
df.columns = ["Item 1", "Item 2", "Item 3"]
Item 1 | Item 2 | Item 3 | |
---|---|---|---|
Suzuki | 1 | 3 | Hokkaido |
Tanaka | 4 | 5 | Tokyo |
Kimura | 3 | 5 | Saitama |
Endo | 6 | 9 | Osaka |
Yoshida | 1 | 1 | Aomori Prefecture |
If you try to arrange this table by item 1,
df.sort_values (by = ["item 1 "], ascending = True)
By
Item 1 | Item 2 | Item 3 | |
---|---|---|---|
Suzuki | 1 | 3 | Hokkaido |
Yoshida | 1 | 1 | Aomori Prefecture |
Kimura | 3 | 5 | Saitama |
Tanaka | 4 | 5 | Tokyo |
Endo | 6 | 9 | Osaka |
They are arranged in ascending order of item 1 (in ascending order of value). If the value of item 1 is the same, it will be arranged depending on the order in the original table. At this time, if you change ʻascending = True to ʻascending = False
, the items will be sorted in descending order (largest value).
If you want to depend on item 2 instead of the original order
df.sort_values (by = ["item 1 "," item 2 "], ascending = True)
By
Item 1 | Item 2 | Item 3 | |
---|---|---|---|
Yoshida | 1 | 1 | Aomori Prefecture |
Suzuki | 1 | 3 | Hokkaido |
Kimura | 3 | 5 | Saitama |
Tanaka | 4 | 5 | Tokyo |
Endo | 6 | 9 | Osaka |
It will be.
It's a little anomalous, but if you want to sort item 1 in ascending order and item 2 in descending order
df.sort_values (by = ["item 1 "," item 2 "], ascending = [True, False])
You can sort by.
Next, sort item 3. As before
df.sort_values (by = ["item 3 "], ascending = True)
Then
Item 1 | Item 2 | Item 3 | |
---|---|---|---|
Suzuki | 1 | 3 | Hokkaido |
Kimura | 3 | 5 | Saitama |
Endo | 6 | 9 | Osaka |
Tanaka | 4 | 5 | Tokyo |
Yoshida | 1 | 1 | Aomori Prefecture |
It will be sorted, but probably not what you want. Here is a list in the order you want to arrange, for example
tdhk = ["Hokkaido", "Aomori", "Saitama", "Tokyo", "Osaka"]
As
df ["item 3"] = pd.Categorical (df ["item 3"], tdhk)
When you insert, it will be sorted in the order of the list.
Item 1 | Item 2 | Item 3 | |
---|---|---|---|
Suzuki | 1 | 3 | Hokkaido |
Yoshida | 1 | 1 | Aomori Prefecture |
Kimura | 3 | 5 | Saitama |
Tanaka | 4 | 5 | Tokyo |
Endo | 6 | 9 | Osaka |
Recommended Posts