[PYTHON] A memorandum of how to write pandas that I tend to forget personally
Create a Sample DataFrame for work
df = pd.util.testing.makeMixedDataFrame()
df
|
A |
B |
C |
D |
0 |
0.0 |
0.0 |
foo1 |
2009-01-01 |
1 |
1.0 |
1.0 |
foo2 |
2009-01-02 |
2 |
2.0 |
0.0 |
foo3 |
2009-01-05 |
3 |
3.0 |
1.0 |
foo4 |
2009-01-06 |
4 |
4.0 |
0.0 |
foo5 |
2009-01-07 |
Change pandas display options
pd.set_option('max_rows', 2)
pd.set_option('max_columns', 3)
df
|
A |
... |
D |
0 |
0.0 |
... |
2009-01-01 |
... |
... |
... |
... |
4 |
4.0 |
... |
2009-01-07 |
Change the type of multiple columns
df = df.astype({"A":"int64", "B":"int64"})
df.dtypes
A int64
B int64
C object
D datetime64[ns]
dtype: object
Change the format of multiple columns
df.assign(E=10000*df["A"])\
.assign(F=100*df["B"])\
.style.format(
{
"A":"{:.2f}",
"B":"{:.4f}",
"D":"{:%Y-%m-%d}",
"E":"{:,}",
"F":"{:}%"
}
)
|
A |
B |
C |
D |
E |
F |
0 |
0.00 |
0.0000 |
foo1 |
2009-01-01 |
0 |
0% |
1 |
1.00 |
1.0000 |
foo2 |
2009-01-02 |
10,000 |
100% |
2 |
2.00 |
0.0000 |
foo3 |
2009-01-05 |
20,000 |
0% |
3 |
3.00 |
1.0000 |
foo4 |
2009-01-06 |
30,000 |
100% |
4 |
4.00 |
0.0000 |
foo5 |
2009-01-07 |
40,000 |
0% |
Do not set the key when groupby is index
df.groupby("B", as_index=False).agg({"A":"sum", "A":"mean"})