――I want to expand a nested dictionary by connecting it with an underscore.
--Input: d1 = {'A': {'i': 0,'j': 1},'B': {'i': 2,'j': 3}}
Data that contains a dictionary
--Output: d2 = {'A_i': 0,'A_j': 1,'B_i': 2,'B_j': 3}
, open data in one dimension
--In the end, nested dictionaries have a tree structure. ――As an image, the process is like breaking a tree structure and directly connecting the roots and ends. --A recursive function seems to be good when it comes to implementing it like a depth-first search.
So I wrote the following code.
#key change
def changekey(dict,k1,k2):
dict[k2] = dict[k1]
del dict[k1]
#Add modifiers to all keys with underscores
def addkeyheader(dict,key):
ks = list(dict.keys())
for k in ks:
changekey(dict,k,key+'_'+k)
#After performing a depth-first search of the favorite function, the element names are combined and returned.
def dict_flatten(dict_,depth=2):
newdict = {}
for key in list(dict_.keys()):
#If the element is also dict, call it recursively
if isinstance(dict_[key], dict):
if depth > 0:
dic = dict_flatten(dict_[key],depth-1)
addkeyheader(dic,key)
newdict.update(dic)
else:
newdict[key] = dict_[key]
#If the element is not a dict, leave it as it is
else:
newdict[key] = dict_[key]
return newdict
In the comments, I told you about a smarter implementation.
def _unwrap(dct, prefix):
for key, value in dct.items():
new_key = f'{prefix}_{key}'
if isinstance(value, dict):
yield from _unwrap(value, new_key)
else:
yield new_key[1:], value
def unwrap(dct):
return dict(_unwrap(dct, ''))
d2 = unwrap(d1)
--The nested dict is arranged in two dimensions as shown below (Reference). --Unlike this, there are times when you just want to arrange the data.
When the Input is converted, it becomes as follows.
>> pd.DataFrame.from_dict(d1)
A B
i 0 2
j 1 3
The output is converted as follows.
>> pd.DataFrame(d2,index=[""])
A_i A_j B_i B_j
0 1 2 3
I thought it was a good idea, but I didn't understand pandas very well.
You can edit it with the MultiIndex function in practice without having to disassemble it. https://qiita.com/Morinikiz/items/40faa91e7a83807c0552
--It can be summarized by stacking () or unstack () df. --Stack can put the column side at a high level, unstack can put the row side at a high level ――However, why is the behavior opposite to Explanation here?
>>> df = pd.DataFrame(d1)
>>> df
A B
i 0 2
j 1 3
>>> df.unstack()
A i 0
j 1
B i 2
j 3
dtype: int64
>>> df.stack()
i A 0
B 2
j A 1
B 3
dtype: int64
To return to the Data Frame, use to_frame () at the end.
>>> df.unstack().to_frame("name")
name
A i 0
j 1
B i 2
j 3
Recommended Posts