Reputation: 188
What is the efficient way to buid a grouped DataFrame from the nested dictionary.
Code snippet
# Endcoding Description
encoding_dict = {"Age":{"Middle": 0,
"Senior": 1,
"Young": 2},
"Sex":{"F": 0,
"M": 1},
"BP":{"High": 0,
"Low": 1,
"Normal": 2},
"Cholesterol":{"High": 0,
"Normal": 1}}
# Step 1 : Create DataFrame
df_1 = pd.DataFrame({"Features": ["Age"]*3 + ["Sex"]*2 + ["BP"]*3 + ["Cholesterol"]*2,
"Categories":["Middle", "Senior", "Young", "F", "M", "High", "Low","Normal", "High", "Normal"],
"Encoding":[0, 1, 2, 0, 1, 0, 1, 2, 0, 1]})
# Step 2 : Grouped DataFrame
grouped = df_1.groupby(["Features","Categories"]).sum()
print(grouped)
Output
Encoding
Features Categories
Age Middle 0
Senior 1
Young 2
BP High 0
Low 1
Normal 2
Cholesterol High 0
Normal 1
Sex F 0
M 1
What's the efficeint way to create a desired grouped dataframe of the nested dictionary without performing the step(1) manually?
Upvotes: 0
Views: 55
Reputation: 35626
A dictionary comprehension to build a frame constructor then add the axis names could work:
df = pd.DataFrame(
{'encoding': {(k, sub_k): v
for k, sub_d in encoding_dict.items()
for sub_k, v in sub_d.items()}}
).rename_axis(index=['Features', 'Categories'])
df
:
encoding
Features Categories
Age Middle 0
Senior 1
Young 2
BP High 0
Low 1
Normal 2
Cholesterol High 0
Normal 1
Sex F 0
M 1
Upvotes: 1