Reputation: 71
I am trying to transform a dictionary of lists (looks like a dictionary of dictionary, but is unfortunately a dictionary of lists) into a dataframe. I want to have the column-names from the list objects. So far i found a way to to turn the dictionary into a data frame, but the columns don't have the appropriate name and the values still contain the column names.
user_dict = {'Category 1': ['att_1: 1', 'att_2: whatever'],
'Category 2': ['att_1 : 23', 'att_2 : another']}
res = pd.DataFrame.from_dict(user_dict, orient='index')
res.columns = [f'SYN{i+1}' for i in res]
Example Output:
att_1 | att_2
Category_1 1 | whatever
Category_1 23 | another
I was thinking at using unlist or regex, but I am not sure where to input that. Any help much appreciated! Thank you
Edit: my unlist attemp ended here:
pd.DataFrame.from_dict({(i,j): to_dict(unlist(user_dict[i][j]))
for i in user_dict.keys()
for j in user_dict[i].keys()},
orient='index')
Upvotes: 1
Views: 1666
Reputation: 164673
You can use a dictionary comprehension to restructure your input into a dictionary of dictionaries. Then use from_dict
with orient='index'
:
user_dict = {'Category 1': ['att_1: 1', 'att_2: whatever'],
'Category 2': ['att_1 : 23', 'att_2 : another']}
d = {k: dict(map(str.strip, x.split(':')) for x in v) for k, v in user_dict.items()}
df = pd.DataFrame.from_dict(d, orient='index')
df['att_1'] = pd.to_numeric(df['att_1'])
print(df)
att_1 att_2
Category 1 1 whatever
Category 2 23 another
As above, you will need to then convert series to numeric as appropriate.
Upvotes: 1