Reputation: 4842
I have a dictionary like this:
my_dict
{'metric_1-metric2': [0.062034245713139154,
0.7711299537484807,
0.9999999999999999,
['US', 'mobile', 'google'],
['UK', 'desktop', 'facebook']],
'metric_1-metric_3': [-0.9607689228305227,
-0.12803370313903312,
0.778375882191523,
['CAN', 'tablet', 'google'],
['UK', 'desktop', 'yahoo']],
'metric_1-metric_4': [-0.4678967355247944,
0.6600255030070277,
0.9999999999999999,
['PT', 'desktop', 'gmail'],
['UK', 'desktop', 'apple']]}
I am trying to achieve the following result:
df
A B C D E F G
metric_1 metric_2 0.062 0.771 0.999 ['US', 'mobile', 'google'] ['UK', 'desktop', 'facebook']
metric_1 metric_3 -0.960 -0.128 0.778 ['CAN', 'tablet', 'google'] ['UK', 'desktop', 'yahoo']
metric_1 metric_4 -0.467 0.660 0.999 ['PT', 'desktop', 'gmail'] ['UK', 'desktop', 'apple']
It's clear that I'll split up the names of the key
in my_dict
:
index_names = []
column_names = []
for x in my_dict.keys():
index_names.append(x.split('-')[0])
column_names.append(x.split('-')[1])
How could I create such a structure in a pandas dataframe?
Upvotes: 2
Views: 69
Reputation: 803
More simple and intuitive way to get the exact results (but less pandas focused)
a = {}
a = {"A": [], "B": [], "C": [], "D": [], "E": [], "F": [], "G": []}
for key, value in d.items():
key = key.split('-')
a['A'].append(key[0])
a['B'].append(key[1])
a['C'].append(value[0])
a['D'].append(value[1])
a['E'].append(value[2])
a['F'].append(value[3])
a['G'].append(value[4])
df = pd.DataFrame(data = a)
d
is the original dict in the question.
Upvotes: 0
Reputation: 323386
Check from_dict
,then split
the index with reset_index
at the end
s = pd.DataFrame.from_dict(d,'index')
s.index=pd.MultiIndex.from_tuples(s.index.str.split('-').map(tuple))
s.reset_index(inplace=True)
s
Out[210]:
level_0 level_1 ... 3 4
0 metric_1 metric2 ... [US, mobile, google] [UK, desktop, facebook]
1 metric_1 metric_3 ... [CAN, tablet, google] [UK, desktop, yahoo]
2 metric_1 metric_4 ... [PT, desktop, gmail] [UK, desktop, apple]
[3 rows x 7 columns]
Upvotes: 2
Reputation: 150815
It's just like commented, with a little extra:
df = pd.DataFrame(my_dict).T
(df.index.to_series() # get the metrics from index
.str.split('-', expand=True) # split by `-`
.rename(columns={0:'A',1:'B'}) # rename the metric
.join(df) # join as usual
.reset_index(drop=True) # remove the metric in index
)
Output:
A B 0 1 2 3 4
-- -------- -------- ---------- --------- -------- --------------------------- -----------------------------
0 metric_1 metric2 0.0620342 0.77113 1 ['US', 'mobile', 'google'] ['UK', 'desktop', 'facebook']
1 metric_1 metric_3 -0.960769 -0.128034 0.778376 ['CAN', 'tablet', 'google'] ['UK', 'desktop', 'yahoo']
2 metric_1 metric_4 -0.467897 0.660026 1 ['PT', 'desktop', 'gmail'] ['UK', 'desktop', 'apple']
Upvotes: 2