Reputation: 263
I have python list of lists which I want to convert into pandas Dataframe. I want to create dataframe in the following format:
table_id created Mb (etc.)
1 NetworkClicks 2018-10-26 0.22
2 NetworkImpressions 2018-10-26 1519.24
(total 6 rows based on list sample below)
Column names are inside each list , e.g. Mb, created, modified, table_id.
List sample:
ls_all = [
[(u'Mb', u'928.11'), (u'created', datetime.date(2018, 10, 25)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'4,378'), (u'table_id', u'NetworkActiveViews'), (u'Tb', u'0.91')],
[(u'Mb', u'800.67'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'3,577'), (u'table_id', u'NetworkBackfillActiveViews'), (u'Tb', u'0.78')],
[(u'Mb', u'2.44'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'11'), (u'table_id', u'NetworkBackfillClicks'), (u'Tb', u'0.00')],
[(u'Mb', u'1190.52'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'5,269'), (u'table_id', u'NetworkBackfillImpressions'), (u'Tb', u'1.16')],
[(u'Mb', u'0.22'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'1'), (u'table_id', u'NetworkClicks'), (u'Tb', u'0.00')],
[(u'Mb', u'1519.24'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'7,089'), (u'table_id', u'NetworkImpressions'), (u'Tb', u'1.48')]
]
I tried
df = pd.DataFrame(ls_all, columns=ls_all[0])
but it's giving me this dataframe:
(Mb, 928.11) ... (Tb, 0.91)
0 (Mb, 928.11) ... (Tb, 0.91)
1 (Mb, 800.67) ... (Tb, 0.78)
2 (Mb, 2.44) ... (Tb, 0.00)
3 (Mb, 1190.52) ... (Tb, 1.16)
4 (Mb, 0.22) ... (Tb, 0.00)
5 (Mb, 1519.24) ... (Tb, 1.48)
Upvotes: 0
Views: 10443
Reputation: 26
I like the list of dictionaries above, here’s another way:
lists = []
for list in ls_all:
temp = [x[1] for x in list]
lists.append(temp)
columns = [x[0] for x in ls_all[0]]
df = pd.DataFrame(lists, columns=columns)
Mb created modified Rows_Mil table_id Tb
0 928.11 2018-10-25 2019-04-18 4,378 NetworkActiveViews 0.91
1 800.67 2018-10-26 2019-04-18 3,577 NetworkBackfillActiveViews 0.78
2 2.44 2018-10-26 2019-04-18 11 NetworkBackfillClicks 0.00
3 1190.52 2018-10-26 2019-04-18 5,269 NetworkBackfillImpressions 1.16
4 0.22 2018-10-26 2019-04-18 1 NetworkClicks 0.00
5 1519.24 2018-10-26 2019-04-18 7,089 NetworkImpressions 1.48
Upvotes: 1
Reputation: 455
Use list of dictionaries rather than list of list of tuple.
list_of_dicts = [dict(x) for x in ls_all]
df = pd.DataFrame(list_of_dicts)
Mb Rows_Mil Tb created modified table_id
0 928.11 4,378 0.91 2018-10-25 2019-04-18 NetworkActiveViews
1 800.67 3,577 0.78 2018-10-26 2019-04-18 NetworkBackfillActiveViews
2 2.44 11 0.00 2018-10-26 2019-04-18 NetworkBackfillClicks
3 1190.52 5,269 1.16 2018-10-26 2019-04-18 NetworkBackfillImpressions
4 0.22 1 0.00 2018-10-26 2019-04-18 NetworkClicks
Upvotes: 3