Sema196
Sema196

Reputation: 263

How to assign column names in Pandas from python list

I have python list of lists which I want to convert into pandas Dataframe. I want to create dataframe in the following format:

table_id           created     Mb (etc.)
1 NetworkClicks      2018-10-26  0.22
2 NetworkImpressions 2018-10-26  1519.24

(total 6 rows based on list sample below)

Column names are inside each list , e.g. Mb, created, modified, table_id.

List sample:

ls_all = [
    [(u'Mb', u'928.11'), (u'created', datetime.date(2018, 10, 25)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'4,378'), (u'table_id', u'NetworkActiveViews'), (u'Tb', u'0.91')],
    [(u'Mb', u'800.67'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'3,577'), (u'table_id', u'NetworkBackfillActiveViews'), (u'Tb', u'0.78')],
    [(u'Mb', u'2.44'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'11'), (u'table_id', u'NetworkBackfillClicks'), (u'Tb', u'0.00')],
    [(u'Mb', u'1190.52'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'5,269'), (u'table_id', u'NetworkBackfillImpressions'), (u'Tb', u'1.16')],
    [(u'Mb', u'0.22'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'1'), (u'table_id', u'NetworkClicks'), (u'Tb', u'0.00')],
    [(u'Mb', u'1519.24'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'7,089'), (u'table_id', u'NetworkImpressions'), (u'Tb', u'1.48')]
]

I tried df = pd.DataFrame(ls_all, columns=ls_all[0])

but it's giving me this dataframe:

    (Mb, 928.11)  ...  (Tb, 0.91)
0   (Mb, 928.11)  ...  (Tb, 0.91)
1   (Mb, 800.67)  ...  (Tb, 0.78)
2     (Mb, 2.44)  ...  (Tb, 0.00)
3  (Mb, 1190.52)  ...  (Tb, 1.16)
4     (Mb, 0.22)  ...  (Tb, 0.00)
5  (Mb, 1519.24)  ...  (Tb, 1.48)

Upvotes: 0

Views: 10443

Answers (2)

Ando21
Ando21

Reputation: 26

I like the list of dictionaries above, here’s another way:

Get data from lists

lists = []

for list in ls_all:
    temp = [x[1] for x in list]
    lists.append(temp)

Get column names

columns = [x[0] for x in ls_all[0]]

Load into DataFrame

df = pd.DataFrame(lists, columns=columns)

Result

        Mb     created    modified Rows_Mil                    table_id    Tb
0   928.11  2018-10-25  2019-04-18    4,378          NetworkActiveViews  0.91
1   800.67  2018-10-26  2019-04-18    3,577  NetworkBackfillActiveViews  0.78
2     2.44  2018-10-26  2019-04-18       11       NetworkBackfillClicks  0.00
3  1190.52  2018-10-26  2019-04-18    5,269  NetworkBackfillImpressions  1.16
4     0.22  2018-10-26  2019-04-18        1               NetworkClicks  0.00
5  1519.24  2018-10-26  2019-04-18    7,089          NetworkImpressions  1.48

Upvotes: 1

danielR9
danielR9

Reputation: 455

Use list of dictionaries rather than list of list of tuple.

list_of_dicts = [dict(x) for x in ls_all]

df = pd.DataFrame(list_of_dicts)

        Mb Rows_Mil    Tb     created    modified                    table_id
0   928.11    4,378  0.91  2018-10-25  2019-04-18          NetworkActiveViews
1   800.67    3,577  0.78  2018-10-26  2019-04-18  NetworkBackfillActiveViews
2     2.44       11  0.00  2018-10-26  2019-04-18       NetworkBackfillClicks
3  1190.52    5,269  1.16  2018-10-26  2019-04-18  NetworkBackfillImpressions
4     0.22        1  0.00  2018-10-26  2019-04-18               NetworkClicks

Upvotes: 3

Related Questions