Reputation: 12451
I have the following dataframe:
In [137]: counts
Out[137]:
SourceColumnID 3029903181 3029903182 3029903183 3029903184 ResponseCount
ColID QuestionID RowID
3029903193 316923119 3029903189 773 788 778 803 3142
3029903194 316923119 3029903189 766 799 782 773 3120
[2 rows x 5 columns]
that works well for what I want when I access it via iloc:
In [138]: counts.iloc[0][3029903181]
Out[138]: 773
but when I convert this to a dict it formats it in a way that isn't accessible in the same way anymore:
In [139]: counts.to_dict()
Out[139]:
{3029903181: {(3029903193, 316923119, 3029903189): 773,
(3029903194, 316923119, 3029903189): 766},
3029903182: {(3029903193, 316923119, 3029903189): 788,
(3029903194, 316923119, 3029903189): 799},
3029903183: {(3029903193, 316923119, 3029903189): 778,
(3029903194, 316923119, 3029903189): 782},
3029903184: {(3029903193, 316923119, 3029903189): 803,
(3029903194, 316923119, 3029903189): 773},
'ResponseCount': {(3029903193, 316923119, 3029903189): 3142,
(3029903194, 316923119, 3029903189): 3120}}
In [140]: counts.to_dict('list')
Out[140]:
{3029903181: [773, 766],
3029903182: [788, 799],
3029903183: [778, 782],
3029903184: [803, 773],
'ResponseCount': [3142, 3120]}
I need to convert this datastructure to a standard python object to return for an API to consume it.
Should I have created the table in a different format?
I started with this DataFrame:
In [141]: df
Out[141]:
ColID QuestionID ResponseCount RowID SourceColumnID
0 3029903193 316923119 773 3029903189 3029903181
1 3029903193 316923119 788 3029903189 3029903182
2 3029903193 316923119 778 3029903189 3029903183
3 3029903193 316923119 803 3029903189 3029903184
4 3029903194 316923119 766 3029903189 3029903181
5 3029903194 316923119 799 3029903189 3029903182
6 3029903194 316923119 782 3029903189 3029903183
7 3029903194 316923119 773 3029903189 3029903184
[8 rows x 5 columns]
and converted it to a pivot table like this:
counts = df.pivot_table(values='ResponseCount', rows=['ColID', 'QuestionID', 'RowID'], cols='SourceColumnID', aggfunc='sum')
I'm really looking for the datastructure to come out looking like this:
[
{
'QuestionID': 316923119,
'RowID': 3029903189,
'ColID': 3029903193,
'3029903181': 773,
'3029903182': 788,
'3029903183': 778,
'3029903184': 803,
'ResponseCount': 3142
},
{
'QuestionID': 316923119,
'RowID': 3029903189,
'ColID': 3029903194,
'3029903181': 766,
'3029903182': 799,
'3029903183': 782,
'3029903184': 773,
'ResponseCount': 3120
},
]
Upvotes: 0
Views: 665
Reputation: 251568
I believe you want counts.reset_index().to_dict('records')
.
Using 'records'
with to_dict
makes it give you a list of dicts, one dict per row, which is what you want. You need to use reset_index()
to get the index information in as columns (because 'records' throws away the index). Conceptually, the dicts you say you want don't distinguish between what's in the index of your pivot table and what's in the columns (you just want all index and column labels as keys in the dict), so you need to reset_index
to remove the index/column distinction.
Upvotes: 2