Reputation:
I'm trying to create the dataframe from the array with following structure:
df = [[{'date_time': Timestamp('2015-05-22 05:37:59'),
'name': 'Tom',
'value': '129'},
{'date_time': Timestamp('2015-05-22 05:37:59'),
'name': 'Kate',
'value': '0'},
{'date_time': Timestamp('2015-05-22 05:37:59'),
'name': 'GroupeId',
'value': '0'}, {...}, {...}, {...}],[another list of dictionaries like the first one],[and another one]]
using this code:
def create_from_arr():
baby_array=pd.MultiIndex.from_tuples(df, names=['sessions', 'behaves'])
return baby_array
I have the following error, that I couldn't understand:
TypeError: unhashable type: 'dict'
My desired output is like:
list
date_time name value
1 0 2015-05-22 05:37:59 Tom 129
1 2015-05-22 05:37:59 Kate 0
2 2015-05-22 05:37:59 GroupeId 0
2 3 2015-05-26 05:56:59 Hence 129
4 2015-05-26 05:56:59 Kate 0
5 2015-05-26 05:56:59 Julie 0
3 ...................... ...... ......
Upvotes: 4
Views: 4910
Reputation: 20563
I am still not sure what exactly you want to do with the MultiIndex, but here is one way to "flatten" your dictionary in your multi-level arrays and load your data into the dataframe properly:
Updated with "list" and "index" as MultiIndex
In [100]: data = [[{'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'Tom',
.....: 'value': '129'},
.....: {'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'Kate',
.....: 'value': '0'},
.....: {'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'GroupeId',
.....: 'value': '0'}], [{'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'Tom',
.....: 'value': '129'},
.....: {'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'Kate',
.....: 'value': '0'},
.....: {'date_time': Timestamp('2015-05-22 05:37:59'),
.....: 'name': 'GroupeId',
.....: 'value': '0'}]]
In [101]: df = pd.DataFrame(columns=['list', 'date_time', 'name', 'value'])
In [102]: for idx, each in enumerate(data, 1):
.....: temp = pd.DataFrame(each)
.....: temp['list'] = idx # manually assign "list" index
.....: df = df.append(temp, ignore_index=True)
.....:
In [103]: df = df.reset_index()
In [104]: df.set_index(['list', 'index'])
Out[104]:
date_time name value
list index
1 0 2015-05-22 05:37:59 Tom 129
1 2015-05-22 05:37:59 Kate 0
2 2015-05-22 05:37:59 GroupeId 0
2 3 2015-05-22 05:37:59 Tom 129
4 2015-05-22 05:37:59 Kate 0
5 2015-05-22 05:37:59 GroupeId 0
Upvotes: 3
Reputation: 21574
IIUC, let d
be an extract of your array:
d = [[{'date_time': '2015-05-22 05:37:59',
'name': 'Tom',
'value': '129'},
{'date_time': '2015-05-22 05:37:59',
'name': 'Kate',
'value': '0'}]]
I would extract the dataframe with:
df = pd.DataFrame.from_dict(d[0])
which returns:
date_time name value
0 2015-05-22 05:37:59 Tom 129
1 2015-05-22 05:37:59 Kate 0
Hope that helps.
Upvotes: 0