Reputation: 3208
I'm using pandas to handle data frames. There's one data frame I create which its rows are as follows: [id, vector]
Where id is of type string and vector is of type dictionary.
Now when I write it to a csv file The row looks like this (in the csv file):
25377bc2-d3b6-4699-a466-6b9f544e8ba3 {u'sport>sports event>world championship': 0.5058, u'sport>sports event': 0.7032, u'sport>soccer': 0.6377, u'lifestyle and leisure>game': 0.4673, u'sport>sports event>world cup': 0.6614, u'sport>sports event>international tournament': 0.454, u'sport>sports event>national tournament': 0.541, u'sport': 0.9069, u'sport>sports organisations>international federation': 0.5046, u'sport>sports organisations': 0.6982}
I've tried to read it back from csv into pandas data frame, but when I look at the type of the vector that once was a dict
it is now of <type 'str'>
I know I can solve it with pickle and save that pandas data frame into a pickle file. But is there a way to read the csv correctly (where the vector in it is of type dictionary)
Upvotes: 1
Views: 373
Reputation: 863531
I think you can use json
what is better structure as csv
for save dicts
.
For write use to_json
and for read read_json
with parameter orient='records'
, thanks piRSquared for comment:
df = pd.DataFrame({'vector':[{'a':1, 'b':3}, {'a':4, 'b':6}], 'ID':[2,3]})
print (df)
ID vector
0 2 {'b': 3, 'a': 1}
1 3 {'b': 6, 'a': 4}
df.to_json('file.json', orient='records')
ID vector
0 2 {'b': 3, 'a': 1}
1 3 {'b': 6, 'a': 4}
df = pd.read_json('file.json', orient='records')
print (df)
print (df.applymap(type))
ID vector
0 <class 'int'> <class 'dict'>
1 <class 'int'> <class 'dict'>
EDIT1:
If is necessary same order of columns, index values use:
df.to_json('file.json', orient='split')
df = pd.read_json('file.json', orient='split')
Upvotes: 2