Reputation: 1165
My data look like this :
{u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
I want to convert it to a pandas Dataframe. But when I try
df = pd.DataFrame(response.items())
I get a DataFrame with two columns, the first with the first key, and the second with the values of the key:
0 1
0 "57e01311817bc367c030b390" {"ad_since": 2016, "indoor_swimming_pool": "No...
1 "57e01311817bc367c030b3a8" {"ad_since": 2012, "indoor_swimming_pool": "No...
How can I get a single column for each key : "ad_since"
, "indoor_swimming_pool"
, "indoor_swimming_pool"
? And keep the first column, or get the id as index.
Upvotes: 3
Views: 2225
Reputation: 76297
As the values are strings, you can use the json
module and list comprehension:
In [20]: d = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
In [21]: import json
In [22]: pd.DataFrame(dict([(k, [json.loads(e)[k] for e in d.values()]) for k in json.loads(d.values()[0])]), index=d.keys())Out[22]:
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b390" 2016 Yes No
"57e01311817bc367c030b3a8" 2012 Yes No
seaside
"57e01311817bc367c030b390" No
"57e01311817bc367c030b3a8" No
Upvotes: 1
Reputation: 862511
You need convert column of type
str
to dict
by .apply(literal_eval)
or .apply(json.loads)
and then use DataFrame.from_records
:
import pandas as pd
from ast import literal_eval
response = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}',
u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
df = pd.DataFrame.from_dict(response, orient='index')
print (type(df.iloc[0,0]))
<class 'str'>
df.iloc[:,0] = df.iloc[:,0].apply(literal_eval)
print (pd.DataFrame.from_records(df.iloc[:,0].values.tolist(), index=df.index))
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b3a8" 2012 Yes No
"57e01311817bc367c030b390" 2016 Yes No
seaside
"57e01311817bc367c030b3a8" No
"57e01311817bc367c030b390" No
import pandas as pd
import json
response = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}',
u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
df = pd.DataFrame.from_dict(response, orient='index')
df.iloc[:,0] = df.iloc[:,0].apply(json.loads)
print (pd.DataFrame.from_records(df.iloc[:,0].values.tolist(), index=df.index))
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b3a8" 2012 Yes No
"57e01311817bc367c030b390" 2016 Yes No
seaside
"57e01311817bc367c030b3a8" No
"57e01311817bc367c030b390" No
Upvotes: 2