Jay Dhanki
Jay Dhanki

Reputation: 23

Error in list of dictionary to data frame in python

I have below list of dictionary.

content = ['{"a": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E; 360SE)", "c": "US", "nk": 0, "tz": "America/Los_Angeles", "g": "1lj67KQ", "h": "1xupVE6", "mc": 807, "u": "https://cdn.adf.ly/js/display.js", "t": 1427288399, "cy": "Mountain View"}\n',
 '{"a": "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E; 360SE)", "c": "US", "nk": 0, "tz": "America/New_York", "g": "1lj67KQ", "h": "1xupVE6", "mc": 514, "u": "https://cdn.adf.ly/js/display.js", "t": 1427288399, "cy": "Buffalo"}\n']

when i tried to convert list of dictionary to data frame or create columns with keys and values in rows, i am getting 'TypeError: string indices must be integers' error message.

Method : 1

for x in content:

     print (x["a"], x["nk"])

Method :2

result = []

sumlist = ["a", "nk"]
for d in content:

      result.append({"col1": d["a"],
                   "col2": d['nk']})

print (result)

Upvotes: 0

Views: 74

Answers (1)

cs95
cs95

Reputation: 402603

Option 1
It actually is JSON, you can use json_normalize + json.loads.

df = pd.io.json.json_normalize([json.loads(x) for x in content])
print(df) 
                                                   a   c             cy  \
0  Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...  US  Mountain View   
1  Mozilla/4.0 (compatible; MSIE 6.0; Windows NT ...  US        Buffalo   

         g        h   mc  nk           t                   tz  \
0  1lj67KQ  1xupVE6  807   0  1427288399  America/Los_Angeles   
1  1lj67KQ  1xupVE6  514   0  1427288399     America/New_York   

                                  u  
0  https://cdn.adf.ly/js/display.js  
1  https://cdn.adf.ly/js/display.js  

If all you want are a and nk, use:

df = pd.DataFrame.from_dict(content)[['a', 'nk']]

Option 2
ast.literal_eval.

import ast

content = [ast.literal_eval(x) for x in content]
df = pd.DataFrame.from_dict(content)

print(df)                                                      
                                                   a   c             cy  \
0  Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ...  US  Mountain View   
1  Mozilla/4.0 (compatible; MSIE 6.0; Windows NT ...  US        Buffalo   

         g        h   mc  nk           t                   tz  \
0  1lj67KQ  1xupVE6  807   0  1427288399  America/Los_Angeles   
1  1lj67KQ  1xupVE6  514   0  1427288399     America/New_York   

                                  u  
0  https://cdn.adf.ly/js/display.js  
1  https://cdn.adf.ly/js/display.js  

Upvotes: 3

Related Questions