Reputation: 53
I am trying to load a JSON file. Here is the code and file structure:
df = pd.DataFrame(None, columns=columns)
for i,line in enumerate(open(json_dimName.json')):
j = json.loads(line)
print j
Output:
{u'dimensionalFacts': [{u'dimensions': [{u'dimName': u'us-gaap_BusinessAcquisitionAxis'}]}], u'stockSymbol': u'pfe', u'_id': {u'$oid': u'55400c1ae44f9e094c5833b2'}}
I then try to read this into a pandas dataframe:
df.loc[i] = [j['dimensionalFacts']['dimensions'], j['stockSymbol']]
This is the error message that I get:
list indices must be integers, not str
I am new to python and programming so would greatly appreciate any help. Thanks so much!
Upvotes: 0
Views: 192
Reputation: 1077
Apparently j['dimensionalFacts']
is a list, so I guess what you want to do is:
df.loc[i] = j['dimensionalFacts'][0]['dimensions'], j['dimentionalFacts'][0]['stockSymbol']
Upvotes: 0
Reputation: 14644
It's because each one of your j values has a list for the key:
{u'dimensionalFacts': [{u'dimensions': [{u'dimName': u'us-gaap_BusinessAcquisitionAxis'}]}]
What you want in this case is:
df.loc[i] = [j['dimensionalFacts'][0]['dimensions'], j['stockSymbol']]
This will grab the resulting dictionary from each j-value, since it seems there is only one dictionary per entry.
The error it gives is precisely the issue: you are trying to tell the program to seek position "dimensions", not an integer position, within a list.
Upvotes: 1
Reputation: 799430
j['dimensionalFacts']
is a list; that's what the square brackets mean. You will need to index it using a number if you want to get to the dict inside it.
j['dimensionalFacts'][0]['dimensions']
Note that this will work for the example given, but more complex structures may require you to iterate over the list rather than assuming the first element.
Upvotes: 0