Reputation: 5
I want to load a JSON file with pandas but it is not working as I expect it to! I have referred to this stackoverflow answers but my problem is not that. The JSON file looks like this:
Code to load the File:-
import pandas as pd
df = pd.read_json("BrowserHistory.json")
print(df)
Output:-
But I don't want only 1 Column containing each json element. I want 6 Columns namely 'favicon_url', 'page_transition', 'title', 'url', 'client_id' and 'time_usec' as describle in above photo of 'json file' and then each column should contain the value it has in every element.
Like this:
favicon url page_transition title url client_id time_user
. . . . . .
. . . . . .
. . . . . .
. . . . . .
JSON File:
{
"Browser History": [
{
"favicon_url": "https://www.google.com/favicon.ico",
"page_transition": "LINK",
"title": "Google Takeout",
"url": "https://takeout.google.com/",
"client_id": "cliendid",
"time_usec": 1620386529857946
},
{
"favicon_url": "https://www.google.com/favicon.ico",
"page_transition": "LINK",
"title": "Google Takeout",
"url": "https://takeout.google.com/",
"client_id": "cliendid",
"time_usec": 1620386514845201
},
{
"favicon_url": "https://www.google.com/favicon.ico",
"page_transition": "LINK",
"title": "Google Takeout",
"url": "https://takeout.google.com/",
"client_id": "cliendid",
"time_usec": 1620386499014063
},
{
"favicon_url": "https://ssl.gstatic.com/ui/v1/icons/mail/rfr/gmail.ico",
"page_transition": "LINK",
"title": "Gmail",
"url": "https://mail.google.com/mail/u/0/#inbox",
"client_id": "cliendid",
"time_usec": 1620386492788783
}
]
}
Upvotes: 0
Views: 2636
Reputation: 147216
Because your objects are in a list at the second level down of your JSON, you can't read it directly into a dataframe using read_json
. Instead, you could read the json into a variable, and then create the dataframe from that:
import pandas as pd
import json
f = open("BrowserHistory.json")
js = json.load(f)
df = pd.DataFrame(js['Browser History'])
df
# favicon_url page_transition ... client_id time_usec
# 0 https://www.google.com/favicon.ico LINK ... cliendid 1620386529857946
# 1 https://www.google.com/favicon.ico LINK ... cliendid 1620386514845201
# 2 https://www.google.com/favicon.ico LINK ... cliendid 1620386499014063
# 3 https://ssl.gstatic.com/ui/v1/icons/mail/rfr/g... LINK ... cliendid 1620386492788783
Note you may need to specify the file encoding on the open
call e.g.
f = open("BrowserHistory.json", encoding="utf8")
Upvotes: 2
Reputation: 4618
The problem is because of the {}
that are around your file, pandas thinks that the first level of the JSON are the columns and thus it uses just Browser History as a column. You can use this code to solve your problem:
import pandas as pd
df = pd.DataFrame(json.load(open('BrowserHistory.json', encoding='cp850'))['Browser History'])
print(df)
Upvotes: 3