Reputation: 453
I am extracting some tweets and I am getting json (json_response
) in return which looks something like this (I've added dummy IDs):
{
"data": [
{
"author_id": "123456",
"conversation_id": "7890",
"created_at": "2020-03-01T23:59:58.000Z",
"id": "12345678",
"lang": "en",
"public_metrics": {
"like_count": 1,
"quote_count": 2,
"reply_count": 3,
"retweet_count": 4
},
"referenced_tweets": [
{
"id": "13664100",
"type": "retweeted"
}
],
"reply_settings": "everyone",
"source": "Twitter for Android",
"text": "This is a sample."
}
],
"includes": {
"users": [
{
"created_at": "2018-08-29T23:45:37.000Z",
"description": "",
"id": "7890123",
"name": "Twitter user",
"public_metrics": {
"followers_count": 1199,
"following_count": 1351,
"listed_count": 0,
"tweet_count": 52607
},
"username": "user_123",
"verified": false
}
]
}
I am trying to convert it into pandas dataframe using the following code:
import json
from pandas.io.json import json_normalize
df = pd.DataFrame.from_dict(pd.json_normalize(json_response['data']), orient='columns')
And it is giving me the output whose header is as follows:
conversation_id | text | source | reply_settings | referenced_tweets | id | created_at | lang | author_id | public_metrics.retweet_count | public_metrics.reply_count | public_metrics.like_count | public_metrics.quote_count | in_reply_to_user_id
except that I want to add username
as a column in the df
along with other columns. I'd like to add the column username
among these columns and I don't know how to do that.
Any guidance please?
Upvotes: 0
Views: 92
Reputation: 12711
IIUC you have a list of users dictionaries in json_response['data']
and json_response['include']['users']
. Why not create your own dictionary list from those two?
json_response = json.loads(response_raw)
your_dict_list = json_response['data']
for i, user in enumerate(json_response['includes']['users']):
your_dict_list[i]['username'] = user['username']
df = pd.json_normalize(your_dict_list)
Output:
author_id conversation_id created_at id lang ... username public_metrics.like_count public_metrics.quote_count public_metrics.reply_count public_metrics.retweet_count
0 123456 7890 2020-03-01T23:59:58.000Z 12345678 en ... user_123 1 2 3 4
Upvotes: 2