Converting big String file to json in python using Pandas

Question

I have a big database of several json file. There are tweets and each tweet is a

{"object":"Message",
"action":"create",
"data":{"id":152376374,"body":"@PeakySellers right there with you brother. 368$","created_at":"2019-01-31T23:59:56Z",
"user":{"id":971815,"username":"mtothe5thpower",}'
}

and I have 3 million row in one file and the size is more than 5GB. I use pandas to read the file and it works well data2=pd.read_table('file', sep=" ",header=None)

Now I have a database and in each row, there is one element (like a tweet that I mentioned earlier) and its type is String. Now I convert each element to a dictionary to use the file and access each element. I am using the code below:

for i, row in data2.itertuples():
    data2["dic"][i] = json.loads(data2[0][i])

While this code successfully converts each string to a dictionary, it is very slow. I think there should be a faster way for this task. Thank you in advance for any help or suggestions.

Converting big String file to json in python using Pandas

Answers (1)

Related Questions