Reputation: 103
I am attempting to import a file that has the structure below (dump of tweets, with unicode strings). The goal is to convert this to a DataFrame using the pandas module. I assume the first step is to load to a json object and then convert to a DataFrame (per p. 166 of McKinney's Python for Data Analysis book) but am unsure and could use some pointers to manage this.
import sys, tailer
tweet_sample = tailer.head(open(r'<MyFilePath>\usTweets0.json'), 3)
tweet_sample # returns
['{u\'contributors\': None, u\'truncated\': False, u\'text\': u\'@KREAYSHAWN is...
Upvotes: 4
Views: 1805
Reputation: 375375
Just use the DataFrame constructor...
In [6]: tweet_sample = [{'contributers': None, 'truncated': False, 'text': 'foo'}, {'contributers': None, 'truncated': True, 'text': 'bar'}]
In [7]: df = pd.DataFrame(tweet_sample)
In [8]: df
Out[8]:
contributers text truncated
0 None foo False
1 None bar True
If you have the file as a JSON you can open it using json.load
:
import json
with open('<MyFilePath>\usTweets0.json', 'r') as f:
tweet_sample = json.load(f)
There will be a from_json
coming soon to pandas...
Upvotes: 2