Reputation: 2132
I have a weather API JSON data. I got from a website then converted python dictionary
markit_dict = json.loads(response.content)
markit_dict
then I turned a Dataframe
but as you can see the weather column needs to separate 3 different columns
when I choose each column to turn data frame I can
wh = pd.DataFrame(openwet.iloc[1,6])
wh
description icon id main
0 broken clouds 04d 803 Clouds
Last time I tried to put in a for loop to make dataframe but I could not
EDIT:
openwet = pd.DataFrame(markit_dict)
openwet['weather'].values
output :
array([ [{u'main': u'Clouds', u'id': 803, u'icon': u'04d', u'description': u'broken clouds'}],
[{u'main': u'Clouds', u'id': 803, u'icon': u'04d', u'description': u'broken clouds'}],
[{u'main': u'Clouds', u'id': 804, u'icon': u'04d', u'description': u'overcast clouds'}],
[{u'main': u'Clouds', u'id': 804, u'icon': u'04d', u'description': u'overcast clouds'}],
I need to make a Dataframe the weather column. Also I put my json data might be someone can find different way.
url = "http://history.openweathermap.org//storage/debd7a72617dd61b0fc871a2c83fcabf.json"
response = requests.get(url)
response.content
Upvotes: 2
Views: 1040
Reputation: 863791
I think you need json_normalize
for create DataFrame
and column weather
is first select lists with str[0]
, then convert to numpy array
by values
and then to DataFrame
. (If need rename column names is possible add add_prefix
) Last concat
to original:
import urllib.request, json
url = "http://history.openweathermap.org//storage/debd7a72617dd61b0fc871a2c83fcabf.json"
#http://stackoverflow.com/a/12965254/2901002
with urllib.request.urlopen(url) as url:
data = json.loads(url.read().decode())
from pandas.io.json import json_normalize
df = json_normalize(data)
df1 = pd.DataFrame(df['weather'].str[0].values.tolist()).add_prefix('weather.')
print (df1.head())
weather.description weather.icon weather.id weather.main
0 broken clouds 04d 803 Clouds
1 broken clouds 04d 803 Clouds
2 overcast clouds 04d 804 Clouds
3 overcast clouds 04d 804 Clouds
4 overcast clouds 04n 804 Clouds
df = pd.concat([df.drop('weather', 1), df1], axis=1)
print (df.head(10))
city_id clouds.all dt dt_iso \
0 2193733 76 1447462800 2015-11-14 01:00:00 +0000 UTC
1 2193733 76 1447470000 2015-11-14 03:00:00 +0000 UTC
2 2193733 88 1447477200 2015-11-14 05:00:00 +0000 UTC
3 2193733 88 1447480800 2015-11-14 06:00:00 +0000 UTC
4 2193733 88 1447488000 2015-11-14 08:00:00 +0000 UTC
5 2193733 88 1447491600 2015-11-14 09:00:00 +0000 UTC
6 2193733 36 1447495200 2015-11-14 10:00:00 +0000 UTC
7 2193733 36 1447498800 2015-11-14 11:00:00 +0000 UTC
8 2193733 88 1447506000 2015-11-14 13:00:00 +0000 UTC
9 2193733 88 1447513200 2015-11-14 15:00:00 +0000 UTC
main.humidity main.pressure main.temp main.temp_max main.temp_min \
0 52 1020 291.15 291.15 291.15
1 45 1018 291.15 291.15 291.15
2 48 1017 290.15 290.15 290.15
3 55 1017 289.15 289.15 289.15
4 58 1017 287.15 287.15 287.15
5 62 1017 286.15 286.15 286.15
6 71 1017 286.15 286.15 286.15
7 71 1016 286.15 286.15 286.15
8 76 1015 286.15 286.15 286.15
9 87 1014 287.15 287.15 287.15
rain.3h wind.deg wind.speed weather.description weather.icon weather.id \
0 NaN 250 6 broken clouds 04d 803
1 NaN 240 7 broken clouds 04d 803
2 NaN 270 6 overcast clouds 04d 804
3 NaN 250 4 overcast clouds 04d 804
4 NaN 310 2 overcast clouds 04n 804
5 NaN 310 2 overcast clouds 04n 804
6 NaN 350 1 scattered clouds 03n 802
7 NaN 10 2 scattered clouds 03n 802
8 NaN 350 2 overcast clouds 04n 804
9 NaN 340 3 overcast clouds 04n 804
weather.main
0 Clouds
1 Clouds
2 Clouds
3 Clouds
4 Clouds
5 Clouds
6 Clouds
7 Clouds
8 Clouds
9 Clouds
Upvotes: 4
Reputation: 467
If I understood your question correctly, I think your are almost there. :)
weatherArray = [ [{u'main': u'Clouds', u'id': 803, u'icon': u'04d', u'description': u'broken clouds'}],
[{u'main': u'Clouds', u'id': 803, u'icon': u'04d', u'description': u'broken clouds'}],
[{u'main': u'Clouds', u'id': 804, u'icon': u'04d', u'description': u'overcast clouds'}],
[{u'main': u'Clouds', u'id': 804, u'icon': u'04d', u'description': u'overcast clouds'}] ]
for weather in weatherArray:
for i in weather:
print(i['main'])
print(i['id'])
print(i['icon'])
print(i['description'])
print('\n')
Use the above code to loop through the contents of weatherArray and add them in different columns.
Upvotes: -1