Reputation: 408
An api returns this kind of data I need Lat & Long as separate columns in pandas DF:
{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}}
but I'm getting 'success','error' & 'data' as three columns after reading json through
pd.read_json(requests.get('url').contents)
it has some 1000 rows, I will have to work on data column to get the lat long. I have done this
def LatRegex(x):
Lat = re.findall('"Lat":(.{,10}),',str(x))
def LongRegex(x):
Long = re.findall('"Lon":(.{,10}),',str(x))
df['Lat'] = df['data'].apply(LatRegex)
df['Long'] = df['data'].apply(LongRegex)
But it's not working & taking too much time & I need the data in a split of a second. Any suggestions??
Upvotes: 0
Views: 44
Reputation: 82765
Using .apply
Ex:
import json
import pandas as pd
data = json.loads('''{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},{"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}]}''')
df = pd.DataFrame(data)
df["Lat"] = df["data"].apply(lambda x: x["Lat"])
df["Lon"] = df["data"].apply(lambda x: x["Lon"])
df.drop("data", inplace=True, axis=1)
print(df)
Output:
error success Lat Lon
0 False True 49.277126 -122.909874
1 False True 49.276950 -122.910110
2 False True 49.277190 -122.910339
Or using json_normalize
Ex:
import json
from pandas.io.json import json_normalize
data = json.loads('''{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},{"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}]}''')
df = json_normalize(data, 'data', ['success', 'error'])
print(df)
Output:
Address Fuel Id Lat Lon Name Plate \
0 0 75 JTDKDTB38E1570707 49.277126 -122.909874 AE2 42K None
1 0 85 JTDKDTB36G1138197 49.276950 -122.910110 DN706K None
2 0 79 JTDKDTB39G1117652 49.277190 -122.910339 CH633A None
success error
0 True False
1 True False
2 True False
Upvotes: 2