Rohan Kumar
Rohan Kumar

Reputation: 408

how to get trimmed output of an api call into a pandas dataframe in a split of a second?

An api returns this kind of data I need Lat & Long as separate columns in pandas DF:

{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}}

but I'm getting 'success','error' & 'data' as three columns after reading json through

pd.read_json(requests.get('url').contents)

it has some 1000 rows, I will have to work on data column to get the lat long. I have done this

def LatRegex(x):
 Lat = re.findall('"Lat":(.{,10}),',str(x))
def LongRegex(x):
 Long = re.findall('"Lon":(.{,10}),',str(x))
df['Lat'] = df['data'].apply(LatRegex)
df['Long'] = df['data'].apply(LongRegex)

But it's not working & taking too much time & I need the data in a split of a second. Any suggestions??

Upvotes: 0

Views: 44

Answers (1)

Rakesh
Rakesh

Reputation: 82765

Using .apply

Ex:

import json
import pandas as pd


data = json.loads('''{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},{"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}]}''')
df = pd.DataFrame(data)

df["Lat"] = df["data"].apply(lambda x: x["Lat"])
df["Lon"] = df["data"].apply(lambda x: x["Lon"])
df.drop("data", inplace=True, axis=1)
print(df)

Output:

   error  success        Lat         Lon
0  False     True  49.277126 -122.909874
1  False     True  49.276950 -122.910110
2  False     True  49.277190 -122.910339

Or using json_normalize

Ex:

import json
from pandas.io.json import json_normalize

data = json.loads('''{"success":true,"error":false,"data":[{"Id":"JTDKDTB38E1570707","Name":"AE2 42K", "Plate" :null, "Lat": 49.2771263,"Lon": -122.909874,"Fuel": 75,"Address": "0 "},{"Id" :"JTDKDTB36G1138197", "Name": "DN706K", "Plate" :null,"Lat" : 49.27695,"Lon": -122.91011,"Fuel": 85,"Address":"0 "},{"Id": "JTDKDTB39G1117652", "Name": "CH633A", "Plate" :null,"Lat":49.27719,"Lon":-122.910339,"Fuel":79,"Address":"0 "}]}''')
df = json_normalize(data, 'data', ['success', 'error'])
print(df)

Output:

  Address  Fuel                 Id        Lat         Lon     Name Plate  \
0      0     75  JTDKDTB38E1570707  49.277126 -122.909874  AE2 42K  None   
1      0     85  JTDKDTB36G1138197  49.276950 -122.910110   DN706K  None   
2      0     79  JTDKDTB39G1117652  49.277190 -122.910339   CH633A  None   

   success  error  
0     True  False  
1     True  False  
2     True  False  

Upvotes: 2

Related Questions