Reputation: 2384
The API I am connecting to has the following JSON reponse (as per their docs):
{
"total_results": 0,
"page_size": 0,
"page_number": 0,
"offers": [
{
"tsin_id": 0,
"offer_id": 0,
"sku": "string",
"barcode": "string",
"product_label_number": "string",
"selling_price": 0,
"rrp": 0,
"leadtime_days": 0,
"leadtime_stock": [
{
"merchant_warehouse": {
"warehouse_id": 0,
"name": "string"
},
"quantity_available": 0
}
],
"status": "string",
"title": "string",
"offer_url": "string",
"stock_cover": [
{
"warehouse_id": 0,
"stock_cover_days": 0
}
],
"sales_units": [
{
"warehouse_id": 0,
"sales_units": 0
}
],
"discount": "string",
"discount_shown": true,
"storage_fee_eligible": true,
"date_created": "string",
"stock_at_takealot_total": 0,
"stock_at_takealot": [
{
"warehouse": {
"warehouse_id": 0,
"name": "string"
},
"quantity_available": 0
}
],
"stock_on_way": [
{
"warehouse": {
"warehouse_id": 0,
"name": "string"
},
"quantity_available": 0
}
]
}
]
}
My current code is below:
import pandas as pd
from pandas import json_normalize
import requests as rq
import json
from datetime import datetime
# API information
url = "https://seller-api.takealot.com/v2"
endpoint = "/offers?"
api_key = "Key xyz"
header = {
'Authorization': api_key
}
full_url = url + endpoint
response = rq.get(full_url, headers=header)
# convert to dataframe
data = response.text
info = json.loads(data)
df = json_normalize(info["offers"])
# end time
print(datetime.now().strftime('%H:%M:%S'))
This returns mostly correct, but columns like leadtime_stock
don't return correctly. For example, if I print out tsin_id
I get:
print(df['tsin_id'].head())
0 74215970
1 74759079
2 44290949
3 48140749
4 74824209
Name: tsin_id, dtype: int64
But for leadtime_stock
I get:
print(df['leadtime_stock'].head())
0 [{'merchant_warehouse': {'warehouse_id': 13167...
1 [{'merchant_warehouse': {'warehouse_id': 13167...
2 [{'merchant_warehouse': {'warehouse_id': 13167...
3 [{'merchant_warehouse': {'warehouse_id': 13167...
4 [{'merchant_warehouse': {'warehouse_id': 13167...
Name: leadtime_stock, dtype: object
Is there an easy way to convert the columns like leadtime_stock
, stock_at_takealot
, stock_on_way
into the same df
at the same time, or after the first normalize?
EDIT
Required dataframe (as an example) I am after will be in the format:
"tsin_id" | "offer_id" | "sku" | "barcode" | "product_label_number" | "selling_price" | "rrp" | "leadtime_days" | "leadtime_stock" | "merchant_warehouse" | "warehouse_id" | "name" | "quantity available"
Upvotes: 1
Views: 105
Reputation: 28649
Using json_normalize
, pass leadtime_stock
as record_path
, and the keys surrounding leadtime_stock
to meta
(these parameters are noted in the docs) :
pd.json_normalize(data['offers'], 'leadtime_stock', ['tsin_id','offer_id',
'sku', 'barcode',
'product_label_number',
'selling_price', 'rrp',
'leadtime_days'])
Upvotes: 2