Reputation: 3504
I am learning API's but I have been using Pandas for data analysis for some time. Can I send data to an API from a Pandas dataframe?
For example, if I make up some time series data in a Pandas df and attempt to use df.to_json()
. Ultimate goal is here to make a Flask API that returns the median value of Value
in the Pandas df.
import requests
import pandas as pd
import numpy as np
from numpy.random import randint
np.random.seed(11)
rows,cols = 50000,1
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='T')
df = pd.DataFrame(data, columns=['Value'], index=tidx)
median_val = df.Value.median()
print('[INFO]')
print(median_val)
print('[INFO]')
print(df.head())
json_data = df.to_json()
print('[Sending to API!]')
url = "http://127.0.0.1:5000/api/v1.0/median_val"
print(requests.post(url, json_data).text)
Is it possible (or bad practice) to send a years worth of time series data to an API to get processed? Or how much data can be sent as FORM on an HTTP POST request?
Here is something simple in Flask on a local route shown below which errors out. This is just something I made up on the fly trying to figure it out.
import numpy as np
import pandas as pd
import time, datetime
from datetime import datetime
import json
from flask import Flask, request, jsonify
#start flask app
app = Flask(__name__)
#Simple flask route to return Value average
@app.route("/api/v1.0/median_val", methods=['POST'])
def med_val():
r = request.form.to_dict()
print(r.keys())
df = pd.json_normalize(r)
print(df)
if r.keys() == {'Date','Value'}:
try:
df = pd.json_normalize(r)
df['Date'] = datetime.fromtimestamp(df['Date'].astype(float))
df = pd.DataFrame(df,index=[0])
df = df.set_index('Date')
df['Value'] = df['Value'].astype(float)
median_val = df.Value.median()
except Exception as error:
print("Internal Sever Error {}".format(error))
error_str = str(error)
return error_str, 500
return json.dumps(median_val)
else:
print("Error on api route, rejected unable to process keys")
print("rejected unable to process keys")
return 'Bad Request', 400
if __name__ == '__main__':
print("Starting main loop")
app.run(debug=True,port=5000,host="127.0.0.1")
I dont get why the print on the flask side the prints
are empty. Any tips greatly appreciated there isnt a lot of wisdom here to web server processes/design.
r = request.form.to_dict()
print(r.keys())
df = pd.json_normalize(r)
print(df)
Full trace back on the Flask side.
dict_keys([])
Empty DataFrame
Columns: []
Index: [0]
Error on api route, rejected unable to process keys
rejected unable to process keys
127.0.0.1 - - [10/Feb/2021 07:50:44] "←[31m←[1mPOST /api/v1.0/median_val HTTP/1.1←[0m" 400 -
Upvotes: 0
Views: 2571
Reputation: 3504
I got the code to work :) not using df.to_json()
but populating an empty Python dictionary baggage_handler = {}
with the data to send to the Flask App Api route to process the data.
Also not super sure on best practices for how much data can be sent as an HTTP POST body but this appears to work on local host :)
Flask APP:
import numpy as np
import pandas as pd
import time, datetime
from datetime import datetime
import json
from flask import Flask, request, jsonify
#start flask app
app = Flask(__name__)
#Simple flask route to return Value average
@app.route("/api/v1.0/median_val", methods=['POST'])
def med_val():
r = request.form.to_dict()
df = pd.json_normalize(r)
print('incoming keys')
print(r.keys())
if r.keys() == {'Value'}:
print('keys are good')
try:
df = pd.json_normalize(r)
df['Value'] = df['Value'].astype(float)
median_val = df.Value.median()
print('median value == ',median_val)
except Exception as error:
print("Internal Sever Error {}".format(error))
error_str = str(error)
return error_str, 00
return json.dumps(median_val)
else:
print("Error on api route, rejected unable to process keys")
print("rejected unable to process keys")
return 'Bad Request', 400
if __name__ == '__main__':
print("Starting main loop")
app.run(debug=True,port=5000,host="127.0.0.1")
HTTP Request script:
import requests
import pandas as pd
import numpy as np
from numpy.random import randint
np.random.seed(11)
rows,cols = 50000,1
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='T')
df = pd.DataFrame(data, columns=['Value'], index=tidx)
median_val = df.Value.median()
print('[INFO]')
print(median_val)
print('[INFO]')
print(df.head())
#create an empty dictionary
baggage_handler = {}
print('[packaging some data!!]')
values_to_send = df.Value.tolist()
baggage_handler['Value'] = values_to_send
print('[Sending to API!]')
response = requests.post('http://127.0.0.1:5000/api/v1.0/median_val', data=baggage_handler)
print("RESPONCE TXT", response.json())
data = response.json()
print(data)
Upvotes: 1