Reputation: 645
I have written a Python script that calls a National Oceanic and Atmospheric Administration (NOAA) endpoint with a zip code and gets a list of weather stations in response. The script then converts the response to a Pandas dataframe.
I believe I have it working correctly based on this Replit.The dataframe appears to print to console correctly and I can inspect it using breakpoints.
Using this blog tutorial as my guide, my real goal is to leverage this Python script in a Tableau Prep flow. Tableau Prep is basically a desktop ETL tool, similar to PowerQuery, but different :).
I have a local working instance of a TabPy server, whose logs also appear to be showing proper construction of the dataframe (image below). However, I'm getting a TypeError : 'DataFrame' object is not callable
. I've also provided an image of the same error surfaced in the Tableau Prep interface.
Any help is sincerely appreciated.
Here's the syntax of the actual script running on my TabPy server - with minimal modifications from what's on Replit.
import requests;
import pandas as pd;
import json;
zip = '97034'
userToken = 'foobar123'
headerCreds = dict(token = userToken)
url = 'https://www.ncei.noaa.gov/cdo-web/api/v2/stations?&locationid=ZIP:' + zip
global dfWorking
def get_stations_for_zip():
r = requests.get(url, headers = headerCreds)
data = json.loads(r.text)
if 'results' in data:
data = data.get('results')
dfWorking = pd.DataFrame(data)
# Column datatypes as received
# elevation float64
# mindate object
# maxdate object
# latitude float64
# name int64
# datacoverage float64
# id object
# elevationUnit object
# longitude float64
dfWorking = dfWorking.astype({'name': 'str'})
# dfWorking['name'] = dfWorking.index
# defining an index converts back to float64
print(dfWorking)
else:
print('no results object in response')
return dfWorking
# Note: the below prep functions are undefined until they are on a TabPy server
def get_output_schema():
return pd.DataFrame({
'elevation' : prep_decimal(),
'mindate' : prep_string(),
'maxdate' : prep_decimal(),
'latitude' : prep_date(),
'name' : prep_string(),
'datacoverage' : prep_decimal(),
'id' : prep_decimal(),
'name' : prep_string(),
'elevationUnit' : prep_decimal(),
'longitude' : prep_decimal()
});
get_stations_for_zip()
Upvotes: 0
Views: 253
Reputation: 645
The solution required two changes:
get_stations_for_zip()
, but needed get_stations_for_zip
without parenthesisget_stations_for_zip
function needed to take "df" (for dataframe) as an argument. So def get_stations_for_zip(df):
. Strangely this argument is never used within the function, but it's necessary and the blog I was referencing shows the same.Here's a quote from help.tableau.com's article Use Python scripts in your flow
When you create your script, include a function that specifies a pandas (pd.DataFrame) as an argument of the function. This will call your data from Tableau Prep Builder.
Upvotes: 1
Reputation: 1032
This line is wrong:
execution_result = get_stations_for_zip()(pd.DataFrame(_arg1))
because get_stations_for_zip
is returning DataFrame, and you are threating it as python function, so you are trying:
df = get_stations_for_zip()
df(pd.DataFrame(_arg1)) # and error is right here
Upvotes: 0