Reputation: 20342
Is anyone here familiar with Google Cloud Functions? I read their documentation and based on that, I customized my script to try to work in their hosted environment.
https://cloud.google.com/functions/docs/concepts/python-runtime
So, my Python script looks like this.
def main():
requests
numpy
pandas
datetime
requests
pandas_gbq
xml.etree.ElementTree
# authentication: working....
login = 'my_email'
password = 'my_password'
AsOfDate = datetime.datetime.today().strftime('%m-%d-%Y')
#step into URL
REQUEST_URL = 'https://www.business.com/report-api/device=779142&rdate=Yesterday'
response = requests.get(REQUEST_URL, auth=(login, password))
xml_data = response.text.encode('utf-8', 'ignore')
#tree = etree.parse(xml_data)
root = xml.etree.ElementTree.fromstring(xml_data)
# start collecting root elements and headers for data frame 1
desc = root.get("Description")
frm = root.get("From")
thru = root.get("Thru")
loc = root.get("locations")
loc = loc[:-1]
df1 = pandas.DataFrame([['From:',frm],['Through:',thru],['Location:',loc]])
df1.columns = ['S','Analytics']
#print(df1)
# start getting the analytics for data frame 2
data=[['Goal:',root[0][0].text],['Actual:',root[0][1].text],['Compliant:',root[0][2].text],['Errors:',root[0][3].text],['Checks:',root[0][4].text]]
df2 = pandas.DataFrame(data)
df2.columns = ['S','Analytics']
#print(df2)
# merge data frame 1 with data frame 2
df3 = df1.append(df2, ignore_index=True)
#print(df3)
# append description and today's date onto data frame
df3['Description'] = desc
df3['AsOfDate'] = AsOfDate
# push from data frame, where data has been transformed, into Google BQ
pandas_gbq.to_gbq(df3, 'Metrics', 'analytics', chunksize=None, reauth=False, if_exists='append', private_key=None, auth_local_webserver=False, table_schema=None, location=None, progress_bar=True, verbose=None)
print('Execute Query, Done!!')
main()
if __name__ == '__main__':
main()
Also, my requirements.txt looks like this.
requests
numpy
pandas
datetime
requests
pandas_gbq
xml.etree.ElementTree
My script has been working fine for the past 2+ months, but I need to run it on my laptop each day. To get away from this manual process, I am trying to get this running on the cloud. The problem is that I keep getting an erorr message that reads: TypeError: main() takes 0 positional arguments but 1 was given
To me, it looks like no arguments are given and no arguments are expected, but somehow Google is saying 1 argument is given. Can I modify my code slightly to get this to work, or somehow bypass this seemingly benign error? Thanks.
Upvotes: 2
Views: 1971
Reputation: 81454
The following takes your code and changes it to run in Google Cloud Functions using an HTTP trigger. You can then use Google Cloud Scheduler to call your function on schedule. You will also need to create a requirements.txt
with the modules that you need to import. See this document for more information.
def handler(request):
import requests
import numpy
import pandas
import datetime
import requests
import pandas_gbq
import xml.etree.ElementTree
# authentication: working....
login = 'my_email'
password = 'my_password'
AsOfDate = datetime.datetime.today().strftime('%m-%d-%Y')
#step into URL
REQUEST_URL = 'https://www.business.com/report-api/device=779142&rdate=Yesterday'
response = requests.get(REQUEST_URL, auth=(login, password))
xml_data = response.text.encode('utf-8', 'ignore')
#tree = etree.parse(xml_data)
root = xml.etree.ElementTree.fromstring(xml_data)
# start collecting root elements and headers for data frame 1
desc = root.get("Description")
frm = root.get("From")
thru = root.get("Thru")
loc = root.get("locations")
loc = loc[:-1]
df1 = pandas.DataFrame([['From:',frm],['Through:',thru],['Location:',loc]])
df1.columns = ['S','Analytics']
#print(df1)
# start getting the analytics for data frame 2
data=[['Goal:',root[0][0].text],['Actual:',root[0][1].text],['Compliant:',root[0][2].text],['Errors:',root[0][3].text],['Checks:',root[0][4].text]]
df2 = pandas.DataFrame(data)
df2.columns = ['S','Analytics']
#print(df2)
# merge data frame 1 with data frame 2
df3 = df1.append(df2, ignore_index=True)
#print(df3)
# append description and today's date onto data frame
df3['Description'] = desc
df3['AsOfDate'] = AsOfDate
# push from data frame, where data has been transformed, into Google BQ
pandas_gbq.to_gbq(df3, 'Metrics', 'analytics', chunksize=None, reauth=False, if_exists='append', private_key=None, auth_local_webserver=False, table_schema=None, location=None, progress_bar=True, verbose=None)
# print('Execute Query, Done!!')
# Normally for an HTTP trigger you would return a full HTML page here
# <html><head></head><body>you get the idea</body></html>
return 'Execute Query, Done!!'
Upvotes: 2
Reputation: 317808
You're misunderstanding how Cloud Functions works. It doesn't let you simply run arbitrary scripts. You write triggers that respond to HTTP requests, or when something changes in your Cloud project. That doesn't seem to be what you're doing here. Cloud Functions deployments don't use main().
You might want to read the overview documentation to get an understanding of what Cloud Functions is used for.
If you're trying to run something periodically, consider writing an HTTP trigger and have that invoked by some cron-like service at the rate you want.
Upvotes: 1