Reputation: 27
I have been on a quest to automate reports by using Windows Task Manager + Python and recently wrote a script that I could use to run the .py file. However, in practice, it does not work. I've spent a good two days looking for an answer and trying different things, but cannot for the life of me figure this out.
The Scenario: I am trying to write a standalone executable python code that connects to a server, pulls data using SQL, and then makes a dataframe/pivot table that saves in a folder. This process is what I want to automate.
My current code sample is below
def module():
sql = "SELECT * FROM TABLE"
conn = connect(host="url.com", port=xxxxxx)
cursor = conn.cursor()
cursor.execute(sql)
data = as_pandas(cursor)
todays_date = datetime.datetime.now().date().strftime("%Y%m%d")
df = data.loc[;,'date','stuff_and_thangs','column1','column2','column3']
earliest_date = df['date'].min()
latest_date = df['date'].max()
df['stuff_and_thangs'] = df['stuff_and_thangs'].astype(float)
df['stuff_and_thangs'] = df['stuff_and_thangs'].round(decimals = 2)
df.sort_values(by = 'date', ascending = 'True')
df.groupby(['date'], sort = True)['stuff_and_thangs','Column1','Column3']
table = pd.pivot_table(df,index=['Column', 'Column1', 'Column3'], columns =['Column'], values =['Column3'], fill_value = 0)
writer = pd.ExcelWriter('C:\User\blahblahblah/ %s to %s.xlsx' % (earliest_date, latest_date))
#Convert the dataframe pivot table to an XlsxWriter Excel object.
table.to_excel(writer, sheet_name='stuff_and_thangs')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
if __name__ == "__main__":
import pandas as pd
import numpy as np
from impala.dbapi import connect
import datetime
from impala.util import as_pandas
module()
Extra details:
Let's say that my file name for the module code is called file.py and I have a batch file for the task scheduler to run set up like so (let's call the batch file "batch.bat"):
start C:\Python27\python.exe file.py
I've coded in Jupyter Notebook and have only now started to try and convert these into executable standalone modules. Are there any best practices I may be missing? I have tried looking into environment variables but, they seem to be set up perfectly fine. The code seems to work fine within the jupyter notebook IDE, but does not work once I make it into a python file.
Upvotes: 0
Views: 1267
Reputation: 51
I have the absolute same problem. My programs starts in via Jupyter or Spyder but does not work properly on the Windows Task Scheduler.
As printing to stdout did not work, I made Python create files which helped me narrowing down the problem. Try this, it helped me figuring out that it was pandas that could not be imported:
t = open("text11.txt", "wb")
t.close()
import re
t = open("text12.txt", "wb")
t.close()
import pandas as pd
t = open("text19.txt", "wb")
t.close()
...
Hope that helps!
Upvotes: 1
Reputation: 566
There are several things that may be hampering your project. Task scheduler may not be running in the user security context you think it is. This depends on whether you selected "Run whether user is logged on or not", whether you select "Run with highest privileges", whether the user is actually logged on, and/or whether the user has the local policy set to allow the user to run batch files.
If you select "Run whether user is logged on or not" or you select "Run with highest privileges", the user context under which it runs will not have access to network resources. I'm only aware of two ways around this. Do not select either of those options and run the task on a machine with a user logged on that has access to the needed network resources or run it on the server supplying those resources to the network and reference them with local references, such as C:, D:, etc for Windows, rather than UNC's which may not be an option in the case of cloud storage.
Scheduled Task Security Context
Upvotes: 0