Reputation: 1016
I'm trying to make this script run every 30 minutes. at the moment it only runs once - a bit confused why it isn't running many more times.
Any idea on where I'm going wrong with my code? Basically this script is taking data from an API, in XML, and then putting it into a csv file.
Trying to use Threading to make it run every so many seconds - running it on pythonanywhere at the moment - and it will only run once. A little bit confused why that would be!
I also have tried using a while loop - put a example of what i've tried near the threading code.
from lxml import etree
import urllib.request
import csv
import threading
#Pickle is not needed
#append to list next
def handleLeg(leg):
# print this leg as text, or save it to file maybe...
text = etree.tostring(leg, pretty_print=True)
# also process individual elements of interest here if we want
tagsOfInterest=["noTrafficTravelTimeInSeconds", "lengthInMeters", "departureTime", "trafficDelayInSeconds"] # whatever
#list to use for data analysis
global data
data = []
#create header dictionary that includes the data to be appended within it. IE, Header = {TrafficDelay[data(0)]...etc
for child in leg:
if 'summary' in child.tag:
for elem in child:
for item in tagsOfInterest:
if item in elem.tag:
data.append(elem.text)
def parseXML(xmlFile):
"""While option
lastTime = time.time() - 600
while time.time() >= lastTime + 600:
lastTime += 600"""
#Parse the xml
threading.Timer(5.0, parseXML).start()
with urllib.request.urlopen("https://api.tomtom.com/routing/1/calculateRoute/-37.79205923474775,145.03010268799338:-37.798883995180496,145.03040309540322:-37.807106781970354,145.02895470253526:-37.80320743019992,145.01021142594075:-37.7999012967757,144.99318476311566:?routeType=shortest&key=xxx&computeTravelTimeFor=all") as fobj:
xml = fobj.read()
root = etree.fromstring(xml)
for child in root:
if 'route' in child.tag:
handleLeg(child)
# Write CSV file
with open('datafile.csv', 'w') as fp:
writer = csv.writer(fp, delimiter=' ')
# writer.writerow(["your", "header", "foo"]) # write header
writer.writerows(data)
"""for elem in child:
if 'leg' in elem.tag:
handleLeg(elem)
"""
if __name__ == "__main__":
parseXML("xmlFile")
with open('datafile.csv', 'r') as fp:
reader = csv.reader(fp, quotechar='"')
# next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read)
Upvotes: 2
Views: 877
Reputation: 12205
How do you know it runs only once? Have you debugged it or do you expect to have the correct result when code reaches this part?
with open('datafile.csv', 'r') as fp:
....
And in general, what do you expect to happen, and when is your program supposed to enter this part? I do not know how to fix this without knowing what you want it to do, but I think I know where your problem is.
This is what your program does. I will call the main thread M:
if __main__()
matches and parseXML
is calledparseXML
launches a new thread, which we call T1, with threading.Timer()
parseXML
finishes and with open...
reached. T1: sleep(5)parseXML
parseXML
T2: sleep(5)parseXML
How your program is built, parseXML
(probably - not able to run your code but it looks about right) does launch a delayed copy of itself in a new thread, but your main program that handles the results has already exited and does not read your datafile.csv anymore after a new timed thread has modified it.
You can verify this by setting daemon=True
on your threads (meaning the threads will exit as soon as your main program exits). Now your program does not "hang". It displays results after the first iteration of parseXML
and it kills immediately the timed thread:
#Parse the xml
_t = threading.Timer(5.0, parseXML)
_t.daemon = True
_t.start()
with urllib.request.urlopen(....)
Do you really need threads here at all? Or could you just move the datafile.csv processing and display to parseXML, put a while True loop there and sleep 5 seconds between iterations?
Another possibility is to move the data reader part to another thread that would sleep N seconds and then run the reader. BUT in this case you will need locks. If you process the same file in different threads, eventually the unexpected will happen and your writer has written only a part of your file when the reader decides to read it. Your parser will then most likely crash to a syntax error. To avoid this, create a global lock and use it to protect your file read and write operations:
foo = threading.Lock()
....
....
with foo:
with open(...) as fp:
....
Now your file operations stay atomic.
Sorry about the lengthy explanation, hope this helps.
Upvotes: 3