Yuandong
Yuandong

Reputation: 67

python scripts stop for no reason after one month(no error message)

I try to run a script to extract data from a json file which is updated every 1 to 2 minutes. The basic concept is that the script execute the extraction procedure first and then sleep for 1 minutes and execute the extraction procedure again. it is infinite loop;

It worked fine for more than one month and stopped suddenly one day without any error message, I restarted it and it worked fine. However, after some days it stopped again for no reason.

I have no idea what's the problem and could just provide my script. below is the python file I wrote.

from requests.auth import HTTPBasicAuth
    import sys
    import requests
    import re
    import time
    import datetime
    import json

    from CSVFileGen1 import csv_files_generator1
    from CSVFileGen2 import csv_files_generator2
    from CSVFileGen3 import csv_files_generator3
    from CSVFileGen4 import csv_files_generator4

    def passpara():
            current_time = datetime.datetime.now()
            current_time_string = current_time.strftime('%Y-%m-%d %H:%M:%S')
            sys.path.append('C:\\semester3\\data_copy\\WAZE\\output_scripts\\TNtool')
            FileLocation1 = 'C:\\semester3\\data_copy\\www\\output\\test1'
            FileLocation2 = 'C:\\semester3\\data_copy\\www\\output\\test2'
            FileLocation3 = 'C:\\semester3\\data_copy\\www\\output\\test3'
            FileLocation4 = 'C:\\semester3\\data_copy\\www\\output\\test4'
            try:
                    r1 = requests.get('https://www...=JSON')
                    json_text_no_lines1 = r1.text
                    csv_files_generator1(current_time, json_text_no_lines1, FileLocation1)
            except requests.exceptions.RequestException as e:
                    print 'request1 error'
                    print e
            try:
                    r2 = requests.get('https://www...=JSON')
                    json_text_no_lines2 = r2.text
                    csv_files_generator2(current_time, json_text_no_lines2, FileLocation2)
            except requests.exceptions.RequestException as e:
                    print 'request2 error'
                    print e
            try:
                    r3 = requests.get('https://www...=JSON')
                    json_text_no_lines3 = r3.text
                    csv_files_generator3(current_time, json_text_no_lines3, FileLocation3)
            except requests.exceptions.RequestException as e:
                    print 'request3 error'
                    print e
            try:
                    r4 = requests.get('https://www...JSON')
                    json_text_no_lines4 = r4.text
                    csv_files_generator4(current_time, json_text_no_lines4, FileLocation4)
            except requests.exceptions.RequestException as e:
                    print 'request4 error'
                    print e
            print current_time_string + ' Data Operated. '   
    while True:
        passpara()
        time.sleep(60)

Here is the CSVFileGen1 that the first script calls. This script parses the json file and saves the information to a csv file.

import json
import datetime
import time
import os.path
import sys
from datetime import datetime
from dateutil import tz


def meter_per_second_2_mile_per_hour(input_meter_per_second):
    return input_meter_per_second * 2.23694

def csv_files_generator1(input_datetime, input_string, target_directory):

        try:
                real_json = json.loads(input_string)
                #get updatetime string
                updatetime_epoch = real_json['updateTime']
                update_time = datetime.fromtimestamp(updatetime_epoch/1000)
                updatetime_string = update_time.strftime('%Y%m%d%H%M%S')
                file_name = update_time.strftime('%Y%m%d%H%M')
                dir_name = update_time.strftime('%Y%m%d')
                if not os.path.exists(target_directory + '\\' + dir_name):
                    os.makedirs(target_directory + '\\' + dir_name)
                if not os.path.isfile(target_directory + '\\' + dir_name + '\\' + file_name):
                        ......#some detailed information I delete it for simplicity
        except ValueError, e:
                print e

Upvotes: 1

Views: 1458

Answers (2)

cddt
cddt

Reputation: 549

I believe that your question has already been answered regarding the reason behind why your script may fail, so I won't duplicate that answer.

However I will provide an alternative solution. Instead of having your script run for days on end, remove the infinite loop, and set it up to run every minute with task scheduler (Windows) or cron (Linux). This has a couple of immediate benefits:

  1. memory is cleared after each run;
  2. recovery from an unexpected error can happen in 60 seconds, rather than when you see the script has stopped running.

Upvotes: 0

Advait
Advait

Reputation: 191

At first glance, I think it would be the sys.path becoming full (as litelite mentioned). I think you can safely move this block of code outside the function to prevent it from being run infinitely (only append to sys.path once):

sys.path.append('C:\\semester3\\data_copy\\WAZE\\output_scripts\\TNtool')
FileLocation1 = 'C:\\semester3\\data_copy\\www\\output\\test1'
FileLocation2 = 'C:\\semester3\\data_copy\\www\\output\\test2'
FileLocation3 = 'C:\\semester3\\data_copy\\www\\output\\test3'
FileLocation4 = 'C:\\semester3\\data_copy\\www\\output\\test4'

So, your code would look like:

sys.path.append('C:\\semester3\\data_copy\\WAZE\\output_scripts\\TNtool')
FileLocation1 = 'C:\\semester3\\data_copy\\www\\output\\test1'
FileLocation2 = 'C:\\semester3\\data_copy\\www\\output\\test2'
FileLocation3 = 'C:\\semester3\\data_copy\\www\\output\\test3'
FileLocation4 = 'C:\\semester3\\data_copy\\www\\output\\test4'
while True:
    passpara()
    time.sleep(60)

When I tried a program that infinitely appends to sys.path, my RAM was being used very heavily. You may want to look into the memory usage of your script as the Python script may be hanging since it doesn't have enough memory. After a few minutes of running this script, my Chrome window crashed due to Python using around 10 GB RAM (used all available RAM).

Please note that I did not have a time.sleep(). The results obtained after running it without any pauses for a few minutes might reflect those found when running it every 60 seconds for a month.

My program is as follows:

import sys
while True:
    sys.path.append("C:\\semester3\\data_copy\\WAZE\\output_scri‌​pts\\TNtool")

Interesting note: A simple incrementing of a variable in a while loop does not rapidly use large amounts of RAM. This is mainly since the variable in question is being overwritten each time and does not take up extra memory. In your case, sys.path is a "list" and appending to it infinitely causes extra RAM to be used. Example program:

count = 0
while True:
    count += 1

On the other hand, appending to a list heavily uses RAM, which is to be expected:

count = []
while True:
    count.append(1)

Upvotes: 1

Related Questions