Allison Wilson
Allison Wilson

Reputation: 250

Python script not iterating through array

So, I recently got into learning python and at work we wanted some way to make the process of finding specific keywords in our log files easier, to make it easier to tell what IPs to add to our block list.

I decided to go about writing a python script that would take in a logfile, take in a file with a list of key terms, and then look for those key terms in the log file and then write the lines that matched the session IDs where that key term was found; to a new file.

import sys
import time
import linecache
from datetime import datetime

def timeStamped(fname, fmt='%Y-%m-%d-%H-%M-%S_{fname}'):
    return datetime.now().strftime(fmt).format(fname=fname)

importFile = open('rawLog.txt', 'r') #pulling in log file
importFile2 = open('keyWords.txt', 'r') #pulling in keywords
exportFile = open(timeStamped('ParsedLog.txt'), 'w') #writing the parsed log

FILE = importFile.readlines()
keyFILE = importFile2.readlines()

logLine = 1  #for debugging purposes when testing
parseString = '' 
holderString = ''
sessionID = []
keyWords= []
j = 0

for line in keyFILE: #go through each line in the keyFile 
        keyWords = line.split(',') #add each word to the array

print(keyWords)#for debugging purposes when testing, this DOES give all the correct results


for line in FILE:
        if keyWords[j] in line:
                parseString = line[29:35] #pulling in session ID
                sessionID.append(parseString) #saving session IDs to a list
        elif importFile == '' and j < len(keyWords):  #if importFile is at end of file and we are not at the end of the array
                importFile.seek(0) #goes back to the start of the file
                j+=1        #advance the keyWords array

        logLine +=1 #for debugging purposes when testing
importFile2.close()              
print(sessionID) #for debugging purposes when testing



importFile.seek(0) #goes back to the start of the file


i = 0
for line in FILE:
        if sessionID[i] in line[29:35]: #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
                holderString = line #pulling the line of log file
                exportFile.write(holderString)#writing the log file line to a new text file
                print(holderString) #for debugging purposes when testing
                if i < len(sessionID):
                    i+=1

importFile.close()
exportFile.close()

It is not iterating across my keyWords list, I probably made some stupid rookie mistake but I am not experienced enough to realize what I messed up. When I check the output it is only searching for the first item in the keyWords list in the rawLog.txt file.

The third loop does return the results that appear based on the sessionIDs that the second list pulls and does attempt to iterate (this gives an out of bounds exception due to i never being less than the length of the sessionID list, due to sessionID only having 1 value).

The program does write to and name the new logfile sucessfully, with a DateTime followed by ParsedLog.txt.

Upvotes: 1

Views: 85

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

If the elif is never True you never increase j so you either need to increment always or check that the elif statement is actually ever evaluating to True

   for line in FILE:
        if keyWords[j] in line:
                parseString = line[29:35] #pulling in session ID
                sessionID.append(parseString) #saving session IDs to a list
        elif importFile == '' and j < len(keyWords):  #if importFile is at end of file and we are not at the end of the array
                importFile.seek(0) #goes back to the start of the file
        j+=1     # always increase

Looking at the above loop, you create the file object with importFile = open('rawLog.txt', 'r') earlier in your code so comparing elif importFile == '' will never be True as importFile is a file object not a string.

You assign FILE = importFile.readlines() so that does exhaust the iterator creating the FILE list, you importFile.seek(0) but don't actually use the file object anywhere again.

So basically you loop one time over FILE, j never increases and your code then moves to the next block.

What you actually need are nested loops, using any to see if any word from keyWords is in each line and forget about your elif :

for line in FILE: 
    if any(word in line for word in keyWords):
            parseString = line[29:35] #pulling in session ID
            sessionID.append(parseString) #saving session IDs to a list

The same logic applies to your next loop:

for line in FILE:
    if any(sess in line[29:35] for sess in sessionID ): #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
            exportFile.write(line)#writing the log file line to a new text file

holderString = line does nothing bar refer to the same object line so you can simply exportFile.write(line) and forget the assignment.

On a sidenote use lowercase and underscores for variables etc.. holderString -> holder_string and using with to open your files would be best as it also closes them for.

with open('rawLog.txt') as import_file:
    log_lines = import_file.readlines()

I also changed FILE to log_lines, using more descriptive names makes your code easier to follow.

Upvotes: 2

Beth Crane
Beth Crane

Reputation: 623

It looks to me like your second loop needs an inner loop instead of an inner if statement. E.g.

for line in FILE:
    for word in keyWords:
            if word in line:
                    parseString = line[29:35] #pulling in session ID
                    sessionID.append(parseString) #saving session IDs to a list
                    break # Assuming there will only be one keyword per line, else remove this
    logLine +=1 #for debugging purposes when testing
importFile2.close()      
print(sessionID) #for debugging purposes when testing        

Assuming I have understood correctly, that is.

Upvotes: 2

Related Questions