Reputation: 250
So, I recently got into learning python and at work we wanted some way to make the process of finding specific keywords in our log files easier, to make it easier to tell what IPs to add to our block list.
I decided to go about writing a python script that would take in a logfile, take in a file with a list of key terms, and then look for those key terms in the log file and then write the lines that matched the session IDs where that key term was found; to a new file.
import sys
import time
import linecache
from datetime import datetime
def timeStamped(fname, fmt='%Y-%m-%d-%H-%M-%S_{fname}'):
return datetime.now().strftime(fmt).format(fname=fname)
importFile = open('rawLog.txt', 'r') #pulling in log file
importFile2 = open('keyWords.txt', 'r') #pulling in keywords
exportFile = open(timeStamped('ParsedLog.txt'), 'w') #writing the parsed log
FILE = importFile.readlines()
keyFILE = importFile2.readlines()
logLine = 1 #for debugging purposes when testing
parseString = ''
holderString = ''
sessionID = []
keyWords= []
j = 0
for line in keyFILE: #go through each line in the keyFile
keyWords = line.split(',') #add each word to the array
print(keyWords)#for debugging purposes when testing, this DOES give all the correct results
for line in FILE:
if keyWords[j] in line:
parseString = line[29:35] #pulling in session ID
sessionID.append(parseString) #saving session IDs to a list
elif importFile == '' and j < len(keyWords): #if importFile is at end of file and we are not at the end of the array
importFile.seek(0) #goes back to the start of the file
j+=1 #advance the keyWords array
logLine +=1 #for debugging purposes when testing
importFile2.close()
print(sessionID) #for debugging purposes when testing
importFile.seek(0) #goes back to the start of the file
i = 0
for line in FILE:
if sessionID[i] in line[29:35]: #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
holderString = line #pulling the line of log file
exportFile.write(holderString)#writing the log file line to a new text file
print(holderString) #for debugging purposes when testing
if i < len(sessionID):
i+=1
importFile.close()
exportFile.close()
It is not iterating across my keyWords list, I probably made some stupid rookie mistake but I am not experienced enough to realize what I messed up. When I check the output it is only searching for the first item in the keyWords list in the rawLog.txt file.
The third loop does return the results that appear based on the sessionIDs that the second list pulls and does attempt to iterate (this gives an out of bounds exception due to i never being less than the length of the sessionID list, due to sessionID only having 1 value).
The program does write to and name the new logfile sucessfully, with a DateTime followed by ParsedLog.txt.
Upvotes: 1
Views: 85
Reputation: 180391
If the elif is never True you never increase j
so you either need to increment always or check that the elif
statement is actually ever evaluating to True
for line in FILE:
if keyWords[j] in line:
parseString = line[29:35] #pulling in session ID
sessionID.append(parseString) #saving session IDs to a list
elif importFile == '' and j < len(keyWords): #if importFile is at end of file and we are not at the end of the array
importFile.seek(0) #goes back to the start of the file
j+=1 # always increase
Looking at the above loop, you create the file object with importFile = open('rawLog.txt', 'r')
earlier in your code so comparing elif importFile == ''
will never be True
as importFile
is a file object not a string.
You assign FILE = importFile.readlines()
so that does exhaust the iterator creating the FILE list, you importFile.seek(0)
but don't actually use the file object anywhere again.
So basically you loop one time over FILE
, j
never increases and your code then moves to the next block.
What you actually need are nested loops, using any
to see if any word from keyWords is in each line and forget about your elif :
for line in FILE:
if any(word in line for word in keyWords):
parseString = line[29:35] #pulling in session ID
sessionID.append(parseString) #saving session IDs to a list
The same logic applies to your next loop:
for line in FILE:
if any(sess in line[29:35] for sess in sessionID ): #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
exportFile.write(line)#writing the log file line to a new text file
holderString = line
does nothing bar refer to the same object line so you can simply exportFile.write(line)
and forget the assignment.
On a sidenote use lowercase and underscores for variables etc.. holderString -> holder_string
and using with
to open your files would be best as it also closes them for.
with open('rawLog.txt') as import_file:
log_lines = import_file.readlines()
I also changed FILE
to log_lines
, using more descriptive names makes your code easier to follow.
Upvotes: 2
Reputation: 623
It looks to me like your second loop needs an inner loop instead of an inner if statement. E.g.
for line in FILE:
for word in keyWords:
if word in line:
parseString = line[29:35] #pulling in session ID
sessionID.append(parseString) #saving session IDs to a list
break # Assuming there will only be one keyword per line, else remove this
logLine +=1 #for debugging purposes when testing
importFile2.close()
print(sessionID) #for debugging purposes when testing
Assuming I have understood correctly, that is.
Upvotes: 2