Reputation: 85
My objective is to read multiple txt source files in a folder (small size), then copy lines selected by criteria into one output txt file. I can do this with 1 source file, but I have no output (empty) when I try to read multiple files and do the same.
With my SO research I wrote following code (no output):
import glob
# import re --- taken out as 'overkill'
path = 'C:/Doc/version 1/Input*.txt' # read source files in this folder with this name format
list_of_files=glob.glob(path)
criteria = ['AB', 'CD', 'EF'] # select lines that start with criteria
#list_of_files = glob.glob('./Input*.txt')
with open("P_out.txt", "a") as f_out:
for fileName in list_of_files:
data_list = open( fileName, "r" ).readlines()
for line in data_list:
for letter in criteria:
if line.startswith(letter):
f_out.write('{}\n'.format(line))
Thank you for your help.
@abe and @ppperry: I'd like to particularly thank you for your earlier input.
Upvotes: 1
Views: 1042
Reputation: 504
The errors:
Here is your code fixed, with comments:
import glob
import re
#path = 'C:\Doc\version 1\Output*.txt' # read all source files with this name format
#files=glob.glob(path)
criteria = ['AB', 'CD', 'EF'] # select lines that start with criteria
list_of_files = glob.glob('./Output*.txt')
with open("P_out.txt", "a") as f_out: #use "a" so you can keep the data from the last Output.txt
for fileName in list_of_files:
data_list = open( fileName, "r" ).readlines()
#indenting the below will allow you to search through all files.
for line in data_list: #Search data_list, not fileName
for letter in criteria:
if re.search(letter,line):
f_out.writelines('{}\n'.format(line))
#I recommend the \n so that the text does not get concatenated when moving from file to file.
#Really? I promise with will not lie to you.
#f_out.close() # 'with' construction should close files, yet I make sure they close
For those who downvoted, why not include a comment to justify your judgment? Everything the OP requested has been satisfied. If you think you can further improve the answer, suggest an edit. Thank you.
Upvotes: -1
Reputation: 3814
Problems with your code:
files
and list_of_files
but only use the latter.data_list
, which erases the contents of the previous file read.fileName
instead of data_list
!Places that could use simplification:
re
module is overkill for just finding out whether a string starts with another string. You can use line.startswith(letter)
.Upvotes: 2