searching for an array of keywords in an array of files

Question

inputs=[]
def pickinputs():         
search_chars=['WaveDir', 'WaveHs', 'WaveTp', 'WtrDpth', 'Number of mooring lines', 'Reference height', 'Reference wind speed', 'Grid height', 'Grid width', 'Analysis time']        
files=[file_platform, file_wind, file_primary]
m=0
while True:
    inputfile=open(files[m],'r')        
    for i in range(len(search_chars)):        
        j=1
        for lines in inputfile:
            if search_chars[i] in lines:
                line= linecache.getline(files[m], j)
                line_split = line.split(' ')
                #print (line_split)
                for k in range(len(line_split)):
                    if line_split[k]!= "":
                        break
                    val=line_split[k+1]
                inputs.append(val)
            j=j+1
    m=m+1

Aim is to search each text of search_chars in files and get its line number in that file(in files) and split to read the first non-space value (it is a number) and append it to inputs. I could write the same in a bigger way, but I would like to do it in an efficient way. The search_chars may be present in any one of the files.

Could anyone suggest modifications in the code that I wrote, to make it work efficiently? Thanks

Martin Evans · Accepted Answer

You could do something like the following:

inputs = []
search_strings = ['WaveDir', 'WaveHs', 'WaveTp', 'WtrDpth', 'Number of mooring lines', 'Reference height', 'Reference wind speed', 'Grid height', 'Grid width', 'Analysis time']
files = ['input.txt', 'input2.txt']

for filename in files: 
    with open(filename) as f_input:
        for line_number, line in enumerate(f_input, start=1):
            for search in search_strings:
                if search in line:
                    first_non_space = line.strip().split(' ')[0]
                    inputs.append((filename, line_number, search, first_non_space))
                    #print filename, line_number, search

for filename, line_number, search_matched, first_non_space in inputs:            
    print filename, line_number, search_matched, first_non_space

This would build up the inputs list with all of the matches, giving you the filename, line_number, search_matched and the first non space value in the line for all of the files you search.

searching for an array of keywords in an array of files

Answers (1)

Related Questions