uhurulol
uhurulol

Reputation: 345

Using Python to search multiple text files for matches to a list of strings

So am starting from scratch on a program that I haven't really seen replicated anywhere else. I'll describe exactly what I want it to do:

I have a list of strings that looks like this:

12482-2958
02274+2482
23381-3857
..........

I want to take each of these strings and search through a few dozen files (all named wds000.dat, wds005.dat, wds010.dat, etc) for matches. If one of them finds a match, I want to write that string to a new file, so in the end I have a list of strings that had matches.

If I need to be more clear about something, please let me know. Any help on where to start with this would be much appreciated. Thanks guys and gals!

Upvotes: 3

Views: 6264

Answers (3)

HackerShark
HackerShark

Reputation: 211

Something like this should work

import os

#### your array ####
myarray = {"12482-2958", "02274+2482", "23381-3857"}

path = os.path.expanduser("path/to/myfile")
newpath = os.path.expanduser("path/to/myResultsFile")
filename = 'matches.data'
newf = open(os.path.join(newpath, filename), "w+")

###### Loops through every element in the above array ####
for element in myarray:
    elementstring=''.join(element)

    #### opens the path where all of your .dat files are ####
    files = os.listdir(path)
    for f in files:
        if f.strip().endswith(".dat"):
            openfile = open(os.path.join(path, f), 'rb')
            #### loops through every line in the file comparing the strings ####
            for line in openfile:
                if elementstring in line:
                        newf.write(line)
           openfile.close()
newf.close()

Upvotes: 5

user5507598
user5507598

Reputation:

Not so pythonic... and probably has something to straighten out but pretty much the logic to follow:

from glob import glob
strings = ['12482-2958',...] # your strings
output = []
for file in glob('ws*.dat'):
    with open(file, 'rb+') as f:
        for line in f.readlines():
            for subs in strings:
                if subs in line:
                    output.append(line)
print(output)

Upvotes: 0

Neo
Neo

Reputation: 3786

Define a function that gets a path and a string and checks for match.
You can use: open(), find(), close() Then just create all paths in a for loop, for every path check all strings with the function and print to file if needed

Not explained much... Needing more explaining?

Upvotes: 1

Related Questions