E.Cross
E.Cross

Reputation: 2137

Find one file out of many containing a desired string in Python

I have a string like 'apples'. I want to find this string, and I know that it exists in one out of hundreds of files. e.g.

file1
file2
file3
file4
file5
file6
...
file200

All of these files are in the same directory. What is the best way to find which file contains this string using python, knowing that exactly one file contains it.

I have come up with this:

for file in os.listdir(directory):
    f = open(file)
    for line in f:
        if 'apple' in f:
            print "FOUND"
    f.close()

and this:

grep = subprocess.Popen(['grep','-m1','apple',directory+'/file*'],stdout=subprocess.PIPE)
found = grep.communicate()[0]
print found

Upvotes: 5

Views: 38467

Answers (5)

Aurelie Giraud
Aurelie Giraud

Reputation: 93

Open your terminal and write this:

  • Case insensitive search
grep -i 'apple' /path/to/files
  • Recursive search (through all sub folders)
grep -r 'apple' /path/to/files

Upvotes: 0

iruvar
iruvar

Reputation: 23364

a lazy-evaluation, itertools-based approach

import os
from itertools import repeat, izip, chain

gen = (file for file in os.listdir("."))
gen = (file for file in gen if os.path.isfile(file) and os.access(file, os.R_OK))
gen = (izip(repeat(file), open(file)) for file in gen)
gen = chain.from_iterable(gen)
gen = (file for file, line in gen if "apple" in line)
gen = set(gen)
for file in gen:
  print file

Upvotes: 0

Levon
Levon

Reputation: 143047

Given that the files are all in the same directory, we just get a current directory listing.

import os

for fname in os.listdir('.'):    # change directory as needed
    if os.path.isfile(fname):    # make sure it's a file, not a directory entry
        with open(fname) as f:   # open file
            for line in f:       # process line by line
                if 'apples' in line:    # search for string
                    print 'found string in file %s' %fname
                    break

This automatically gets the current directory listing, and checks to make sure that any given entry is a file (not a directory).

It then opens the file and reads it line by line (to avoid problems with memory it doesn't read it in all at once) and looks for the target string in each line.

When it finds the target string it prints the name of the file.

Also, since the files are opened using with they are also automatically closed when we are done (or an exception occurs).

Upvotes: 11

Ben
Ben

Reputation: 5087

for x in  os.listdir(path):
    with open(x) as f:
        if 'Apple' in f.read():
         #your work
        break

Upvotes: 2

ninjagecko
ninjagecko

Reputation: 91094

For simplicity, this assumes your files are in the current directory:

def whichFile(query):
    for root,dirs,files in os.walk('.'):
        for file in files:
            with open(file) as f:
                if query in f.read():
                    return file

Upvotes: 2

Related Questions