Reputation: 137
First I started trying to search file for one single word with this code:
import re
shakes = open("tt.txt", "r")
for line in shakes:
if re.match("(.*)(H|h)appy(.*)", line):
print line,
but what if I need to check for multiple words? I was thinking that maybe something like a for
loop can work, searching the file each time for a different word in the list.
Do you think this can be convenient?
Upvotes: 1
Views: 6435
Reputation: 876
Another idea is to use a set
.
The code below assumes that all words in your file are separated by spaces and that word_list
is the list of words to look for.
shakes = open("tt.txt", "r")
words = set(word_list)
for line in shakes:
if words & set(line.split()):
print line,
If you want to do a case-insensitive search, you can convert each string to lowercase:
shakes = open("tt.txt", "r")
words = set(w.lower() for w in word_list)
for line in shakes:
if words & set(line.lower().split()):
print line,
Upvotes: 0
Reputation: 428
I think using regex here is not pythonic as regex is a bit implicit. So I'd use loops if speed doesn't matter too much:
def find_word(word_list, line):
for word in word_list:
if word in line:
return line
with open('/path/to/file.txt') as f:
result = [find_word(word_list, line.lower()) for line in f.readlines()]
Upvotes: 0
Reputation: 174696
Just join the word_list with |
as delimiter. (?i)
case-insensitive modifier helps to do a case-insensitive match.
for line in shakes:
if re.search(r"(?i)"+'|'.join(word_lst), line):
print line,
Example:
>>> f = ['hello','foo','bar']
>>> s = '''hello
hai
Foo
Bar'''.splitlines()
>>> for line in s:
if re.search(r"(?i)"+'|'.join(f), line):
print(line)
hello
Foo
Bar
Without regex:
>>> f = ['hello','foo','bar']
>>> s = '''hello
hai
Foo
Bar'''.splitlines()
>>> for line in s:
if any(i.lower() in line.lower() for i in f):
print(line)
hello
Foo
Bar
Upvotes: 4