Reputation: 193
I'm looking for an easy way to check if all the strings that are in a list are in a huge text file (>35.000 words).
self.vierkant = ['BIT', 'ICE', 'TEN']
def geldig(self, file):
self.file = file
file = open(self.file, 'r')
line = file.readline()
self.file = ''
while line:
line = line.strip('\n')
self.file += line
line = file.readline()
return len([woord for woord in self.vierkant if woord.lower() not in self.file]) == 0
I just copy the text file into self.file, then check if all words from self.vierkant are in self.file.
The main problem is that it takes a very long time to read in the text file. Is there an easier/faster way to do this?
Upvotes: 2
Views: 2533
Reputation: 12178
with open('a.txt') as f:
s = set(f.read().splitlines()) # splitlines will remove the '\n' in the end and return a list of line.
for line in test_lines:
line in s # O(1) check if the the line in the line-set
Upvotes: 0
Reputation: 150225
You can read the entire contents of a file with file.read()
instead of calling readline()
repeatedly and concatenating the result:
with open(self.file) as f:
self.file = f.read()
If you need to check a lot of words, you could also build a set from the file's contents for O(1) containment checks.
Upvotes: 2