Python - Best way to find data in a very large textfile (8GB)

Question

I would like to scan a 8GB textfile (it's a log file) to find specific words. These words are stored in a dataframe with over 3400 rows.

I've tried the solution below, which avoids having to load the entire document:

with open(filename) as f:
for line in f:
    do_stuff(line)

However, this is taking a very long time to process. It takes over 2 minutes to scan the entire document for one word. Multiplying it with 3400 would take 113 hours to complete the script.

Is there anyway to improve this process?

Python - Best way to find data in a very large textfile (8GB)

Answers (1)

Related Questions