Reputation: 315
I want to use Python to:
search_list
file.source_file
lines.source_file
to export_file
.search_file
is exhausted.The contents of source_file
are plain text. Sample:
Act of Heroism Instant 1W Common Magali Villeneuve
Adorned Pouncer Creature — Cat 1/1 1W Rare Slawomir Maniak
Angel of Condemnation Creature — Angel 3/3 2WW Rare Slawomir Maniak
The search_list
file is also a plain text with keywords, one per line, as in the following example:
Condemnation
Heroism
After spending some time in Stackoverflow, I have the current code --which is unusable at the moment:
with open('list.txt', 'r') as search_list, \
open('source_file.txt', 'r', encoding="utf8") as source_file:
for line in search_list:
searchquery = search_list.readlines()
for line in source_file:
current_line = line.split()
if searchquery in current_line:
print (line)
it returns nothing.
I try to figure it out what's wrong and so far I can't find it.
I did a step back and tried to search with string and it worked!
with open('list.txt', 'r') as search_list, \
open('source_file.txt', 'r', encoding="utf8") as source_file:
for line in source_file:
if "Heroism" in line:
print (line)
The result is:
Act of Heroism Instant 1W Common Magali Villeneuve
Could anyone point me out what's wrong in my top code?
Thank you very much.
Upvotes: 0
Views: 2685
Reputation: 2545
I interpreted your question as that you want to output each line of a file source_file.txt
that contains a certain substring, and these substrings are in another file search_list.txt
. If that is correct, the following code should work for you:
import sys
with open('search_list.txt', 'r') as search_list:
targets = [line.strip() for line in search_list]
with open('source_file.txt', 'r') as source_file:
for line in source_file:
if any(target in line for target in targets):
sys.stdout.write(line)
where search_lines.txt
is
Condemnation
Heroism
and source_file.txt
is
Act of Heroism Instant 1W Common Magali Villeneuve
Adorned Pouncer Creature — Cat 1/1 1W Rare Slawomir Maniak
Angel of Condemnation Creature — Angel 3/3 2WW Rare Slawomir Maniak
this will correctly output
Act of Heroism Instant 1W Common Magali Villeneuve
Angel of Condemnation Creature — Angel 3/3 2WW Rare Slawomir Maniak
which is each line that contains either 'Condemnation
' or 'Heroism
'.
This works by first building up a list of all the targets
first, and then for each line in source_file.txt
, checking if any target is a substring of the line. You have to build up the list of targets as when you iterate over a file in Python each line is 'consumed' so you can't go back to the start again in another for loop.
The way the line if any(target in line for target in targets)
works is broadly like this:
First, it uses the generator expression target in line for target in targets
. This returns the value of target in line
(which checks if target
is a substring of line
) for each target
in targets
- it could also effectively be written as
with open('source_file.txt', 'r') as source_file:
for line in source_file:
matches = []
for target in targets:
matches.append(target in line)
if any(matches):
sys.stdout.write(line)
Now, the any
function takes an iterable (something like a list) and returns True
if any of the values are True
(or equivalent to True
). It also has the property of short-circuiting - it actuallly stops as soon as it does meet True
, if it does. This means the code could be rewritten pretty accuately as
with open('source_file.txt', 'r') as source_file:
for line in source_file:
matches = []
for target in targets:
if target in line:
sys.stdout.write(line)
break
(This has to do with the fact that there is a generator expression, which does not evaluate the whole thing at once, but lazily gives one value at a time, meaning no more work will be done than needed.)
By the way, [line.strip() for line in search_list]
is a list comprehension. This returns a list of line.strip()
for each line in search_list
. This could be rewritten as
targets = []
for line in search_list:
targets.append(line.strip()
Hopefully that's helped. Here is some useful documentation on how list comprehensions work. I find it can often be useful to start with the simpler examples like [i ** 2 for i in range(10)]
. Let me know if you'd like any more clarification.
Upvotes: 3