Reputation: 13
I'm trying to read two txt files by opening both and then I want to find if any lines match (they don't need to be in the correct order).
For example:
txt file 1:
1
2
8
4
3
9
txt file 2:
2
0
10
9
8
So 2, 9 and 8 would be a match here.
Below is the code I've tried so far but this isn't outputting results which are in both txt files at present. It currently outputs nothing which would indicate it's just going down the else
path.
with open('file1.txt') as file1, open('file2.txt') as file2:
for domain in file2:
for domain_2 in file1:
if domain == domain_2:
print(domain + "Match found")
else:
continue
Upvotes: 1
Views: 485
Reputation: 13197
The simplest approach would be to cast these lists of numbers to sets then take the intersection of them. This is what set()
was born to do :-).
Assuming you can read your file contents into lists already then a solution becomes:
file1_data = ["2", "8", "4", "3", "9"]
file2_data = ["2", "0", "10", "9", "8"]
print(list(set(file1_data).intersection(file2_data)))
Giving us:
['2', '8', '9']
If you would like to do a more manual approach based on a strategy similar to what you have now, I would:
file1_data = ["2", "8", "4", "3", "9"]
file2_data = ["2", "0", "10", "9", "8"]
results = []
for value_1 in file1_data:
if value_1 in file2_data:
results.append(value_1)
print(results)
Or you might use a list comprehension:
results = [value_1 for value_1 in file1_data if value_1 in file2_data]
print(results)
The key piece in the last two versions being the test if the current value (value_1) is in your second list via:
value_1 in file2_data
If part of the issue is related to not being able to "re-read" from file1
, that is because after you reach the end of file1
via its generator, you would have to reread it to get back to the beginning.
For this reason, I recommend reading the data from the files into lists that don't have this generator restriction.
with open("file1.txt", "r") as file1:
file1_data = [n.strip() for n in file1]
with open("file2.txt", "r") as file2:
file2_data = [n.strip() for n in file2]
though I am not keen on it, you might also do this that is very close to your current solution:
with open('file1.txt') as file1:
with open('file2.txt') as file2:
for domain_2 in file2:
file1.seek(0)
for domain_1 in file1:
if domain_1.strip() == domain_2.strip():
print(f"Match Found: {domain_1.strip()}")
else:
continue
This gives us:
Match Found: 2
Match Found: 9
Match Found: 8
Note the deviations from your current solution being mostly the addition of a file1.seek(0)
to reset file1 back to the starting point and the various inclusion of .strip()
to get rid of the carriage returns in the raw data. Note as well that the else: continue
is technically redundant and could be omitted.
Upvotes: 1