Reputation: 9
I have two .txt files. One contains a list of domains (google.com, facebook.com, apple.com, amazon.com), each on a separate line. The other one contains a smaller list of domains (facebook.com, amazon.com), which are also each on a separate line.
I want to remove all the domains which are on the second text file, from the first text file.
So for example, the first text file would go from
google.com
facebook.com
apple.com
amazon.com
To:
google.com
apple.com
How could this be done with a python script?
Upvotes: 0
Views: 94
Reputation: 167
Fast answer and stub:
with open('filename1.txt', 'r') as f:
text1 = f.read()
with open('filename2.txt', 'r') as f:
text2 = f.read()
with open('filename1.txt', 'w') as f:
domains = text1.splitlines()
keep_domains = text2.splitlines()
new_domains = [d for d in domains if d in keep_domains]
text = '\n'.join(new_domains)
f.write(text)
This solution could work fine.
Note: I dislike method .readlines() of file object, I prefer string method .splitlines because first one keeps new line characters. I dislike to use 'r+' mode when reading/writing file, I prefer to make it separately.
Upvotes: 0
Reputation: 656
Python Diff To get a diff using the difflib library, you can simply call the united_diff function on it. For example, Lets say you have 2 files, file1 and file2 with the following content: Now to take their diff use the following code: This will give the output:
def Diff(li1, li2):
return list(set(li1) - set(li2)) + list(set(li2) - set(li1))
li1 = ['google.com','facebook.com','apple.com','amazon.com']
li2 = ['facebook.com','amazon.com']
print(Diff(li1, li2))
Compare Two Text Files
file1 = "path to file"
file2 = "path to file"
import difflib
with open('file1') as f1:
f1_text = f1.read()
with open('file2') as f2:
f2_text = f2.read()
# Find and print the diff:
for line in difflib.unified_diff(f1_text, f2_text, fromfile='file1', tofile='file2', lineterm=''):
print line
Upvotes: 0
Reputation: 426
Try with this one.
f = open("text1.txt")
list1 = [i.strip() for i in f.readlines()]
f.close()
f = open("text2.txt")
list2 = [i.strip() for i in f.readlines()]
f.close()
def Diff(list1, list2):
return list(set(list1) - set(list2)) + list(set(list2) - set(list1))
list1 = Diff(list1, list2)
f = open("text1.txt","w")
for i in list1:
f.write(f"{i}\n")
f.close()
Upvotes: 1