Reputation: 1
How to delete common words from two documents thats extracted from two websites? I already extracted the news from two sites now I want to delete the common words from the two documents. I used the following code to extract news from two different websites:
from __future__import unicode_literals
import feedparser
import re
d=feedparser.parse('http://feeds.bbci.co.uk./news/rss.xml')
i=0
for post in d.entries
titl = post.title
desc = post.description
titl2 = tit1.replace('\\'," ")
desc1 = desc.replace('/'," ")
print(str(i) + ' ' + titl2)
i=i+1
print "indian Express"
g=feedparser.parse('http://www.rssmicro.com/rss.web?q=Android')
i=0
for pos in g.entries:
tit = post.title
#desc=post.description
tit4 = tit.replace('\\'," ")
print(str(i) + ' ' + tit4)
i=i+1
Upvotes: 0
Views: 43