Reputation: 21
I used Python Newspaper lib to develop a web scraping script. I needed to extract the following - URL, Title, Summary, Author and date of publication. I got all except the date of publication. My question is, has anyone used the Newspaper lib to capture publication date?
hn.write("***********Article no" + str(x+1) + "************\r\n");
hn.write("URL: "+ article.url+ "\r\n");
hn.write("Title: "+ article.title + "\r\n");
hn.write( "Authors: "+' '.join(map(str, article.authors)));
hn.write("\r\n");
hn.write("Summary: "+ article.summary+ "\r\n);
hn.write("Key words: ");
hn.write(str(article.keywords).strip('[]'));
Is there a way to get the date of publication using Newspaper lib?
Thanks
Mukesh
Upvotes: 1
Views: 963
Reputation: 142641
There is line 195 in newspaper/article.py
# TODO self.publish_date = self.config.publishDateExtractor.extract(self.doc)
It seems it is not ready yet. But you can try to uncomment this code.
Source: https://github.com/codelucas/newspaper/blob/master/newspaper/article.py#L195
Upvotes: 3