Reputation: 113
I'm currently working on fetching data from the GDELT project using Google BigQuery. While exploring various tables, I haven't come across any that include the complete text or title of news articles. It appears that the tables primarily contain URLs linking to the news content. My question is whether there is a specific table within the GDELT project that includes the headline, title, or description of news articles, or if the tables solely consist of URLs.
For instance I've gone through
gdelt-bq.full.events
But it doesn’t contain the headlines and text of news.
Upvotes: 1
Views: 828
Reputation: 1
As of September 2019 you can extract news headlines from the GDELT GKG XMLExtras field: https://blog.gdeltproject.org/gkg-2-0-now-includes-page-titles/
You can also search the full text of news articles, and return article titles using the GDELT DOC 2.0 or Summary APIs.
(still no option to return the full text of an article though, but for summarising eg the relations between countries generally the titles are already a good starting point)
Upvotes: 0
Reputation: 605
The GDELT project does not contain the full text of news articles. The tables in the GDELT project only contain the URLs of the news articles. If you want to get the full text of news articles, you will need to use a different data source.
Upvotes: 2