Reputation: 537
I am searching for sample .txt files for information Retrieval. Would be nice if there are sets of documents(around 20 documents) regarding one topic, e.g., sports, music, etc.
Thanks
Upvotes: 1
Views: 272
Reputation: 3134
There are many datasets available, for instance:
Datasets used to evaluate IR systems: http://www.daviddlewis.com/resources/testcollections/
More IR datasets: http://boston.lti.cs.cmu.edu/callan/Data/
A comprehensive list of several datasets: http://zitnik.si/mediawiki/index.php?title=Datasets
The classic news groups dataset: http://scikit-learn.org/stable/datasets/twenty_newsgroups.html
Much bigger, news articles: http://research.signalmedia.co/newsir16/signal-dataset.html
Upvotes: 3