Reputation: 513
I had created the text semantic search engine. However, I cannot find the data set which is labeled so that I can evaluate the information retrieve of my system.
Is there any public available document (text) which is labeled. As I would need the text document to evaluate the information retrieve result. (recall, precision, F1 value...)
Thanks.
Upvotes: 3
Views: 91
Reputation: 37741
I do research in this direction. In all my research, i have used AOL dataset which consists of ~20M web queries collected from ~650k users over three months (March 01, 2006 to May 31, 2006). The data is sorted by anonymous user ID and sequentially arranged.
The data set includes {AnonID, Query, QueryTime, ItemRank, ClickURL}
. More details can be found in the link mentioned above. I am interested to know how you have implemented and if possible, share your engine's code. I am also interested to know the performance on AOL dataset in your search engine.
You can find the dataset in my git repository. Thanks!
Upvotes: 2