Reputation: 925
I have around 80,000 text files and I want to be able to do an advanced search on them. Let's say I have two lists of keywords and I want to return all the files that include at least one of the keywords in the first list and at least one in the second list. Is there already a library that would do that, I don't want to rewrite it if it exists.
Upvotes: 6
Views: 6925
Reputation: 426
I just get a feeling you want to use MapReduce type of processing for the search. It should be very scalable, Python should have MapReduce packages.
Upvotes: 1
Reputation: 940
As you need to search the documents multiple times, you most likely want to index the text files to makes such searches as fast as possible.
Implementing a reasonable index yourself is certainly possible, but a quick search lead me to:
Take a look at the documentation. It should hopefully be rather trivial to achieve the desired behaviour.
Upvotes: 4