David Batista
David Batista

Reputation: 3124

NEAR type query in whoosh

In the slides from the PyCon 2013 there is a mention to NEAR type queries. I've looked through the documentation and there is no mention to the NEAR keyword in the queries. I could only find something similar, this:

"whoosh library"~5

which matches if a document has 'library' within 5 words after 'whoosh'

I was wondering whether there is a way to make this kind of query:

'whoosh' NEAR:X 'python' NEAR:X 'retrieval'

where X represents the maximum number of words between the query words (i.e., 'whoosh', 'python', 'retrieval')

Upvotes: 0

Views: 470

Answers (1)

David Batista
David Batista

Reputation: 3124

I went through the documentation again and found the SpanNear2 class, this seems to do the job, example for three terms:

   t1 = query.Term("sentence", "Whoosh")
   t2 = query.Term("sentence", "python")
   t3 = query.Term("sentence", "retrieval")
   q = spans.SpanNear2([t1, t2, t3], slop=5, ordered=True)

This would match a document containing a sentence like:

  "The Whoosh project is a python library for information retrieval."

but not this sentence:

  "Whoosh is a great open source project is a python for information retrieval."

since there are 8 tokens between 'Whoosh' and and python, and slop=5

Upvotes: 2

Related Questions