Reputation: 3043
As I understand Apache Lucene and Google (GSA or GCS) are completely different search engines / frameworks and their parsers have varying logic but their query languages seem extremely similar, or the same. If they are the same what is this query language called? If not the same what is each called/what are the differences?
example:
field1:foo "some text"
and the item existed in the dataset
{
"field1": "foo",
"somefield": "bla bal some text"
}
would be in the result
Upvotes: 0
Views: 73
Reputation: 668
You might call it "search syntax", it's kindof a mashup of the old days of information retrieval research (80s and 90s) and what the suddenly dominant web search engines settled on in the late 90s.
Modern customer-oriented search engines match all the words in the query in all fields, although some allow partial matches. Most allow ways to override the default behavior using query syntax such as Boolean operators like AND (sometimes "+"), OR (sometimes "||") and NOT (sometimes "_"), quote marks to indicate phrase search matches, and field filters like "department:".
After all that, it occurs to me that you may come from a database background, and be asking about why the result doesn't exactly match the query. If that's the case, it's because the search engine has an inverted index that can match parts of fields, and then sort results by a relevance algorithm, usually TF IDF.
Upvotes: 1