Miguel Freire
Miguel Freire

Reputation: 1

Find all documents with exact expression

How can I find an exact expression in all documents in Vespa?

I was trying to find a citation of a specific document using an exact expression but got 0 results. I have several documents with text containing the expression "Document 23/2010".

I tried running the following query:

vespa query 'yql=select title from Documents where text contains "\"Document 23/2010\"" LIMIT 10'

and also tried using grammar: phrase

Upvotes: 0

Views: 163

Answers (1)

Jo Kristian Bergum
Jo Kristian Bergum

Reputation: 3184

The above should just work for string fields with index

vespa query 'yql=select * from doc where text contains "\"Document 23/2010\""'
{
    "root": {
        "id": "toplevel",
        "relevance": 1.0,
        "fields": {
            "totalCount": 1
        },
        "coverage": {
            "coverage": 100,
            "documents": 1,
            "full": true,
            "nodes": 1,
            "results": 1,
            "resultsFull": 1
        },
        "children": [
            {
                "id": "id:doc:doc::1",
                "relevance": 0.15974580091895013,
                "source": "text",
                "fields": {
                    "sddocname": "doc",
                    "documentid": "id:doc:doc::1",
                    "text": [
                        "Foo Bar \"Document 23/2010\" Bar"
                    ]
                }
            }
        ]
    }
}

If you add trace.level=3 you will see how any query is parsed and executed against the back.

 {
                                "message": "sc0.num0 dispatch response: Result (1 of total 1 hits)"
                            },
                            {
                                "message": "sc0.num0 fill to dispatch: query=[text:'document 23 2010'] timeout=9998ms offset=0 hits=10 groupingSessionCache=true sessionId=5304f5d0-6cd3-4dc4-be2e-666829413231.1708027994706.5.default grouping=0 :  restrict=[doc] summary=[null]"
                            },
                            {
                                "message": "Current state of query tree: SPHRASE[explicit=false index=\"text\" isFromQuery=true isFromUser=true locked=true rawWord=\"\\\"Document 23/2010\\\"\" stemmed=true uniqueID=1]{\n  WORD[fromSegmented=false index=\"text\" origin=null segmentIndex=0 stemmed=true words=true]{\n    \"document\"\n  }\n  WORD[fromSegmented=false index=\"text\" origin=null segmentIndex=0 stemmed=true words=true]{\n    \"23\"\n  }\n  WORD[fromSegmented=false index=\"text\" origin=null segmentIndex=0 stemmed=true words=true]{\n    \"2010\"\n  }\n}\n"
                            },
                            {
                                "message": "YQL+ representation: select * from doc where text contains ({origin: {original: \"\\\"Document 23\\/2010\\\"\", offset: 0, length: 18}, id: 1}phrase(\"document\", \"23\", \"2010\")) timeout 9998"
                            },

Here we can see that the query uses phrase search.

With index, chars like " are not searchable; they are removed by the tokenizer.

Upvotes: 2

Related Questions