igor tarashchuk
igor tarashchuk

Reputation: 11

Vespa counts the number of times a word appears in a string

I have a question, i can try to find, how to calculate number of times, when searched word appears in a string. For example, have schema like this.

schema search {
    document search {
        field Id type string {
            indexing: summary | attribute
            attribute: fast-search
        }
        field Name type string {
            indexing:  summary | attribute
        
        }
    field NameArray type array<string> {
            indexing:   summary | attribute 
        }
    field NameLength type int {
            indexing:  summary | attribute 
        }
    }

    fieldset default {
        fields: Id, Name
    }

    rank-profile default {
        first-phase {
            expression: nativeRank(Id, Name)
        }
    }

rank-profile searchByName {
        first-phase {
            expression: matchCount(Name)
        }
    }

rank-profile searchByName1 {
    first-phase {
        expression: matchCount(NameArray)
    }


}

Example of document

{
                "id": "id:search:search::AjSjRtrcoklrHHb",
                "relevance": 1,
                "source": "search",
                "fields": {
                    "sddocname": "search",
                    "documentid": "id:search:search::AjSjRtrcoklrHHb",
                    "Id": "AjSjRtrcoklrHHb",
                    "Name": "Test Cat сat Cat",
                    "NameArray": [
                        "Test",
                        "Cat",
                        "сat",
                        "Cat"
                    ],
                    "NameLength": 16
                }
            }



{
    "hits": 150,
    "ranking": {
        
        "profile": "searchByName "
        
    },
    "offset": 0,
    "yql": "select * from search where Name matches '(?i).*сat.*'' "
}

When i send request to Vespa, it give me every time value of matchCount = 1 ( in relevance). The same result for ranking searchByName and searchByName1. How to calculate number of appears keywords "cat" in Name or using array of name. Also can calculate, when i can try use for keyword - "ca".

Try to use index on field Name, textSimilarity(Name).queryCoverage and textSimilarity(Name).fieldCoverage functio for this

Upvotes: 1

Views: 121

Answers (1)

Jon
Jon

Reputation: 2339

Rank features such as matchCount operates on tokens. Here you make the fields attributes, not (text) indexes, and then you just have a single value which is not split into tokens.

Here you are doing regex matching on that value rather than just exact match, but Vespa will anyway just count it as a single match if the regex matches.

Looks like you could just use a text index instead and just query by "contains 'cat'"?

Upvotes: 1

Related Questions