Reputation: 15141
Is it possible to get the ttf
(total term frequency) for all the tokens from a field in all the shards for a given index?
e.g. I have:
PUT /index/type/1
{
"sentence": "delicious cake"
}
PUT /index/type/2
{
"sentence": "horrible cake"
}
I want to get:
cake 2
horrible 1
delicious 1
Also is it possible to do it for multiple fields (let's say I'd have sentence1
and sentence2
and I'd like to run such a count on the concatenation of them)?
I know termvectors give the ttf and that mtermvectors can do it for multiple documents but then I'd have to go through all the documents and handle the results myself somehow.
Actually only the top K terms would be sufficient for me if I can control K.
Upvotes: 1
Views: 200
Reputation: 185
If your field 'sentence' is analyzed you can get TTF with Terms Facet:
POST /index/type/_search
{
"query": {
"match_all": {}
},
"facets" : {
"sentence" : {
"terms" : {
"field" : "sentence",
"size" : 10
}
}
}
}
TTF will be in facet part of response
Also you can pass array of fields ["sentence", "sentence2"] to count TTF across multiple fields
POST /index/type/_search
{
"query" : {
"match_all" : { }
},
"facets" : {
"multiple_sentence" : {
"terms" : {
"fields" : ["sentence", "sentence2"],
"size" : 10
}
}
}
}
Upvotes: 3