Reputation: 391
I have the following record set in elastic search
POST /books/book/1
{
"title" : "JavaScript: The Good Parts",
"author" : "Douglas Crockford",
"language" : "JavaScript",
"publishYear" : 2009,
"soldCopy" : "50"
}
POST /books/book/2
{
"title" : "JavaScript: The Good Parts",
"author" : "Douglas Crockford",
"language" : "JavaScript",
"publishYear" : 2009,
"soldCopy" : "110"
}
POST /books/book/3
{
"title" : "JavaScript: The Good Parts",
"author" : "Douglas Crockford1",
"language" : "JavaScript",
"publishYear" : 2011,
"soldCopy" : "2"
}
POST /books/book/4
{
"title" : "JavaScript: The Good Parts",
"author" : "Douglas Crockford2",
"language" : "JavaScript",
"publishYear" : 2012,
"soldCopy" : "5"
}
I am using the following elastic search query to get the distinct title and author based on a given year 2009.The output for the query that i am expecting is
JavaScript: The Good Parts Douglas Crockford
but in the response i get back 2 records with same ouput like :
JavaScript: The Good Parts Douglas Crockford
JavaScript: The Good Parts Douglas Crockford
Query used for elastic search was :
{
"query": {
"match": {
"publishYear": "2009" }
}
}
equivalent select query that i am try to create in database terms is :
select distinct title,author from book where publishYear = '2009'
How could i get the same output from elastic search as my sql query ? Thanks
Upvotes: 1
Views: 2206
Reputation: 12672
Distinct in sql is equivalent to terms aggregation in elasticsearch.
{
"query": {
"match": {
"publishYear": "2009"
}
},
"aggs": {
"unique_author": {
"terms": {
"field": "author",
"size": 10
}
},
"unique_book": {
"terms": {
"field": "title",
"size": 10
}
}
},
"size": 0
}
For this to work you would have to make title and author field as not_analyzed or you could also use keyword tokenizer
with lowercase
token filter. The better option would be to make them multi fields
You could create index like this
PUT books
{
"mappings": {
"book":{
"properties": {
"title":{
"type": "string",
"fields": {
"raw":{
"type": "string",
"index": "not_analyzed"
}
}
},
"author":{
"type": "string",
"fields": {
"raw":{
"type": "string",
"index": "not_analyzed"
}
}
},
"language":{
"type": "string"
},
"publishYear":{
"type": "integer"
},
"soldCopy":{
"type": "string"
}
}
}
}
}
Then use .raw in aggregation.
Upvotes: 1