Reputation: 1299
EDIT: Added my current query to the end
I have a large database of human names and am using elastic search (via symfony2's FOSElasticaBundle and Elastica) to do smarter searching of the names.
I have a full name field, and I want to index the people's names with standard, ngram, and phonetic analyzers.
I've got the analyzers set up in elastic search, and I can begin dumping data into the index. I'm wondering if the way I'm doing it here is the best way, or if I can apply the analyzers to a single field...the reason I ask is because when I do a get /website/person/:id, I see all three fields in plain text...I was expecting to see the analyzed data here, although I guess it must only exist in an inverted index rather than on the document. Examples I've seen use multiple fields, but is it possible to add multiple analyzers to a single field?
My config.yml:
fos_elastica:
clients:
default: { host: %elastica_host%, port: %elastica_port% }
indexes:
website:
settings:
index:
analysis:
analyzer:
phonetic_analyzer:
type: "custom"
tokenizer: "lowercase"
filter: ["name_metaphone", "lowercase", "standard"]
ngram_analyzer:
type: "custom"
tokenizer: "lowercase"
filter : [ "name_ngram" ]
filter:
name_metaphone:
encoder: "metaphone"
replace: false
type: "phonetic"
name_ngram:
type: "nGram"
min_gram: 2
max_gram: 4
client: default
finder: ~
types:
person:
mappings:
name: ~
nameNGram:
analyzer: ngram_analyzer
namePhonetic:
analyzer: phonetic_analyzer
When I check the mapping it looks good:
{
"website" : {
"mappings" : {
"person" : {
"_meta" : {
"model" : "acme\\websiteBundle\\Entity\\Person"
},
"properties" : {
"name" : {
"type" : "string",
"store" : true
},
"nameNGram" : {
"type" : "string",
"store" : true,
"analyzer" : "ngram_analyzer"
},
"namePhonetic" : {
"type" : "string",
"store" : true,
"analyzer" : "phonetic_analyzer"
}
}
}
}
}
}
When I GET the document, I see that all three fields are stored in plain text... maybe i need to set STORE: FALSE for these extra fields, or, is it not being analyzed properly?
{
"_index" : "website",
"_type" : "person",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source":{
"name":"John Doe",
"namePhonetic":"John Doe",
"nameNGram":"John Doe"
}
}
EDIT: The solution I'm currently using, which still requires some refinement but tests well for most names
//Create the query object
$boolQuery = new \Elastica\Query\Bool();
//Boost exact name matches
$exactMatchQuery = new \Elastica\Query\Match();
$exactMatchQuery->setFieldParam('name', 'query', $name);
$exactMatchQuery->setFieldParam('name', 'boost', 10);
$boolQuery->addShould($exactMatchQuery);
//Create a basic Levenshtein distance query
$levenshteinMatchQuery = new \Elastica\Query\Match();
$levenshteinMatchQuery->setFieldParam('name', 'query', $name);
$levenshteinMatchQuery->setFieldParam('name', 'fuzziness', 1);
$boolQuery->addShould($levenshteinMatchQuery);
//Create a phonetic query, seeing if the name SOUNDS LIKE the name that was searched
$phoneticMatchQuery = new \Elastica\Query\Match();
$phoneticMatchQuery->setFieldParam('namePhonetic', 'query', $name);
$boolQuery->addShould($phoneticMatchQuery);
//Create an NGRAM query
$nGramMatchQuery = new \Elastica\Query\Match();
$nGramMatchQuery->setFieldParam('nameNGram', 'query', $name);
$nGramMatchQuery->setFieldParam('nameNGram', 'boost', 2);
$boolQuery->addMust($nGramMatchQuery);
return $boolQuery;
Upvotes: 0
Views: 1714
Reputation: 3573
No, you can't have multiple analyzers on a single field. The way you are doing is correct way of applying multiple analyzers by having different field names for same field.
The reason you are getting namePhonetic and nameNGram also in _source field is use of
"store" : true
It tells the ElasticSearch that you need those extra fields also in response. Use
"store" : false
that will solve your problem.
If you want to see the analyzed data on a field you can use _analyze api of elasticsearch.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-analyze.html
Yes, these fields are stored in inverted index after analysis.
I hope I have answered all your doubts. Please let me know if you need more help on this.
Thanks
Upvotes: 1