bigerock
bigerock

Reputation: 713

how to add a full phrase tokenizer in Nest for Elasticsearch?

when i create a search using facets, i want the facet results to be on the whole phrase, not the individual word. and i want it NOT to be case sensitive - as 'not_analyzed' would do.

for example, if i have a music json object and want to organize the facet results based on genre, i want to see each genre as the whole genre term (rhythm and blues) and not one facet for 'rhythm' and one for 'blues', and i want to be able to search on 'rhythm and blues' and have it match 'Rhythm and Blues' (notice case).

it seems the elasticsearch documentation suggests using a custom analyzer of a tokenizer and lowercase filter.

here's the suggestion from elasticsearch i mentioned: (mid-page) http://www.elasticsearch.org/blog/starts-with-phrase-matching/

I want to be able to say something like (in my POCO in pseudo code):

[ElasticProperty(Analyzer = "tokenizer, lowercase"]
public string Genre { get; set; }

Upvotes: 4

Views: 3949

Answers (1)

Greg Marzouka
Greg Marzouka

Reputation: 3325

Use the multi field type in your mapping. Doing so will allow you to index the Genre field in two ways- analyzed (using the standard or lowercase analyzer) for conducting searches, and not_analyzed for faceting.

For more advanced mappings like this, the attribute based mapping in NEST won't cut it. You'll have to use the fluent API, for example:

client.CreatIndex("songs", c => c
.AddMapping<Song>(m => m
    .MapFromAttributes()
    .Properties(props => props
        .MultiField(mf => mf
            .Name(s => s.Genre)
            .Fields(f => f
                .String(s => s.Name(o => o.Genre).Analyzer("standard"))
                .String(s => s.Name(o => o.Genre.Suffix("raw")).Index(FieldIndexOption.not_analyzed)))))));

Hope this helps!

Upvotes: 1

Related Questions