Andrei
Andrei

Reputation: 115

Elastic search case insensitive

I have the following annotation based elastic search configuration, I've set the index not to be analyzed because I don't want these fields to be tokenized:

    @Document(indexName = "abc", type = "efg")
    public class ResourceElasticSearch {
    @Id
    private String id;
    @Field(type = FieldType.String, index = FieldIndex.not_analyzed)
    private String name;
    @Field(type = FieldType.String, store = true)
    private List<String> tags = new ArrayList<>();
    @Field(type = FieldType.String)
    private String clientId;
    @Field(type = FieldType.String, index = FieldIndex.not_analyzed)
    private String virtualPath;
    @Field(type = FieldType.Date)
    private Date lastModifiedTime;
    @Field(type = FieldType.Date)
    private Date lastQueryTime;
    @Field(type = FieldType.String)
    private String modificationId;
    @Field(type = FieldType.String)
    private String realPath;
    @Field(type = FieldType.String)
    private String extension;
    @Field(type = FieldType.String)
    private ResourceType type;

Is it possible by using annotations to make the searches on the name, virtualPath and tags to be case-insensitive? The search looks like this, search by wildcard is required:

private QueryBuilder getQueryBuilderForSearch(SearchCriteria criteria) {
    String virtualPath = criteria.getPath();

    return boolQuery()
            .must(wildcardQuery("virtualPath", virtualPath))
            .must(wildcardQuery("name", criteria.getName()));
}

Upvotes: 2

Views: 3459

Answers (3)

Alex
Alex

Reputation: 1

You can add @Setting, which consumes file path, after @Document, settings file should contain json like this: {"analysis":{"analyzer":{"case_insensitive":{"type":"custom","tokenizer":"whitespace","char_filter":["html_strip"],"filter":["lowercase","asciifolding"]}}}} and field annotation with analyzer @Field(type = FieldType.Keyword, analyzer = "case_insensitive")

Upvotes: 0

Andrei Stefan
Andrei Stefan

Reputation: 52366

Not really possible what you want to do and it's not about Spring Data configuration, it's about Elasticsearch itself: you indexed data as not_analyzed and it will stay that way.

Also, if you wanted case insensitive data I suggest indexing with keyword analyzer combined with a lowercase token filter.

Upvotes: 6

Andrei
Andrei

Reputation: 115

I've found something based on Andrei Stefan's suggestion which has a similar result to using the annotations:

    @Bean
    public Client client() throws IOException {
    TransportClient client = new TransportClient();
    TransportAddress address = new InetSocketTransportAddress(env.getProperty("elasticsearch.host"), Integer.parseInt(env.getProperty("elasticsearch.port")));
    client.addTransportAddress(address);

    XContentBuilder settingsBuilder = XContentFactory.jsonBuilder()
            .startObject()
            .startObject("analysis")
            .startObject("analyzer")
            .startObject("keyword")
            .field("tokenizer", "keyword")
            .array("filter", "lowercase")
            .endObject()
            .endObject()
            .endObject()
            .endObject();
    if (!client.admin().indices().prepareExists("abc").execute().actionGet().isExists()) {
        client.admin().indices().prepareCreate("abc").setSettings(settingsBuilder).get();
    }
       return client;
    }

Upvotes: 1

Related Questions