Sachin
Sachin

Reputation: 3544

Why sorting in elasticSearch does not sort data properly?

First Question:- I have data of size around 45000. I want to sort that data on chrom and pos key. I have written the query to sort data shown below.

   //The below script sort the chromosomes 
   SortBuilder builder=new ScriptSortBuilder("s = doc['chrom'].value; s=s.substring(3); s.indexOf('X')!=-1?23:s.indexOf('Y')!=-1?24:s.indexOf('MT')!=-1?25:s.indexOf('M')!=-1?25:s;" +
                    "n = org.elasticsearch.common.primitives.Ints.tryParse(s); if (n != null) { String.format(\"%010d\",n)} else { s }", String.class.getSimpleName().toLowerCase());
   SearchRequestBuilder setQuery = this.getClient().prepareSearch(this.getIndex()).setTypes(this.getType())
                    .addSort(builder)
                    .addSort(Keys.POS.toLowerCase(),SortOrder.ASC).
                    setQuery(QueryBuilders.matchQuery(Keys.SAMPLE_ID_DB_KEY, entityID.toLowerCase())).setSize(100).setSearchType(SearchType.QUERY_AND_FETCH).setScroll(new TimeValue(60000000));

However, after firing the query I received multiple bunch of data. Where bunch is sorted but irrespective of data in other bunch.(i.e. If there is entry of 1:11111 present in 1st bunch then there can be entry in second bunch having value less than 1:11111 ).

Am I missing somthing?

Second question:- When I do not specify the size in query it does not returns me all 45000 entries. Why is it so?

Edit
Data in JSON format

{
  "chrom": "chr1",
  "pos": 762273,
  "isIndel": false,
  "interpretation": "",
  "sampleID": "xyz",
  "isSignedOff": false,
  "ownerID": null,
  "entityType": 0
}

Upvotes: 0

Views: 1618

Answers (1)

imotov
imotov

Reputation: 30163

Switch to SearchType.QUERY_THEN_FETCH instead of SearchType.QUERY_AND_FETCH.

Upvotes: 1

Related Questions