Sorting a string field doesn't sort correctly

Question

If I sort a string field called code I get the following resutls:

    {code:ABC-120GB}
    {code:ABC-120GBY}
    {code:ABC-120GY}
    {code:ABC-120G}
    {code:ABC-120GB}
    {code:ABC-120GBY}
    {code:ABC-120GY}

These are the configuration for the mentioned field from the schema.xml file:

frances · Accepted Answer

It looks like the first level sort on the code_length is working. Does the sort on code work if it's the only sort specified? My suspicion is that you would see the same issues with the sort on the code field if it were the only field you were using for sort.

It seems quite likely that the problem you're seeing is caused by variations in the data we can't see because you haven't included the real data. It would be interesting to see if you can recreate this problem with the data you actually posted, or other non-sensitive values. For one, I would suspect invisible variations in the character encoding. If that's the case, you could try modifying the code field to be a single-token text-based field rather than an unmodified string field. Then, you have the choice of various filters to add to that fieldType that could normalize encoding variations. A good filter to consider is the ICU Folding Filter which manages a lot of normalization and can be added to your fieldType definition with this line:

You could consider this definition for a fieldType called "exact" that might work for you. By "tokenizing" the value into one large token, it preserves the exact-match searching you have now with the code field having type="string", and it will make Solr happy by only having one token on a sort field as well.

Then you'd change your code field definition to:

Of course, this is somewhat speculative since I can't know what the data I'm not seeing might show me, and the ICU Folding Filter cannot adjust for everything that might be causing your trouble. But I hope this helps.

Note: because sorting doesn't work on multiValued fields, I recommend explicitly specifying multiValued="false" for any fields you plan to use for sorting. It will be less ambiguous.

Sorting a string field doesn't sort correctly

Answers (1)

Related Questions

Sorting a string field doesn&#39;t sort correctly

Answers (1)

Related Questions

Sorting a string field doesn't sort correctly