Pedro Dusso
Pedro Dusso

Reputation: 2140

Mongo unique index case insensitive

@CompoundIndexes({
    @CompoundIndex(name = "fertilizer_idx",
        unique = true,
        def = "{'name': 1, 'formula': 1, 'type': 1}")
})
public class Fertilizer extends Element implements Serializable {
//class stuff
}

Is it possible to create the index case insensitive? Right now it is differentiating from NAME to NAMe. Saving a second field lowercase (or uppercase) is not a possibility for me.

Thanks, Pedro

Upvotes: 15

Views: 16594

Answers (5)

Dhaval Simaria
Dhaval Simaria

Reputation: 1964

Spring Data Mongo2.2 provides 'Annotation-based Collation support through @Document and @Query.'

Ref. What's new in Spring Data Mongo2.2

@Document(collection = 'fertilizer', collation = "{'locale':'en', 'strength':2}")
public class Fertilizer extends Element implements Serializable {

    @Indexed(unique = true)
    private String name;
    //class stuff
}

When the application is started, it will create the indexes along with respective collation for every document.

Collation changes at DB level based on Java code

Upvotes: 6

Ivan Osipov
Ivan Osipov

Reputation: 792

As mentioned above by Shaishab Roy you should use collation.strength

There are no way to define that with annotations of spring data

But you can implement it manually. To implement this behavior with your spring application you should create event listener to listen that your application is ready, inject MongoOperations bean and define index like in example below:

@Configuration
public class MongoConfig {
  @Autowired
  private MongoOperations mongoOperations;

  @EventListener(ApplicationReadyEvent.class)
  public void initMongo() {
    mongoOperations
    .indexOps(YourCollectionClass.class)
    .ensureIndex(
        new Index()
            .on("indexing_field_name", Sort.Direction.ASC)
            .unique()
            .collation(Collation.of("en").strength(2)));
  }
}

Upvotes: 0

Shaishab Roy
Shaishab Roy

Reputation: 16805

Prior of MongoDB version 3.4 we were unable to create index with case insensitive.

In version 3.4 has collation option that allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.

The collation option has the following syntax:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

where the locale field is mandatory; all other fields are optional.

To create index with case insensitive we need to use mandatory field locale and strength field for string comparison level. strength allows value rage 1 - 5. read more about collation

The strength attribute determines whether accents or case are taken into account when collating or matching text

Example:

if strength=1 then role = Role = rôle

if strength=2 then role = Role < rôle

if strength=3 then role < Role < rôle

Comparison level doc

So we need to use strength=2 to create index. like:

db.collectionName.createIndex(
  { name: 1, formula: 1, type: 1 },
  { 
    name: "fertilizer_idx",
    collation: {locale: "en", strength: 2},
    unique: true
  }
)

N.B: collation option is not available for text indexes.

Upvotes: 24

Joe Stogner
Joe Stogner

Reputation: 23

db.collection.createIndex(
{ name: 1, formula: 1, type: 1 },
{ name: "fertilizer_idx", unique: true, collation:{ locale: "en", strength: 2 } }
)

Use collation as an option for db.collection.createIndex()

more info here: https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/

here for locale/language information: https://docs.mongodb.com/manual/reference/collation-locales-defaults/#collation-languages-locales

strength: integer

Optional. The level of comparison to perform. Possible values are:

1: Primary level of comparison. Collation performs comparisons of the base characters only, ignoring other differences such as diacritics and case.

2: Secondary level of comparison. Collation performs comparisons up to secondary differences, such as diacritics. That is, collation performs comparisons of base characters (primary differences) and diacritics (secondary differences). Differences between base characters takes precedence over secondary differences.

3: Tertiary level of comparison. Collation performs comparisons up to tertiary differences, such as case and letter variants. That is, collation performs comparisons of base characters (primary differences), diacritics (secondary differences), and case and variants (tertiary differences). Differences between base characters takes precedence over secondary differences, which takes precedence over tertiary differences. This is the default level.

4: Quaternary Level. Limited for specific use case to consider punctuation when levels 1-3 ignore punctuation or for processing Japanese text.

5: Identical Level. Limited for specific use case of tie breaker.

Mongo 3.4 has collation, which allows users to specify language-specific rules for string comparison

Collation includes:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

Upvotes: 1

felix
felix

Reputation: 9285

Yes, it is now available in MongoDB 3.4 with the new collation feature.

you can create a case insensitive index like this:

db.collection.createIndex({
   name:1,
   formula:1,
   type:1
},
{
   collation:{
      locale:"en",
      strength:2
   }
});

where the strength attribute is the comparaison level

you can then get case insensitive match with this query:

db.collection.find({name: "name"}).collation({locale: "en", strength: 2});

see collation for details

if you upgraded to mongodb 3.4 from a previous version, you may need to set compatibility before creating the index like this

db.adminCommand( { setFeatureCompatibilityVersion: "3.4" } )

Upvotes: 9

Related Questions