Kaman Wu
Kaman Wu

Reputation: 199

$lookup with case sensitive fields

I know how to do the case insensitive search.

example :

collection member looks like:

{
"_id" : ObjectId("5b439c02a5439e00568291fe"),
"members" : [ 
    {
        "email" : "[email protected]"
    }, 
    {
        "email" : "[email protected]"
    }, 
    {
        "email" : "[email protected]"
    }
],
"groupname" : "abuse",
"active" : true,
"createdAt" : ISODate("2018-07-09T17:31:46.098Z"),
"updatedAt" : ISODate("2018-07-09T17:31:46.098Z"),
"__v" : 0
}

collection member_user looks like:

{
"_id" : ObjectId("5a7dfce545e13f0233b2e451"),
"email" : "[email protected]",
"__v" : 0
}

{
"_id" : ObjectId("5b43ac89bfd90aa9cc4a68c2"),
"email" : "[email protected]",
"__v" : 0
}

This is code I use:

    db.getCollection('member').aggregate([
    {$unwind: "$members"},
    {$lookup: {'from': 'member_user', 'localField': 'members.email', 'foreignField': 'email', 'as': 'users'}},
    {$unwind: {path: '$users', preserveNullAndEmptyArrays: true}},
    {$match : { users: { $exists: false } }},
], 
{
    collation: {
        locale: 'en_US',
        strength: 1
    }
}
    );

when the dataset is small, the speed always very fast. BUT I have 24K members and 7K users. As I said, when {$match : { users: { $exists: true } }}, speed is OK, but when {$match : { users: { $exists: false } }}, speed almost 60 times. (1 sec vs 1 min)

Without collation, the speed is always fast

Upvotes: 3

Views: 2416

Answers (1)

Ashh
Ashh

Reputation: 46451

You need to convert the localField and foreignField into lower case using $toLower aggregation then you can easily match your fields.

db.collection.aggregate([
  { "$unwind": "$members" },
  { "$addFields": { "members.email": { "$toLower": "$members.email" }}},
  { "$lookup": {
    "from": "users",
    "let": { "membersEmail": "$members.email" },
    "pipeline": [
      { "$addFields": { "email": { "$toLower": "$email" }}},
      { "$match": { "$expr": { "$eq": [ "$email", "$$membersEmail" ] } } }
    ],
    "as": "members.email"
  }},
  { "$unwind": "$members.email" },
  {  "$group": {
    "_id": "$_id",
    "groupname": { "$first": "$groupname" },
    "active": { "$first": "$active" },
    "members": { "$push": "$members.email" }
  }}
])

Upvotes: 3

Related Questions