kishore
kishore

Reputation: 31

How to Find Duplicate Elements in a array (One Document)?

How do I return duplicate elements in an array (ignoring case) present in MongoDB?

Input (document in MongoDB)

 {
    "userID" : "USER001",
    "userName" : "manish",
    "collegeIDs" : [ 
        "COL_HARY", 
        "COL_MARY", 
        "COL_JOHNS", 
        "COL_CAS", 
        "COL_JAMES", 
        "col_mary", 
        "COL_JOHNS", 
        "COL_JOHNS"
    ]
}

Expected Output:

{ "collegeIDs" : ["COL_MARY", "col_mary", "COL_JOHNS"] }

Note:

  1. It should only check the content and not the case.
  2. Return all the possibilities in the list

What I have tried

db.myList.aggregate([
  {"$project": {"collegeIDs":1}},
  {"$unwind":"$collegeIDs"},
  {"$project": {"collegeIDs": {"$toLower": "$collegeIDs"}}},
  {"$group": {"_id":{"_id":"$_id", "cid":"$collegeIDs"}, "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
  {"$group": {"_id": "$_id._id", "collegeIDs": {"$addToSet":"$_id.cid"}}}
]) 

Upvotes: 3

Views: 73

Answers (1)

Himanshu Sharma
Himanshu Sharma

Reputation: 3010

The following query can get you the expected output:

db.check.aggregate([
    {
        $unwind:"$collegeIDs"
    },
    {
        $addFields:{
            "collegeIdToLower":{
                $toLower:"$collegeIDs"
            }
        }
    },
    {
        $group:{
            "_id":"$collegeIdToLower",
            "collegeIDs":{
                $addToSet:"$collegeIDs"
            },
            "count":{
                $sum:1
            }
        }
    },
    {
        $match:{
            "count":{
                $gt:1
            }
        }
    },
    {
        $unwind:"$collegeIDs"
    },  
    {
        $group:{
            "_id":"NO_ID",
            "collegeIDs":{
                $push: "$collegeIDs"
            }
        }
    },
    {
        $project:{
            "_id":0
        }
    }
]).pretty()

Output:

{
    "collegeIDs" : [
        "col_mary",
        "COL_MARY",
        "COL_JOHNS"
    ]
}

Instead of converting the actual college ID to lower string, we would add one more field for that and use for grouping.

Upvotes: 1

Related Questions