tylerl
tylerl

Reputation: 1180

MongoDB - Get Names of All Keys Matching Criteria in a Collection

As the title says, I need to retrieve the names of all the keys in my MongoDB collection, BUT I need them split up based on a key/value pair that each document has. Here's my clunky analogy: If you imagine the original collection is a zoo, I need a new collection that contains all the keys Zebras have, all the keys Lions have, and all the keys Giraffes have. The different animal types share many of the same keys, but those keys are meant to be specific to each type of animal (because the user needs to be able to (for example) search for Zebras taller than 3ft and giraffes shorter than 10ft).

Here's a bit of example code that I ran which worked well - it grabbed all the unique keys in my entire collection and threw them into their own collection:

db.runCommand({
  "mapreduce" : "MyZoo",
  "map" : function() {
    for (var key in this) { emit(key, null); }
  },
  "reduce" : function(key, stuff) { return null; },
  "out": "MyZoo" + "_keys"
})

I'd like a version of this command that would look through the MyZoo collection for animals with "type":"zebra", find all the unique keys, and place them in a new collection (MyZoo_keys) - then do the same thing for "type":"lion" & "type":"giraffe", giving each "type" its own array of keys.

Here's the collection I'm starting with:

{
    "name": "Zebra1",
    "height": "300",
    "weight": "900",
    "type": "zebra"
    "zebraSpecific1": "somevalue"
},
{
    "name": "Lion1",
    "height": "325",
    "weight": "1200",
    "type": "lion",
},
{
    "name": "Zebra2",
    "height": "500",
    "weight": "2100",
    "type": "zebra",
    "zebraSpecific2": "somevalue"
},
{
    "name": "Giraffe",
    "height": "4800",
    "weight": "2400",
    "type": "giraffe"
    "giraffeSpecific1": "somevalue",
    "giraffeSpecific2": "someothervalue"
}

And here's what I'd like the MyZoo_keys collection to look like:

{
    "zebra": [
        {
            "name": null,
            "height": null,
            "weight": null,
            "type": null,
            "zebraSpecific1": null,
            "zebraSpecific2": null
        }
    ],
    "lion": [
        {
            "name": null,
            "height": null,
            "weight": null,
            "type": null
        }
    ],
    "giraffe": [
        {
            "name": null,
            "height": null,
            "weight": null,
            "type": null,
            "giraffeSpecific1": null,
            "giraffeSpecific2": null
        }
    ]
}

That's probably imperfect JSON, but you get the idea...

Thanks!

Upvotes: 0

Views: 356

Answers (1)

BatScream
BatScream

Reputation: 19700

You can modify your code to dump the results in a more readable and organized format.

The map function:

  • Emit the type of animal as key, and an array of keys for each animal(document). Leave out the _id field.

Code:

var map = function(){
var keys = [];
Object.keys(this).forEach(function(k){
    if(k != "_id"){
    keys.push(k);
    }
})
emit(this.type,{"keys":keys});
}

The reduce function:

  • For each type of animal, consolidate and return the unique keys.
  • Use an Object(uniqueKeys) to check for duplicates, this increases the running time even if it occupies some memory. The look up is O(1).

Code:

var reduce = function(key,values){
    var uniqueKeys = {};
    var result = [];
    values.forEach(function(value){
        value.keys.forEach(function(k){
            if(!uniqueKeys[k]){
                 uniqueKeys[k] = 1;
                 result.push(k);
             }
        })
    })
    return {"keys":result};
}

Invoking Map-Reduce:

db.collection.mapReduce(map,reduce,{out:"t1"});

Aggregating the result:

db.t1.aggregate([
{$project:{"_id":0,"animal":"$_id","keys":"$value.keys"}}
])

Sample o/p:

{
        "animal" : "lion",
        "keys" : [
                "name",
                "height",
                "weight",
                "type"
        ]
}
{
        "animal" : "zebra",
        "keys" : [
                "name",
                "height",
                "weight",
                "type",
                "zebraSpecific1",
                "zebraSpecific2"
        ]
}
{
        "animal" : "giraffe",
        "keys" : [
                "name",
                "height",
                "weight",
                "type",
                "giraffeSpecific1",
                "giraffeSpecific2"
        ]
}   

Upvotes: 2

Related Questions