Jura Khrapunov
Jura Khrapunov

Reputation: 1024

How to MapReduce collection to get joined array of all values from the field containing array

I have the following schema in the database:

{ 
    id: 12345; 
    friends: [123,345,678,908]
},
{ 
    id: 908; 
    friends: [123,345]
}

Is there a way to get an array of all unique friends IDs from the entire collection?

Upvotes: 0

Views: 79

Answers (2)

Asya Kamsky
Asya Kamsky

Reputation: 42352

To get distinct friends values you do not need to write map/reduce job.

Just run:

> db.collection.distinct("friends")
[ 123, 345, 678, 908 ]

Upvotes: 1

Quetzalcoatl
Quetzalcoatl

Reputation: 3067

I'm not too familiar with MongoDB's MapReduce implementation but I imagine you could have your mappers write out the values passed to them as keys, and simply use null values.

This way you can ensure the reducers only will receive a given key (your friend IDs) once, and you can simply write that out just once without iterating over the values. As the values are null anyway, there is no point to iterating (not to mention that if you iterate you will write out the keys multiple times, you just want it written once to ensure it is distinct.)

However, bear in mind that your keys will be spread across the reducers output files, e.g. reducer 1 might output 123 and reducer 2 might output 345 so you may have to consolidate the output files' contents afterwards in order to construct your array.

Upvotes: 0

Related Questions