Reputation: 3511
Map:
function () {
emit(this.thread,
{max_year:this.date.getFullYear(),
min_year:this.date.getFullYear(),
max_month:this.date.getMonth(),
min_month:this.date.getMonth(),count:1});
};
Reduce:
function (key, values) {
max_year= values[0].max_year;
min_year = values[0].min_year;
max_month= values[0].max_month;
min_month = values[0].min_month;
var sum = 0;
if (values.length > 1){
for(i in values){
if(values[i].max_year > max_year){
max_year = values[i].max_year;
};
if(values[i].min_year < min_year){
min_year = values[i].min_year;
};
if(values[i].max_month > max_month){
max_month = values[i].max_month;
};
if(values[i].min_month < min_month){
min_month = values[i].min_month;
};
sum+=values[i].count
};
};
return {"max year":max_year, "min year":min_year, "max month":max_month, "min month":min_month, "No of posts": sum};
}
};
output:
{u'_id': u'Sujet Top 5 TED POST', u'value': {u'No of posts': 8.0, u'min month': 0.0, u'max month': 6.0, u'max year': 2011.0, u'min year': 2010.0}}
{u'_id': u'Sujet Top 5 des meilleurs guitaristes de lhistoire du Rock', u'value': {u'No of posts': 42.0, u'min month': 2.0, u'max month': 10.0, u'max year': 2011.0, u'min year': 2009.0}}
{u'_id': u'Sujet Top ALEJANDRO GONZALEZ INARRITU', u'value': {u'No of posts': 29.0, u'min month': 0.0, u'max month': 9.0, u'max year': 2011.0, u'min year': 2008.0}}
{u'_id': u'Sujet Top ANDY et LARRY WACHOWSKY', u'value': {u'No of posts': 40.0, u'min month': 0.0, u'max month': 11.0, u'max year': 2011.0, u'min year': 2008.0}}
{u'_id': u'Sujet Top BRYAN SINGER', u'value': {u'No of posts': 50.0, u'min month': 0.0, u'max month': 11.0, u'max year': 2011.0, u'min year': 2006.0}}
{u'_id': u'Sujet Top Cinma 2010', u'value': {u'No of posts': nan, u'min month': None, u'max month': None, u'max year': None, u'min year': None}}
{u'_id': u'Sujet Top Cinma 2011', u'value': {u'No of posts': nan, u'min month': None, u'max month': None, u'max year': None, u'min year': None}}
As you can see, for some of the field ("no of posts") it prints 'Nan' and non for other fields. This doesn't occur when I Map Reduce just to count the number of posts without trying to work on the timestamps. I also notice that, Nan is being printed when "no of posts" is large (around 1000 or so). Also, without the 'count' and 'sum' all the manipulations on max year, min year and month are good. Thank you.
Upvotes: 1
Views: 342
Reputation: 26258
Your reduce function needs to return a value in the same format as the second argument to emit()
-- because of the way MongoDB Map-Reduce works, the results of a reduce function may be passed in to reduce again. I suspect that this is where the nan
and None
are coming from. Specifically here, you just need to adjust the key names in the object you return from your reduce: for instance, rather than "max year"
(in reduce) you should use max_year
.
For more on writing correct map and reduce functions, see the MongoDB Map-Reduce documentation.
Upvotes: 2