Reputation: 1771
I've sucessfully used Mongodb and MapReduce to aggregate a sales statement of over 5million lines down to around 60k. Very happy with the results but am puzzled by one line in the results where is seems to have concatenated the result instead of summing it, so i've ended up with a value field containing " 0.00590100000000.0059010.133150000000.0053100.002960.043208000.00189"
Has anyone experienced this before?
On close analysis of the lines involved in the raw statement, I can't see anything that would have caused it as they appear to be exactly the same. There are even values for the same identifier that have been summed.
My code is as follows, can anyone spot anything that might be causing it? Like I say, it's only 7 lines from a raw statement of 5.2 million so accuracy is pretty good, it's just not spot on and I know it will bug me.
mongoimport -d test -c sales --type csv --file sales_rawdata.csv --headerline
var mapFunction1 = function() {
emit({video_id: this.video_id, isrc: this.isrc, country: this.country}, this.amount_payable);
};
var reduceFunction1 = function(keyIsrc, valuesAmountPayable) {
return Array.sum(valuesAmountPayable);
};
db.sales.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "sales_total_by_country_and_isrc" }
)
db.sales_total_by_country_and_isrc.find()
mongoexport --csv -d test -c sales_total_by_country_and_isrc -q '{value: {$ne: 0}}' -f "_id.video_id","_id.isrc","_id.country","value" -o sales_total_by_country_and_isrc.csv
Upvotes: 0
Views: 120
Reputation: 3012
It may be that one of your amount_payable value was stored as a string. If so, then Array.sum will concatenate as a sum.
You could test using:
db.sales_total_by_country_and_isrc.find( { video_id: <the video_id in question>,
isrc: <the isrc in question>,
country: <the country in question>,
amount_payable: {$type: 2 }
} )
where $type: 2 would check for String type.
Upvotes: 1