Raoot
Raoot

Reputation: 1771

Mongodb - MapReduce concatenated values instead of summed

I've sucessfully used Mongodb and MapReduce to aggregate a sales statement of over 5million lines down to around 60k. Very happy with the results but am puzzled by one line in the results where is seems to have concatenated the result instead of summing it, so i've ended up with a value field containing " 0.00590100000000.0059010.133150000000.0053100.002960.043208000.00189"

Has anyone experienced this before?

On close analysis of the lines involved in the raw statement, I can't see anything that would have caused it as they appear to be exactly the same. There are even values for the same identifier that have been summed.

My code is as follows, can anyone spot anything that might be causing it? Like I say, it's only 7 lines from a raw statement of 5.2 million so accuracy is pretty good, it's just not spot on and I know it will bug me.

mongoimport -d test -c sales --type csv --file sales_rawdata.csv --headerline

var mapFunction1 = function() {
                       emit({video_id: this.video_id, isrc: this.isrc, country: this.country}, this.amount_payable);
                   };


var reduceFunction1 = function(keyIsrc, valuesAmountPayable) {
                        return Array.sum(valuesAmountPayable);
                    };

db.sales.mapReduce(
        mapFunction1,
        reduceFunction1,
            { out: "sales_total_by_country_and_isrc" }
                )

db.sales_total_by_country_and_isrc.find()           

mongoexport --csv -d test -c sales_total_by_country_and_isrc -q '{value: {$ne: 0}}' -f "_id.video_id","_id.isrc","_id.country","value" -o sales_total_by_country_and_isrc.csv

Upvotes: 0

Views: 120

Answers (1)

Kay
Kay

Reputation: 3012

It may be that one of your amount_payable value was stored as a string. If so, then Array.sum will concatenate as a sum.

You could test using:

db.sales_total_by_country_and_isrc.find( { video_id: <the video_id in question>,
                                           isrc: <the isrc in question>,
                                           country: <the country in question>,
                                           amount_payable: {$type: 2 }
                                       } )

where $type: 2 would check for String type.

Upvotes: 1

Related Questions