Reputation: 1920
I have collections of twitter tweets of my own stored in mongodb using twitter api 'statuses/user_timeline'. Am trying to get the retweets count I've got on the tweets posted by me using MongoDb MapReduce method but not able to get it. Can anyone please help me out.
Sample Data: this is the format of document stored in mongodb
{
"_id" : ObjectId("570664d7a9c29761168b4587"),
"created_at" : "Thu Sep 17 01:17:28 +0000 2015",
"id" : NumberLong("644319222886039556"),
"id_str" : "644319222886039556",
"text" : "Be silent or let your words be worth more than you silence.",
"entities" : {
"hashtags" : [ ],
"symbols" : [ ],
"user_mentions" : [ ],
"urls" : [ ]
},
"truncated" : false,
"source" : "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
"in_reply_to_status_id" : null,
"in_reply_to_status_id_str" : null,
"in_reply_to_user_id" : null,
"in_reply_to_user_id_str" : null,
"in_reply_to_screen_name" : null,
"user" : {
// Here is the user information who tweeted
"id" : NumberLong(xxxxxxxxxxxxxxxxx),
"id_str" : "xxxxxxxxx",
"name" : "Haridarshan Gorana",
"screen_name" : "haridarshan2901"
},
"geo" : null,
"coordinates" : null,
"place" : null,
"contributors" : null,
"is_quote_status" : false,
"retweet_count" : NumberLong(1),
"favorite_count" : NumberLong(0),
"favorited" : false,
"retweeted" : false,
"lang" : "en"
}
Code:
$map = new \MongoCode("function() { emit(this.id_str, this.retweet_count); }");
$out = "retweets";
$reduce = new \MongoCode('function(key, values) {
var retweets = 0;
for(i=0;i<values.length;i++){
if( values[i].retweet_count > 0 ){
retweets += values[i].retweet_count;
}
}
return retweets;
}');
$verbose = true;
$cmd = array(
"map" => $map,
"reduce" => $reduce,
"query" => $query,
"out" => "retweets",
"verbose" => true
);
$result = $db->command($cmd);
print_r($result);
this gives me this error
Fatal error: Call to a member function command() on null
Same code I've tried to run on mongo client
var mapFunction1 = function() {
emit(this.id_str, this.retweet_count);
}
var reduceFunction1 = function(id, values) {
var retweet = 0;
for(i=0;i<values.length;i++){
if(values[i].retweet_count > 0) {
retweet += values[i].retweet_count;
}
}
return retweet;
}
db.tweets.mapReduce(
mapFunction1,
reduceFunction1,
{
query: {
user: { id: xxxxxxxxx }
},
out: "retweets",
verbose: true
}
)
Output from console
{
"result" : "retweets",
"timeMillis" : 12,
"timing" : {
"mapTime" : 0,
"emitLoop" : 8,
"reduceTime" : 0,
"mode" : "mixed",
"total" : 12
},
"counts" : {
"input" : 0,
"emit" : 0,
"reduce" : 0,
"output" : 0
},
"ok" : 1
}
Upvotes: 2
Views: 632
Reputation: 151112
Your reducer is trying to call a property retweet_count
when all that is there is just a "value" with no other property. You already referenced that in the mapper.
Actually your reduce can simply be:
function(key,values) {
return Array.sum(values)
}
But you would be better off simply using .aggregate()
for this. Not only is it more simple, but it's just going to run a lot faster:
db.tweets.aggregate([
{ "$group": {
"_id": "$user.id_str",
"retweets": { "$sum": "$retweet_count" }
}}
])
Or for PHP
$collection->aggregate(
array(
'$group' => array(
'_id' => '$user.id_str',
'retweets' => array( '$sum' => '$retweet_count' )
)
)
)
If you want to add a "query" to that then add a $match
pipeline stage at the begining. i.e.
$collection->aggregate(
array(
'$match' => array(
'user.id_str' => 'xxxxxxxxx'
)
),
array(
'$group' => array(
'_id' => '$user.id_str',
'retweets' => array( '$sum' => '$retweet_count' )
)
)
)
You should really only use mapReduce
when the structure actually needs JavaScript control for processing.
Upvotes: 3