y0ft88
y0ft88

Reputation: 73

Pentaho Kettle (Data Intergration) MongoDB Aggregation

I'm using kettle v5.2 which support the aggregation pipleline in MongoDB when using MongoDB input the query works for small data set but I need to use option allowDiskUse to the query can't figure how to add this in pentaho while I tested this option in mongo shell and it's working as expected

http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/

http://wiki.pentaho.com/display/EAI/MongoDB+Input#MongoDBInput-queryaggpipeline

this works

[ {$unwind: "$friends"}, {$group : { '_id' : '$friends.id', name: {'$first': '$friends.name'} ,count: {$sum:1} } } ,{$sort: {count: -1}}, {$limit: 100} ]

this doesn't

[ {$unwind: "$friends"}, {$group : { '_id' : '$friends.id', name: {'$first': '$friends.name'} ,count: {$sum:1} } } ,{$sort: {count: -1}}, {$limit: 100} ] , {allowDiskUse: true}

Upvotes: 1

Views: 1370

Answers (3)

Juan Salinas
Juan Salinas

Reputation: 1

check the option "Query is aggregation pipeline"

Code

Upvotes: 0

Starfight
Starfight

Reputation: 186

If you look at the class who parse the pipeline, you can go up to see that Pentaho use MongoDB class java DBCollection with a deprecated function instead of this aggregate :

public Cursor aggregate(List<DBObject> pipeline,
                        AggregationOptions options)

So unfortunately options are not available in Pentaho Mongo Input.

Upvotes: 1

user3652621
user3652621

Reputation: 3634

Having you tried checking the "Query is aggregation pipeline" box in the Query tab on the MongoDB input step?

Upvotes: 0

Related Questions