Toma
Toma

Reputation: 97

Merge two collections in MongoDB

I use MongoDB for several months but with basic usage, and it's maybe the first time that I try to use more complicated functions.

I have two collections :

collection 'trades' :

{ "_id" : ObjectId("5878ec66ad251f4fb4d2aacc"), "clubCode" : "43W0K", "date" : ISODate("2017-01-13T15:03:52.410Z"), "symbol" : "ENGI.PA", "name" : "ENGIE SA", "buyOrSell" : "buy", "orderDone" : true, "quantity" : 12, "price" : 11.99, "fees" : 0.99, "total" : 144.87, "__v" : 0 }
{ "_id" : ObjectId("5878f8c339b47f0ee4a3b80b"), "clubCode" : "43W0K", "date" : ISODate("2017-01-13T15:56:32.088Z"), "symbol" : "PRIO.PA", "name" : "Lyxor ETF PEA Brazil (Ibovespa) C-EUR", "buyOrSell" : "buy", "orderDone" : true, "quantity" : 56, "price" : 8.92, "fees" : 0.99, "total" : 500.51, "__v" : 0 }
{ "_id" : ObjectId("5878fadf39b47f0ee4a3b80c"), "clubCode" : "43W0K", "date" : ISODate("2017-01-13T16:05:35.849Z"), "symbol" : "ALVIV.PA", "name" : "Visiativ SA", "buyOrSell" : "buy", "orderDone" : true, "quantity" : 10, "price" : 18.15, "fees" : 0.99, "total" : 182.49, "__v" : 0 }
{ "_id" : ObjectId("587a03319e3fe23138119937"), "clubCode" : "43W0K", "date" : ISODate("2017-01-14T10:52:56.208Z"), "symbol" : "BIG.PA", "name" : "BigBen Interactive", "buyOrSell" : "buy", "orderDone" : false, "orderType" : "ACL", "quantity" : 83, "price" : 6.01, "fees" : 0.99, "total" : 499.82, "__v" : 0 }

collection 'subscriptions' :

{ "_id" : ObjectId("587c946f3aa3f229a0922761"), "email" : "[email protected]", "clubCode" : "43W0K", "period" : ISODate("2016-09-01T22:00:00Z"), "amount" : 100, "type" : "recurrent", "__v" : 0 }
{ "_id" : ObjectId("587c946f3aa3f229a0922762"), "email" : "[email protected]", "clubCode" : "43W0K", "period" : ISODate("2016-09-01T22:00:00Z"), "amount" : 100, "type" : "recurrent", "__v" : 0 }
{ "_id" : ObjectId("587c946f3aa3f229a0922763"), "email" : "[email protected]", "clubCode" : "43W0K", "period" : ISODate("2016-09-01T22:00:00Z"), "amount" : 100, "type" : "recurrent", "__v" : 0 }

Each are related to a clubCode (here '43W0K').

I would like to have this result (in another collection or just with an aggregate) :

{
    clubCode: "43W0K",
    treasury_moves: [
        {
            date: ISODate("2017-01-13T15:03:52.410Z"),
            wording: "Achat ENGIE SA",
            amount: 144.87
        }, {
            date: ISODate("2017-01-13T15:56:32.088Z"),
            wording: "Achat Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
            amount: 500.51
        }, {
            date: ISODate("2017-01-13T16:05:35.849Z"),
            wording: "Achat Visiativ SA",
            amount: 182.49
        }, {
            date: ISODate("2017-01-14T10:52:56.208Z"),
            wording: "Achat BigBen Interactive",
            amount: 499.82
        }, {
            date: ISODate("2016-09-01T22:00:00Z"),
            wording: "Subscription [email protected]",
            amount: 100
        }, {
            date: ISODate("2016-09-01T22:00:00Z"),
            wording: "Subscription [email protected]",
            amount: 100
        }, {
            date: ISODate("2016-09-01T22:00:00Z"),
            wording: "Subscription [email protected]",
            amount: 100
        }
    ]
}

So a merge between 'trades' and 'subscriptions' collections,

First of all, is it possible to do this ? With mapreduce ? aggregate ?

I've looking to this topic which seems similar but I don't succeed to have the result that I want.

If you can help me or just indicate me the way to follow.

Upvotes: 1

Views: 4937

Answers (1)

chridam
chridam

Reputation: 103365

Possible through the aggregate() function in the aggregation framework. You need to run the following pipeline to get the desired result:

db.trades.aggregate([
    {
        "$lookup": {
            "from": "subscriptions",
            "localField": "clubCode",
            "foreignField": "clubCode",
            "as": "subs"
        }
    },
    { "$unwind": "$subs" },
    {
        "$group": {
            "_id": "$clubCode",
            "trades": {
                "$push": {
                    "date": "$date",
                    "wording": { "$concat": ["Achat ", "$name"] },
                    "amount": "$total"
                }
            },
            "subs": {
                "$push": {
                    "date": "$subs.period",
                    "wording": { "$concat": ["Subscription ", "$subs.email"] },
                    "amount": "$subs.amount"
                }
            }
        }
    },
    {
        "$project": {
            "clubCode": "$_id",
            "_id": 0,
            "treasury_moves": { "$setUnion": ["$subs", "$trades"] }
        }
    }
])

Sample Output

/* 1 */
{
    "clubCode" : "43W0K",
    "treasury_moves" : [ 
        {
            "date" : ISODate("2017-01-13T15:03:52.410Z"),
            "wording" : "Achat ENGIE SA",
            "amount" : 144.87
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2017-01-14T10:52:56.208Z"),
            "wording" : "Achat BigBen Interactive",
            "amount" : 499.82
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2017-01-13T15:56:32.088Z"),
            "wording" : "Achat Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
            "amount" : 500.51
        }, 
        {
            "date" : ISODate("2017-01-13T16:05:35.849Z"),
            "wording" : "Achat Visiativ SA",
            "amount" : 182.49
        }
    ]
}

In the above pipeline, the first stage involves the $lookup operator. This allows you to do a "left outer join" to the other collection in the same database to filter in documents from the "joined" collection for processing. When you run a pipeline against the trades collection with just this step:

db.trades.aggregate([
    {
        "$lookup": {
            "from": "subscriptions",
            "localField": "clubCode",
            "foreignField": "clubCode",
            "as": "subs"
        }
    }
])

you will get the following result:

/* 1 */
{
    "_id" : ObjectId("5878ec66ad251f4fb4d2aacc"),
    "clubCode" : "43W0K",
    "date" : ISODate("2017-01-13T15:03:52.410Z"),
    "symbol" : "ENGI.PA",
    "name" : "ENGIE SA",
    "buyOrSell" : "buy",
    "orderDone" : true,
    "quantity" : 12,
    "price" : 11.99,
    "fees" : 0.99,
    "total" : 144.87,
    "__v" : 0,
    "subs" : [ 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922761"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922762"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922763"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("5878f8c339b47f0ee4a3b80b"),
    "clubCode" : "43W0K",
    "date" : ISODate("2017-01-13T15:56:32.088Z"),
    "symbol" : "PRIO.PA",
    "name" : "Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
    "buyOrSell" : "buy",
    "orderDone" : true,
    "quantity" : 56,
    "price" : 8.92,
    "fees" : 0.99,
    "total" : 500.51,
    "__v" : 0,
    "subs" : [ 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922761"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922762"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922763"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }
    ]
}

/* 3 */
{
    "_id" : ObjectId("5878fadf39b47f0ee4a3b80c"),
    "clubCode" : "43W0K",
    "date" : ISODate("2017-01-13T16:05:35.849Z"),
    "symbol" : "ALVIV.PA",
    "name" : "Visiativ SA",
    "buyOrSell" : "buy",
    "orderDone" : true,
    "quantity" : 10,
    "price" : 18.15,
    "fees" : 0.99,
    "total" : 182.49,
    "__v" : 0,
    "subs" : [ 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922761"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922762"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922763"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }
    ]
}

/* 4 */
{
    "_id" : ObjectId("587a03319e3fe23138119937"),
    "clubCode" : "43W0K",
    "date" : ISODate("2017-01-14T10:52:56.208Z"),
    "symbol" : "BIG.PA",
    "name" : "BigBen Interactive",
    "buyOrSell" : "buy",
    "orderDone" : false,
    "orderType" : "ACL",
    "quantity" : 83,
    "price" : 6.01,
    "fees" : 0.99,
    "total" : 499.82,
    "__v" : 0,
    "subs" : [ 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922761"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922762"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }, 
        {
            "_id" : ObjectId("587c946f3aa3f229a0922763"),
            "email" : "[email protected]",
            "clubCode" : "43W0K",
            "period" : ISODate("2016-09-01T22:00:00.000Z"),
            "amount" : 100,
            "type" : "recurrent",
            "__v" : 0
        }
    ]
}

For the next step, you will need to flatten the subs array with $unwind for processing as your next pipeline stage:

db.trades.aggregate([
    {
        "$lookup": {
            "from": "subscriptions",
            "localField": "clubCode",
            "foreignField": "clubCode",
            "as": "subs"
        }
    },
    { "$unwind": "$subs" }
])

This will produce i x j documents where i is the number of documents in the original collection (4) and j is the number of array elements (3).


The next step, $group, will then group the 12 documents by the clubCode field and create two arrays with subdocuments from the fields specified:

db.trades.aggregate([
    {
        "$lookup": {
            "from": "subscriptions",
            "localField": "clubCode",
            "foreignField": "clubCode",
            "as": "subs"
        }
    },
    { "$unwind": "$subs" }, 
    {
        "$group": {
            "_id": "$clubCode",
            "trades": {
                "$push": {
                    "date": "$date",
                    "wording": { "$concat": ["Achat ", "$name"] },
                    "amount": "$total"
                }
            },
            "subs": {
                "$push": {
                    "date": "$subs.period",
                    "wording": { "$concat": ["Subscription ", "$subs.email"] },
                    "amount": "$subs.amount"
                }
            }
        }
    }
])

Output at the pipeline

/* 1 */
{
    "_id" : "43W0K",
    "trades" : [ 
        {
            "date" : ISODate("2017-01-13T15:03:52.410Z"),
            "wording" : "Achat ENGIE SA",
            "amount" : 144.87
        }, 
        {
            "date" : ISODate("2017-01-13T15:03:52.410Z"),
            "wording" : "Achat ENGIE SA",
            "amount" : 144.87
        }, 
        {
            "date" : ISODate("2017-01-13T15:03:52.410Z"),
            "wording" : "Achat ENGIE SA",
            "amount" : 144.87
        }, 
        {
            "date" : ISODate("2017-01-13T15:56:32.088Z"),
            "wording" : "Achat Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
            "amount" : 500.51
        }, 
        {
            "date" : ISODate("2017-01-13T15:56:32.088Z"),
            "wording" : "Achat Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
            "amount" : 500.51
        }, 
        {
            "date" : ISODate("2017-01-13T15:56:32.088Z"),
            "wording" : "Achat Lyxor ETF PEA Brazil (Ibovespa) C-EUR",
            "amount" : 500.51
        }, 
        {
            "date" : ISODate("2017-01-13T16:05:35.849Z"),
            "wording" : "Achat Visiativ SA",
            "amount" : 182.49
        }, 
        {
            "date" : ISODate("2017-01-13T16:05:35.849Z"),
            "wording" : "Achat Visiativ SA",
            "amount" : 182.49
        }, 
        {
            "date" : ISODate("2017-01-13T16:05:35.849Z"),
            "wording" : "Achat Visiativ SA",
            "amount" : 182.49
        }, 
        {
            "date" : ISODate("2017-01-14T10:52:56.208Z"),
            "wording" : "Achat BigBen Interactive",
            "amount" : 499.82
        }, 
        {
            "date" : ISODate("2017-01-14T10:52:56.208Z"),
            "wording" : "Achat BigBen Interactive",
            "amount" : 499.82
        }, 
        {
            "date" : ISODate("2017-01-14T10:52:56.208Z"),
            "wording" : "Achat BigBen Interactive",
            "amount" : 499.82
        }
    ],
    "subs" : [ 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }, 
        {
            "date" : ISODate("2016-09-01T22:00:00.000Z"),
            "wording" : "Subscription [email protected]",
            "amount" : 100
        }
    ]
}

You would need to join the two arrays and ignore duplicates with the $setUnion operator in your final aggregation pipeline, $project. This will also reshape the documents by replacing the _id key from the previous pipeline with the clubCode field. Running the final pipeline with those stages will give you the desired result.

Upvotes: 3

Related Questions