Soni007
Soni007

Reputation: 103

Mongodb: how to aggregate data from different collection based on id

I have a document in which data are like
collection A

{  
 "_id" : 69.0,  
"values" : [ 
    {
        "date_data" : "2016-12-16 10:00:00",
        "valueA" : 8,
        "valuB" : 9
    }, 
    {
        "date_data" : "2016-12-16 11:00:00",
        "valueA" : 8,
        "valuB" : 9
    },.......
   }

collection B

{  
 "_id" : 69.0,  
"values" : [ 
    {
        "date_data" : "2017-12-16 10:00:00",
        "valueA" : 8,
        "valuB" : 9
    }, 
    {
        "date_data" : "2017-12-16 11:00:00",
        "valueA" : 8,
        "valuB" : 9
    },.......
   }

data is being stored at each hour, as it store in one documents, it may reach its limit 16Mb at some point, that's why i'm thinking to spread data across the years, means in one collection all the id's will hold the data on yearly basis. But when we want to show data combined, how we can use aggregate function?

For example, collectionA has data from 7th dec'16 to 7th dec'17 and collectionB has data from 6th dec'15 to 6th dec'16. how i can show data between 1st dec'16 to 1st jan'17 which are in different collection?

Upvotes: 0

Views: 2614

Answers (1)

PirateApp
PirateApp

Reputation: 6240

Very simple, use mongodb $lookup query which is the equivalent of a left outer join. All the documents on the left will be scanned for a value inside a field and the documents from the right considered the foreign document will match with respect to value. For your case, here is the parent collection

Parent A

enter image description here

Child collection B

enter image description here

Now all we have to do is make a query from the collection A

With a very simple aggregation $lookup query, you ll see the following result enter image description here

db.getCollection('uniques').aggregate([

    {
        "$lookup": {
            "from": "values",//Get data from values table
            "localField": "_id", //The field _id of the current table uniques
            "foreignField": "parent_id", //The foreign column containing a matching value
            "as": "related" //An array containing all items under 69
        }
    },
    {
        "$unwind": "$related" //Unwind that array
    },
    {
        "$project": {
            "value_id": "$related._id",//project only what you need
            "date": "$related.date_data",
            "a": "$related.valueA",
            "b": "$related.valueB"
        }
    }

], {"allowDiskUse": true})

Remember a few things

  1. Local field for the lookup doesnt care if you have indexed it or not so run it over a table with the least number of rows
  2. Foreign field works best when indexed or directly on an _id
  3. There is an option to specify a pipeline and do some custom filtering work while matching, I wont recommend it as pipelines are ridiculously slow
  4. Dont forget to "allowDiskUse" if you are going to aggregate large amounts of data

Upvotes: 1

Related Questions