Reputation: 989
I have some trade data like this
{
"_id" : 1498290900.0,
"trade" : {
"type" : "Modify",
"data" : {
"type" : "bid",
"rate" : "0.00658714",
"amount" : "3.82354427"
},
"date" : 1498290930291.0,
"name" : "TLX"
}
},{
"_id" : 1498290900.0,
"trade" : {
"type" : "Modify",
"data" : {
"type" : "ask",
"rate" : "0.00658714",
"amount" : "3.82354427"
},
"date" : 1498290930291.0,
"name" : "TLX"
}
},{
"_id" : 1498290900.0,
"trade" : {
"type" : "Remove",
"data" : {
"type" : "ask",
"rate" : "0.00680891"
},
"date" : 1498290931349.0,
"name" : "TLX"
}
}
These come from $rewind
hence the _id
being same. Want i want to do next is group them on _id
so i try
{
$group: {
_id: {_id: "$_id", name: "$trade.type",dtype: "$trade.data.type"},
count : {$sum: 1}
},
},{$project: { _id: "$_id._id", type: "$_id.name", count: 1, dtype: "$_id.dtype" } },
{
$group: {
_id: "$_id",
results: { $push : "$$ROOT" }
}
}
Which is quite good, give me the below
{
"_id" : 1498276800.0,
"results" : [
{
"count" : 16.0,
"_id" : 1498276800.0,
"type" : "Modify",
"dtype" : "bid"
},
{
"count" : 15.0,
"_id" : 1498276800.0,
"type" : "Remove",
"dtype" : "bid"
},
{
"count" : 3.0,
"_id" : 1498276800.0,
"type" : "Remove",
"dtype" : "ask"
},
{
"count" : 1.0,
"_id" : 1498276800.0,
"type" : "Modify",
"dtype" : "ask"
}
]
}
But i was trying to make the output more like this
{
"_id" : 1498276800.0,
"Modify": {
"bid":{
"count": 16.0
},
"ask": {
"count": 1.0
}
},
"Remove": {
"bid":{
"count": 15.0
},
"ask": {
"count": 3.0
}
}
}
But no amount of playing with $projections
has got me close.
Can anyone point me in the right direction please?
thanks.
UPDATE
Excluding last pipeline stage, this is example documents with nice bid/ask per type ready to be grouped by _id.
{
"_id" : {
"_id" : 1498276800.0,
"type" : "orderBookRemove"
},
"results" : [
{
"k" : "bid",
"v" : {
"count" : 15.0
}
},
{
"k" : "ask",
"v" : {
"count" : 3.0
}
}
]
},
{
"_id" : {
"_id" : 1498276800.0,
"type" : "orderBookModify"
},
"results" : [
{
"k" : "bid",
"v" : {
"count" : 16.0
}
},
{
"k" : "ask",
"v" : {
"count" : 1.0
}
}
]
}
When last part of pipeline is applied i.e
{ "$group": {
"_id": "$_id._id",
"results": {
"$push": {
"k": "$_id.type",
"v": "$results"
}
}
}}
I get this, only first 'bid' element of results array. 2nd item 'ask' goes AWOL ?
{
"_id" : 1498280700.0,
"results" : [
{
"k" : "orderBookRemove",
"v" : [
{
"k" : "bid",
"v" : {
"count" : 9.0
}
}
]
},
{
"k" : "orderBookModify",
"v" : [
{
"k" : "bid",
"v" : {
"count" : 6.0
}
}
]
}
]
}
Upvotes: 1
Views: 78
Reputation: 151190
This entirely depends on the MongoDB version you have available, or not really, depending on how you look at it. As you say the data is actually from an array originally so I am going to start in that format, and process each option from there.
The source being considered then being:
{
"_id" : ObjectId("594f3a530320738061df3eea"),
"data" : [
{
"_id" : 1498290900,
"trade" : {
"type" : "Modify",
"data" : {
"type" : "bid",
"rate" : "0.00658714",
"amount" : "3.82354427"
},
"date" : 1498290930291,
"name" : "TLX"
}
},
{
"_id" : 1498290900,
"trade" : {
"type" : "Modify",
"data" : {
"type" : "ask",
"rate" : "0.00658714",
"amount" : "3.82354427"
},
"date" : 1498290930291,
"name" : "TLX"
}
},
{
"_id" : 1498290900,
"trade" : {
"type" : "Remove",
"data" : {
"type" : "ask",
"rate" : "0.00680891"
},
"date" : 1498290931349,
"name" : "TLX"
}
}
]
}
Simply use $replaceRoot
and $arrayToObject
with some careful placement of results:
db.dtest.aggregate([
{ "$unwind": "$data" },
{ "$group": {
"_id": {
"_id": "$data._id",
"type": "$data.trade.type",
"dtype": "$data.trade.data.type"
},
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": {
"_id": "$_id._id",
"type": "$_id.type"
},
"results": {
"$push": {
"k": "$_id.dtype",
"v": {
"count": "$count"
}
}
}
}},
{ "$group": {
"_id": "$_id._id",
"results": {
"$push": {
"k": "$_id.type",
"v": "$results"
}
}
}},
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$concatArrays": [
[{ "k": "_id", "v": "$_id" }],
{ "$map": {
"input": "$results",
"as": "r",
"in": {
"k": "$$r.k",
"v": { "$arrayToObject": "$$r.v" }
}
}}
]
}
}
}}
])
Which is what probably makes the most sense in most cases where we simply do the aggregated array form and do the transformation in the client instead. We don't really need additional aggregation since that part is already done, so it's not like we are reducing the data any further.
Simple to do in most languages, but as a basic JavaScript concept that works in the shell:
db.dtest.aggregate([
{ "$unwind": "$data" },
{ "$group": {
"_id": {
"_id": "$data._id",
"type": "$data.trade.type",
"dtype": "$data.trade.data.type"
},
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": {
"_id": "$_id._id",
"type": "$_id.type"
},
"results": {
"$push": {
"k": "$_id.dtype",
"v": {
"count": "$count"
}
}
}
}},
{ "$group": {
"_id": "$_id._id",
"results": {
"$push": {
"k": "$_id.type",
"v": "$results"
}
}
}}
]).map(doc =>
doc.results.map(r =>
({ k: r.k, v: r.v.reduce((acc,curr) =>
Object.assign(acc, { [curr.k]: curr.v }),{})
})
).reduce((acc,curr) =>
Object.assign(acc, { [curr.k]: curr.v }),{ _id: doc._id })
)
Which is essentially doing the same thing whilst processing the cursor as out new fancy pipeline stage is doing for each document.
So it really just goes to show that unless you intend to aggregate the results further along than from even this result, it simply is not necessary to use the fancy new operators. The same thing is achieved in what is considerably less lines of code and is far less terse to express.
Both output the same thing:
{
"_id" : 1498290900,
"Modify" : {
"ask" : {
"count" : 1
},
"bid" : {
"count" : 1
}
},
"Remove" : {
"ask" : {
"count" : 1
}
}
}
Taking the data from your update I apply this:
db.test.aggregate([
{ "$group": {
"_id": "$_id._id",
"results": {
"$push": {
"k": "$_id.type",
"v": "$results"
}
}
}},
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$concatArrays": [
[{ "k": "_id", "v": "$_id" }],
{ "$map": {
"input": "$results",
"as": "r",
"in": {
"k": "$$r.k",
"v": { "$arrayToObject": "$$r.v" }
}
}}
]
}
}
}}
])
And get the expected output:
{
"_id" : 1498276800.0,
"orderBookRemove" : {
"bid" : {
"count" : 15.0
},
"ask" : {
"count" : 3.0
}
},
"orderBookModify" : {
"bid" : {
"count" : 16.0
},
"ask" : {
"count" : 1.0
}
}
}
So your claimed output is false, and you are not following the example.
Upvotes: 1