Reputation: 1967
Let's say we have two collections:
devices
: the objects from this collection have (among others) the fields name
(string) and cards
(array); each part from that array has the fields model
and slot
. The cards are not another collection, it's just some nested data.interfaces
: the objects from this collection have (among others) the fields name
and owner
.Extra info:
cards
, I'm only interested in the ones where slot
is a numberpart
of a device
that matches the previous condition, there is an interface
object in the other collection where the owner
fields has as value the name
of the device
in cause and the name is s[slot]p1
(the character 's' + the slot of that part + 'p1')My job is to create a query to generate a summary of all the existing cards in all of those devices, each entry being enriched with information from the interfaces
collection. I also need to be able to parametrize the query (in case I'm interested only in a certain device with a certain name, only a certain model for cards etc.)
So far, I have this:
mongo_client.devices.aggregate([
# Retrieve all the devices having the cards field
{
"$match": {
# "name": "<device-name>",
"cards": {
"$exists": "true"
}
}
},
# Group current content with every cards object
{
"$unwind": "$cards"
},
# Only take the ones having "slot" a number
{
"$match": {
"cards.slot": {
"$regex": "^\d+$"
}
}
},
# Retrieve the device's interfaces
{
"$lookup": {
"from": "interfaces",
"let": {
"owner": "$name",
},
"as": "interfaces",
"pipeline": [{
"$match": {
"$expr": {
"$eq": ["$owner", "$$owner"]
},
},
}]
}
},
{
"$unwind": "$interfaces"
},
{
"$match": {
"$expr": {
"$eq": ["$interfaces.name", {
"$concat": ["s", "$cards.slot", "p1"]
}]
}
}
},
# Build the final object
{
"$project": {
# Card related fields
"slot": "$cards.slot",
"model": "$cards.model",
# Device related fields
"device_name": "$name",
# Fields from interfaces
"interface_field_x": "$interfaces.interface_field_x",
"interface_field_y": "$interfaces.interface_field_y",
}
},
])
The query works and it's quite fast, but I have a question:
Is there any way I can avoid the 2nd $unwind
? If for every device
there are 50-150 interface
objects where owner
is the name of that device, I feel that I'm slowing it down. Every device has a unique interface named s[slot]p1
. How can I get that specific object in a better way? I tried to use two $eq
expressions in the $match
inside the $lookup
or even $regex
or $regexMatch
, but I couldn't use the outside slot
fields, even if I put it inside let
.
If I want to parametrize my query to filter the data if needed, would you add match expressions as intermediary steps or just filter at the end?
Any other improvements to the query are welcome. I'm also interested in how to make it errors proof (if by mistake cards
is missing or that s1p1
interface is not found.
Thanks!
Upvotes: 0
Views: 588
Reputation: 14530
Your question is missing sample data for the query, but:
The number of unwinds in the query should correspond to what objects you want in the result set:
To match the desired conditions no unwinds are needed.
Upvotes: 1