Reputation: 33
I have a database something like that in MongoDB:
{ "_id" : "piramidales", "LiciList" : [ "318081", "318157" ] }
{ "_id" : "pyramidalis", "LiciList" : [ "318081", "318157" ] }
{
"_id" : "toneis",
"LiciList" : [
"318077",
"318151",
"318288",
"318360",
"318666"
]
I want to count pair words for all combinations!
How can I get the relationship of LiciList item? Like this:
{item1:'piramidales',item2:'pyramidalis',count:2},
{item1:'piramidales',item2:'toneis',count:0},
{item1:'pyramidalis',item2:'toneis',count:0}
Upvotes: 0
Views: 99
Reputation: 49985
You can try following aggregation:
db.col.aggregate([
{
$group: {
_id: null,
item1: { $push: "$$ROOT" },
item2: { $push: "$$ROOT" },
}
},
{ $unwind: "$item1" },
{ $unwind: "$item2" },
{
$project: {
_id: 0,
item1: "$item1._id",
item2: "$item2._id",
count: { $size: { $setIntersection: [ "$item1.LiciList", "$item2.LiciList" ] } }
}
},
{
$redact: {
$cond: {
if: { $and: [{ $gt: [ "$item2", "$item1" ] }, { $gt: [ "$count", 0 ] } ] },
then: "$$KEEP",
else: "$$PRUNE"
}
}
}
],
{ allowDiskUse: true })
Basically you have to generate the documents having pairs (item1, item2) and that's why we have to group everything into one document with two fields and then unwind twice. To count matching elements we can use $setIntersection. Then we have to filter out the duplicates using $redact. Simply comparing strings using $gt will eliminate pairs like (toneis
, toneis
) or (toneis
, pyramidalis
) keeping (pyramidalis
, toneis
).
Upvotes: 1