nightrayy
nightrayy

Reputation: 41

MongoDB Convert sql to mongo query (join with id-1)

I would like to change below sql query:

select a1.opis_ruchu as poprzednik, a2.opis_ruchu as nastepnik,
count(a1.opis_ruchu) as ilosc_wystapien
from public.informacje a1 join public.informacje a2
on a1.id = a2.id-1
group by a1.opis_ruchu, a2.opis_ruchu;

To Mongo At the moment I have:

db.informacje.aggregate(
    [
           { 
            "$project" : { 
                "_id" : NumberInt(0), 
                "a1" : "$$ROOT"
            }
        },
        { 
            "$lookup" : { 
                "localField" : "id", 
                "from" : "informacje", 
                "foreignField" : "id-1", 
                "as" : "a2"
                        }
        }
        ,
        
       { 
            "$group" : { 
                "_id" : { 
                    "poprzednik" : "$a1.opis_ruchu",
                    "nastepnik"  : "$a2.opis_ruchu"
                }, 
                "COUNT(a1᎐opis_ruchu)" : { 
                    "$sum" : NumberInt(1)
                }
            }
        }
            ])

Problem is: field poprzednik exists once when nastepnik multiple one.

For example in SQL I have:

poprzednik nastepnik ilosc_wystapien

left left 500

In Mongo there is: poprzednik nastepnik

left "left","left","left"...

Example of collection:

> db.informacje.find().pretty()
{
        "_id" : ObjectId("5fa7f372fdd610505c57d5d6"),
        "id" : 1,
        "opis_ruchu" : "left"
}
{
        "_id" : ObjectId("5fa7f372fdd610505c57d5d7"),
        "id" : 2,
        "opis_ruchu" : "left"
}
{
        "_id" : ObjectId("5fa7f372fdd610505c57d5d8"),
        "id" : 3,
        "opis_ruchu" : "left"
}
{
        "_id" : ObjectId("5fa7f372fdd610505c57d5d9"),
        "id" : 4,
        "opis_ruchu" : "left"
}

from id=501 I have "opis_ruchu" : "right"

Upvotes: 0

Views: 479

Answers (4)

Sheeri
Sheeri

Reputation: 608

One thing I like to do is use MongoDB compass - it's a free GUI tool but you can visualize each stage of the pipeline. For example:

enter image description here

You can see after the first stage (the $lookup) that a2 is an empty array. And you're right, it's because of the "id-1". With $lookup you can't manipulate the fields, but you can use $addFields before the $lookup:

addFields stage to add the prevId field which is id-1

So you can see that the lookup has been successful - id=1 doesn't have anything in a2, because there's no id=0, but id=2 has an array in a2 that's the full record of id=1.

(Using Compass and the aggregation pipeline you can see the stages as you go, and it makes debugging tons easier).

From there, you want a2 not to be an array (you can't avoid that, because lookup always results in an array). So you $unwind, which takes the values in an array and creates documents for each item in the array....and then do the group:

enter image description here

As you can see from the group, you don't need any projection, because your data is already in the format you want.

So the pipeline is:

db.informacje.aggregate[{$addFields: {
  prevId: {$subtract: ["$id",1]}
}}, {$lookup: {
  localField: 'prevId',
  from: 'informacje',
  foreignField: 'id',
  as: 'a2'
}}, {$unwind: {
  path: '$a2',
  preserveNullAndEmptyArrays: true
}}, {$group: {
  _id: {
    poprzednik: '$opis_ruchu',
    nastepnik: '$a2.opis_ruchu'
  },
  'sum': {
    $sum: 1
  }
}}]

Upvotes: 1

Sheeri
Sheeri

Reputation: 608

A better way to do this would be to use $group and then $match, as in the example here:

https://developer.mongodb.com/community/forums/t/how-to-transform-a-sql-self-join-to-a-mongodb-query/5860/5

And again only $project at the end.

Upvotes: 0

nightrayy
nightrayy

Reputation: 41

@Sheeri Thank you for your help. After running script have folowinng data:

{ "_id" : { "poprzednik" : "right", "nastepnik" : [ ] }, "sum" : "COUNT(opis_ruchu)" }
{ "_id" : { "poprzednik" : "left", "nastepnik" : [ ] }, "sum" : "COUNT(opis_ruchu)" }

I think that main problem is field id-1.

Upvotes: 0

Sheeri
Sheeri

Reputation: 608

Project should only be used once and in the LAST stage - it doesn't change the information that is processed, just what's displayed.

db.informacje.aggregate(
    [
        { 
            "$lookup" : { 
                "localField" : "id", 
                "from" : "informacje", 
                "foreignField" : "id-1", 
                "as" : "a2"
                        }
        }
        ,
        
       { 
            "$group" : { 
                "_id" : { 
                    "poprzednik" : "$a1.opis_ruchu",
                    "nastepnik"  : "$a2.opis_ruchu"
                }, 
                "COUNT(a1᎐opis_ruchu)" : { 
                    "$sum" : NumberInt(1)
                }
            }
        },
           { 
            "$project" : { poprzednik: 1, nastepnik: 1,
                "sum" : "COUNT(a1᎐opis_ruchu)",
            }
        },

            ])

This looks very complicated to get what you're getting, though - I find it helpful to 'translate' the SQL into what I want to do, and then 'translate' that into mongo. In SQL, you're grouping records by the values of their own opis_ruchu plus the value of the opis_ruchu of the previous record, and displaying those groups plus the count.

If you had 500 entries with increasing ids, and 'left' as opis_ruchu, and then ids 501-1000 are 'right', I would expect the outcome of your SQL to be:

left left 499 (because id 1 has no match, there's no id 0)
right left 1 (this is id 501 compared to id 500)
right right 499 (this is for id's 502-1000)

It seems like a lot of work to get that result for two big blocks of ids, though. There might be a more efficient way in Mongo to do what you're trying to do, if you can explain it by what your app is doing.

Upvotes: 0

Related Questions