JSBach
JSBach

Reputation: 4747

Group documents using substring of a field

I am working with MongoDB and I am enjoying a lot! There is one query I am having problems to work with: I have this set of data that represents an hierarchy (a tree, where 1 is the root, 1.1 and 1.2 are children of 1, and so on)

db.test.insert({id:1, hierarchy:"1"})
db.test.insert({id:2, hierarchy:"1.1"})
db.test.insert({id:3, hierarchy:"1.2"})
db.test.insert({id:4, hierarchy:"1.1.1"})
db.test.insert({id:5, hierarchy:"1.1.2"})
db.test.insert({id:6, hierarchy:"1.2.1"})
db.test.insert({id:7, hierarchy:"1.2.2"})
db.test.insert({id:8, hierarchy:"1.2.3"})

So if I make a query:

> db.test.find()
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4760"), "id" : 1, "hierarchy" : "1" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4761"), "id" : 2, "hierarchy" : "1.1" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4762"), "id" : 3, "hierarchy" : "1.2" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4763"), "id" : 4, "hierarchy" : "1.1.1" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4764"), "id" : 5, "hierarchy" : "1.1.2" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4765"), "id" : 6, "hierarchy" : "1.2.1" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4766"), "id" : 7, "hierarchy" : "1.2.2" }
{ "_id" : ObjectId("546a6095cafd2fa3ff8e4767"), "id" : 8, "hierarchy" : "1.2.3" }

the document with id 1 represents the CEO and I would like to gather information about the teams under the VPs (1.1 and 1.2).

I would like to have an output like this

{
    id: null,
    teams:
    [
        {
             manager: 2,
             hierarchy: "1.1",
             subordinates: 2
        },
        {
             manager: 3,
             hierarchy: "1.2",
             subordinates: 3
        }
    ]
}

I am having problems to aggregate the documents in the right "slot". I tried to use a regex to aggregate using the substring, and project before grouping and create a new field which would be "manager_hierarchy", so I could group using this field. But with none of these approaches I had any kind of success, so I am stuck now.

Is there anyway I could accomplish this task?

EDIT: sorry, I forgot to make one thing explicit: This query is to get information about subordinate teams of an employee. I've used an example as the CEO, but if I was the employee 1.2.3 in the hierarchy, I would like to see the teams 1.2.3.1, 1.2.3.2, ..., 1.2.3.xx

There is also the possibility (rare, but possible) that someone has more than 9 subordinates, so making a "hardcoded" substring would not work, since

$substr:["$hierarchy",0,3]} 

would work for 1.2 but not for 1.10 and $substr:["$hierarchy",0,4]}

would work for 1.10, but not for 1.2

Upvotes: 2

Views: 76

Answers (1)

BatScream
BatScream

Reputation: 19700

You can get your results using the below aggregate pipeline operations.

  • Sort the rows based on their hierarchy, so that the manager comes on top.
  • Group together records that start with similar ancestors.(i.e 1.1 or 1.2,...). The manager record will be on top for each group due to the above sort operation.
  • Take the count of each group, so the number of subordinates will be count-1(the manager record).
  • Again group the records to get a single array.

The code:

db.test.aggregate([
{$match:{"id":{$gt:1}}},
{$sort:{"hierarchy":1}},
{$group:{"_id":{"grp":{$substr:["$hierarchy",0,3]}},
         "manHeir":{$first:"$hierarchy"},
         "count":{$sum:1},"manager":{$first:"$id"}}},
{$project:{"manager":1,
           "hierarchy":"$manHeir",
            "subordinates":{$subtract:["$count",1]},"_id":0}},
{$group:{"_id":null,"teams":{$push:"$$ROOT"}}},
{$project:{"_id":0,"teams":1}}
])

Upvotes: 3

Related Questions