Reputation: 33695
I am new to MongoDB and NoSQL. I actually have different follow up questions depending on how this question gets answered. And I'll post my follow up questions as a separate question. Here we go...
I am trying to model a database to help me answer a question such as "Find all departments where 2 (TWO) or more teams each have 2 (TWO) or more employees known to have caused accidents greater than their team's max_accidents." If I were allowed to use a relational database with MySQL, I would solve the problem by making these tables:
department:department_id, location_id (FK to a location table not described here), unit_type
team: team_id, department_id, max_accidents
employee: employee_id, team_id, accidents
And then I'd use this query (untested, but hopefully you get the idea):
SELECT department_id FROM team
WHERE EXISTS (
SELECT 1 FROM department
WHERE department.department_id = team.team_id
AND team.team_id IN (
SELECT team_id FROM employee
WHERE EXISTS (
SELECT 1 FROM team
WHERE team.team_id = employee.team_id
AND employee.accidents > team.max_accidents
) GROUP BY team_id HAVING COUNT(*) >=2
)
) GROUP BY department_id HAVING COUNT(*) >= 2
From what I understand about NoSQL databases, I can see two ways to model my collections. First, I could model each collection in exactly the same way I set out my tables above, meaning foreign keys would exist. The second possible way is this:
department = {_id,teams:[]team};
team = {_id,max_accidents,employees:[]employee};
employee = {_id,accidents};
My guess is that I should use the second approach where I embed arrays of documents. Then to perform my query, I would need to learn how to use the MongoDB aggregate framework as demonstrated in this question here:
Compare embedded document to parent field with mongoDB
I can build upon the aggregate approach to implement my HAVING COUNT(*)
behaviour by using the $match
feature as demonstrated in this question here:
What is the correct way to do a HAVING in a MongoDB GROUP BY?
I'd like confirmation if I'm approaching this problem correctly? If not, would be great if someone could explain why I might be approaching it the wrong way or what I might need to concern myself with.
Upvotes: 0
Views: 418
Reputation: 3424
From MongoDB Documentation
In general, use embedded data models when:
- you have “contains” relationships between entities. See Model One-to-One Relationships with Embedded Documents.
- you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents. See Model One-to-Many Relationships with Embedded Documents.
In general, embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. Embedded data models make it possible to update related data in a single atomic write operation.
This is a fair enough guideline. However you can take you call depending on your case.
Ask questions:
If answer is yes, would not think about embedded documents.
Consider a scenario where one employee is part of multiple teams. That means the employee object exists in multiple documents.
Which can cause: data duplication, need more storage, make updates redundant.
Upvotes: 1