Jon
Jon

Reputation: 43

Efficient MongoDB database structure

I'm building a simple database with MongoDB and I have some questions regarding an efficient document structure. I essentially have three different parts: users, events and costs. See my current approach below.

{
  "first_name": "John",
  "last_name": "Doe",
  "email": "[email protected]",
  "phone": "123456",
  "events": [
    {
      "_id": "*MongoId reference to event*",
      "status": 0,
      "owner": 1,
      "costs": [
        {
          "id": 1,
          "name": "Test",
          "amount": 59.99,
          "created": "27/12/16 16:47:34 UTC",
          "updated:": "27/12/16 16:47:34 UTC"
        }
      ],
      "created": "27/12/16 16:47:34 UTC",
      "updated": "27/12/16 16:47:34 UTC"
    }
  ],
  "created": "27/12/16 16:47:34 UTC",
  "updated": "27/12/16 16:47:34 UTC"
}

Multiple users will be connected to the same event, hence the MongoID reference, but a cost obviously only belongs to one user-event combination. I have some sample use cases:

  1. List a user's events by user_id (fast)
  2. List an event's users by event_id (speed?)
  3. List an event's costs by event_id (speed?)
  4. Find a cost by user_id, event_id and cost_id (fast)

Would use case 2 and 3 be in acceptable limits and is this an efficient structure for my needs?

Upvotes: 0

Views: 89

Answers (1)

Rahul Kumar
Rahul Kumar

Reputation: 2831

IMHO your approach is correct as far as denormalization goes, Where you want to put data and relation together.

The problem I see here is that you are using array of object inside an array of object (costs inside event). Mongo queries though are great usually, they are not very efficient with nested arrays. Nested Objects are easier to handle imo.

Additionally putting an index in nested array would be mess and might not bring you desired results.

Now it totally depends on the requirement you want to go with, But if I have to think of it, my model would be like below.

{
  "first_name": "John",
  "last_name": "Doe",
  "email": "[email protected]",
  "phone": "123456",
  "events": [
    {
      "_id": "*MongoId reference to event*",
      "status": 0,
      "owner": 1,
      "created": "27/12/16 16:47:34 UTC",
      "updated": "27/12/16 16:47:34 UTC"
    }
  ],
  "costs": [
        {
          "id": 1,
          "event_id": "*Appropriate event id*"
          "name": "Test",
          "amount": 59.99,
          "created": "27/12/16 16:47:34 UTC",
          "updated:": "27/12/16 16:47:34 UTC"
        }
      ],
  "created": "27/12/16 16:47:34 UTC",
  "updated": "27/12/16 16:47:34 UTC"
}

Additionally I would be putting indexes on both events and costs, for performance reason.

The pros I guess is that it stands fit on all the use cases as far as performance is concerned, it's easier to update costs data based on user and event id.

The cons is that you might have to do a mapping of event and cost on application level, secondly if you want to delete an event, you have to write update to remove corresponding costs. Thankfully both can be achieved in single update which is atomic for a single document.

There could be further approaches but somewhere you would have to decide in between.

Upvotes: 1

Related Questions