denormalizing JSON for mongoDB

Question

I think that's the word I'm looking for. I'm trying to get parent info into each of the cards. I think that's what I need to do, but chime in if you have any other ideas.

{
  "LEA": {
    "name": "Limited Edition Alpha",
    "code": "LEA",
    "releaseDate": "1993-08-05",
    "border": "black",
    "type": "core",
    "cards": [
      {"name": "Air Elemental"},
      {"name": "Earth Elemental"},
      {"name": "Fire Elemental"},
      {"name": "Water Elemental"}
    ]
  },
  "LEB": {
    "name": "Limited Edition Beta",
    "code": "LEB",
    "releaseDate": "1993-10-01",
    "border": "black",
    "type": "core",
    "cards": [
      {"name": "Armageddon"},
      {"name": "Fireball"},
      {"name": "Swords to Plowshares"},
      {"name": "Wrath of God"}
    ]
  }
}

This is a tiny subset of the data, obviously. LEA and LEB are sets of cards, and inside each set there are a bunch of cards. I'm thinking of denormalizing this into just the cards, with the set info added to each card. Something like this...

{
  {
    "name": "Air Elemental",
    "set": {
      "name": "Limited Edition Alpha",
      "code": "LEA",
      "releaseDate": "1993-08-05",
      "border": "black",
      "type": "core"
    }
  },
  {
    "name": "Earth Elemental",
    "set": {
      "name": "Limited Edition Alpha",
      "code": "LEA",
      "releaseDate": "1993-08-05",
      "border": "black",
      "type": "core"
    }
  },
  {
    "name": "Armageddon",
    "set": {
      "name": "Limited Edition Beta",
      "code": "LEB",
      "releaseDate": "1993-10-01",
      "border": "black",
      "type": "core"
    }
  },
  {
    "name": "Fireball",
    "set": {
      "name": "Limited Edition Beta",
      "code": "LEB",
      "releaseDate": "1993-10-01",
      "border": "black",
      "type": "core"
    }
  }
}

Is my thinking right, first and foremost? Would I want a giant collection of cards and have the set information flattened into each card? In SQL, I'd do a table for the sets, and and the cards would belong_to a set. I'm trying to wrap my head around 'document thinking'.

Second, if my thinking is correct, any ideas on how I could achieve this denormalizing?

hjc1710 · Accepted Answer

Here you go =).

OK here is where I would start. Since we've said that cards will never change (since they're based on physical MTG cards), create one collection with all of your cards in it, this will be used for easily populating a user's deck later on. You can search on it by card name or some sort of card ID (like a physical one, stored on the card).

For the user's array of card objects, you shouldn't just store the _id field for a card, because that forces you to join. Since cards will never change, completely denormalize them and just shove them in that card array, so a user object, so far, resembles:

{
  name: "Tom Hanks",
  skill_level: 0,
  decks: [
    [
      { 
        card_name: "Balance", 
        card_description: "LONG_BLOCK_OF_DESCRIP_TEXT", 
        card_creator: "Sugargirl14", 
        type: "Normal",
        _id: $SOME_MONGO_ID_HERE,
        ... rest of card data...
      }, {
         ...card 2 complete data...
      }
    ],
    [
      { ...another deck here... }
    ]
  ]
}

OK, back to set info, I will also assume set info is a constant (based on your SO post, I can't see how it would physically change). So, if that set info is always relevant to the card, I would denormalize and include it, changing our card object to:

      { 
        card_name: "Balance", 
        card_description: "LONG_BLOCK_OF_DESCRIP_TEXT", 
        card_creator: "Sugargirl14", 
        type: "Normal",
        _id: $SOME_MONGO_ID_HERE
        set: {
          "name": "Limited Edition Alpha",
          "code": "LEA",
          "releaseDate": "1993-08-05",
          "border": "black",
          "type": "core",
          "_id": $SOME_MONGO_ID_HERE
        },
        ... rest of card data...
      }

I imagine that storing the other cards in the denormalized object for a given card isn't relevant, if it is, add them. If you'll note, the key that is given in your SO example is dropped, since it seems to always == the "code" field.

OK, now to properly answer your SO question about whether you should embed sets in cards, or vice versa. First off, both collections are relevant. So, even if we embed sets into cards, you'll want those sets in a collection so they can be fetched later and inserted into new cards.

Which gets embedded in which is really determined by business logic, how the data is used and which gets pulled more often. Are you frequently displaying sets and pulling cards from them (like for users to search)? You could embed all of the card data, or any relevant data, in each set's cards array. But with the above data model, each card stores its set ID in its set object. I assume cards belong to only one set, so to get all cards for a set you can query over your card collection where set.id == the Mongo ID of the set you want. Now sets need minimal updates, due to business logic, (hopefully none at all) and your queries are still fast (and you get complete card objects). I'd, honestly, do that latter one and keep my sets clean of cards. As such, a card owns the set it belongs to as opposed to a set owning a card. That's a more SQLy way to think that actually can work fine in Mongo (you'll never join).

So our final data model resembles:

Collection 1, Set:

//data model
{
    "name": "Limited Edition Alpha",
    "code": "LEA",
    "releaseDate": "1993-08-05",
    "border": "black",
    "type": "core",
    "_id": $SOME_MONGO_ID_HERE
}

Collection 2, cards:

//data model
{ 
  _id: $SOME_MONGO_ID_HERE
  card_name: "Balance", 
  card_description: "LONG_BLOCK_OF_DESCRIP_TEXT", 
  card_creator: "Sugargirl14", 
  type: "Normal",
  set: {
    "name": "Limited Edition Alpha",
    "code": "LEA",
    "releaseDate": "1993-08-05",
    "border": "black",
    "type": "core",
    "_id": $SOME_MONGO_ID_HERE
     ... rest of card data...
  },
}

Collection 3, users:

{
  _id: $SOME_MONGO_ID_HERE,
  name: "Tom Hanks",
  skill_level: 0,
  decks: [
    [
      { 
        card_name: "Balance", 
        card_description: "LONG_BLOCK_OF_DESCRIP_TEXT", 
        card_creator: "Sugargirl14", 
        type: "Normal",
        _id: $SOME_MONGO_ID_HERE,
        set: {
          "name": "Limited Edition Alpha",
          "code": "LEA",
          "releaseDate": "1993-08-05",
          "border": "black",
          "type": "core",
          "_id": $SOME_MONGO_ID_HERE
        },
      }, {
         ...card 2 complete data...
      }
    ],
    [
      { ...another deck here... }
    ]
  ]
}

This, obviously, assumes set data for each card is relevant to the user. Now your data is denormalized, sets and cards rarely need updates (according to business logic), so you'll never need cascading updates or deletes. Manipulating users is easy. When you remove a card from a user's deck you can do a $pull from Mongo (I think that's what it's called) on the relevant decks array where a contained item's _id field == the Mongo ID of the card you want to remove. All other updates are easier.

In retrospect, you might want to make the user's decks like so:

decks: {
  "SOME_ID_HERE": [
    { ...card 1... },
    { ...card 2... }
  ] 
}

This makes identifying the decks MUCH easier and will make your pulls easier (you'll have more data on the frontend and the pull query will be more precise). It can be a number, random string, anything really, since it gets passed back to the frontend. Or just use their Mongo ID, when looking at a deck, a user will have it's Mongo ID. Then when they pull a card out of it, or add one in, you have a direct identifier to easily grab the deck needed.

Obviously all values with text like: $MONGO_ID_HERE should really be MongoId() objects.

Whew, that was intense, 6800 characters. Hope it makes sense to you and I apologize if any verbiage is confusing or if any of my JSON objects' formatting is fucked up (just let me know if any prose is confusing, I'll reword). Does this make sense/solve your problem?

denormalizing JSON for mongoDB

Answers (1)

Related Questions