user2572790
user2572790

Reputation: 474

How index mongodb collection on embedded properties?

Example of data structure:

{
    "result": {
        "status": 1,
        "num_results": 1,
        "total_results": 500,
        "results_remaining": 499,
        "matches": [
            {
                "match_id": 792680045,
                "match_seq_num": 712015697,
                "start_time": 1406113521,
                "lobby_type": 8,
                "radiant_team_id": 0,
                "dire_team_id": 0,
                "players": [
                    {
                        "account_id": 4294967295,
                        "player_slot": 0,
                        "hero_id": 0
                    },
                    {
                        "account_id": 137113820,
                        "player_slot": 128,
                        "hero_id": 11
                    }
                ]

            }
        ]

    }
}

That's small part of data (in original will be 100 matches in list and 10 players in every match.) - and income value of data - is 10 millions matches per month.

That's dota2 game matches history. I want two fast types of search:

  1. Search by match params (match_id, start_time and lobby_type)
  2. Search matches by player_id from embedded players data

How should I organize it in Mongodb?

Upvotes: 0

Views: 49

Answers (1)

Philipp
Philipp

Reputation: 69663

Your database schema seems to be fine to satisfy the two queries you mentioned. Just create the following indexes and you should be fine:

db.collection.ensureIndex({ "result.matches.match_id" : 1 });
db.collection.ensureIndex({ "result.matches.start_time" : 1 });
db.collection.ensureIndex({ "result.matches.lobby_type" : 1 });
db.collection.ensureIndex({ "result.matches.players.account_id" : 1 });

When you not just need the account_id for the mentioned use-cases but also some more information about the players (like their name), you should duplicate that information in the player-subdocument so you don't need a subsequent query to your collection of players.

But I have a concern which might affect your write performance: MongoDB doesn't like documents which grow over time. MongoDB always tries to keep each document in a consecutive section of the database file to improve read performance. But that means that when a document update increases the document size, the document needs to be moved to the end of the file which is an expensive operation. That means when your document starts with one match and then recieves more and more each day, your update performance could suffer. As a tradeoff you could create a separate collection for matches.

Upvotes: 2

Related Questions