ECMAScript
ECMAScript

Reputation: 4649

Storing data efficiently in MongoLab and in general

I have an app that listens to a websocket and it stores usernames/userID's (Usernames are 1-20 bytes, UserID's are 17 bytes). This is not a big deal because it's only one document. However, every round they participate in, it pushes the round ID (24 bytes) and a 'score' decimal value (ex: 1190.0015239999999).

The thing is, there is no limit to how many rounds there are and I can't afford to pay so much per month for mongolab. What's the best way to handle this data?

My thoughts: - If there is a way to replace the _id: field in mongodb, I will replace it with the userID which is 17 bytes long. Not sure if I can do that though.

tl;dr

Thanks in advance!

Edit:

Document Schema:

userID: {type: String},
userName: {type: String},
rounds: [{roundID: String, score: String}]

Upvotes: 0

Views: 240

Answers (2)

Markus W Mahlberg
Markus W Mahlberg

Reputation: 20703

Modelling 1:n relationships as embedded document is not the best except for very rare cases. This is because there is a 16MB size limit for BSON documents at the time of this writing.

A better (read more scalable and efficient approach) is to do use document references.

First, you need your player data, of course. Here is an example:

{
  _id: "SomeUserId",
  name: "SomeName"
}

There is no need for an extra userId field since each document needs to have a _id field with unique values anyway. Contrary to popular belief, this fields value does not have to be an ObjectId. So we already reduced the size you need for your player data by 1/3, if I am not mistaken.

Next, the results of each round:

{
  _id: {
    round: "SomeString",
    player: "SomeUserId"
  },
  score: 5,
  createdAt: ISODate("2015-04-13T01:03:04.0002Z")
}

A few things are to note here. First and foremost: Do NOT use strings to record values. Even grades should rather be stored as corresponding numerical values. Otherwise you can not get averages and alike. I'll show more of that later. We are using a compound field for _id here, which is perfectly valid. Furthermore, it will give us a free index optimizing a few of the most likely queries, like "How did player X score in round Y?"

db.results.find({"_id.player":"X","_id.round":"Y"})

or "What where the results of round Y?"

db.results.find({"_id.round":"Y"})

or "What we're the scores of Player X in all rounds?"

db.results.find({"_id.player":"X"})

However, by not using a string to save the score, even some nifty stats become rather cheap, for example "What was the average score of round Y?"

db.results.aggregate(
  { $match: { "_id.round":"Y" } },
  { $group: { "round":"$_id.round", "averageScore": {$avg:"$score"} }
)

or "What is the average score of each player in all rounds?"

db.results.aggregate(
  { $group: { "player: "$_id.player", "averageAll": {$avg:"$score"} }
)

While you could do these calculation in your application, MongoDB can do them much more efficiently since the data does not have to be send to your app prior to processing it.

Next, for the data expiration. We have a createdAt field, of type ISODate. Now, we let MongoDB take care of the rest by creating a TTL index

db.results.ensureIndex(
  { "createdAt":1 },
  { expireAfterSeconds: 60*60*24*30}
)

So all in all, this should be pretty much the most efficient way of storing and expiring your data, while improving scalability in the same time.

Upvotes: 1

snewcomer
snewcomer

Reputation: 2135

So currently you are storing three data points in the array for each record.

_id: false will prevent mongoose from automatically creating an id for the document. If you don't need roundID, then you can use the following which only stores one data point in the array:

round[{_id:false, score:String}]

Otherwise if roundID actually has meaning, use the following which stores two data points in the array:

round[{_id:false, roundID: string, score:String}]

Lastly, if you just need an ID for reference purposes, use the following, which will store two data points in the array - a random id and the score:

round[{score:String}]

Upvotes: 0

Related Questions