Reputation: 9448
I am using mongodb as my backend. I have data for movies, music, books and more which I am storing in one single collection. The compulsory fields for every bson entry are "_id", "name", "category". Rest of the fields depend upon the category to which the entry belongs. For example, I have a movie record stored like.
{
"_id": <some_id>,
"name": <movie_name>,
"category": "movie",
"director": <director_name>,
"actors": <list_of_actors>,
"genre": <list_of_genre>
}
For music, I have,
{
"_id": <some_id>,
"name": <movie_name>,
"category": "music"
"record_label": <label_name>
"length": <length>
"lyrics": <lyrics>
}
Now I have 12 different categories for which only _id, name and category are common fields. Rest the fields are all different for different categories. Is my decision to store all data in one single collection fine or should I make different collections per category.
Upvotes: 1
Views: 755
Reputation: 39264
MongoDB allows you to store any field structure in a document even if every document is different, so that isn't a concern. By having those 3 consistent fields then you can use those as part of the index and to handle your queries. This is a good example of where a schemaless database helps because you can store everything in a single collection.
There is no performance hit for using a single collection in this way. Indeed, there is actually a benefit because you can shard the collection as a scaling strategy later. Sharding is done on a collection level so you could shard based on the _id field to have them evenly distributed, or use your category field to have certain categories per shard, or even a combination.
One thing to be aware of is future query requirements. If you do need to index the other fields then you can use sparse indexes which mean that documents without the indexed fields won't be in the index, so won't take any space in the index; a handy optimisation.
You should also be aware of growing the documents if you made updates. This does have a major performance impact.
Upvotes: 1
Reputation: 371
A single collection is best if you're searching across categories. Having the single collection might slow performance on inserts, but if you don't have a high write need, that shouldn't matter.
Upvotes: 1