Reputation: 737
In MongoDB's documentation it is suggested to put as much data as possible in a single document. It is also suggested NOT to use ObjectId ref based sub-documents unless the data of those sub-documents must be referenced from more than one document.
In my case I have a one-to-many relationship like this:
Log schema:
const model = (mongoose) => {
const LogSchema = new mongoose.Schema({
result: { type: String, required: true },
operation: { type: Date, required: true },
x: { type: Number, required: true },
y: { type: Number, required: true },
z: { type: Number, required: true }
});
const model = mongoose.model("Log", LogSchema);
return model;
};
Machine schema:
const model = (mongoose) => {
const MachineSchema = new mongoose.Schema({
model: { type: String, required: true },
description: { type: String, required: true },
logs: [ mongoose.model("Log").schema ]
});
const model = mongoose.model("Machine", MachineSchema);
return model;
};
module.exports = model;
Each Machine will have many Production_Log documents (more than one million). Using embedded documents I hitted the 16mb per document limit very quickly during my tests and I couldn't add any more Production_Log documents to the Machine documents.
My doubts
Is this a case where one should be using sub-documents as ObjectId references instead of embedded documents?
Is there any other solution I could evaluate?
I will be accessing Production_Log documents to generate stats for each Machine using the aggregation framework. Should I have any extra consideration on the schema design?
Thank you very much in advance for your advice!
Upvotes: 1
Views: 420
Reputation: 13113
MongoDB scales better if you store full information in the single document (Data redundancy). Database normalization obligate split data in different collections, but once growth your data, it will cause bottlenecks issues.
Use only LOG
Schema:
const model = (mongoose) => {
const LogSchema = new mongoose.Schema({
model: { type: String, required: true },
description: { type: String, required: true },
result: { type: String, required: true },
operation: { type: Date, required: true },
x: { type: Number, required: true },
y: { type: Number, required: true },
z: { type: Number, required: true }
});
const model = mongoose.model("Log", LogSchema);
return model;
};
Read / Write operation scales fine in this way.
With Aggregation you can process data and compute desired result.
Upvotes: 2
Reputation: 5466
Please see if this approach suits your need
The Log
collection would be having more data generated whereas the Machine
collection never exceed 16MB. Instead of embedding Log
collection into Machine
collection try the vice versa.
Your modified schema would be like this
Machine schema:
const model = (mongoose) => {
const MachineSchema = new mongoose.Schema({
model: { type: String, required: true },
description: { type: String, required: true }
});
const model = mongoose.model("Machine", MachineSchema);
return model;
};
module.exports = model;
Log schema:
const model = (mongoose) => {
const LogSchema = new mongoose.Schema({
result: { type: String, required: true },
operation: { type: Date, required: true },
x: { type: Number, required: true },
y: { type: Number, required: true },
z: { type: Number, required: true },
machine: [ mongoose.model("Machine").schema ]
});
const model = mongoose.model("Log", LogSchema);
return model;
};
If still we are overshooting the size of Document(16MB) then in the Log Schema we can create a new document for every Day/Hour/Week depending on the logs we are generating.
Upvotes: -1