Reputation: 637
I have the following time series data in mongodb
channels collection has data regarding each channel. Each channel document has real time data -> rtData which is a json array holding the time series data for that channel.
db.channels.find({}).pretty() gives a similar structure
{
channelName:"ABC",
rtData:[
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
.
.
.
},
{
channelName:"NBC",
rtData:[
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
{
ts:ISO_DATE(timestamp),
data:[12, 14]
},
.
.
.
},
Now, every 4 seconds I get a updated record for one or more channels
{
ts: ISO_DATE(timestamp),
data: [14,15]
}
This record I need to push/update in the rtData array of that channel.
So what I did was something similar to this -
channels.findOne({query}, function(channel) {
channel.rtData.push(newData);
channels.findAndModify({query}, {$set:{rtData: channel.rtData}},
function({}))
})
Find the channel, push data into the rtData array and do a find and modify.
Now this seems to work when the data volume is low. But when there are close 50K elements in rtData array of one single channel the app is not able to handle that.
Is there an efficient to update the time series data.
Upvotes: 0
Views: 1831
Reputation: 20712
You overembedded your model, imho. Keep in mind that there is a 16MB size limit on BSON documents in MongoDB. Furthermore, modifying a document takes a lot more time than simply inserting one.
Since an ObjectID contains a timestamp already and additionally gives you the uniqueness for each entered value, something like
{
_id: new ObjectID(),
channelName: "NBC",
value: [14,15]
}
Makes inserting your data extremely efficient. For querying based on date, you use to your advantage that ObjectIDs are starting with a hex representation of a 4 byte value containing secs since epoch and is otherwise monotonically increasing:
secs = Math.floor(Date.Now()/1000)
hexSecs = secs.toString(16)
id = ObjectId(hexSecs+"0000000000000000")
db.values.find({"_id":{$lt:id}})
This example would give you all entries older than the time Date.Now()
was called. You can query ranges, of course, using the same method of converting your beginning and end dates. For a more detailed explanation, see the according entry "Popping time stamps into ObjectIds" on Kristina Chodorow's blog
The rest of the queries and aggregations should be obvious.
Upvotes: 1