Serbin
Serbin

Reputation: 823

MongoDB decreasing size of document with huge arrays

I have a device, which accumulate temperatures from different sides and save them to the database every second. For each measurement I have the next documents:

{
    "_id" : ISODate("2017-05-05T22:07:37.924Z"),
    "north_side" : [  2660 elements * Int32 ],
    "east_side" : [  1330 elements * Int32 ],
    "south_side" : [ 2660 elements * Int32 ],
    "west_side" : [ 1330 elements * Int32 ]
}

Here _id is a timestamp, when measurement was done. And arrays of temperatures done for each side. Totally device measures 7980 temperatures every second (in uint16_t format). But storing all this measurements for one month will take too much space.

I get a statistics from db.getCollection('temperatures').stats() and it shows me that avgObjSize = 75445 bytes. It is about 6.5 GB per month.

Storing 7980 temperatures with 32 bit (Am I forced to use 32 bit, because mongodb don't have 16 bit value?) will take 31920 bytes. For what else mongodb use 43525 bytes of data and how can I decrese this value?

Upvotes: 1

Views: 245

Answers (1)

Las Ten
Las Ten

Reputation: 1175

I assume temperatures have numbers after the decimal point, so they are not integers. Anyway, mongo "treats all numbers as 64-bit floating-point double values by default."

So that's 8 bytes per number, not 4. Much closer to the average object size you're referring to, the rest, I guess, are control values, array sizes, etc.

You could decrease the size by making reasonable simplifications, like for example storing just one double precision value per side, and then just store the differences from that one in tenths or hundredths (1/10, 1/100) and store the whole set combined into just one string. Like

99|101|67|-13|-23|9|17 ...

Update: Or even better, if your language supports marshalling, then create your strongly typed arrays in memory and store the marshalled object. Storing just the differences as integers from a double could still help.

Upvotes: 2

Related Questions