Reputation: 1307

MongoDB- huge data size?

I am new to MongoDB, i have a collection which has the following fields:

> db.TestTable.findOne()
{
        "_id" : ObjectId("527c48e99000cf10bc2a1d82"),
        "ID" : "16587",
        "Name" : "N15247",
        "Serial1" : "11",
        "Serial2" : "727",
        "DateTime" : ISODate("1998-12-15T18:30:00Z"),
        "CompID" : "ID465",
        "CompName" : "F1460"
}

I have inserted around 300,000,000 documents into the collection using a c# driver using BsonDocument. The size of the collection is:

> db.TestTable.stats()
{
        "ns" : "FeatureParser.LogsTable",
        "count" : 300000000,
        "size" : 62399477600,
        "avgObjSize" : 207.99825866666666,
        "storageSize" : 68783787568,
        "numExtents" : 54,
        "nindexes" : 2,
        "lastExtentSize" : 2146426864,
        "paddingFactor" : 1,
        "systemFlags" : 1,
        "userFlags" : 0,
        "totalIndexSize" : 14878186064,
        "indexSizes" : {
                "_id_" : 9746789472,
                "dateTime_1" : 5131396592
        },
        "ok" : 1
}

Does MongoDB take so much space for the documents inserteD? Is there anyway the size of the DB can be reduced?

Thanks in advance.

Upvotes: 1

Answers (3)

Salvador Dali

Reputation: 222461

People already suggested the reason why collection is so big, so instead of rephrasing their words, I would address the second question. How to decrease the size of the collection.

There is one nice way to reduce the size of your collection.

Because mongodb stores keys for every document, you can substantially reduce the size of the collection by shortening the names. This way you will have collection with documents like this:

{
        "_id" : ObjectId("527c48e99000cf10bc2a1d82"),
        "ID" : "16587",
        "n" : "N15247",
        "s" : "11",
        "c" : "727",
        "d" : ISODate("1998-12-15T18:30:00Z"),
        "c" : "ID465",
        "f" : "F1460"
}

and on your application layer you can create a mapping from these cryptic names to normal names.

Upvotes: 1

AD7six

Reputation: 66170

To be expected

It's not clear in what way the stored size is considered huge - what size is to be expected?

I have inserted around [300M] documents

Each rows is approximately 200 bytes:

{"_id" : ObjectId("527c48e99000cf10bc2a1d82"),"ID" : "16587","Name" : "N15247","Serial1" : "11","Serial2" : "727","DateTime" : ISODate("1998-12-15T18:30:00Z"),"CompID" : "ID465","CompName" : "F1460"}
^199 chars

Which is reported/confirmed as:

"avgObjSize" : 207.99825866666666 [bytes]

with a total data size of:

"size" : 62399477600 [bytes]

Therefore:

    300, 000, 000 rows x
              200 bytes per row
60, 000, 000, 000 bytes

Which simply confirms that the estimate of the data inserted, is very close to the size of the data in the collection (62GiB v 60GiB).

The actual storage size is 68, 783, 787, 568 (68GiB) which is also pretty close to the data size, the difference being overhead for indexes and pre-allocation of storage space.

As such the results observed are easily to be expected. If the above isn't what's meant - please clarify by editing the question.

Upvotes: 3

Ayush

Reputation: 42440

From http://docs.mongodb.org/manual/faq/storage/

Preallocated data files.
In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation. MongoDB names the first data file .0, the next .1, etc. The first file mongod allocates is 64 megabytes, the next 128 megabytes, and so on, up to 2 gigabytes, at which point all subsequent files are 2 gigabytes. The data files include files with allocated space but that hold no data. mongod may allocate a 1 gigabyte data file that may be 90% empty. For most larger databases, unused allocated space is small compared to the database.

Upvotes: 3

MongoDB- huge data size?

Answers (3)

To be expected

Related Questions