Reputation: 307
I have a populated mongoDB.
Now I need to add huge amounts of additional data to my documents (log file data). This data exceeds the BSON size limit.
Document too large: This BSON document is limited to 16777216 bytes. (BSON::InvalidDocument)
A simplified example of my situation would look like this:
cli = MongoClient.new("localhost", MongoClient::DEFAULT_PORT)
db = cli.db("testdb")
coll = db.collection("test")
data = {:name => "Customer1", :data1 => "some value", :log_file => "A" * 17_000_000}
coll.save data
Upvotes: 0
Views: 220
Reputation: 307
The paragraph about document growth finally solved my question. (Found by following Konrad's link.)
http://docs.mongodb.org/manual/core/data-model-operations/#data-model-document-growth
What I am now basically doing is this:
cli = MongoClient.new("localhost", MongoClient::DEFAULT_PORT)
db = cli.db("testdb")
coll = db.collection("test")
grid = Grid.new db
#store data
id = grid.put "A"*17_000_000
data = {:name => "Customer1", :data1 => "some value", :log_file => id}
coll.save data
#access data
cust = coll.find({:name => "Customer1"})
id = cust.first["log_file"]
data = grid.get id
Upvotes: 1
Reputation: 4479
Maybe you can split up your document and reference them. See this SO post: syntax for linking documents in mongodb
Upvotes: 1
Reputation: 679
I would suggest two approaches:
GridFS with instructions here https://github.com/mongodb/mongo-ruby-driver/wiki/GridFS
Advantages: uses already existing service(mongodb) to store files so presumably easiest to implement/ cheapest since you already have the infrastructure.
Disadvantage: Not necesarilly the best use of an in-memory DB, especially if it's used for other storage as well.
S3 - Store links to a hosted data service (such as Amazon S3) which is designed for file storage (redundant, replicated and highly available). In this case you just upload the files and store a pointer to their S3 location in your DB.
Advantage Keeps your DB leaner, probably cheaper since you keep your mongo machines optimised for doing mongo things (i.e. high-memory) and take advantage of the really cheap file storage on S3 as well as the near-infinite scalability.
Disadvantage Harder to implement since you need to design your own code to do this. Though there may be off the shelf solutions somewhere.
Some more useful discussion on this SO post
Upvotes: 1