Reputation: 1458
I am hesitating between MongoDB and Google Cloud Datastore for one of my microservices. The microservice is rather easy to setup and no other limitations in either database is problematic.
All of the documents stored will contain a trimmed-down version of a web page, many of which are over 1MB themselves. That's without the properties and results we'll compute and add to the document. Therefore, Datasore's limitation of 1MB per entity (document) is problematic (see here.)
On the flipside, I have several microservices and I tend to like starting as easy as possible. Datastore is ideal in terms of hosted database: automatically scales and the API is great. So besides this entity size limit, it is my first choice.
For Google Datastore users, are they actually enforcing the entity size limit, and if so, are you aware of any plan to bring this limit up?
Upvotes: 3
Views: 2159
Reputation: 16563
Yes, Google does enforce the entity size limit. I am not aware of any suggestions to increase the limit.
One feature of the datastore that you can take advantage of is that you can automatically compress the data stored in the entity. You would use a compressed BlobProperty
or PickleProperty
as described here. Depending on the data, you might be able to store 3MB in an entity this way.
I'll give you more details of my implementation for doing something similar. A BlobProperty needs to store encoded text and can't store unicode, so I made my own property to automate encoding and decoding:
class UTF8BlobProperty(ndb.BlobProperty):
def __init__(self):
super(UTF8BlobProperty, self).__init__(default="", compressed=True)
def _validate(self, text):
if not isinstance(text, basestring):
raise TypeError("Expected a basestring, got %s" % text)
def _to_base_type(self, text):
return text.encode("utf-8")
def _from_base_type(self, text):
return text.decode("utf-8")
An entity then uses it like this:
class MyEntity(ndb.Model):
data = UTF8BlobProperty()
After that, you just use it like any other property. I've been meaning to modify this so that it will automatically store the data in Google Cloud Storage when the compressed data is too large, but haven't gotten around to needing that yet.
Upvotes: 5
Reputation: 8964
You may want to look into Google's Cloud Firestore. Like Cloud Datastore it is a NoSQL database but it follows a document model. Where Cloud Datastore has entities Cloud Firestore has documents. Both Firestore documents and Datastore entities have the same 1MB limitation and both databases support a hierarchical model. With Firestore's hierarchical model, documents can be grouped into collections whereas Datastore groups its entities into kinds.
BUT with Firestore's hierarchical model you can nest a collection in a document without impacting the 1MB limit (conversely you cannot embed a kind in a datastore entity, the best you can do is embed an entity within another entity but that contributes to the 1MB limit) and using that mechanism you might be able to work-around the 1MB limitation depending on your use case.
Upvotes: 2
Reputation: 4303
Datastore may be not the best option for you in this case, but, in the other hand, look at Google Cloud Storage, that allows you to store much bigger objects in it. By default, simplest upload method allows you to store up to 5 MB files. With resumable upload approach you can store over 5 MB limit up to some count of terabytes.
Datastore, as well as MongoDB is great for JSON data format, but in your example web page is more likely to be an HTML in ideal case, or set of files, of course if you didn't meant to make web pages archiving or pre processing before save.
In any case, I think if you don't have pure JSON data, then Datastore, as well as MongoDB will not ideally fit to your needs.
Upvotes: 0