Reputation: 2251
I'm trying to export an entity from Google Data Store to Big Query (and then to CSV).
When I create the dataset, everything turns fine except for one missing variable that is supposed to be a JSON.(ndb.JsonProperty()
)
Looking at this entity variable in the datastore, it seems to be an encoded JSON (eg : ...0NzIyMDUyODkiLCAidXNlcl9uYW1lIjogIlZpbmNlbnQgR
...)
My only purpose is to export this entity from the datastore using Big Query, Python or whatever needed, in order to explore the data.
Upvotes: 1
Views: 333
Reputation: 55933
ndb JsonProperty values are stored in the datastore as blobs:
JsonProperty Value is a Python object (such as a list or a dict or a string) that is serializable using Python's json module; Cloud Datastore stores the JSON serialization as a blob.
BigQuery discards blob data:
Blob BigQuery discards these values when loading the data.
One possible workaround is to create Computed Properties on your model to extract the data that you're interested in in a format that BigQuery will accept.
For example, say you are storing a dict
like this in your JsonProperty
:
data = {'foo': 'bar', 'baz': 'quux'}
Let's say you are interested in the value corresponding to the key foo
. You can create a ComputedProperty
that returns the value, and this will be picked up by your BigQuery export (note you must save all your model instances after a ComputedProperty
has been added to populate the new property).
class MyModel(ndb.Model):
blob = ndb.JsonProperty()
foo = ndb.ComputedProperty(lambda self: self.blob.get('bar'))
obj = MyModel(blob=data)
obj.put()
obj.foo
'bar'
Upvotes: 3