Reputation: 309831
I'm trying to query against app-engine datastore backup data. In python, the entities are described as something like this:
class Bar(ndb.Model):
property1 = ndb.StringProperty()
property2 = ndb.StringProperty()
class Foo(ndb.Model):
bar = ndb.StructuredProperty(Bar, repeated=True)
baz = ndb.StringProperty()
Unfortunately when Foo
gets backed up and loaded into bigquery, the table schema gets loaded as:
bar | RECORD | NULLABLE
bar.property1 | STRING | REPEATED
bar.property2 | STRING | REPEATED
baz | STRING | NULLABLE
What I would like to do is to get a table of all bar.property1
and associated bar.property2
where baz = 'baz'
.
Is there a simple way to flatten Foo
so that the bar records are "zipped" together? If that's not possible, is there another solution?
Upvotes: 0
Views: 681
Reputation: 309831
As hinted in a comment by @Mosha, it seems that big query supports User Defined Functions (UDF). You can input it in the UDF Editor
tab on the web UI. In this case, I used something like:
function flattenTogether(row, emit) {
if (row.bar && row.bar.property1) {
for (var i=0; i < row.bar.property1.length; i++) {
emit({property1: row.bar.property1[i],
name: row.bar.property2[i]});
}
}
};
bigquery.defineFunction(
'flattenBar',
['bar.property1', 'bar.property2'],
[{'name': 'property1', 'type': 'string'},
{'name': 'property2', 'type': 'string'}],
flattenTogether);
And then the query looked like:
SELECT
property1,
property2,
FROM
flattenBar(
SELECT
bar.property1,
bar.property2,
FROM
[dataset.foo]
WHERE
baz = 'baz')
Upvotes: 2
Reputation: 13994
Since baz is not repeated, you can simply filter on it in WHERE clause without any flattening:
SELECT bar.property1, bar.property2 FROM t WHERE baz = 'baz'
Upvotes: 1