Reputation: 41
I am trying to run below query -
select prd_cat, product_category from
(
select split( product_category,".") as prd_cat,product_category from
test_dataset.cosme_raw_table
where product_link = "XXX"
) as a
group by prd_cat,product_category;
when I ran it using BigQuery Web interface it ran successfully but when I tried to run it using BigQuery Query API it failed with error message "Grouping by expressions of type ARRAY is not allowed at [6:10]" Below is my code -
String query = "select prd_cat, product_category" +
" from\n" +
"(\n" +
"select split( product_category,\".\") as prd_cat," +
"product_category " +
"from test_dataset.cosme_raw_table \n" +
"where product_link = \"XXX\"\n" +
") as a\n" +
"group by prd_cat,product_category";
QueryJobConfiguration queryJobConfiguration =
QueryJobConfiguration.newBuilder(query)
.setDestinationTable(tableId1)
.setWriteDisposition(JobInfo.WriteDisposition.WRITE_TRUNCATE)
.build();
Job loadJob1 = bigquery.create(JobInfo.of(queryJobConfiguration));
Below are the logs -
com.google.cloud.bigquery.BigQueryException: Grouping by expressions of type ARRAY is not allowed at [6:10]
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:99)
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:401)
at com.google.cloud.bigquery.BigQueryImpl$23.call(BigQueryImpl.java:688)
at com.google.cloud.bigquery.BigQueryImpl$23.call(BigQueryImpl.java:683)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89)
at com.google.cloud.RetryHelper.run(RetryHelper.java:74)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51)
at com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:682)
at com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:674)
at com.google.cloud.bigquery.Job$1.call(Job.java:329)
at com.google.cloud.bigquery.Job$1.call(Job.java:326)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89)
at com.google.cloud.RetryHelper.run(RetryHelper.java:74)
at com.google.cloud.RetryHelper.poll(RetryHelper.java:63)
at com.google.cloud.bigquery.Job.waitForQueryResults(Job.java:325)
at com.google.cloud.bigquery.Job.waitFor(Job.java:240)
at TestBigQuery.explicit(TestBigQuery.java:190)
at TestBigQuery.main(TestBigQuery.java:32)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"location" : "parameters.q",
"message" : "Grouping by expressions of type ARRAY is not allowed at [6:10]",
"reason" : "invalidQuery"
} ],
"message" : "Grouping by expressions of type ARRAY is not allowed at [6:10]",
"status" : "INVALID_ARGUMENT"
}
Can some one please help. Thanks!!
Upvotes: 1
Views: 1257
Reputation: 33705
If you use legacy SQL, the GROUP BY operator will implicitly flatten any arrays that you are grouping. If you use standard SQL, you need to flatten the array explicitly. Note that:
You can fix your query to work using standard SQL by flattening the array, e.g.:
select prd_cat, product_category
from test_dataset.cosme_raw_table,
UNNEST(split( product_category,".")) as prd_cat
where product_link = "XXX"
group by prd_cat,product_category;
It's not clear to me what result you are hoping to get with the query, but at least it should run.
Upvotes: 0
Reputation: 2883
This is happening because you're using Legacy SQL. You need to set it in your QueryJobConfiguration. For instance:
import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.Dataset;
import com.google.cloud.bigquery.DatasetInfo;
import com.google.cloud.bigquery.FieldValue;
import com.google.cloud.bigquery.FieldValueList;
import com.google.cloud.bigquery.QueryJobConfiguration;
public class QuickstartSample {
public static void main(String... args) throws Exception {
BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
String query = "Your-Query";
//setUseLegacySql(true) below
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).setUseLegacySql(true).build();
for (FieldValueList row : bigquery.query(queryConfig).iterateAll()) {
for (FieldValue val : row) {
System.out.printf("%s,", val.toString());
}
System.out.printf("\n");
}
}
}
Otherwise, you could use the TO_JSON_STRING with Standard SQL. For instance:
String query = "WITH sample AS (SELECT 1 id, ['a,b', 'c'] a UNION ALL SELECT 1, ['a','b,c']) SELECT TO_JSON_STRING(a) arr,COUNT(DISTINCT id) cnt FROM sample GROUP BY arr";
QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();
In your case, you could try:
WITH a AS (select split(product_category,".") as prd_cat,product_category from test_dataset.cosme_raw_table where product_link = "XXX") select TO_JSON_STRING(prd_cat) arr, product_category from a GROUP BY arr,product_category
Hope it helps.
Upvotes: 1