Reputation: 45
I'm really appreciate for the answers in my previous questions: Query table with unpredictable number of columns by BigQuery.
However, I tried to put them in my official task but with me, it's still quite tricky. According to this task, my table looks like this, and there will not only 7 but maybe 10, 1000, N value columns and N time columns:
My final result needs to be
"value" column got from columns with names contains "value_" and "time" column from columns with names contains "time_" and the most difficult thing: "position this value appeared" will be the position where this value appeared with the corresponding id.
Is there any possible way to create this result table? Thank you all in advanced.
Code to create this sample table:
WITH my_dataset AS
(SELECT '001' as id, 1 as value_1, 'a1' as time_1, 2 as value_2, 'a2' as time_2,3 as value_3, 'a3' as time_3, 4 as value_4,
'a4' as time_4, 5 as value_5, 'a5' as time_5, 6 as value_6, 'a6' as time_6, 7 as value_7, 'a7' as time_7
UNION ALL
SELECT '002', 8, 'a8', 9, 'a9', 10, 'a10', 11, 'a11', 12, 'a12', 13, 'a13', 14, 'a14'
UNION ALL
SELECT '003', 15, 'a15', 16, 'a16', 17, 'a17', 18, 'a18', 19, 'a19', 20, 'a20', 21, 'a21'
UNION ALL
SELECT '004', 22, 'a22', 23, 'a23', 24, 'a24', 25, 'a25', 26, 'a26', 27, 'a27', 28, 'a28'
UNION ALL
SELECT '005', 29, 'a29', 30, 'a30', 31, 'a31', 32, 'a32', 33, 'a33', 34, 'a34', 35, 'a35'
UNION ALL
SELECT '006', 36, 'a36', 37, 'a37', 38, 'a38', 39, 'a39', 40, 'a40', 41, 'a41', 42, 'a42'
UNION ALL
SELECT '007', 43, 'a43', 44, 'a44', 45, 'a45', 46, 'a46', 47, 'a47', 48, 'a48', 49, 'a49')
SELECT * FROM my_dataset
Upvotes: 2
Views: 67
Reputation: 1955
Here is a script approach for you to compare the performance when querying a big table:
declare n int64;
declare query_str string;
set n = (select cast((array_length(regexp_extract_all(to_json_string(`<project>.<dataset>.<table>`),"\":"))-1)/2 as int64) total_columns from `<project>.<dataset>.<table>` limit 1);
set query_str = '''SELECT id,value_1 as value, time_1 as time, 1 as position FROM `<project>.<dataset>.<table>`''';
while n > 1 do
set query_str = concat(query_str, "union all SELECT id,value_", n, " as value, time_", n ," as time, ", n ," as position FROM `<project>.<dataset>.<table>`");
set n = (n-1);
end while;
EXECUTE IMMEDIATE query_str;
Upvotes: 1
Reputation: 172994
Consider below approach
select * from (
select id, val, split(col, '_')[offset(0)] as col, split(col, '_')[offset(1)] as pos
from my_dataset t, unnest([to_json_string(t)]) json,
unnest(`bqutil.fn.json_extract_keys`(json)) col with offset
join unnest(`bqutil.fn.json_extract_values`(json)) val with offset
using(offset)
where starts_with(col, 'value_') or starts_with(col, 'time_')
)
pivot (min(val) for col in ('value', 'time'))
if applied to sample data in your question - output is
Upvotes: 1