Big Query query selecting by range of columns as one might in Python

Question

In python I can select a ranges of columns to create a new dataset. Can this be done in Big Query?

For example, I have a table with many columns (57), and I want to query columns 6-40.

In python I can df = df.iloc[:, 6:41].

As far as I can tell, all I can do in Big Query use a select statement that lists by name, columns 6-40 or select * except and then list by name the 22 columns I don't want.

So for example, my table has date, userid, address, followed by several demographic variables (income, location, etc), followed by several metrics about their online behavior (last login, etc.) There's over fifty columns, and I only want to include about half in my query.

Is there a more efficient way of querying a range of columns in a table that has a large number of columns?

Mikhail Berlyant · Accepted Answer

Consider below - this is the closest I can think of

execute immediate format('select %s from your_table', (
  select string_agg(split(kv, ':')[offset(0)], ',' order by offset) list
  from (select * from your_table limit 1) t, 
  unnest([translate(to_json_string(t), '{}"', '')]) rec,
  unnest(split(rec)) kv with offset
  where offset between 5 and 39
))

or you can just simply generate your sql statement using below

select format('select %s 
from your_table', (
  select string_agg(split(kv, ':')[offset(0)], ', ' order by offset) list
  from (select * from your_table limit 1) t, 
  unnest([translate(to_json_string(t), '{}"', '')]) rec,
  unnest(split(rec)) kv with offset
  where offset between 5 and 39
))

Big Query query selecting by range of columns as one might in Python

Answers (1)

Related Questions