Danny
Danny

Reputation: 65

Bigquery: Compare the columns in two tables to see if they match

I have two tables (t1 and t2) that both includes 100 of columns with the exact same names like the following:

Table 1

id A B
123 1 0
234 0 1
345 1 1

Table 2

id A B
123 1 0
234 0 1
345 1 0

Both of them have an id column and the columns that only include 1 and 0 values. I need to see if these columns match each other. Is there any way to build a table like the following that shows what percentage of each column with the same name match:

t1 t2 matched?
A A 100%
B B 66%

Thanks in advance!

Upvotes: 2

Views: 3232

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

Consider below approach

create temp function get_keys(input string) returns array<string> language js as """
  return Object.keys(JSON.parse(input));
""";
create temp function  get_values(input string) returns array<string> language js as """
  return Object.values(JSON.parse(input));
""";
with temp1 as (
  select id, key, value
  from table1 t, 
  unnest(get_keys(to_json_string(t))) key with offset 
  join unnest(get_values(to_json_string(t))) value with offset 
  using(offset)
  where key != 'id'
), temp2 as (
  select id, key, value
  from table2 t, 
  unnest(get_keys(to_json_string(t))) key with offset 
  join unnest(get_values(to_json_string(t))) value with offset 
  using(offset)
  where key != 'id'
)
select key, round(100 * countif(t1.value = t2.value) / count(t1), 2) matched
from temp1 t1
left join temp2 t2
using(key, id)
group by key    

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Related Questions