user19329
user19329

Reputation: 57

sql pivot with repeated variable

I'm new to SQL and using Google BigQuery. I have a table with one record that looks like the following:

publication_number | assignee

US-6044964-A|Sony Corporation

|Digital Audio Disc Corporation

The identifier, publication_number, is listed only once; the first assignee appears on the same row as the publication_number and the second just shows up as an additional row with no identifier.

What I want to do is create a table like the following:

publication_number | assignee1 | assignee2

US-6044964-A | Sony Corporation |Digital Audio Disc Corporation

Where the additional assignee appears as another column.

I have what Google's patents-public-data calls a "repeated" variable. I've tried the following query in BigQuery:

SELECT pvt.publication_number, pvt.[1] as assignee1, pvt.[2] as assignee2
FROM `main_tables.main_table5` 
PIVTO (
   MAX(assignee)
   FOR publication_number IN([1],[2])
) as pvt

I receive the following error:

Syntax error: Unexpected "[" at [2:36]. If this is a table identifier, escape the name with , e.g.table.name` rather than [table.name]

I have found the following question/reply that does something similar to what I want using pivot here. However, I don't have an identifier for each row as in that example.

How can I create another column for the second assignee?

Upvotes: 3

Views: 672

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173171

Below is for BigQuery Standard SQL and should give you an idea

#standardSQL
SELECT 
  publication_number, 
  assignee[SAFE_OFFSET(0)] assignee1,
  assignee[SAFE_OFFSET(1)] assignee2,
  assignee[SAFE_OFFSET(2)] assignee3,
  assignee[SAFE_OFFSET(3)] assignee4
FROM `yourproject.yourdataset.yourtable`   

You can test / play with it with below dummy data

#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
  SELECT 'US-6044964-A' publication_number , ['Sony Corporation', 'Digital Audio Disc Corporation'] assignee UNION ALL
  SELECT 'ABC', ['xyz', 'abc', 'uvw']
)
SELECT 
  publication_number, 
  assignee[SAFE_OFFSET(0)] assignee1,
  assignee[SAFE_OFFSET(1)] assignee2,
  assignee[SAFE_OFFSET(2)] assignee3,
  assignee[SAFE_OFFSET(3)] assignee4
FROM `yourproject.yourdataset.yourtable`   

Result is

publication_number  assignee1           assignee2                       assignee3   assignee4    
US-6044964-A        Sony Corporation    Digital Audio Disc Corporation  null        null     
ABC                 xyz                 abc                             uvw null     

Upvotes: 2

Related Questions