Reputation: 141

BigQuery - Select only first row in BigQuery

I have a table with data where in Column A I have groups of repeating Data (one after another).

I want to select only first row of each group based on values in column A only (no other criteria). Mind you, I want all corresponding columns selected also for the mentioned new found row (I don't want to exclude them).

Can someone help me with a proper query.

Here is a sample: SAMPLE

Thanks!

Upvotes: 12

Answers (5)

David Masip

Reputation: 2491

You can now use qualify for a more concise solution:

select
   *
from
   your_dataset.your_table
where true
qualify ROW_NUMBER() OVER(PARTITION BY columnA ORDER BY columnA) = 1

Upvotes: 5

Mahesh Uligade

Reputation: 627

Add LIMIT 1 at the end of the query

something like

SELECT name, year FROM person_table ORDER BY year LIMIT 1

Upvotes: 7

Yurci

Reputation: 576

In BigQuery the physical sequence of rows is not significant. “BigQuery does not guarantee a stable ordering of rows in a table. Only the result of a query with an explicit ORDER BY clause has well-defined ordering.”[1].

First, you need to define which property will determine the first row of your group, then you can run Vasily Bronsky’s query by changing ORDER BY with that property. Which means either you should add another column to the table to store the order of the rows or select one from the columns you have.

Upvotes: 1

Mikhail Berlyant

Reputation: 172993

#standardSQL
SELECT row.*
FROM (
  SELECT ARRAY_AGG(t LIMIT 1)[OFFSET(0)] row
  FROM `project.dataset.table` t
  GROUP BY columnA
)

Upvotes: 11

Vasily Bronsky

Reputation: 433

you can try smth like this:

#standardSQL
SELECT
   * EXCEPT(rn)
FROM (
   SELECT
   *,
   ROW_NUMBER() OVER(PARTITION BY columnA ORDER BY columnA) AS rn
FROM
   your_dataset.your_table)
WHERE rn = 1

that will return:

Row   columnA       col2       ...   
1     AC1001        Z_Creation   
2     ACO112BISPIC  QN
...

Upvotes: 8

BigQuery - Select only first row in BigQuery

Answers (5)

Related Questions