Paul Clarke
Paul Clarke

Reputation: 338

Update one row based off distinct values of another column

I've got a data set with post codes, suburbs and their longitude and latitude.

For each postcode there are multiple rows with the corresponding suburbs within that postcode, so when I match it with another table which has sales by postcode in Power BI I end up with multiple rows returned for each post code.

What I'd like to do is insert a column called unique_postcode as a boolean marking one line of each post code as True. I don't mind which one. I tried the below as well as a few other options, it didn't give any errors but didn't have any affect.

UPDATE postcodes
SET post_codes.unique_postcode = 1
FROM (
    SELECT DISTINCT(postcode)
    FROM postcodes
);

Upvotes: 1

Views: 1281

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1271231

You can use row_number() to define a particular one to assign the flag to. In an update, this looks like:

WITH toupdate AS (
      SELECT p.*, ROW_NUMBER() OVER (PARTITION BY postcode ORDER BY postcode) as seqnum
      FROM postcodes p
     )
UPDATE toupdate
    SET unique_postcode = (CASE WHEN seqnum = 1 THEN 1 ELSE 0 END);

Note: This sets one value to "1" and the rest to "0". It is also safe to run multiple times on the table.

Upvotes: 0

SteveC
SteveC

Reputation: 6015

If the row doesn't matter then the simplest way would be to select TOP 1.

with cte as (select top 1 * from postcodes)
update cte
set unique_postcode = 1;

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522817

You could use an updatable CTE which targets a random row:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY postcode ORDER BY postcode) rn
    FROM postcodes
)

UPDATE cte
SET unique_postcode = 1
WHERE rn = 1;

Note that because the ordering used in ROW_NUMBER uses the postal code itself, the "first" row number value could be any of the rows, in the case that a postal code have more than one record associated with it.

Upvotes: 2

Related Questions