Reputation: 1708
I have this BigQuery
table with three columns and every row can have same values of the previous one.
For example:
| col_a | col_b | col_c
+-------+-------+------------
| 123 | 3 | 2019-12-12
| 123 | 3 | 2019-12-12
| 234 | 11 | 2019-10-12
Now I want to add new column named col_d with a UUID in it.
The problem is that when I try to execute UPDATE
command I have no way to update only one row at a time (because some rows have the same values and I want different UUID in each one of them).
Things I tried with no luck :(
LIMIT
UPDATE table
SET col_d = GENERATE_UUID()
LIMIT 1
I thought to get all rows and then traverse them with an update command. But there's not LIMIT on UPDATE commands in BigQuery.
ROW_NUMBER
UPDATE table
SET col_d = ROW_NUMBER() OVER()
But BigQuery doesn't allow to use analytic function in Update
command
INSERT
I can query all rows, and insert them with a UUID and then delete all the old ones that has no UUID. that approach will work and it will be my final gateaway but I believe there's a better way so I'm asking here.
Any other idea or advice will be welcome.
Upvotes: 1
Views: 2872
Reputation: 173056
Below is for BigQuery Standard SQL and produces different UUID for each and every row no matter how duplicate they are
UPDATE `project.dataset.table`
SET col_d = GENERATE_UUID()
WHERE TRUE
Note: based on your "Insert and then Delete" option - I assume that col_d already exists in your table - otherwise you would not be able to do DELETE FROM table WHERE col_d IS NULL
as you mentioned in your comments
Upvotes: 4
Reputation: 1270573
because some rows have the same values and I want different UUID in each one of them
This should do what you want:
UPDATE table
SET col_d = GENERATE_UUID()
I don't understand why you would be using limit
, if you want to update all rows.
That said, BigQuery has restrictions on UPDATE
s, so the CREATE TABLE
approach suggested by fromthehills seems more appropriate.
Upvotes: 1
Reputation: 162
You can SELECT the data with a UUID as a fourth column (col_d) and then save that data as a new table.
SELECT col_ac, col_b, col_c, GENERATE_UUID() AS col_d
FROM table
This will generate the output you desire:
| col_a | col_b | col_c | col_d
+-------+-------+-------------+------------------------------------------
| 123 | 3 | 2019-12-12 | e3784e4d-59bb-433b-a9ac-3df318e0f675
| 123 | 3 | 2019-12-12 | 430d034a-6292-4f5e-b1b0-0ee5550af3f6
| 234 | 11 | 2019-10-12 | 3e7e14d2-3077-4030-a704-5a2b7fc3c11e
Since BigQuery does not allow adding a column with data like traditional SQL the following should create a new table with the UUID values added.
CREATE OR REPLACE TABLE table AS
SELECT *, GENERATE_UUID() AS col_d
FROM table
Be warned that the table history may be deleted so back it up first. One should always backup data prior to doing such updates in all cases as undesired outcomes do arise.
Upvotes: 1