nyptop
nyptop

Reputation: 59

SQL: Engineering new features using UPDATE SET

I'm trying to create a new feature, user_count, which displays the number of times a particular user appears in a dataset. Here is a snapshot of the original dataset:

dataset

Here is the SQL code that I have written to try to achieve this (used in GoogleCloud's BigQuery):

UPDATE `project-feat-eng.ednetv1.smalltrain` AS t
SET t.user_count = target.total_questions_answered
FROM (
SELECT user_id, count(user_id) AS total_questions_answered
FROM `project-feat-eng.ednetv1.smalltrain` 
GROUP BY user_id) AS target
WHERE target.user_id = t.user_id

The problem is that user_count is not a feature in the original table, so cannot be updated. Can the above code be adapted to create the user_count feature successfully?

Upvotes: 0

Views: 45

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

You cannot introduce new column using UPDATE syntax. Rather you should use CREATE OR REPLACE TABLE syntax (with below query) to create new table or replace existing

SELECT *, COUNT(1) OVER(PARTITION BY user_id) AS user_count
FROM `project-feat-eng.ednetv1.smalltrain`    

Or, as an option, you can use ALTER TABLE to add new column first and then use your UPDATE statement

Upvotes: 1

Related Questions