Reputation: 59
I'm trying to create a new feature, user_count, which displays the number of times a particular user appears in a dataset. Here is a snapshot of the original dataset:
Here is the SQL code that I have written to try to achieve this (used in GoogleCloud's BigQuery):
UPDATE `project-feat-eng.ednetv1.smalltrain` AS t
SET t.user_count = target.total_questions_answered
FROM (
SELECT user_id, count(user_id) AS total_questions_answered
FROM `project-feat-eng.ednetv1.smalltrain`
GROUP BY user_id) AS target
WHERE target.user_id = t.user_id
The problem is that user_count is not a feature in the original table, so cannot be updated. Can the above code be adapted to create the user_count feature successfully?
Upvotes: 0
Views: 45
Reputation: 173190
You cannot introduce new column using UPDATE
syntax. Rather you should use CREATE OR REPLACE TABLE
syntax (with below query) to create new table or replace existing
SELECT *, COUNT(1) OVER(PARTITION BY user_id) AS user_count
FROM `project-feat-eng.ednetv1.smalltrain`
Or, as an option, you can use ALTER TABLE
to add new column first and then use your UPDATE
statement
Upvotes: 1