Ben
Ben

Reputation: 35

Counting CASE in SQL Google BigQuery

SELECT
SUM(CASE WHEN TipPercentage < 0 THEN 1 ELSE 0 END) AS 'No Tip'
SUM(CASE WHEN TipPercentage BETWEEN 0 AND 5 THEN 1 ELSE 0 END) AS 'Less but still a Tip'
SUM(CASE WHEN TipPercentage BETWEEN 5 AND 10 THEN 1 ELSE 0 END) AS 'Decent Tip'
SUM(CASE WHEN TipPercentage > 10 THEN 1 ELSE 0 END) AS 'Good Tip'
SUM(ELSE ) AS 'Something different'
END AS TipRange,
TipPercentage,
Tipbin)
FROM
(SELECT
case when tip_amount=0 then 'No Tip'
when (tip_amount > 0 and tip_amount <=5) then '0-5'
when (tip_amount > 5 and tip_amount <=10) then '5-10'
when (tip_amount > 10 and tip_amount <=20) then '10-20'
when tip_amount > 20 then '> 20'
else 'other'
end as Tipbin,
SUM(tip_amount) as Tips,
ROUND(avg((tip_amount)/(total_amount-tip_amount))*100,3) as TipPercentage
FROM `bigquery-public-data.new_york.tlc_yellow_trips_2015`
WHERE trip_distance >0
AND fare_amount/trip_distance BETWEEN 2 AND 10
AND dropoff_datetime > pickup_datetime
group by 1,2,3,tip_amount,tipbin)

I was trying to get data from Google Bigquery where the sum of each 'No Tip', 'Less but still a Tip', 'Decent Tip', 'Good Tip' and 'Something Different' would be returned based on the counts of each. However, I got a syntax error saying that the string 'No Tip' was unexpected.

Could someone guide me on this?

Thank you!

Edit:

The error code I got using Standard SQL was

Error: Syntax error: Unexpected string literal 'No Tip' at [2:55]

When I tried running it with Legacy SQL I got this:

Error: Encountered " "AS" "AS "" at line 2, column 52. Was expecting: <EOF>

Upvotes: 0

Views: 13111

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270401

Based on your description, you seem to essentially want your subquery. Here it is cleaned up a bit with syntax errors fixed:

SELECT (case when tip_amount = 0 then 'No Tip'
             when tip_amount > 0 and tip_amount <= 5 then '0-5'
             when tip_amount > 5 and tip_amount <= 10 then '5-10'
             when tip_amount > 10 and tip_amount <= 20 then '10-20'
             when tip_amount > 20 then '> 20'
             else 'other'
        end) as Tipbin,
       COUNT(*) as num,
       SUM(tip_amount) as Tips,
       ROUND(avg((tip_amount)/(total_amount-tip_amount))*100,3) as TipPercentage
FROM `bigquery-public-data.new_york.tlc_yellow_trips_2015`
WHERE trip_distance > 0 AND
      fare_amount/trip_distance BETWEEN 2 AND 10 AND
      dropoff_datetime > pickup_datetime
GROUP BY TIpBin
ORDER BY MIN(tip_amount);

Upvotes: 2

chinloyal
chinloyal

Reputation: 1141

Your missing a bunch of commas after each sum you need:

SELECT
SUM(CASE WHEN TipPercentage < 0 THEN 1 ELSE 0 END) AS 'No Tip',
SUM(CASE WHEN TipPercentage BETWEEN 0 AND 5 THEN 1 ELSE 0 END) AS 'Less but still a Tip',
SUM(CASE WHEN TipPercentage BETWEEN 5 AND 10 THEN 1 ELSE 0 END) AS 'Decent Tip',
SUM(CASE WHEN TipPercentage > 10 THEN 1 ELSE 0 END) AS 'Good Tip',
-- SUM(ELSE ) AS 'Something different'//this line is missing something in 
-- the sum function
END AS TipRange,
TipPercentage,
Tipbin
FROM
(SELECT
case when tip_amount=0 then 'No Tip'
when (tip_amount > 0 and tip_amount <=5) then '0-5'
when (tip_amount > 5 and tip_amount <=10) then '5-10'
when (tip_amount > 10 and tip_amount <=20) then '10-20'
when tip_amount > 20 then '> 20'
else 'other'
end as Tipbin,
SUM(tip_amount) as Tips,
ROUND(avg((tip_amount)/(total_amount-tip_amount))*100,3) as TipPercentage
FROM `bigquery-public-data.new_york.tlc_yellow_trips_2015`
WHERE trip_distance >0
AND fare_amount/trip_distance BETWEEN 2 AND 10
AND dropoff_datetime > pickup_datetime
group by 1,2,3,tip_amount,tipbin) T --All derived tables must have an alias

Upvotes: 1

Related Questions