Reputation: 23
I am trying to do an SQL query in snowflake to count up values from a different table based on a value from the first table. For example, I find a value that matches in table1 and table2, and I count the rows that match it from table 2.
When I attempt to do this, the query runs but the resulting data is incorrectly counting. It looks as if it is doing some sort of odd breakout/cumulative counting. I've tried reading examples and googling for examples like this but what's interesting is in non-snowflake SQL, the examples I've seen do not need to "group by" the count I'm trying to select.
In snowflake it gives me an error unless I group the (select count() from... But doing this seems to return a result that is over counting/summing the value by a lot, when I remove that (select count()... row and the group by, it does everything else correctly so I'm guessing there's something specifically wrong there.
select
"naming_db"."public"."username".name as "Name",
"naming_db"."public"."addresses".address as "Address",
count(*) as "Shoe Count",
(select count(*) from "data_db"."records"."orderID" where
"data_db"."records"."orderID".ID = "data_db"."records"."userinfo".ID) as "Orders",
sum("data_db"."records"."userinfo".revenue) as "Spend"
from "data_db"."records"."userinfo"
join "naming_db"."public"."username" on "naming_db"."public"."username".name = "data_db"."records"."userinfo".nameid
join "naming_db"."public"."addresses" on "naming_db"."public"."addresses".address = "data_db"."records"."userinfo".addressid
full join "data_db"."records"."orderID" on "data_db"."records"."userinfo".ID = "data_db"."records"."orderID".ID
group by 1,2,4
The (select count(*) from... line I think is the issue but I can't seem to figure out how to write it in a way that would make the query run and return the results I want.
Just edited the query to make more sense. Was trying to obfuscate some of the info there but I realized it's too confusing sorry!
Here's the results that I would like. I get this when I remove the "Orders" line - the one that is (select count(*) from...
Here's the results that I'm getting when I add in that line for "Orders": Incorrect summing and breaks out more rows
Upvotes: 1
Views: 3403
Reputation: 2880
After you rearranged the query, it looks like you have:
userinfo
is the User table that contains one line per user and also has an accumulated revenue amount per userorderID
is the Order table with quantities per customer per productusername
and addresses
are only used to display the user contact info (unimportant).It is still hard to understand what your aggregations should show, so I've tried to guess:
The use of a FULL JOIN
is confusing. It indicates that some orders are for non-registered users - not existing in the userinfo
table. This doesn't make sense to me. Your examples indicates that all orders are from users in userinfo
, so there should be no need for a FULL JOIN
.
Why do you need it?
The solution should be really simple. Please advise me if I didn't understand your requirements:
SELECT
ANY_VALUE(n.name) AS "Name",
ANY_VALUE(a.address) AS "Address",
COUNT(*) AS "Shoe Count",
COUNT(o.ID) AS "Orders",
ANY_VALUE(u.revenue) AS "Spend"
FROM "data_db"."records"."userinfo" u
INNER JOIN "naming_db"."public"."username" n ON n.name = u.nameid
INNER JOIN "naming_db"."public"."addresses" a ON a.address = u.addressid
INNER JOIN "data_db"."records"."orderID" o ON o.ID = u.ID
GROUP BY u.ID
Upvotes: 1
Reputation: 26140
Your sub-select is a correlated sub-query which snow has limited support for. So if you pivot to a CTE which does the count for all ID
's then join to that you should get what you are asking for..
WITH order_counts AS (
SELECT ID, count(*) as count data_db.records.orderID group by 1
)
select
un.name as Name,
a.address as Address,
count(*) as Shoe Count,
oc.count as Orders,
sum(ui.revenue) as Spend
from data_db.records.userinfo as ui
join naming_db.public.username as un
on un.name = ui.nameid
join naming_db.public.addresses as a
on a.address = ui.addressid
left join order_counts as oc
on ui.ID = oc.ID
group by 1,2,4
I also inserted some aliases so the SQL was more readable.
Upvotes: 0
Reputation: 13026
you can use sum()
instead of using subquery
on your column
Shoes.
select
"table1".name as "Name",
"table1".address as "Address",
count(*) as "Counts",
sum(case when coalesce("table3".shoeID, 0) != 0 then 1 else 0 end) as "Shoes",
sum("table2".shoecount) as "Number of Shoes"
from "table2"
join "table1" on "table1".name = "table2".name
join "table1" on "table1".Address = "table2".Address
full join "table3" on "table3".shoeID = "table2".shoeID
group by 1,2,4
Upvotes: 0