Reputation: 12484
I'm struggling to understand what this query is doing:
SELECT branch_name, count(distinct customer_name)
FROM depositor, account
WHERE depositor.account_number = account.account_number
GROUP BY branch_name
What's the need of GROUP BY?
Upvotes: 0
Views: 1126
Reputation: 498904
You must use GROUP BY
in order to use an aggregate function like COUNT
in this manner (using an aggregate function to aggregate data corresponding to one or more values within the table).
The query essentially selects distinct branch_name
s using that column as the grouping column, then within the group it counts the distinct customer_name
s.
You couldn't use COUNT
to get the number of distinct customer_name
s per branch_name
without the GROUP BY
clause (at least not with a simple query specification - you can use other means, joins, subqueries etc...).
Upvotes: 6
Reputation: 57023
Let's take a step away from SQL for a moment at look at the relational trainging language Tutorial D.
Because the two relations (tables) are joined on the common attribute (column) name account_number
, we can use a natural join:
depositor JOIN account
(Because the result is a relation, which by definition has only distinct tuples (rows), we don't need a DISTINCT
keyword.)
Now we just need to aggregate using SUMMARIZE..BY
:
SUMMARIZE (depositor JOIN account)
BY { branch_name }
ADD ( COUNT ( customer_name ) AS customer_tally )
Back in SQLland, the GROUP BY branch_name
is doing the same as SUMMARIZE..BY { branch_name }
. Because SQL has a very rigid structure, the branch_name
column must be repeated in the SELECT
clause.
Upvotes: 2
Reputation: 16304
If you want to COUNT
something (see SELECT
-Part of the statement), you have to use GROUP BY
in order to tell the query what to aggregate. The GROUP BY
statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
Neglecting it will lead to SQL errors in most RDBMS, or senseless results in others.
Useful link: http://www.w3schools.com/sql/sql_groupby.asp
Upvotes: 2
Reputation: 57573
It's giving you the total distinct customers for each branch; GROUP BY
is used for grouping COUNT
function.
It could be written also as:
SELECT branch_name, count(distinct customer_name)
FROM depositor INNER JOIN account
ON depositor.account_number = account.account_number
GROUP BY branch_name
Upvotes: 3