Caffeinated
Caffeinated

Reputation: 12484

I'm not sure what is the purpose of "group by" here

I'm struggling to understand what this query is doing:

SELECT branch_name, count(distinct customer_name)
FROM depositor, account
WHERE depositor.account_number = account.account_number
GROUP BY branch_name

What's the need of GROUP BY?

Upvotes: 0

Views: 1126

Answers (4)

Oded
Oded

Reputation: 498904

You must use GROUP BY in order to use an aggregate function like COUNT in this manner (using an aggregate function to aggregate data corresponding to one or more values within the table).

The query essentially selects distinct branch_names using that column as the grouping column, then within the group it counts the distinct customer_names.

You couldn't use COUNT to get the number of distinct customer_names per branch_name without the GROUP BY clause (at least not with a simple query specification - you can use other means, joins, subqueries etc...).

Upvotes: 6

onedaywhen
onedaywhen

Reputation: 57023

Let's take a step away from SQL for a moment at look at the relational trainging language Tutorial D.

Because the two relations (tables) are joined on the common attribute (column) name account_number, we can use a natural join:

depositor JOIN account

(Because the result is a relation, which by definition has only distinct tuples (rows), we don't need a DISTINCT keyword.)

Now we just need to aggregate using SUMMARIZE..BY:

SUMMARIZE (depositor JOIN account) 
   BY { branch_name } 
      ADD ( COUNT ( customer_name ) AS customer_tally )

Back in SQLland, the GROUP BY branch_name is doing the same as SUMMARIZE..BY { branch_name }. Because SQL has a very rigid structure, the branch_name column must be repeated in the SELECT clause.

Upvotes: 2

Bjoern
Bjoern

Reputation: 16304

If you want to COUNT something (see SELECT-Part of the statement), you have to use GROUP BY in order to tell the query what to aggregate. The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.

Neglecting it will lead to SQL errors in most RDBMS, or senseless results in others.

Useful link: http://www.w3schools.com/sql/sql_groupby.asp

Upvotes: 2

Marco
Marco

Reputation: 57573

It's giving you the total distinct customers for each branch; GROUP BY is used for grouping COUNT function.
It could be written also as:

SELECT branch_name, count(distinct customer_name)
FROM depositor INNER JOIN account
ON depositor.account_number = account.account_number
GROUP BY branch_name

Upvotes: 3

Related Questions