A.B.
A.B.

Reputation: 2470

mysql world database: select country with max population

I want to select the country with the highest population. This is my query:

SELECT continent, name, population FROM country HAVING population = MAX(population);

Somehow it returns 0 rows.

If i use a nested query it works:

SELECT continent, name, population FROM country WHERE population = (SELECT MAX(population) FROM country);

So my question is: What is wrong with the first query?

PS: You can download the database here: http://dev.mysql.com/doc/index-other.html


Alright i think finally i figured out the whole process. Here how it works and why the alternative suggestions/solutions didn't work:

(First of all, as "Dan Bracuk" said in his answer, we have to combine GROUP BY with HAVING and add the aggregate function at the begininng in SELECT statement)

So let's go step by step and try this:

SELECT continent, name, population, MAX(population) FROM country;

This yields us the first row appended MAX(pop) at the end:
"North America", "Aruba", "103000", "1277558000"

So MAX(population) is just one entry so it restricts the row number to 1 and because i added the continent, name, population columns, mysql just selects the first row from the table.

So if i now write down:

SELECT continent, name, population, MAX(population) FROM country HAVING population = MAX(population);

I get 0 rows, because 103000 is not equal to 1277558000.

If i use:

SELECT continent, name, population, MAX(population) FROM country GROUP BY name;

for instance, i get a list of all countries where on each row MAX(population) = population.

So additionally adding "HAVING population = MAX(population)" has no effect since it's already true.

I hope i understood it correctly and could clarify others who wondered why the other solutions didnt work.

Upvotes: 3

Views: 24483

Answers (7)

seun
seun

Reputation: 11

The HAVING clause is used in filtering data while aggregating at the same time, if the table already contains each continent with its total population, then you do not need to use the max function in the main query cos there will be no need to aggregate the data. Therefore the WHERE will suffice in this case

Also, where you have Having population = Max(population) will return nothing because you did not specify the table from which the max(population) will get the data.

Alternatively, in sqlserver you can use the clause Where >= ALL to pull the continent with the highest population in the entire table

select  continent, name
from country
where population >= all ( select population
                            from country
                           )

Upvotes: 0

newtothis
newtothis

Reputation: 11

select population, country from world where population = (select max(population) from world)

Upvotes: 1

user11951743
user11951743

Reputation: 1

If you want to find the name of the city with the max and counting city from number of city given in table so you can find it through fire given below query.

select max(city),
       max(count(city))
from emp
group by city
having city=max(city);
MAX(city)      |   MAX(COUNT(city))
-------------------- -------------------
udhna          |           4

Upvotes: -1

unique2
unique2

Reputation: 2302

While the WHERE clause is applied to the source table, the HAVING clause is applied to the aggregated data as a last step before returning the result.

In the case of MySQL this is what happens with your query:

  • Since you are using the aggregation function MAX, MySQL concludes that the query should be an aggregation
  • Since there is no GROUP BY clause, the aggregation is over the complete table returning a single line
  • You are using the fields continent, name, and population which are not in the GROUP BY without an aggregation function, while other SQL databases would throw an error, MySQL actually just returns the values of an indeterminate row
  • The HAVING clause checks if the returned row has the maximum population

Usually this should return 0 rows, but depending on the setup of the actual table, you may actually get a correct answer as you can see in this sql fiddle: http://sqlfiddle.com/#!2/79835/7, but this is misleading and breaks down for the minimum.

If you are using a GROUP BY clause or an aggregation function like MAX, COUNT, or SUM your query is an aggregation. In that case any field you are using should usually either be included in the GROUP BY or enclosed in an aggregation function. Not doing this can lead to subtle bugs which are hard to track down.

See also: GROUP BY behavior when no aggregate functions are present in the SELECT clause

Upvotes: 1

WGS
WGS

Reputation: 14179

Even if I already posted my answer in the comments area, I feel obliged to point this out. Mind you, this might not even be correct, but for Q&A purposes, I must at the very least attempt to point out what is wrong with the first query.

First point: It's not entirely wrong. It just happens to be "non-conformist". There are instances where people have forced similar queries to work, but it's almost always an action that begs a long explanation. Suffice to say, "hack-free", HAVING works (almost) only with GROUP BY.

Second is the evaluation flow of SQL. I might be off the mark here as well, and since I haven't had any experience with SQL for quite a while, but the order I think should stay the same. FROM-WHERE-GROUP BY-HAVING-SELECT-ORDER BY was one of the first things I've learned about it, and it acts like a filter of sorts. Going for a HAVING MAX(population) when it hasn't pulled out the population yet is a bit of pain to execute. This is partly where subqueries exist: so you can identify a "target" value that can be referenced before SELECT ever comes around.

Let's step through your second line, the one with a subquery: FROM country gets all the countries. Then you said to SQL on the WHERE part, "Hey, let's set a filter. But the filter needs another query. So let's do that." It returns a value, which is the MAX(population of the table. Now, he returns that value to WHERE so WHERE filters out all rows with a population not equal to the condition. Then it returns the continent, name, and population of the country. Simple enough.

Now, the second one: Again, FROM country gets all the countries. You have no WHERE so you're working with the base table. You didn't put in a GROUP BY either, no grouping happened. But then you threw in a HAVING, so SQL stops and asks, "Where's the qualifying groups I should throw out? But even if there are no groups and by some chance this query is slightly valid, where's my condition?" So you provided a condition, but this condition rests on one simple fact: Who is MAX(population) referring to? You haven't picked it out of the list anyway, since SELECT still has to happen.

In essence, this is partly the reason why you can't do a SELECT name FROM country WHERE population = MAX(population). As far as the flow goes, it doesn't know the max population because it still needs to be evaluated. It's like asking an apple picker to select the best apple from an orchard when he hasn't seen all the apples yet.

Upvotes: 1

James
James

Reputation: 3805

Select continent, name, sum(population) As sumPop
From country
Group By continent
Order By sumPop Desc
Limit 1;

Wow, after lots of trial and error, I see what the issue is now. First, your initial query seems wrong. With the database table you are using, you can have multiple regions within the continent. Therefore, your query will only return (if working correctly) the continent with the single largest region instead of a sum of all regions. Secondly, you absolutely can use the max function within the having clause, such as in this example:

Select continent, name, max(population) As sumPop
From country
Group By continent
Having sumPop = max(population)

(If you do not group by Continent, you will only get one row back.)

However, this is really of no use to you because at this point, the max population is whatever the population is for that row. Not for every row. This is because the having clause is only looking at the values for that one row (for however many rows you have). No matter what, you will need a sub query if you want to use the having clause in your case (or you can use my initial query), such as below.

Select continent, name, sum(population) As sumPop, maxPop
From country
Left Join (Select sum(population) As maxPop
          From country
          Group By continent
          Order By maxPop Desc Limit 1) as TmpTable On (maxPop > 0)
Group By continent
Having sumPop = maxPop

http://sqlfiddle.com/#!2/3ab95/26 (This sqlfiddle does not contain all table rows, so the result may be different from your own.)

Upvotes: 2

Dan Bracuk
Dan Bracuk

Reputation: 20804

There are a couple of things wrong with it. In fact, in some db engines it would throw an error.

First, the having keyword is used with aggregate results, not fields. Here is an example.

select field1, count(*) records
from sometable
group by field1
having records > 10

Note that some databases will not allow you use alias names in the having clause.

Next, you say this:

having population = max(population)

max(population) hasn't really been defined. You are not selecting it from anywhere.

Upvotes: 1

Related Questions