kkun
kkun

Reputation: 45

error using "order by" in a select statement (error: column is not contained in "either an aggregate function or the GROUP BY clause")

I have a table below and want to count the number of consecutive occurrences each letter appears. The code to reproduce the table I am using is listed for those helping to save time.

CREATE TABLE table1 (id integer, names varchar(50));

  INSERT INTO table1 VALUES (1,'A');
  INSERT INTO table1 VALUES (2,'A');
  INSERT INTO table1 VALUES (3,'B');
  INSERT INTO table1 VALUES (4,'B');
  INSERT INTO table1 VALUES (5,'B');
  INSERT INTO table1 VALUES (6,'B');
  INSERT INTO table1 VALUES (7,'C');
  INSERT INTO table1 VALUES (8,'B');
  INSERT INTO table1 VALUES (9,'B');

  select * from table1; 

I found code already written to accomplish this online, which I've tested and can confirm it runs successfully. It's shown here.

select names, count(*) as count
from (select id, names, (row_number() over (order by id) - row_number() over (partition by names order by id)) as grp
      from table1
     ) as temp
group by grp, names

I am trying to add in the ORDER BY clause at the end, like so:

select names, count(*) as count
from (select id, names, (row_number() over (order by id) - row_number() over (partition by names order by id)) as grp
      from table1
     ) as temp
group by grp, names
order by id -- added this here, but it creates an error.

but kept getting the error "Column "temp.id" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause." However, I am able to order by "names." What is the difference here?

Also, why can't I add in the "order by id" in the subquery? If I run this subquery on its own (see below), then the "order by id" is fine, but all together it cannot run. Why is this?

select names, count(*) as count
from (select id, names, (row_number() over (order by id) - row_number() over (partition by names order by id)) as grp
      from table1
      order by id -- added this in here, but it creates an error. 
     ) as temp
group by grp, names
order by names

Upvotes: 0

Views: 603

Answers (2)

DancingFool
DancingFool

Reputation: 1267

kept getting the error "Column "temp.id" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause." However, I am able to order by "names." What is the difference here?

SQL does things in a certain order. If your query has a GROUP BY (which yours does), that is done first. After grouping, the only thing SQL has is the columns that are selected and grouped by, so those are the only columns that can be used in the order by clause.

As an example, think of houses in a street. If you did a query on houses, returning colour & count, you might get something like Red 2, White 10, Green 3. But asking to sort that by address number makes no sense, because that information is not in data we've returned. In your case you are returning names, count, and you used grp in the group by clause, so those are the only things you can use to sort the final data, because they are all you have, and all that makes sense.

Also, why can't I add in the "order by id" in the subquery? If I run this subquery on its own (see below), then the "order by id" is fine, but all together it cannot run. Why is this?

When you have a subquery, the results are used as if they were a table. You can join on it, or query from it like you are, but the point is the order of that table has no effect on any thing else. The entry order of the underlying table is no guarantee that your query will come out in that order, unless you use an order by clause. And because you are doing a group by, that order means nothing anyway. Because the order of the subquery has no effect, SQL won't let you put it in.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

A select statement returns rows in an arbitrary order -- unless it has an order by. This is an extension of the fact that SQL operators on unordered sets.

Your select has no order by, so you should not assume the data would come back in any particular ordering. To get the results order by id, add order by id to the select.

Upvotes: 1

Related Questions