user1893354
user1893354

Reputation: 5938

MySQL Query - grouping issue

So this is my query

select 
A.search_parameters,
reseller_count,
reseller,
num

from 

(select 
 search_parameters,
 count(distinct(reseller)) as reseller_count 
 from all_lookups 
 group by search_parameters)A

 join

 (select
  search_parameters,
  reseller,
  count(*) as num
  from all_lookups
  group by reseller,search_parameters)B

  on A.search_parameters=B.search_parameters

  where a.search_parameters not like '%addsomekeywords%' 
  and a.search_parameters not like '%add-some%' 
  order by reseller_count DESC,num 
  DESC limit 1000

For some reason, the results are grouped by reseller_count, but I want them to be grouped by search_parameters. I've tried adding a group by A.search_paramters after the where clause but this creates the error 'select list expression not produced by aggregation output (missing from GROUP BY clause?)' It might also be important to note that I am using this query in Cloudera Impala but I suspect the same MySQL rules/syntax still apply

Here is an example of what I am getting

A.search_parameter|reseller_count|reseller|num
hello             | 2            | abc    |556
hi                | 2            | tre    |54
hello             | 2            | xyz    |9
hi                | 2            | ytu    |4

and what I want is

A.search_parameter|reseller_count|reseller|num
hello             | 2            | abc    | 556
hello             | 2            | xyz    | 9
hi                | 2            | tre    | 54
hi                | 2            | ytu    | 4

So basically, all of the search_parameters' with the same reseller_count are randomly put together, but I want them grouped by search_parameter

Thanks

Upvotes: 0

Views: 1836

Answers (2)

DRapp
DRapp

Reputation: 48139

Aside from making your existing query a bit more readable, the only thing you should have to change is the outermost order by clause to just be on the search_parameters, then the num descending. The aggregates are already "grouped" from your JOINED query results.

select 
      A.search_parameters,
      A.reseller_count,
      B.reseller,
      B.num
   from 
      ( select 
              search_parameters,
              count(distinct(reseller)) as reseller_count 
           from 
              all_lookups 
           group by 
              search_parameters ) A
      join
      ( select
              search_parameters,
              reseller,
              count(*) as num
           from 
              all_lookups
           where 
                  search_parameters not like '%addsomekeywords%' 
              and search_parameters not like '%add-some%' 
           group by 
              search_parameters,
              reseller ) B

         on A.search_parameters = B.search_parameters
   order by 
      A.search_parameters,
      B.num DESC 
   limit 
      1000

Per feedback/comment. If you want based on reseller count too, then adjust your order by like I have below.

   order by 
      A.reseller_count DESC,
      A.search_parameters,
      B.num DESC 

So, what this will do is for all that have the highest resellers will be at the top of the list... Then, if you have multiple "search_parameters" that have the same high reseller counts they will be grouped together, and finally within each search parameter, which resellers with the highest number will be listed at the top. Take the following sample data counts..

I've change the column name headings here specifically for sample clarification purposes.

DistResell   SearchParm   ActualReseller   NumPerReseller
   4         X            Mary             405
   4         X            Bob              108
   4         X            George           107
   4         X            Jane              98
   4         Y            Jim              290
   4         Y            Jill             287
   4         Y            Mary             243
   4         Y            Sam              164
   3         A            Sam              201
   3         A            Mary             187
   3         A            Joe              146
   2         D            Jim              73
   2         D            Kathy            67
   2         G            Mary             140
   2         G            Jeff             125

So, as you can see in this example there are two search parameters found "X" and "Y" and each have 4 resellers so they are pushed to the top of the list even though alphabetically greater than search parm "A" (with count of 3). Now, within the group of 4 resellers, they are alpha so "X" is before "Y". Good at this level. Now, regardless of WHO the reseller is, you want it sorted by their "num" count per each distinct reseller.

"A" is next with 3 resellers, no confusion there.

Then down to 2 resellers with search parms of "D" and "G"... follows similar hierarchy as the 4 where we already know 2 is in the last group, then alpha by "D" and "G", and finally within each respective "D" and "G", the highest "num".

Does this better clarify what you are trying to accomplish?

Upvotes: 1

M Khalid Junaid
M Khalid Junaid

Reputation: 64476

You can do it like this wrap the whole query as the tabel

SELECT q.* FROM (
SELECT 
A.search_parameters,
reseller_count,
reseller,
num

FROM 

(SELECT 
 search_parameters,
 COUNT(DISTINCT(reseller)) AS reseller_count 
 FROM all_lookups 
 GROUP BY search_parameters)A

 JOIN

 (SELECT
  search_parameters,
  reseller,
  COUNT(*) AS num
  FROM all_lookups
  GROUP BY reseller,search_parameters)B

  ON A.search_parameters=B.search_parameters

  WHERE A.search_parameters NOT LIKE '%addsomekeywords%' 
  AND A.search_parameters NOT LIKE '%add-some%' 
  ORDER BY reseller_count DESC,num 
  DESC LIMIT 1000
  ) q GROUP BY q.search_parameters

Here is the example

SELECT q.* FROM (your whole query) q GROUP BY q.any_column_in_query

Upvotes: 0

Related Questions