齐天大圣
齐天大圣

Reputation: 1189

Get the average from a column, then filter out all the rows whose number is smaller than the average, and then count the number of row

This is sql question i have been working on for a while with no result.

The schema is

CREATE TABLE states (
    statecode text primary key,
    population_2010 integer,
    population_2000 integer,
    population_1950 integer,
    population_1900 integer,
    landarea real,
    name text,
    admitted_to_union text
);
CREATE TABLE counties(
    name text,
    statecode text references states(statecode),
    population_1950 integer,
    population_2010 integer
);
CREATE TABLE senators(
    statecode text references states(statecode),
    name text primary key,
    affiliation text,
    took_office integer,
    born integer);
CREATE TABLE committees(
    id text primary key,
    parent_committee text references committees(id),
    name text,
    chairman text references senators(name),
    ranking_member text references senators(name)
);

The question is: return a count of how many states have more than the average number of counties

I know how to calculate the average:

select avg(state_count) 
from
    (select count(*) as state_count
     from counties C
     group by C.statecode)

But i'm wondering if i can reuse the subquery.

like this(the following one does not work ):

select count(*) as state_count
from counties C
group by C.statecode
having count(*) > avg(state_count)

This query still does not count the number of rows, is there an elegant way?

EDIT: Just found a sample answer

SELECT COUNT(*)
FROM (SELECT statecode, COUNT(*) AS counts
      FROM counties
      GROUP BY statecode) s
WHERE s.counts > (SELECT AVG(t.counts) 
                  FROM  (SELECT COUNT(*) AS counts
                         FROM counties 
                         GROUP BY statecode) t);

But i am still curious if there is a better way?

Upvotes: 3

Views: 1892

Answers (3)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521599

I think you can simplify to the following query. I used a simple (uncorrelated) subquery to compute the average number of counties per state, by taking the total number of counties divided by the total number of states.

SELECT COUNT(*) AS states_above_average_count
FROM
(
    SELECT statecode
    FROM counties
    GROUP BY statecode
    HAVING COUNT(*) > (SELECT COUNT(*) FROM counties) / (SELECT COUNT(*) FROM states)
) t

Upvotes: 0

Dipendu Paul
Dipendu Paul

Reputation: 2753

The uneconomical part of sample answer that you have provided is this

SELECT COUNT(*) AS counts
FROM counties 
GROUP BY statecode

This part appears twice, once for the purpose of calculating average and once for finding counts which are greater than average. So here is my attempt using CTE which reuses the above piece of code for both the purposes:

WITH c AS
(
   SELECT COUNT(*) AS counts
   FROM counties 
   GROUP BY statecode
)
SELECT COUNT(*) FROM c WHERE counts > (SELECT  Avg(CAST(counts AS decimal)) 
                                       FROM c) 

Upvotes: 2

Raja ram
Raja ram

Reputation: 59

HI you can do by using CTE

    WITH States AS (
   SELECT COUNT(*) as state_count
     FROM counties t
 GROUP BY t.statecode)
  select count(*) from( SELECT count(*) as state_count from 
    counties t
 GROUP BY t.statecode
        having  count(*)> (select avg(x.state_count) from States x)) as g

Upvotes: 0

Related Questions