Need to find average and number of repetitions of column

I have an SQL sentence :

SELECT application.id,title,url,company.name AS company_name,package_name,ranking,date,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name FROM application
JOIN application_history ON application_history.application_id = application.id
JOIN company ON application.company_id = company.id
JOIN country ON application_history.country_id = country.id
JOIN collection ON application_history.collection_id = collection.id
JOIN category ON application_history.category_id = category.id
WHERE application.platform=0
AND country.name ='CZ'
AND collection.name='topfreeapplications'
AND category.name='UTILITIES'
AND application_history.ranking <= 10
AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
ORDER BY application_history.ranking ASC

It produces this result :

enter image description here

I'd like to add both a column average ranking for a given package, and a column number of appearances, which would count the number a package appears in the list. I'd also like to Group results by package_name, so that I don't have redundancies.

So far, I've tried to add a GROUP BY By clause before the ORDER BY :

GROUP BY package_name

But it returns me an error :

column "application.id" must appear in the GROUP BY clause or be used in an aggregate function

If I add each and every column it asks me for, it doesn't work. I have also tried to count the number of package names, by adding after the SELECT :

COUNT(package_name) AS count

It produces a similar error.

How could I get the result I'm looking for ? Should I make two queries instead, or is it possible to get everything at once ? I precise I have looked at other answers on S.O, but none of them tries to make the COUNT on a "produced" column.

Thank you for your help.

Edit :

Here is the result I expected at first :

enter image description here

Although Gordon's advice didn't give me the proper result it put me on the good track, when I read this : From the docs : "Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row."

So I came back to using COUNT and AVG alone. My problem was that I wanted to display the ranking column and date to check whether things were right. But putting these column into the Select prevented the GROUP BY to work as expected, as mentioned by Jarlh in the comments.

The working query :

SELECT application.id,title,url,company.name AS company_name,package_name,platform,country.name AS country_name,collection.name AS collection_name,category.name AS category_name, 
    COUNT(package_name) AS count, AVG(application_history.ranking) AS avg
    FROM application
    JOIN application_history ON application_history.application_id = application.id
    JOIN company ON application.company_id = company.id
    JOIN country ON application_history.country_id = country.id
    JOIN collection ON application_history.collection_id = collection.id
    JOIN category ON application_history.category_id = category.id
    WHERE application.platform=0
    AND country.name ='CZ'
    AND collection.name='topfreeapplications'
    AND category.name='UTILITIES'
    AND application_history.ranking <= 10
    AND date::date BETWEEN date (CURRENT_DATE - INTERVAL '1 month') AND CURRENT_DATE
    GROUP BY package_name,application.id,company.name,country.name,collection.name,category.name
    ORDER BY count DESC

Upvotes: 1

Views: 201

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270021

I think you want window/analytic functions. The following adds two columns, one for the count of rows for each package and the other an average ranking for them:

SELECT application.id, title, url, company.name AS company_name, package_name, 
       ranking, date, platform, country.name AS country_name,
       collection.name AS collection_name, category.name AS category_name,
       count(*) over (partition by package_name) as count,
       avg(ranking) over (partition by package_name) as avg_package_ranking
FROM application . . .

Upvotes: 1

Related Questions