Anton Tarasenko
Anton Tarasenko

Reputation: 8455

Getting the percentile the item belongs to in SQL

I'm trying to put together some simple stats, but stuck with computing the percentile that the author belongs to according to his total scores:

select
    [by] author,
    count(*) count,
    sum(score) sum_score,
    quantiles(sum(score), 101) percentile_sum_score,
from
    [bigquery-public-data:hacker_news.stories]
group by
    author

This code returns each author's sum(score) in the column percentile_sum_score (defined as quantiles(sum(score), 101)). But not the percentile the author is in, relative to other authors.

This happens in BigQuery, and some conventional functions are not available (docs).

Is there a way to get the right statistics?

Upvotes: 1

Views: 185

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

You might be looking for PERCENT_RANK() function
Below is possible use for your example

SELECT
    author,
    [count],
    sum_score,
    PERCENT_RANK() OVER(ORDER BY sum_score DESC) percentile_sum_score,
FROM (
  SELECT
    [by] author,
    COUNT(1) [count],
    SUM(score) sum_score,
  FROM [bigquery-public-data:hacker_news.stories]
  GROUP BY author
)

Upvotes: 1

Related Questions