Reputation: 8455
I'm trying to put together some simple stats, but stuck with computing the percentile that the author belongs to according to his total scores:
select
[by] author,
count(*) count,
sum(score) sum_score,
quantiles(sum(score), 101) percentile_sum_score,
from
[bigquery-public-data:hacker_news.stories]
group by
author
This code returns each author's sum(score)
in the column percentile_sum_score
(defined as quantiles(sum(score), 101)
). But not the percentile the author is in, relative to other authors.
This happens in BigQuery, and some conventional functions are not available (docs).
Is there a way to get the right statistics?
Upvotes: 1
Views: 185
Reputation: 172993
You might be looking for PERCENT_RANK() function
Below is possible use for your example
SELECT
author,
[count],
sum_score,
PERCENT_RANK() OVER(ORDER BY sum_score DESC) percentile_sum_score,
FROM (
SELECT
[by] author,
COUNT(1) [count],
SUM(score) sum_score,
FROM [bigquery-public-data:hacker_news.stories]
GROUP BY author
)
Upvotes: 1