Reputation: 4716
What's the least verbose way to express percentile function (not to be confused with percentile) in Q, excluding nulls?
I have:
q)x:0N 1 2 0N 2 1 5
q)@[count[x]#0Nf;i;:;(1%count i)*1+rank x i:where not null x]
0n 0.2 0.6 0n 0.8 0.4 1
Problem with the rank
above is that ties actually don't end up with the equal probability/percentile value.
Upvotes: 1
Views: 4262
Reputation: 4716
I've compared several approaches (including prank4
from the other answer):
prank1:{
n:asc x where not null x;
(sums[count each group n]%count n) @ x
}
prank2:{
p:(1+(asc n) bin n)%count n:x i:where not null x;
@[count[x]#0Nf;i;:;p]
}
prank3:{@[((1+til[count i])%count i)@last each group asc i:x where not null x;x]}
prank4:{
X: x where not null x;
grouped: group asc X;
firstRank: first each value grouped;
quantiles: (key grouped)! firstRank%count X;
quantiles x
}
Check that output is consistent to the nearest rank method of percentile calculation, except prank4
:
prank1 0N 1 2 0N 2 1 5 / 0n 0.4 0.8 0n 0.8 0.4 1
Compare timings and memory footprint:
x:10000000?0N,til 500
\ts prank1 x / 494 402661632
\ts prank2 x / 3905 671088960
\ts prank3 x / 552 536879392
\ts prank4 x / 496 533741888
prank2[x]~prank1 x / 1b
prank1[x]~prank3 x / 1b
prank1[x]~prank4 x / 0b
Upvotes: 0
Reputation: 2569
Although I don't think this is the most optimal solution, but it should solve the issue:
{
X: x where not null x;
grouped: group asc X;
firstRank: first each value grouped;
quantiles: (key grouped)! firstRank%count X;
quantiles x
}[0N 1 2 0N 2 1 5]
The code
Upvotes: 1