Reputation: 2793
How can one compute weighted median in KDB?
I can see that there is a function med for a simple median but I could not find something like wmed
similar to wavg.
Thank you very much for your help!
Upvotes: 1
Views: 523
Reputation: 1097
For values v
and weights w
, med v where w
gobbles space for larger values of w
.
Instead, sort w
into ascending order of v
and look for where cumulative sums reach half their sum.
q)show v:10?100
17 23 12 66 36 37 44 28 20 30
q)show w:.001*10?1000
0.418 0.126 0.077 0.829 0.503 0.12 0.71 0.506 0.804 0.012
q)med v where "j"$w*1000
36f
q)w iasc v / sort w into ascending order of v
0.077 0.418 0.804 0.126 0.506 0.012 0.503 0.12 0.71 0.829
q)0.5 1*(sum;sums)@\:w iasc v / half the sum and cumulative sums of w
2.0525
0.077 0.495 1.299 1.425 1.931 1.943 2.446 2.566 3.276 4.105
q).[>]0.5 1*(sum;sums)@\:w iasc v / compared
1111110000b
q)v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v / weighted median
36
q)\ts:1000 med v where "j"$w*1000
18 132192
q)\ts:1000 v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v
2 2576
q)wmed:{x i sum .[>]0.5 1*(sum;sums)@\:y i:iasc x}
Some vector techniques worth noticing:
(sum;sums)@\:
and using Apply .
and an operator on the result, rather than setting a variable, e.g. (0.5*sum yi)>sums yi:y i
or defining an inner lambda {sums[x]<0.5*sum x}y i
iasc
to sort anotherv i sum ..
Upvotes: 5
Reputation: 13572
You could effectively weight the median by duplicating (using where
):
q)med 10 34 23 123 5 56 where 4 1 1 1 1 1
10f
q)med 10 34 23 123 5 56 where 1 1 1 1 1 4
56f
q)med 10 34 23 123 5 56 where 1 2 1 3 2 1
34f
If your weights are percentages (e.g. 0.15 0.10 0.20 0.30 0.25) then convert them to equivalent whole/counting numbers
q)med 1 2 3 4 5 where "i"$100*0.15 0.10 0.20 0.30 0.25
4f
Upvotes: 1