Reputation: 53
I have a panel data set for different country categories, each with some score. I need to calculate the percentile rank of the score but only against the scores within the same country. I managed to write an array function to do the job but it seems it is calculating the percentiles wrongly. Any ideas why?
Dataset
Country Score Manual percentiles Array-function percentiles
AT 0.211 0.555 0.368
AT 0.090 0.111 0.157
AT 0.722 0.888 0.842
AT 0.241 0.666 0.421
AT 0.099 0.222 0.21
AT 0.138 0.444 0.315
AT 0.068 0 0
AT 0.136 0.333 0.263
AT 0.461 0.777 0.736
AT 0.982 1 1
BE 0.068 0 0.052
BE 0.316 0.333 0.526
BE 0.304 0.222 0.473
BE 0.072 0.111 0.105
BE 0.703 0.777 0.789
BE 0.419 0.444 0.578
BE 0.788 0.888 0.894
BE 0.817 1 0.947
BE 0.444 0.666 0.684
BE 0.425 0.555 0.631
Manual percentiles I calculate as:
=PERCENTRANK($B$2:$B$11;B2) for Country = "AT"
=PERCENTRANK($B$12:$B$21;B12) for COuntry = "BE"
I type in the array formula for array-function percentiles (and enter with Ctrl + Shift + Enter)
=PERCENTRANK(IF($A$2:$A$21=INDEX($A$2:$A$21;ROW()-1);$B$2:$B$21);
INDEX($B$2:$B$21;ROW()-1))
The results appear to capture the gist of the data but in fact are different. Any ideas why? I found that if I replace INDEX($A$2:$A$21;ROW()-1) by "AT" or "BE", the results are the same.
Upvotes: 2
Views: 1111
Reputation: 601
You could create a custom function in vba which operates similar to the sumif formula.
below is an example of one I've done before. then you could reference that function in your existing sub, or use it directly in the workbook.
Note it gets a bit resource intensive if you are using it on large data ranges.
Option Explicit
Dim Data_range As Range
Dim Criteria_range As Range
Dim d_array As Variant
Dim c_array As Variant
Dim p_array() As Double
Dim Criteria As Range
Dim k As Double
Dim c As Long
Dim i As Long
Function PERCENTILEIF(Data_range, Criteria_range, Criteria, k)
' Define Inputs
d_array = Data_range.Value
c_array = Criteria_range.Value
' Seed Percentile array
c = 0
ReDim Preserve p_array(c)
' Loop through range, look for matching criteria, If found add to p_array
For i = 1 To UBound(c_array, 1)
If c_array(i, 1) = Criteria.Value Then
ReDim Preserve p_array(0 To c)
p_array(c) = d_array(i, 1)
c = c + 1
End If
Next
' worksheet percentile function of p_array
PERCENTILEIF = Application.WorksheetFunction.Percentile(p_array, k)
' Clean up
c = 0
Erase d_array
Erase c_array
Erase p_array
End Function
Upvotes: 0
Reputation: 16981
This works for me if I copy your data starting in A1
(still using ctrl+shift+enter):
=PERCENTRANK(IF($A$2:$A$21=A2;$B$2:$B$21);B2)
Upvotes: 1