wlq
wlq

Reputation: 53

Conditional percentile with index function

I have a panel data set for different country categories, each with some score. I need to calculate the percentile rank of the score but only against the scores within the same country. I managed to write an array function to do the job but it seems it is calculating the percentiles wrongly. Any ideas why?

Dataset

Country Score   Manual percentiles  Array-function percentiles
AT      0.211   0.555               0.368
AT      0.090   0.111               0.157
AT      0.722   0.888               0.842
AT      0.241   0.666               0.421
AT      0.099   0.222               0.21
AT      0.138   0.444               0.315
AT      0.068   0                   0
AT      0.136   0.333               0.263
AT      0.461   0.777               0.736
AT      0.982   1                   1
BE      0.068   0                   0.052
BE      0.316   0.333               0.526
BE      0.304   0.222               0.473
BE      0.072   0.111               0.105
BE      0.703   0.777               0.789
BE      0.419   0.444               0.578
BE      0.788   0.888               0.894
BE      0.817   1                   0.947
BE      0.444   0.666               0.684
BE      0.425   0.555               0.631

Manual percentiles I calculate as:

=PERCENTRANK($B$2:$B$11;B2) for Country = "AT"
=PERCENTRANK($B$12:$B$21;B12) for COuntry = "BE"

I type in the array formula for array-function percentiles (and enter with Ctrl + Shift + Enter)

=PERCENTRANK(IF($A$2:$A$21=INDEX($A$2:$A$21;ROW()-1);$B$2:$B$21);
INDEX($B$2:$B$21;ROW()-1))

The results appear to capture the gist of the data but in fact are different. Any ideas why? I found that if I replace INDEX($A$2:$A$21;ROW()-1) by "AT" or "BE", the results are the same.

Upvotes: 2

Views: 1111

Answers (2)

Bobby Heyer
Bobby Heyer

Reputation: 601

You could create a custom function in vba which operates similar to the sumif formula.

below is an example of one I've done before. then you could reference that function in your existing sub, or use it directly in the workbook.

Note it gets a bit resource intensive if you are using it on large data ranges.

Option Explicit
Dim Data_range As Range
Dim Criteria_range As Range
Dim d_array As Variant
Dim c_array As Variant
Dim p_array() As Double
Dim Criteria As Range
Dim k As Double
Dim c As Long
Dim i As Long

Function PERCENTILEIF(Data_range, Criteria_range, Criteria, k)

' Define Inputs
    d_array = Data_range.Value
    c_array = Criteria_range.Value

' Seed Percentile array
    c = 0
    ReDim Preserve p_array(c)

' Loop through range, look for matching criteria, If found add to p_array
    For i = 1 To UBound(c_array, 1)
        If c_array(i, 1) = Criteria.Value Then
            ReDim Preserve p_array(0 To c)
            p_array(c) = d_array(i, 1)
            c = c + 1
        End If
    Next

' worksheet percentile function of p_array
    PERCENTILEIF = Application.WorksheetFunction.Percentile(p_array, k)

' Clean up
    c = 0
    Erase d_array
    Erase c_array
    Erase p_array

End Function

Upvotes: 0

jblood94
jblood94

Reputation: 16981

This works for me if I copy your data starting in A1 (still using ctrl+shift+enter):

=PERCENTRANK(IF($A$2:$A$21=A2;$B$2:$B$21);B2)

Upvotes: 1

Related Questions