Hendrik
Hendrik

Reputation: 135

OptimalCutoff Youden index calculation

After calculating the ROC curve for a dichotomous variable (a vs b). I want to calculate the optimal cut off value to differentiate this variable. The Youden index is the value that optimizes sensitivity and specificity for the differentiation.

Apparently, the package "OptimalCutpoints" should be able to do it. However, I get this strange error. Code inserted below:

library(pROC)
library(OptimalCutpoints)
df <- structure(list(value = c(1945.523629, 2095.549323, 2066.585153, 
                         2445.878083, 2112.252632, 2115.92955, 2000.285032, 2224.611905, 
                         1616.534694, 1668.017699, 1475.980978, 1940.849817, 1716.666667, 
                         2153.284314, 2063.353635, 2163.070313, 1856.319149, 1499.986928, 
                         2240.440449, 1869.083916, 1807.196078, 2025.603604, 1638.22973, 
                         1781.602941, 2014.013809, 1906.027356, 2033.148718, 1923.403162, 
                         1687.107744, 2632.280305, 1774.073084, 2196.162393, 2164.108659, 
                         2055.031216, 2229.501425, 1273.872576, 2224.126126, 2006.858974, 
                         1956.601942, 1808.214521, 1535.387136, 1382.15, 1596.69693, 1779.477273, 
                         1577.174699, 1908.321526, 1833.124454, 1679.492978, 1777.31114, 
                         1988.249023, 1736.75, 1985.68521, 1821.025974, 1745.325862, 1805.640777, 
                         2326.821229, 1858.558824, 2025.622727, 2197.781321, 1475.685446, 
                         2000.906423, 1714.749573, 1436.529412, 1981.15572, 1939.612779, 
                         2007.679335, 2029.189536, 1644.298246, 1824.697342, 2281.990385, 
                         2131.331776, 1143.722714, 1784.578076, 2143.131579, 982.4908457, 
                         2217.021592, 1799.512346, 526.7047753, 1613.25, 951.9103079, 
                         1006.241888, 1146.276835, 1651.474138, 1568.484778, 1938.867704, 
                         792.5410822, 1602.037383, 1244.281863, 957.5739437, 819.6116071, 
                         879.2128326, 1189.638632, 775.5525292, 1148.193333, 1130.812183, 
                         902.34, 994.3302961), type = c("a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
                                                        "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b", 
                                                        "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b"
                         )), .Names = c("value", "type"), row.names = c(NA, -97L), class = "data.frame")

rocobj <- plot.roc(df$type, df$value, percent = TRUE, main="ROC", col="#1c61b6", add=FALSE)

optimal.cutpoint.Youden <- optimal.cutpoints(X = "value", status = "type", tag.healthy = 0, methods = "Youden", 
                                             data = df, pop.prev = NULL,
                                             control = control.cutpoints(), ci.fit = FALSE, conf.level = 0.95, trace = FALSE)
summary(optimal.cutpoint.Youden)
plot(optimal.cutpoint.Youden)

Error: There are no healthy subjects in your dataset. Please review data and variables. Prevalence must be a value higher than 0 and lower than 1.

I am probably missing something very obvious here. I tried to modify the code based on the package help file, but I cannot get rid of the error.

Thank you very much and my apologies for my R "skills"

PS: I understand the limitations of defining an "optimal cutoff" because it depends on how important your sensitivity is versus your specificity etc. I just want to have an idea of what value we would get using this technique.

Upvotes: 1

Views: 6546

Answers (1)

Frost_Maggot
Frost_Maggot

Reputation: 309

the problem is how you have defined the tag.healthy argument. It should be 'a' or 'b' as these are in your data. You have defined it as 0.

Hope this helps.

Upvotes: 1

Related Questions