Reputation: 11
I have a modeled/forecasted change and an actual change. The forecasted change is in a column named forecastHPIChange and the actual change is named HPIChange. It's in the following form:
HPIChange forecastHPIChange
1 NA 1.547368e-02
2 -0.0026155187 1.485668e-02
3 0.0002906977 1.251108e-02
4 -0.0077877127 1.718729e-02
5 0.0200058841 2.143551e-02
I want to test for the 143 instances, whether the sign alignment of the forecast is correct. So there are really four cases:
To check this, I've hacked together the following code and I could feed them into a data frame but I wanted to check to see if there is a more elegant way to do this check?
data1 %>%
filter(forecastHPIChange > 0 & HPIChange > 0) %>%
summarise(correct = n())
data1 %>%
filter(forecastHPIChange < 0 & HPIChange < 0) %>%
summarise(correct = n())
data1 %>%
filter(forecastHPIChange < 0 & HPIChange > 0) %>%
summarise(wrong = n())
data1 %>%
filter(forecastHPIChange > 0 & HPIChange < 0) %>%
summarise(wrong = n())
Upvotes: 1
Views: 44
Reputation: 23109
Starting with the following data (changed your example data a little bit to have datapoints present for all the classes TP, FP, TN, FN):
data1
HPIChange forecastHPIChange
1 NA 0.01547368
2 -0.0026155187 0.01485668
3 0.0002906977 0.01251108
4 -0.0077877127 -0.01718729
5 0.0200058841 -0.02143551
# transform the data1 to dataset data2 where we have only + and - labels (represented by +1 and -1)
data2 <- as.data.frame(sapply(data1, function(x) ifelse(x > 0, 1, -1)))
table(data2)
forecastHPIChange
HPIChange -1 1
-1 1 1 # 1, 1 = TP 1, -1 = FN
1 1 1 # -1. -1 = TN -1, 1 = FP
# using the package caret
library(caret)
confusionMatrix(data2$forecastHPIChange, data2$HPIChange)
Upvotes: 0
Reputation: 269860
Try confusionMatrix
in the caret package:
library(caret)
make_factor <- function(x) factor(sign(x), levels = c(-1, 1))
signs <- as.data.frame(lapply(data1, make_factor))
with(signs, confusionMatrix(forecastHPIChange, reference = HPIChange))
or using a pipeline:
library(purrr)
data1 %>%
map_df(make_factor) %>%
{ confusionMatrix(.$forecastHPIChange, reference = .$HPIChange) }
Either gives:
Confusion Matrix and Statistics
Reference
Prediction -1 1
-1 0 0
1 2 2
Accuracy : 0.5
95% CI : (0.0676, 0.9324)
No Information Rate : 0.5
P-Value [Acc > NIR] : 0.6875
Kappa : 0
Mcnemar's Test P-Value : 0.4795
Sensitivity : 0.0
Specificity : 1.0
Pos Pred Value : NaN
Neg Pred Value : 0.5
Prevalence : 0.5
Detection Rate : 0.0
Detection Prevalence : 0.0
Balanced Accuracy : 0.5
For the input shown not all factor levels appeared but if the actual input does have all factor levels then we could eliminate make_factor
and just use sign
instead.
Note: The input data1
in reproducible form used above is:
data1 <- structure(list(HPIChange = c(NA, -0.0026155187, 0.0002906977,
-0.0077877127, 0.0200058841), forecastHPIChange = c(0.01547368,
0.01485668, 0.01251108, 0.01718729, 0.02143551)), .Names = c("HPIChange",
"forecastHPIChange"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Upvotes: 2