Reputation: 316
I have a dataframe and my goal here is to take all possible combinations of the rank column and for each rank combination apply a function that will use the volume_metric and kpi_metric values. The resulting output would be a matrix just like the cor()
function provides except with each rank combinations p-values.
Basically I want to take the first row values of volume_metric & kpi_metric and then the second row values of volume_metric & kpi_metric and apply the zTest
function to them. Then 1->3, 1->4, etc.
rank <- c('ad 1', 'ad 2', 'ad 3', 'ad 4', 'ad 5', 'ad 6', 'ad 7', 'ad 8')
volume_metric <- c(12321, 12321, 1232121, 4343, 14333, 52323, 234532, 2322)
kpi_metric <- c(12, 32, 111, 334, 653, 343, 232, 212)
# The df
df <- tibble(rank, volume_metric, kpi_metric)
# A tibble: 8 x 3
rank volume_metric kpi_metric
<chr> <dbl> <dbl>
1 ad 1 12321 12
2 ad 2 12321 32
3 ad 3 1232121 111
4 ad 4 4343 334
5 ad 5 14333 653
6 ad 6 52323 343
7 ad 7 234532 232
8 ad 8 2322 212
# z-test fucntion
zTest <- function(volume1, volume2, kpi1, kpi2) {
z_test <- prop.test(
x=c(kpi1, kpi2),
n=c(volume1, volume2),
alternative = "greater",
conf.level = 0.95,
correct = FALSE
)
p_value <- z_test$p.value
return(p_value)
}
So far I have been able to get all of the rank combinations using
possible_combinations <- combn(nrow(df), 2)
which will provide a matrix with all of the combos (the rank will always be the same as now(df)
).
I tried to loop through that matrix and then subset the df
but that resulted in a never ending loop 🤦🏻♂️.
My question here is, how do I use that matrix with the combos to index against my df
and apply the zTest
function, or am I thinking about this all wrong?
Upvotes: 0
Views: 38
Reputation: 388982
combn
accepts a function so you may pass the row number values to them and subset specific volume_metric
and kpi_metric
from df
and pass it to zTest
function.
zTest <- function(volume, kpi) {
z_test <- prop.test(
x=kpi,
n=volume,
alternative = "greater",
conf.level = 0.95,
correct = FALSE
)
p_value <- z_test$p.value
return(p_value)
}
do.call(rbind, combn(nrow(df), 2, function(x)
data.frame(row1 = x[1], row2 = x[2],
cor = zTest(df$volume_metric[x], df$kpi_metric[x])),
simplify = FALSE))
# row1 row2 cor
#1 1 2 9.987e-01
#2 1 3 4.628e-23
#3 1 4 1.000e+00
#4 1 5 1.000e+00
#5 1 6 1.000e+00
#6 1 7 5.209e-01
#7 1 8 1.000e+00
#...
Upvotes: 2