Analysis of final outcome

Question

Here is sample data:

df <- data.frame(group=rep(1:5,rep(2,5)),value=c(0,-150,0,50,0,-120,0,30,0,-20),flag1=floor(runif(10)),flag2=rep(rbinom(5,1,.5),rep(2,5)),flag3=rep(rbinom(5,1,.5),rep(2,5)))

Each group starts with 0 value and the second row per group is the terminal value, this can be >0 or 0<.

For example group 1:

group value flag1 flag2 flag3
    1     0     0     0     0
    1  -150     0     0     0

I would like to find out which combination of variables values flag1-flag3 results to negative value and which to positive. This example above would indicate that having all 0 flag1-flag3at state 0 (row 1) would result to negative value = outcome (row 2). I would like to obtain the association per group and overall.

coffeinjunky · Accepted Answer

Consider the following as an example. I group by all possible values of flag1-flag3 and calculate the probability distribution for positive or negative values for each group.

library(dplyr)

# remove redundant rows:
df <- df %>% filter(value != 0) 

# get all combinations of flat1-flag3 by grouping them,
# and then calculate the distribution:
df %>% group_by(flag1, flag2, flag3) %>% summarise(pos = mean(value > 0),
                                                   neg = mean(value < 0))
Source: local data frame [4 x 5]
Groups: flag1, flag2 [?]

  flag1 flag2 flag3   pos   neg
      
1     0     0     0   0.0   1.0
2     0     0     1   0.5   0.5
3     0     1     0   1.0   0.0
4     0     1     1   0.0   1.0

If you are more looking for regression coefficients, you would probably want to do something like

 lm(value > 0 ~ flag1 + flag2 + flag3, data = df)

I am not sure this is what you were asking for, though. Just add it in case...

Just to point it out, you could get the above with the built-in function ftable, but I usually prefer dplyr as it returns a tibble, which is easy to work with.

Analysis of final outcome

Answers (1)

Related Questions