LLL
LLL

Reputation: 743

Inserting rows based on two factor levels in R

I've got a data frame (df) with four variables out of which two are factors, var1 and var2. var1 and var2 each have three levels.

Some combinations of var1 and var2 are not present in the data frame, eg there is no var2 level "4 or 5" present for var1 level "slow".

I'd like to add those missing combination rows to my data frame (dfgoal), and set var3 and var4 of those rows to 0.

I find adding rows tricky at the best of times, and have no idea how to achieve this. Any help would be much appreciated!

# Starting point 
df <- data.frame(var1=c("fast","fast","fast","medium","slow","slow"),
                 var2=c("1 or 2","3","4 or 5","3","1 or 2","3"),
                 var3_freq=c(22,56,22,100,36,64),
                 var4_n=c(10,26,10,2,5,9))
df$var1 <- as.factor(df$var1)
df$var2 <- as.factor(df$var2)

# Goal
dfgoal <- data.frame(var1=c("1 or 2","3","4 or 5","1 or 2","3","4 or 5","1 or 2","3","4 or 5"),
                 var2=c("fast","fast","fast","medium","medium","medium","slow","slow","slow"),
                 var3_freq=c(22,56,22,0,100,0,36,64,0),
                 var4_n=c(10,26,10,0,2,0,5,9,0))

Upvotes: 0

Views: 93

Answers (1)

Roman
Roman

Reputation: 4989

Simple solution without loading external libraries:

    var1   var2 var3_freq var4_n
1   fast 1 or 2        22     10
2   fast      3        56     26
3   fast 4 or 5        22     10
4 medium      3       100      2
5   slow 1 or 2        36      5
6   slow      3        64      9
7 medium 1 or 2         0      0
8 medium 4 or 5         0      0
9   slow 4 or 5         0      0

Code

new <- data.frame(var1 = c("medium", "medium", "slow"),
                  var2 = c("1 or 2", "4 or 5", "4 or 5"),
                  var3_freq = c(0, 0, 0),
                  var4_n = c(0, 0, 0))
rbind(df, new)

Data

df <- data.frame(var1=c("fast","fast","fast","medium","slow","slow"),
                 var2=c("1 or 2","3","4 or 5","3","1 or 2","3"),
                 var3_freq=c(22,56,22,100,36,64),
                 var4_n=c(10,26,10,2,5,9))
df$var1 <- as.factor(df$var1)
df$var2 <- as.factor(df$var2)    

Upvotes: 0

Related Questions