Reputation: 1040
I have data such as this.
data.sample <- read_table2('score_label treatment score data1 data2 data3
A treatment 1 1 t yt
A treatment 2 1 t yt
A treatment 3 5 f yt
B treatment 1 5 f yt
B treatment 2 5 f yt
B treatment 3 5.5 g yt
B treatment 4 6.8 t yt
C treatment 1 9.4 t yt
C treatment 2 10.7 f yt
C treatment 3 12 j yt
C treatment 4 13.3 t yt
C control 1 14.6 t yt
C control 3 18.5 k yt
C control 4 19.8 t yt')
I would like to create df such as this. Where every score label-treatment group, has a score running from 1-4 and where 0 is populated into the cells where this score was not present previously.
output<- read_table2('score_label treatment score data1 data2 data3
A treatment 1 1 t yt
A treatment 2 1 t yt
A treatment 3 5 f yt
A treatment 4 0 0 0
B treatment 1 5 f yt
B treatment 2 5 f yt
B treatment 3 5.5 g yt
B treatment 4 6.8 t yt
C treatment 1 9.4 t yt
C treatment 2 10.7 f yt
C treatment 3 12 j yt
C treatment 4 13.3 t yt
C control 1 14.6 t yt
C control 2 0 0 0
C control 3 18.5 k yt
C control 4 19.8 t yt')
I thought of doing this to create a new score column, but it's not working how I hoped it would. Any suggestions appreciated!!
data.sample %>%
group_by(score_lable, treatment) %>%
mutate(new_score=seq(4))
Upvotes: 2
Views: 192
Reputation: 887118
We can use complete
with fill
library(dplyr)
library(tidyr)
data.sample %>%
group_by(score_label, treatment) %>%
complete(score = unique(data.sample$score),
fill = list(data1 = 0, data2 = 0, data3 = '0'))
If there are many columns to fill
, it can be constructed as a list
nm1 <- names(data.sample)[startsWith(names(data.sample), 'data')]
fillcols <- setNames(rep(list(0), length(nm1)), nm1)
data.sample %>%
group_by(score_label, treatment) %>%
complete(score = unique(data.sample$score), fill = fillcols)
Upvotes: 3