Reputation: 47
In R, I would like to select certain strings (A2, D1) from multiple columns (dx1-dx3) and put them in a column in a long format by id while keeping another column (i.e., Time). Below is the data frame.
id dx1 dx2 dx3 Time
1 A2 D1 1
1 B1 A1 D1 2
2 D1 1
2 A3 D1 2
2 A2 D1 3
The format I want is as follows:
id name value Time
1 dx1 A2 1
1 dx3 D1 1
1 dx3 D1 2
2 dx1 D1 1
2 dx2 D1 2
2 dx1 A2 3
2 dx2 D1 3
I tried d %>% select(id, contains('dx'), Time) %>% pivot_longer(cols = -c('id')) %>% filter(str_detect(value, 'A2|D1')). I got the results without the Time column. Do you have any suggestions on how to keep the Time column?
Upvotes: 0
Views: 48
Reputation: 287
Try reshape2::melt
and then subset the rows:
library(reshape2)
df = data.frame(id=c(1,1,2,2,2),
dx1=c('A2','B1','D1','A3','A2'),
dx2=c(NA, 'A1', NA, 'D1', 'D1'),
dx3=c('D1','D1',NA, NA, NA),
Time=c(1,2,1,2,3))
df2 = melt(df, measure.vars=c("dx1","dx2","dx3"), id.vars=c("id","Time"), variable.name="name")
df2 = df2[df2$value %in% c('A2','D1'),]
df2
id Time name value
1 1 1 dx1 A2
3 2 1 dx1 D1
5 2 3 dx1 A2
9 2 2 dx2 D1
10 2 3 dx2 D1
11 1 1 dx3 D1
12 1 2 dx3 D1
Upvotes: 1