Reputation: 151
I would like to create a new column which extracts 'b' and 'c' from type
and puts the value
in that column. There are other random columns in between as well that should be kept which I just named random
.
id <- c("1", "1", "1", "1","2", "2", "2", "2", "3", "3", "3", "3")
type <- c("a", "a", "b", "c", "a", "a", "b", "c", "a", "a", "b", "c")
random <- c("random")
value <- c("1", "2", "50", "100", "4", "5", "55", "110", "2.5", "3", "53", "105")
df <- data.frame(id, type, random, value)
id type random value
1 1 a random 1
2 1 a random 2
3 1 b random 50
4 1 c random 100
5 2 a random 4
6 2 a random 5
7 2 b random 55
8 2 c random 110
9 3 a random 2.5
10 3 a random 3
11 3 b random 53
12 3 c random 105
What I would like to have is:
id2 type2 random value2 b c
1 1 a random 1 50 100
2 1 a random 2 50 100
3 2 a random 4 55 110
4 2 a random 5 55 110
5 3 a random 2.5 53 105
6 3 a random 3 53 105
Would very much appreciate some thoughts! Best,
Upvotes: 1
Views: 43
Reputation: 887851
We could group by 'id', create the 'b' column by extracting the 'value' where the 'type' is 'b' (assuming single value of 'b' per group), ungroup
and remove the row where 'type' is 'b'
library(dplyr)
df %>%
group_by(id) %>%
mutate(b = value[type == 'b']) %>%
ungroup %>%
filter(type != 'b')
# A tibble: 4 x 5
# id type keepthis value b
# <fct> <fct> <fct> <fct> <fct>
#1 1 a keep 4 95
#2 1 a keep 5 95
#3 2 a keep 3 94
#4 2 a keep 5 94
Based on the updated dataset, we could filter
the 'type' with 'b', 'c', do a reshape to 'wide' format with pivot_wider
, left_join
with the original dataset filter
ed with only 'type' 'a'
library(tidyr)
df %>%
filter(type %in% c('b', 'c')) %>%
pivot_wider(names_from = type, values_from = value) %>%
left_join(df %>%
filter(type == 'a'))
# A tibble: 6 x 6
# id random b c type value
#* <fct> <fct> <fct> <fct> <fct> <fct>
#1 1 random 50 100 a 1
#2 1 random 50 100 a 2
#3 2 random 55 110 a 4
#4 2 random 55 110 a 5
#5 3 random 53 105 a 2.5
#6 3 random 53 105 a 3
With updated dataset, the code would be
df %>%
filter(scale_id %in% c('IM', 'RT')) %>%
select(Title, Task, task_id, scale_id, data_value) %>%
pivot_wider(names_from = scale_id, values_from = data_value) %>%
left_join(df %>%
filter(! scale_id %in% c('IM', 'RT')) %>%
group_by(task_id) %>%
slice(which.max(data_value)))
Upvotes: 1