Reputation: 27
I have the following dataframe, I would like to 'transfer' the values of columns from v3 to v10 in the first two columns v1 and v2, respectively, since they are duplicated variables and their values belong to the first two variables (v1 and v2).
df <- data.frame(id = 1:5,
v1 = c("infection", "anxiety", "pain", "infection", "aspiration"),
v2 = c(31, 55, 33, 31, 24),
v3 = c("aspiration", NA, "ulcer", NA, "pain"),
v4 = c(26, NA, 43, NA, 20),
v5 = c("skin", NA, NA, NA, NA),
v6 = c(25, NA, NA, NA, NA),
v7 = c("ventilation", NA, NA, NA, NA),
v8 = c(8, NA, NA, NA, NA),
v9 = c("gas", NA, NA, NA, NA),
v10 = c(4, NA, NA, NA, NA))
I would like to have as output this new df:
out <- data.frame(id = c(1,1,1,1,1,2,3,3,4,5,5),
v1 = c("infection", "aspiration", "skin", "ventilation", "gas", "anxiety", "pain", "ulcer", "infection", "aspiration", "pain"),
v2 = c(31, 26, 25, 8, 4, 55, 33, 43, 31, 24, 20))
Thank you so much for your help!
Upvotes: 1
Views: 54
Reputation: 389055
Renaming column would be better since it will clearly identify which column belongs to which group. You can then use pivot_longer
directly by using names_pattern
argument.
library(dplyr)
library(tidyr)
df %>%
rename_with(~paste(., c('v1', 'v2'), sep = '_'), -1) %>%
pivot_longer(cols = -id,
names_to = '.value',
names_pattern = '.*_(\\w+)',
values_drop_na = TRUE) -> out
out
# id v1 v2
# <int> <chr> <dbl>
# 1 1 infection 31
# 2 1 aspiration 26
# 3 1 skin 25
# 4 1 ventilation 8
# 5 1 gas 4
# 6 2 anxiety 55
# 7 3 pain 33
# 8 3 ulcer 43
# 9 4 infection 31
#10 5 aspiration 24
#11 5 pain 20
Upvotes: 3
Reputation: 2262
This is ugly and I'm not sure how it works, but it does:
library(dplyr)
library(tidyr)
# make life easier for tidyr
names(df)[seq(2, 11, 2)] <- paste0("cond_", 1:5)
names(df)[seq(3, 11, 2)] <- paste0("var_", 1:5)
# stops tidyr complaining; if the numerics of var matter, you can change them back later
df <- df %>% mutate(across(starts_with("var"), as.character))
df %>%
pivot_longer(-id, names_to = c("var", "v2"), names_sep = "_") %>%
pivot_wider(c(id, v2), names_from = var) %>%
na.omit()
Upvotes: 1