How to reshape a dataframe from wide to long with columns that represent duplicated variables?

Question

I have the following dataframe, I would like to 'transfer' the values of columns from v3 to v10 in the first two columns v1 and v2, respectively, since they are duplicated variables and their values belong to the first two variables (v1 and v2).

df <- data.frame(id = 1:5,
              v1 = c("infection", "anxiety", "pain", "infection", "aspiration"),
              v2 = c(31, 55, 33, 31, 24),
              v3 = c("aspiration", NA, "ulcer", NA, "pain"),
              v4 = c(26, NA, 43, NA, 20),
              v5 = c("skin", NA, NA, NA, NA),
              v6 = c(25, NA, NA, NA, NA),
              v7 = c("ventilation", NA, NA, NA, NA),
              v8 = c(8, NA, NA, NA, NA),
              v9 = c("gas", NA, NA, NA, NA),
              v10 = c(4, NA, NA, NA, NA))

I would like to have as output this new df:

out <- data.frame(id = c(1,1,1,1,1,2,3,3,4,5,5),
              v1 = c("infection", "aspiration", "skin", "ventilation", "gas", "anxiety", "pain", "ulcer", "infection", "aspiration", "pain"),
              v2 = c(31, 26, 25, 8, 4, 55, 33, 43, 31, 24, 20))

Thank you so much for your help!

dash2 · Accepted Answer

This is ugly and I'm not sure how it works, but it does:

library(dplyr)
library(tidyr)

# make life easier for tidyr
names(df)[seq(2, 11, 2)]  <- paste0("cond_", 1:5)
names(df)[seq(3, 11, 2)]  <- paste0("var_", 1:5)

# stops tidyr complaining; if the numerics of var matter, you can change them back later
df <- df %>% mutate(across(starts_with("var"), as.character))

df %>% 
      pivot_longer(-id, names_to = c("var", "v2"), names_sep = "_") %>%
      pivot_wider(c(id, v2), names_from = var) %>% 
      na.omit()

How to reshape a dataframe from wide to long with columns that represent duplicated variables?

Answers (2)

Related Questions