Reputation: 7725
What is the tidyverse
replacement for reshape()
in this example? I want the wide version to take the name of the round: v2.1
and v2.2
. I thought it should be gather()
, but I haven't figured it out.
library(tidyverse)
r1 <- data.frame(id=c(1, 2, 3),
v1=c(1, 1, 0),
v2=c(0, 1, 1),
round=c(1, 1, 1))
r2 <- data.frame(id=c(1, 2, 3),
v2=c(1, 0, 0),
round=c(2, 2, 2))
r12 <- bind_rows(r1, r2)
r12w <- reshape(r12,
timevar = "round",
v.names = "v2",
idvar = "id",
direction = "wide")
r12w
# id v1 v2.1 v2.2
#1 1 1 0 1
#2 2 1 1 0
#3 3 0 1 0
Updated example with unbalanced rows across datasets.
r1 <- data.frame(id=c(1, 2, 3, 4),
v1=c(1, 1, 0, 0),
v2=c(0, 1, 1, 1),
round=c(1, 1, 1, 1))
r2 <- data.frame(id=c(1, 2, 3),
v2=c(1, 0, 0),
round=c(2, 2, 2))
This mimics a panel survey where some people are not found/refuse in later rounds. Here, person 4 is in r1
, but not r2
. We want to keep this person in the final dataset, but with a NA value for v2
. Here is the desired output. Looking for a tidverse approach to go from r1
and r2
to this output.
# id v1 v2.1 v2.2
#1 1 1 0 1
#2 2 1 1 0
#3 3 0 1 0
#4 4 0 1 NA
Upvotes: 0
Views: 157
Reputation: 1037
I am not sure I fully understand what you want but here is an attempt:
library(dplyr)
full_join(r1, r2, by = "id", suffix = c(".1", ".2")) %>%
select(-starts_with("round"))
Upvotes: 1
Reputation: 887501
We create the missing column in 'r2' before doing the bind_rows
by assigning that column from 'r1'. For this, we can use setdiff
to get the column that is found in 'r1' and not in 'r2'. Then, paste
the string 'v2.' with 'round' column and spread
to 'wide' format
m1 <- setdiff(names(r1), names(r2))
r2[nm1] <- r1[nm1]
bind_rows(r1, r2) %>%
mutate(round = paste0("v2.", round)) %>%
spread(round, v2)
# id v1 v2.1 v2.2
#1 1 1 0 1
#2 2 1 1 0
#3 3 0 1 0
NOTE: Here, we are assuming that the datasets have the same number of rows
Upvotes: 1