Reputation: 10538
I have a dataset that looks like this:
df <- data.frame(
x = c(rep("A", 3), rep("B", 2)),
y = c(1, 2, 6, 8, 3)
)
I need to (un)tidy it so that it looks like this:
df_new <- data.frame(
A = c(1, 2, 6),
B = c(8, 3, NA)
)
tidyr::spread
threw duplicate value errors....
Upvotes: 1
Views: 173
Reputation: 887118
We can do this with base R
with unstack
to create a list
, then pad with NA
at the end to make the length same for each list
element and convert to data.frame
lst <- unstack(df, y~x)
data.frame(lapply(lst, `length<-`, max(lengths(lst))))
# A B
#1 1 8
#2 2 3
#3 6 NA
Or if we are using a package, a compact option would be
library(stringi)
stri_list2matrix(split(df$y, df$x))
The output will be string which can be changed to numeric
Upvotes: 1
Reputation: 10538
Using dplyr, tidyr::complete, ::spread
df_new <- df %>%
group_by(x) %>%
mutate(index = row_number()) %>%
complete(index = 1:max(index)) %>%
spread(x, y, fill = NA) %>%
select(-index)
Upvotes: 0
Reputation: 145775
tidyr
(to my knowledge) won't let you do this without an ID column. So we'll add that first and then spread:
library(dplyr)
library(tidyr)
df %>% group_by(x) %>%
mutate(id = 1:n()) %>%
spread(key = x, value = y, fill = NA)
# # A tibble: 3 x 3
# id A B
# * <int> <dbl> <dbl>
# 1 1 1 8
# 2 2 2 3
# 3 3 6 NA
You can, of course, remove the id
column at the end if you prefer.
Upvotes: 3