Reputation: 1305
I have this dataset:
VAR= c('X1','X1','X1','X1','X2','X2','X2','X3','X3','X3','X3','X3')
Ranking= c(1,2.5 ,2.5 ,1.5 ,1.5 ,NA, 1 ,NA ,NA ,1.5 ,1.5 ,3)
df<-data.frame(VAR,Ranking)
For which even if I add a unique identifier and spread
df$row <- 1:nrow(df)
df_wide<-spread(df, VAR, Ranking)
df_wide<-df_wide[,-1]
I don't get what I am after.
But what I need is:
How do I accomplish this?
Upvotes: 0
Views: 148
Reputation: 6020
You can do in R base:
spl <- split(df, df$VAR)
n <- max(sapply(spl, nrow))
do.call(cbind, lapply(spl, function(x) {
x <- x[!is.na(x$Ranking),'Ranking']
length(x) <- n
x}))
Upvotes: 3
Reputation: 160607
Your approach is close, but by making Row
unique across all rows, you are guaranteeing that those rows will not really "join" the way you intend. Instead, number them within VAR
and then spread:
library(dplyr)
library(tidyr)
group_by(df, VAR) %>%
mutate(Row = row_number()) %>%
ungroup() %>%
spread(VAR, Ranking)
# # A tibble: 5 × 4
# Row X1 X2 X3
# * <int> <dbl> <dbl> <dbl>
# 1 1 1.0 1.5 NA
# 2 2 2.5 NA NA
# 3 3 2.5 1.0 1.5
# 4 4 1.5 NA 1.5
# 5 5 NA NA 3.0
Upvotes: 1