J. Doe.
J. Doe.

Reputation: 1305

Long to wide with no unique key

I have this dataset:

VAR=       c('X1','X1','X1','X1','X2','X2','X2','X3','X3','X3','X3','X3')
Ranking=   c(1,2.5 ,2.5   ,1.5  ,1.5  ,NA, 1   ,NA    ,NA   ,1.5  ,1.5  ,3)
df<-data.frame(VAR,Ranking)

For which even if I add a unique identifier and spread

df$row <- 1:nrow(df)
df_wide<-spread(df, VAR, Ranking)
df_wide<-df_wide[,-1]

I don't get what I am after.

This is what I get

But what I need is:

enter image description here

How do I accomplish this?

Upvotes: 0

Views: 148

Answers (2)

Wietze314
Wietze314

Reputation: 6020

You can do in R base:

spl <- split(df, df$VAR)
n <- max(sapply(spl, nrow))
do.call(cbind, lapply(spl, function(x) {
  x <- x[!is.na(x$Ranking),'Ranking']
  length(x) <- n
  x}))

Upvotes: 3

r2evans
r2evans

Reputation: 160607

Your approach is close, but by making Row unique across all rows, you are guaranteeing that those rows will not really "join" the way you intend. Instead, number them within VAR and then spread:

library(dplyr)
library(tidyr)
group_by(df, VAR) %>%
  mutate(Row = row_number()) %>%
  ungroup() %>%
  spread(VAR, Ranking)
# # A tibble: 5 × 4
#     Row    X1    X2    X3
# * <int> <dbl> <dbl> <dbl>
# 1     1   1.0   1.5    NA
# 2     2   2.5    NA    NA
# 3     3   2.5   1.0   1.5
# 4     4   1.5    NA   1.5
# 5     5    NA    NA   3.0

Upvotes: 1

Related Questions