Laura
Laura

Reputation: 113

Reshape data frame in R: rows to columns

there are 3 columns in the original data frame: id, type and rank. Now I want to create a new data frame having each possible value of type as a single column (see the small example below, the original data contains >100.000 rows and 30 types)

data1
id  type  rank
x   a     1
y   a     2
z   a     3
x   b     1
z   b     2
y   c     1     

data2
id  a  b  c
x   1  1  NA
y   2  NA  1
z   3  2  NA   

That's what I have done so far:

for (i in (1:nrow(data1))) {
  dtype <- data[i,2]
  if (any(data2$id == data1[i,1], na.rm = TRUE)) {
    row <- grep(data1[i,1],data2$id)
    data2[row,c(dtype)] <- data1[i,3]
  } else {
    data2[nrow(data2)+1,1] <- as.character(data1[i,1])
    data2[nrow(data2),c(dtype)] <- data1[i,3]
 }
}

This works (I hope this example explains what I am doing), but it is quite slow. Do you have any hints how I can optimize this algorithm?

Upvotes: 0

Views: 1043

Answers (3)

Andrew Taylor
Andrew Taylor

Reputation: 3488

Here's an example from the tidyr package.

library("tidyr")
library("dplyr")
data2<-
   data1 %>% spread(type, rank)

  id a  b  c
1  x 1  1 NA
2  y 2 NA  1
3  z 3  2 NA

Upvotes: 4

Arun
Arun

Reputation: 118809

Here's using data.table:

require(data.table)
ans = dcast.data.table(setDT(data1), id ~ type)
ans
#    id a  b  c
# 1:  x 1  1 NA
# 2:  y 2 NA  1
# 3:  z 3  2 NA

Upvotes: 3

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193547

Using the function by the word mentioned in your question, you can just use reshape from base R:

> reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
  id rank.a rank.b rank.c
1  x      1      1     NA
2  y      2     NA      1
3  z      3      2     NA

Upvotes: 4

Related Questions