zhd
zhd

Reputation: 13

Create new variable (in new columns but not create new column) from rows with same ID in R

I want to change may Data Frame (left) to (right). I used reshape package, however it will create new column not as what I want create

enter image description here

Thank you so much if someone can help me solve my issue!

Upvotes: 0

Views: 80

Answers (2)

markus
markus

Reputation: 26343

You need to create another variable first that is either v1 or v2 (for the example data given above).

We use ave and seq_along to create a run-length type id column, 'idx'. This variable will from 1 to the total number of observations for every group of 'IDBILL'. We use paste0 to add the suffix "v".

(DF <- transform(DF, 
                 idx = ave(IDPRODUCT,
                           IDBILL, # grouping variable
                           FUN = function(x) paste0("v", seq_along(x))
       )))
#  IDBILL IDPRODUCT idx
#1    111    ABC123  v1
#2    111    ABC124  v2
#3    112    BCH134  v1
#4    113    ABC123  v1
#5    113    GDF345  v2

Now use dcast from reshape2 to spread the 'idx' column and fill in the values of 'IDPRODUCT'.

library(reshape2)
(dcast(DF, IDBILL ~ idx, value.var = 'IDPRODUCT'))
#  IDBILL     v1     v2
#1    111 ABC123 ABC124
#2    112 BCH134   <NA>
#3    113 ABC123 GDF345

Without additional packages you could use reshape instead of dcast.

reshape(DF, idvar = "IDBILL", timevar = "idx", direction = "wide")

If you prefer data.table you could try

library(data.table)
setDT(DF)[, idx := paste0("v", rleid(IDPRODUCT)), by = IDBILL]
dcast(DF, IDBILL ~ idx, value.var = 'IDPRODUCT')

To finally complete the unholy trinity - try tidyverse

library(tidyverse)
DF %>% 
  group_by(IDBILL) %>% 
  mutate(idx = paste0("v", row_number())) %>% 
  spread(idx, IDPRODUCT) %>% 
  ungroup()

data

DF <- structure(list(IDBILL = c(111, 111, 112, 113, 113), IDPRODUCT = c("ABC123", 
"ABC124", "BCH134", "ABC123", "GDF345")), .Names = c("IDBILL", 
"IDPRODUCT"), row.names = c(NA, -5L), class = "data.frame")

Upvotes: 2

Sal-laS
Sal-laS

Reputation: 11639

I hope you consider this solution as well.

Data generation

dt<-data.frame("id"=sample(3,5,replace = TRUE),value=sample(5))
m<-merge(dt, dt) 

And then

IDs<-unique(m[,1])   

i<-1
res<-NULL
while(i<=length(IDs)){

   res<-rbind(res, c(IDs[i],m[m[,1]==IDs[i],2]))
   i<-i+1
}
res

You can use, either rbind, or rbind.na

Upvotes: 0

Related Questions