codemachino
codemachino

Reputation: 33

Reshape data table using multiple timevar

I have a dataset DT where each row represents the performance of an individual in a race:

personID raceDate raceID finPos
person1 2009-08-14 489801 2
person1 2010-04-17 502397 6
person1 2011-03-10 524554 4
person2 2009-08-14 489801 1
person2 2011-03-10 524554 3
... ... ... ...

I want to transform the dataset so there is 1 row per person, and there are columns for their race performances in order of the race date (with NA values for individuals who didn't attend a specific race). For example:

personID 489801 finPos 502397 finPos 524554 finPos
person1 2009-08-14 2 2010-04-17 6 2011-03-10 4
person2 2009-08-14 1 NA NA 2011-03-10 3

I know I can get part of the way there with reshape2 for example:

reshape(DT, direction = "wide", idvar = "raceID", timevar = "raceDate")

But is there a way to ensure that the combination of raceID/raceDat/finishPos stays together?

Upvotes: 0

Views: 339

Answers (2)

Jon Spring
Jon Spring

Reputation: 66415

Here's a tidyr approach. The data pairs are together, but I have them swapped in order.

library(tidyr)
DT %>%
  pivot_wider(id_cols = personID, 
              names_from = c(raceID), 
              names_glue = "{raceID}_{.value}",
              values_from = c(raceDate, finPos)) %>%
  select(personID, sort(colnames(.)))


## A tibble: 2 x 7
#  personID `489801_finPos` `489801_raceDate` `502397_finPos` `502397_raceDate` `524554_finPos` `524554_raceDate`
#  <chr>              <int> <chr>                       <int> <chr>                       <int> <chr>            
#1 person1                2 2009-08-14                      6 2010-04-17                      4 2011-03-10       
#2 person2                1 2009-08-14                     NA NA                              3 2011-03-10   

Upvotes: 1

Wimpel
Wimpel

Reputation: 27732

data.table approach

library(data.table)
DT <- fread("personID   raceDate    raceID  finPos
person1     2009-08-14  489801  2
person1     2010-04-17  502397  6
person1     2011-03-10  524554  4
person2     2009-08-14  489801  1
person2     2011-03-10  524554  3")

# Cast to wide
cols <- c("raceDate", "finPos")  #value columns to cast
answer <- dcast(DT, personID ~ raceID, value.var = cols, drop = FALSE)

now the data is cast to wide, but is 'grouped' by value-variable. All you need to do now is to reorder the columns...

# Determine column order
new_col_order <- CJ( sort(unique(DT$raceID)), cols, sorted = FALSE)[, paste(cols, V1, sep = "_")]
# Set new column order
setcolorder(answer, c(setdiff(names(answer), new_col_order), new_col_order))
#    personID raceDate_489801 finPos_489801 raceDate_502397 finPos_502397 raceDate_524554 finPos_524554
# 1:  person1      2009-08-14             2      2010-04-17             6      2011-03-10             4
# 2:  person2      2009-08-14             1            <NA>            NA      2011-03-10             3

Upvotes: 1

Related Questions