Louise Sørensen
Louise Sørensen

Reputation: 257

Spread data based on multiple key variables

My data:

df <- as.data.frame(cbind(Bilagstoptekst = c("A", "A", "A", "B", "B", "C", "D", "E", "E", "F", "F", "F", "F", "F"), 
              AKT=c("80", "80", "80", "80", "80", "25", "80", "80", "80", "80", "80", "25", "25", "80"), 
              IArt=c("HUVE", "HUVE", "HUVE", "HUVE", "HUBO", "BILÅ", "HUBO", "HUVE", "HUVE", "HUBO", "HUVE", "BILÅ", "BILÅ", "HUBO" ),
              Belob=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14)))

> df
Bilagstoptekst AKT IArt Belob
A               80 HUVE     1
A               80 HUVE     2
A               80 HUVE     3
B               80 HUVE     4
B               80 HUBO     5
C               25 BILÅ     6
D               80 HUBO     7
E               25 HUVE     8
E               80 HUVE     9
F               80 HUBO    10
F               80 HUVE    11
F               25 BILÅ    12
F               25 BILÅ    13
F               80 HUBO    14

Now, I like to spread my Belob-column for each key of the combination of Bilagstoptekst, AKT and IArt.

Output data should be like this:

Bilagstoptekst AKT IArt Belob1 Belob2 Belob3 
A               80 HUVE     1     2      3
B               80 HUVE     4    NA     NA
B               80 HUBO     5    NA     NA
C               25 BILÅ     6    NA     NA
D               80 HUBO     7    NA     NA
E               80 HUVE     8     9     NA
F               80 HUBO    10    14     NA
F               80 HUVE    11    NA     NA
F               25 BILÅ    12    13     NA

Now, I've tried with spread and dcast, but I just can't make it work.

In my real dataset I have thousands of rows, so this is just sample data.

Upvotes: 0

Views: 57

Answers (1)

markus
markus

Reputation: 26353

Here is a way using dcast from data.table

library(data.table)
dt <- as.data.table(df)
dt[, idx := rowid(Bilagstoptekst, AKT, IArt)] # creates the timevar
out <- dcast(dt, 
             Bilagstoptekst + AKT + IArt ~ paste0("Belob", idx),
             value.var = "Belob")
out
#   Bilagstoptekst AKT IArt Belob1 Belob2 Belob3
#1:              A  80 HUVE      1      2      3
#2:              B  80 HUBO      5   <NA>   <NA>
#3:              B  80 HUVE      4   <NA>   <NA>
#4:              C  25 BILÅ      6   <NA>   <NA>
#5:              D  80 HUBO      7   <NA>   <NA>
#6:              E  80 HUVE      8      9   <NA>
#7:              F  25 BILÅ     12     13   <NA>
#8:              F  80 HUBO     10     14   <NA>
#9:              F  80 HUVE     11   <NA>   <NA>

What is important here is the column idx that we created which serves as a "timevar" when we reshape your data.


In base R you would need to do

df$idx <- with(df, ave(Belob, Bilagstoptekst, AKT, IArt, FUN = seq_along))
reshape(df, idvar = c("Bilagstoptekst", "AKT", "IArt"), timevar = "idx", direction = "wide")

The tidyverse approach is left as an exercise ;)


Not sure if your question is a duplicate of Transpose / reshape dataframe without “timevar” from long to wide format.

Upvotes: 2

Related Questions