Brian Smith
Brian Smith

Reputation: 1353

Un-melting a dataframe in generalised way

Let say I have below data.frame

library(reshape2)

set.seed(1)

dat = data.frame(X1 = sample(letters, 10, replace = T), X2 = sample(letters, 10, replace = T), X3 = sample(LETTERS, 10, replace = T), X4 = sample(LETTERS[1:4], 10, replace = T), X5 = sample(11:13, 10, replace = T), X6 = sample(200:201, 10, replace = T))

dat

   X1 X2 X3 X4 X5  X6
1   q  w  J  B 12 201
2   t  c  U  A 12 200
3   q  c  W  A 11 200
4   b  u  E  C 11 201
5   p  m  C  B 13 201
6   g  t  V  A 12 201
7   t  d  F  C 13 201
8   x  t  P  D 11 201
9   d  e  E  D 13 200
10  l  m  L  D 13 201

Now I want to un-melt above data frame such that, the column will be the unique combination of the values of columns X4 & X5, value vector will be X6, which will corresponds to all rows except X4, X5, and X6 (my actual dataframe has many columns so I can not explicitly name those remaining columns)

So in above case, the columns of the final dataframe will be {X1, X2, X3, A-12 A-13, ... B-12, B-13, .. etc}

Can you please help me how to achieve this with dcast() function from reshape?

Thanks for your pointer.

Upvotes: 1

Views: 56

Answers (2)

Uwe
Uwe

Reputation: 42544

The casting formula of dcast() allows for a special variable ... which represents all other variables not used in the formula. So,

library(reshape2)
dcast(dat, ... ~ X4 + X5, value.var = "X6")

should return the expected result:

   X1 X2 X3 A_13 B_11 B_12 B_13 C_11 C_13 D_12
1   a  j  Y   NA   NA   NA   NA   NA   NA  201
2   b  v  N   NA   NA   NA   NA  201   NA   NA
3   d  u  E   NA   NA   NA   NA  201   NA   NA
4   g  u  I  201   NA   NA   NA   NA   NA   NA
5   k  j  E   NA   NA   NA  201   NA   NA   NA
6   n  g  B   NA   NA  200   NA   NA   NA   NA
7   r  i  J   NA   NA  201   NA   NA   NA   NA
8   s  o  Y   NA   NA   NA   NA   NA  201   NA
9   w  n  E   NA  201   NA   NA   NA   NA   NA
10  y  a  U   NA   NA   NA   NA   NA   NA  200

By the way, enhanced versions of the melt() and dcast() functions are also available in the data.table package.

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388817

You can unite X4 and X5 columns and use pivot_wider.

library(dplyr)
library(tidyr)

dat %>%
  unite(col, X4, X5) %>%
  pivot_wider(names_from = col, values_from = X6)

#   X1    X2    X3     D_12  C_11  A_13  B_11  B_13  B_12  C_13
#   <chr> <chr> <chr> <int> <int> <int> <int> <int> <int> <int>
# 1 y     a     U       200    NA    NA    NA    NA    NA    NA
# 2 d     u     E        NA   201    NA    NA    NA    NA    NA
# 3 g     u     I        NA    NA   201    NA    NA    NA    NA
# 4 a     j     Y       201    NA    NA    NA    NA    NA    NA
# 5 b     v     N        NA   201    NA    NA    NA    NA    NA
# 6 w     n     E        NA    NA    NA   201    NA    NA    NA
# 7 k     j     E        NA    NA    NA    NA   201    NA    NA
# 8 n     g     B        NA    NA    NA    NA    NA   200    NA
# 9 r     i     J        NA    NA    NA    NA    NA   201    NA
#10 s     o     Y        NA    NA    NA    NA    NA    NA   201 

Upvotes: 0

Related Questions