ajax2000
ajax2000

Reputation: 711

dplyr convert data to wide format with two keys

I have a dataframe that looks like this:

ID, Type, Unit, Value, Status
A, L1, cm, 100, F
A, L2, %, 70, F
A, L3, cm, 100, F
A, L4, %, 80, F
B, L1, cm, 100, T
B, L2, %, 70, T
B, L3, cm, 100, T
B, L4, %, 80, T

And I want to convert this to wide format:

ID, L1(cm), L2(%), L3(cm), L4(%), Status
A, 100, 70, 100, 80, F
B, 100, 70, 100, 80, T

I'm using the spread function in dplyr, with the key being Type and value to be Value. And I think I will have to do it again on the ID column. I would like to know if there's a standard way to do this or if there's a more concise way.

Upvotes: 1

Views: 489

Answers (2)

wxxyyyzz
wxxyyyzz

Reputation: 251

@akrun 's answer is definitely beautiful. If you also want to try reshape, below should also work but you might need to rename the column names afterwards.

Using the same data

df <- data.frame(ID = c("A", "A", "A", "A", "B", "B", "B", "B"), 
             Type = c(" L1", " L2", " L3", " L4", " L1", " L2", " L3", " L4"), 
             Unit = c("cm", "%", "cm", "%", "cm", "%", "cm", "%"), 
             Value = c(100, 70, 100, 80, 100, 70, 100, 80), 
             Status = c(" F", " F", " F", " F", " T", " T", " T", " T"))

Using package reshape2

library(reshape2)
dcast(df, ID + Status  ~ Type + Unit, value.var = "Value" )

Output

  ID Status  L1_cm  L2_%  L3_cm  L4_%
1  A      F    100    70    100    80
2  B      T    100    70    100    80

Upvotes: 1

akrun
akrun

Reputation: 887078

We can use tidyverse. Create a concatenated column with 'Type' and 'Unit' either with paste or unite (less flexible), and then spread to 'wide' format after removing the unwanted columns

library(tidyverse)
df1 %>%
  mutate(TypeUnit = paste0(Type, "(", Unit, ")")) %>%
  select(-Type, -Unit) %>%
  spread(TypeUnit, Value) 
#   ID Status  L1(cm)  L2(%)  L3(cm)  L4(%)
#1  A      F     100     70     100     80
#2  B      T     100     70     100     80

data

df1 <- structure(list(ID = c("A", "A", "A", "A", "B", "B", "B", "B"), 
Type = c(" L1", " L2", " L3", " L4", " L1", " L2", " L3", 
" L4"), Unit = c("cm", "%", "cm", "%", "cm", "%", "cm", "%"
), Value = c(100L, 70L, 100L, 80L, 100L, 70L, 100L, 80L), 
Status = c(" F", " F", " F", " F", " T", " T", " T", " T"
)), row.names = c(NA, -8L), class = "data.frame")

Upvotes: 4

Related Questions