Reputation: 65
I've searched around for an adequate answer and I don't think that this question has been answered yet. Essentially, I have a data frame in long format that looks like this:
ID event_type event_value
123 A 1.1
123 A 1.2
123 A "Hello"
234 B "Hello"
456 A 2.8
Where there are multiple event types with a variety of values. What I want to do is to reshape the data such that it looks like this
ID event_type_A_1 event_type_A_2 event_type_A_3 event_type_B_1
123 1.1 1.2 "Hello" NA
234 NA NA NA "Hello"
456 2.8 NA NA NA
Such that the new columns extend out as far as the longest event type for any given patient, and the rest are filled with NA. I've mucked around with spread()
and cast()
, but for some reason it just isn't clicking. Thanks!
Upvotes: 3
Views: 1660
Reputation: 8072
You can do this with dplyr
and tidyr
. The trick is using group_by
and adding a within group index using mutate
and row_number
.
library(dplyr)
library(tidyr)
df <- data_frame(ID = c(123,123,123,234,456),
event_type = c("A","A","A","B","A"),
event_value = c(1.1, 1.2, "Hello", "Hello", 2.8))
df %>%
group_by(ID) %>%
mutate(sub_ID = row_number()) %>%
unite("ID_type", event_type, sub_ID,remove = TRUE) %>%
spread(ID_type, event_value)
Breaking down the dplyr
chain:
row_number()
event_type
and sub_ID
with unite
and remove the original columnsspread()
the ID_type into columns with the values from event_value
Upvotes: 5