Reputation: 17
I want to have factor Levels of different variables as column names and as the value the count per PatID. What I have is this:
data_sample <- data.frame(
PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
What I want is the following:
PatID I250 M206 X560
1 2 1 0
2 2 1 1
3 1 2 2
Can anyone help? I tried dcast and others but the result never came
Upvotes: 0
Views: 142
Reputation: 66570
data_sample <- data.frame(
PatID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L),
status1 = c("I250", "NA", "NA", "X560", "M206", "NA", "NA", "M206", "NA"),
status2 = c(".", "M206", "NA", "I250", "I250", "M206", "NA", "NA", "X560"),
status3 = c(".", "I250", "NA", "NA", "NA", "I250", "X560", "NA", "NA")
)
library(tidyverse)
data_sample %>%
gather(status_num, value, -PatID) %>%
filter(value != "NA", value != ".") %>%
count(PatID, value) %>% # Improvement by @antoniosk
spread(value, n, fill = 0)
# A tibble: 3 x 4
# Groups: PatID [3]
PatID I250 M206 X560
<int> <int> <int> <int>
1 1 2 1 NA
2 2 2 1 1
3 3 1 2 2
Upvotes: 1