Reputation: 307
I have a dataset of individual actual choices
set.seed(123)
data <- tibble(id = c(1:100), choice = sample(c('a','b','c','d'),100,replace = T), x = runif(100, min=0, max=100))
head(data)
# A tibble: 6 x 3
id choice x
<int> <chr> <dbl>
1 1 a 91.1
2 2 d 87.5
3 3 d 3.88
4 4 b 32.0
5 5 d 27.8
6 6 c 76.3
id is the id number of an individual; choice is the actual choice from a b c and d; x is some individual character.
To run a specific models, I wish to generate a dataset of chosen and un-chosen observations, the dataset should look like
id choice x chosen
1 a 91.1 1
1 b 91.1 0
1 c 91.1 0
1 d 91.1 0
2 a 87.5 0
2 b 87.5 0
2 c 87.5 0
2 d 87.5 1
3 a 3.88 0
3 b 3.88 0
3 c 3.88 0
3 d 3.88 1
where chosen is a dummy indicating whether the choice is actually chosen.
Is there a tidy way to do this?
Thank you so much for your help!
Upvotes: 0
Views: 186
Reputation: 174393
You can use tidyr::complete
(Note that the random numbers generated in data
were different from the example despite the random seed)
complete(data = data, id, choice) %>%
group_by(id) %>%
mutate(chosen = ifelse(is.na(x), 0, 1),
x = x[!is.na(x)][1])
#> # A tibble: 400 x 4
#> # Groups: id [100]
#> id choice x chosen
#> <int> <chr> <dbl> <dbl>
#> 1 1 a 60.0 0
#> 2 1 b 60.0 0
#> 3 1 c 60.0 1
#> 4 1 d 60.0 0
#> 5 2 a 33.3 0
#> 6 2 b 33.3 0
#> 7 2 c 33.3 1
#> 8 2 d 33.3 0
#> 9 3 a 48.9 0
#> 10 3 b 48.9 0
#> # ... with 390 more rows
Upvotes: 1