Reputation: 12112
If I have data like this
x <- data.frame(id=letters[1:4],v1=c(3,0,0,1),v2=c(1,1,0,1),v3=c(1,0,0,0),v4=c(2,0,2,0))
id v1 v2 v3 v4
a 3 1 1 2
b 0 1 0 0
c 0 0 0 2
d 1 1 0 0
How is the best way to change it to
id v1 v2 v3 v4
a 1 1 1 1
a 1 NA NA 1
a 1 NA NA NA
b 0 1 0 0
c 0 0 0 1
c NA NA NA 1
d 1 1 0 0
where the data is only composed of 1s and 0s. Each ID now gets multiple rows if needed. Empty positions are populated with NAs.
Upvotes: 3
Views: 123
Reputation: 35584
You can use uncount()
from tidyr
.
library(dplyr)
library(tidyr)
x %>%
uncount(apply(select(x, v1:v4), 1, max)) %>%
group_by(id) %>%
mutate(across(v1:v4, ~ {if(first(.) == 0) 0 else rep(1, first(.))}[1:n()]))
# # A tibble: 7 x 5
# # Groups: id [4]
# id v1 v2 v3 v4
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 a 1 1 1 1
# 2 a 1 NA NA 1
# 3 a 1 NA NA NA
# 4 b 0 1 0 0
# 5 c 0 0 0 1
# 6 c NA NA NA 1
# 7 d 1 1 0 0
Upvotes: 2
Reputation: 34511
tidyr::uncount()
is useful here but it's necessary to account for 0
values so they're not dropped:
library(tidyr)
library(dplyr)
x %>%
pivot_longer(-id) %>%
uncount(value + (value == 0)) %>%
group_by(id, name) %>%
mutate(rn = row_number(),
value = +(value != 0)) %>%
pivot_wider(names_from = name, values_from = value) %>%
select(-rn)
# A tibble: 7 x 5
# Groups: id [4]
id v1 v2 v3 v4
<chr> <int> <int> <int> <int>
1 a 1 1 1 1
2 a 1 NA NA 1
3 a 1 NA NA NA
4 b 0 1 0 0
5 c 0 0 0 1
6 c NA NA NA 1
7 d 1 1 0 0
Upvotes: 2