Reputation: 101
I have the following dataframe in R
DF2<-data.frame("A"= c(1,2,3,4,5), "A_1"= c(1,2,3,4,5), "B"= c(1,2,10,NA,NA), "B_1"=c(2,3,10,NA, NA), "ID"=c("A", "A", "A","A","A"))
I have tried to expand the dataframe using the code
library(dplyr)
library(tidyr)
df2= DF2 %>%
mutate(E=pmax(A,B,na.rm = TRUE))%>%
complete(E = seq(max(E)))
The column E should consist of the range of values in A and B. A_1, B_1 values should align with the corresponding values of column E
I am getting the following output
ID E A A_1 B B_1
1 A 1 1 1 1 2
2 A 2 2 2 2 3
3 A 3 NA NA NA NA
4 A 4 4 4 NA NA
5 A 5 5 5 NA NA
6 A 6 NA NA NA NA
7 A 7 NA NA NA NA
8 A 8 NA NA NA NA
9 A 9 NA NA NA NA
10 A 10 3 3 10 10
However, I expect the following output
E ID A A_1 B B_1
1 1 A 1 1 1 2
2 2 A 2 2 2 3
3 3 A 3 3 NA NA
4 4 A 4 4 NA NA
5 5 A 5 5 NA NA
6 6 A NA NA NA NA
7 7 A NA NA NA NA
8 8 A NA NA NA NA
9 9 A NA NA NA NA
10 10 A NA NA 10 10
I request someone to help here. The final output should match A_1 value with E column value such that E value should equal A value, else it should return zero. ID is the grouping variable
Upvotes: 1
Views: 64
Reputation: 388862
Use group_by
ID
and complete
using max
of E
and replace
the value to NA
where it doesn't match column E
library(dplyr)
library(tidyr)
DF2 %>%
mutate(E=pmax(A,B,na.rm = TRUE)) %>%
group_by(ID) %>%
complete(E = seq(max(E))) %>%
mutate_at(vars(matches("^A|^B")), ~replace(., . != E, NA))
# ID E A A_1 B B_1
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 A 1 1 1 1 2
# 2 A 2 2 2 2 3
# 3 A 3 NA NA NA NA
# 4 A 4 4 4 NA NA
# 5 A 5 5 5 NA NA
# 6 A 6 NA NA NA NA
# 7 A 7 NA NA NA NA
# 8 A 8 NA NA NA NA
# 9 A 9 NA NA NA NA
#10 A 10 NA NA 10 10
Upvotes: 1