Reputation: 4635
I have come upon an issue with the ifelse
function not properly working in my data frame. I want to add a new column based on a conditional in the grouped data, but it seems that only the first element is being passed into the new column.
df <- data.frame(ID = c(1, 1, 1 ,2, 2, 5), A = c("foo", "bar", "bar", "foo", "foo", "bar"), B = c(seq(1:6)))
ID A B
1 1 foo 1
2 1 bar 2
3 1 bar 3
4 2 foo 4
5 2 foo 5
6 5 bar 6
df%>%
group_by(ID) %>%
mutate(C = ifelse(length(which(A == 'bar')) >= 2, B, NA))
# A tibble: 6 x 4
# Groups: ID [3]
ID A B C
<dbl> <fctr> <int> <int>
1 1 foo 1 1
2 1 bar 2 1
3 1 bar 3 1
4 2 foo 4 NA
5 2 foo 5 NA
6 5 bar 6 NA
I also tried do
like in tidyverse/dplyr/issues/489
but it produces the same result.
What is the MATRIX;)
expected output
# A tibble: 6 x 4
# Groups: ID [3]
ID A B C
<dbl> <fctr> <int> <int>
1 1 foo 1 1
2 1 bar 2 2
3 1 bar 3 3
4 2 foo 4 NA
5 2 foo 5 NA
6 5 bar 6 NA
Upvotes: 2
Views: 76
Reputation: 887088
Here the condition returns a logical vector
of length
1 for each 'ID',
df %>%
group_by(ID) %>%
summarise(ind = length(which(A=='bar'))>=2)
# A tibble: 3 x 2
# ID ind
# <dbl> <lgl>
#1 1 TRUE
#2 2 FALSE
#3 5 FALSE
so it is better to use if/else
. When we use ifelse
, the test
, yes
and no
should be of the the same length
. As the test
is returning a single element, the first element of 'B' i.e. we get the first element of 'B' populating for the entire 'ID'
df %>%
group_by(ID) %>%
mutate(C = if(length(which(A=='bar'))>=2) B else NA)
# A tibble: 6 x 4
# Groups: ID [3]
# ID A B C
# <dbl> <fctr> <int> <int>
#1 1 foo 1 1
#2 1 bar 2 2
#3 1 bar 3 3
#4 2 foo 4 NA
#5 2 foo 5 NA
#6 5 bar 6 NA
However, if we still needs to use ifelse
, then rep
licate
df %>%
group_by(ID) %>%
mutate(C=ifelse(rep(length(which(A=='bar'))>=2, n()),B,NA))
# A tibble: 6 x 4
# Groups: ID [3]
# ID A B C
# <dbl> <fctr> <int> <int>
#1 1 foo 1 1
#2 1 bar 2 2
#3 1 bar 3 3
#4 2 foo 4 NA
#5 2 foo 5 NA
#6 5 bar 6 NA
Upvotes: 4