Reputation: 310
Sorry if this is a dumb question, I'm new to R. I have a data set like this:
t a b
1 1 1 0
2 2 1 0
3 3 1 4
4 4 1 0
5 5 1 2
6 1 2 0
7 2 2 1
8 3 2 3
9 4 2 0
10 5 2 5
I want to add a new column c
which is on if b
is zero and no previous b
grouped by a
was not zero, and zero if not. Basically I want to mark the leading zeros for each a
, based on the t
index. The result should look like this:
t a b c
1 1 1 0 1
2 2 1 0 1
3 3 1 4 0
4 4 1 0 0
5 5 1 2 0
6 1 2 0 1
7 2 2 1 0
8 3 2 3 0
9 4 2 0 0
10 5 2 5 0
I tried running
data.c <- ifelse(nrow(subset(data, t < data$t & a == data$a & b != 0)) == 0 & data$b == 0, 1, 0)
but that just set c
to 1 if b
was 0. What am I doing wrong? How would you approach this?
Thanks
Reproducible example:
t <- "time a b
1 1 1 0
2 2 1 0
3 3 1 4
4 4 1 0
5 5 1 2
6 1 2 0
7 2 2 3
8 4 2 5
9 4 2 0"
data <- read.table(text=t, header = TRUE)
data$c <- ifelse(nrow(subset(data, t < data$t & a == data$a & b != 0)) == 0 & data$b == 0, 1, 0)
Upvotes: 1
Views: 45
Reputation: 50678
How about the following using dplyr
and cumsum
:
require(dplyr);
df %>%
group_by(a) %>%
arrange(a, time) %>%
mutate(c = ifelse(b != 0 | cumsum(b) > 0, 0, 1)) %>%
ungroup();
# time a b c
# <int> <int> <int> <dbl>
# 1 1 1 0 1.00
# 2 2 1 0 1.00
# 3 3 1 4 0
# 4 4 1 0 0
# 5 5 1 2 0
# 6 1 2 0 1.00
# 7 2 2 1 0
# 8 3 2 3 0
# 9 4 2 0 0
#10 5 2 5 0
df <- read.table(text =
"time a b
1 1 1 0
2 2 1 0
3 3 1 4
4 4 1 0
5 5 1 2
6 1 2 0
7 2 2 1
8 3 2 3
9 4 2 0
10 5 2 5", header = T)
Upvotes: 1