Reputation: 61
Supposing I have the data frame below:
key values
A 118
B 118
B 118
C 100
A 90
A 90
A 90
I want to replace the values that have the same key of the previous row with 0. Then, the data would be like:
key values
A 118
B 118
B 0
C 100
A 90
A 0
A 0
For a data frame called 'm', I used for and if:
for (i in 1:(length(m[,1])-1)) {
if (m[,1][i+1]==m[,1][i]) {m[,2][i+1]<- 0}
}
It works, but my data has 2104776 rows and 23 columns. So, it took 3 hours.
I'm wondering if there is a more efficient way to do this task. Any ideas would be appreciated. Thank you.
Upvotes: 6
Views: 1342
Reputation: 2076
You basically take cusmsum of the keys cols and then subtract an index of length of those key to give you a vector which should hold the values and you can convert everything else to 0.
i=cumsum(rle(as.character(df$key))$lengths)-(rle(as.character(df$key))$lengths-1)
df$values[-i]=0
Upvotes: 1
Reputation: 388807
We can use lag
from dplyr
and check if the value in key
is same as its previous value and assign 0 to those who match.
library(dplyr)
df$values[df$key == lag(df$key)] <- 0
df
# key value
#1 A 118
#2 B 118
#3 B 0
#4 C 100
#5 A 90
#6 A 0
#7 A 0
Upvotes: 6
Reputation: 38500
In data.table
you could use rleid
to group, like this:
library(data.table)
setDT(df)
df[, values := c(head(values, 1), rep(0L, .N-1)), by=rleid(key)]
df
key values
1: A 118
2: B 118
3: B 0
4: C 100
5: A 90
6: A 0
7: A 0
c(head(values, 1), rep(0L, .N-1))
takes the first element of values and prepends this to a vector of 0s the length of the groups minus 1.
Upvotes: 2