Reputation: 513
I am trying to use the data.table package to update a data.table y
depending on how columns in y
relate to columns in x
, where the two dataframes can be mapped together by an id.
Consider the following two data.tables:
x <- data.table(id=c(1,2,3),status=c(0,1,1),xend=c(2,4,7))
y <- data.table(id=c(1,1,2,2,3,3),yend=c(2,2,3,5,6,8))
setkey(x,id)
setkey(y,id)
Now in y
I want to update the column yend
when status == 1 and yend > xend
so that yend
becomes xend
. Of course the following notation does not perform the operation, because I am mixing a data.table with an expression, but I thought it would help to show what I would like to do.
y[x[status==1] & yend>xend,yend:=xend]
What is the simplest way to do this?
Upvotes: 4
Views: 199
Reputation: 118859
Using NOCB (next observation carried forward) rolling joins:
# v1.9.6
y[x[status == 1L], yend := i.xend, on=c(id="id", yend="xend"), roll=-Inf]
Also have a look at rollends
argument if you want to roll along the edges of each group.
There's no need to setkey()
anymore, with the new on=
argument implemented that allows for adhoc joins (as subsets).
From the next version, on=c("id", yend="xend")
should be sufficient.
Note that this'll only replace the first yend
value that is > xend
. To replace all values, we'd need to perform a non-equi join, which is not yet implemented, but something we're looking into.
Until then, you can use pmin
as suggested by @Frank, but you'll need by=.EACHI
:
y[x[status == 1L], yend := pmin(yend, i.xend), on="id", by=.EACHI]
Upvotes: 6