Reputation: 3291
I have two dataframes, here's the first:
df <- data.frame(p=letters[1:3],y1=c(2,4,3))
df
p y1
1 a 2
2 b 4
3 c 3
and the second one:
df2 <- data.frame(p=rep(letters[1:3],c(3,2,4)),y2=c(3,1,1,4,3,4,3,3,1),d=rep(1,length=9))
df2
p y2 d
1 a 3 1
2 a 1 1
3 a 1 1
4 b 4 1
5 b 3 1
6 c 4 1
7 c 3 1
8 c 3 1
9 c 1 1
What I want to do is get those lines in df2
, where for each value of p
(a,b,c etc.) where d=1
(which in this case are all rows), y2
is bigger than y1
grouped by p
in df
.
Because this explanation probably doesn't make sense, the two lines that need to be kicked: line 1 in df2
, because for a
, y2=3
is greater than y1=2
in df
, and line 6, because for c
in df2
, y
has value 4, but the value for c
in df
is 3.
Since I'm working with data.tables, a "data.table-solution" would be nice, maybe something like:
setkey(df2,d)
df2[1,y>??,by="p"]
Upvotes: 1
Views: 130
Reputation: 121568
You should use merge
before subsetting.
Using data.table
:
library(data.table)
merge(data.table(df1,key='p'),
data.table(df2,key='p'))[d==1 & y2 > y1]
p y1 y2 d
1: a 2 3 1
2: c 3 4 1
Using base merge
:
subset(merge(df1,df2), d==1 & y2 > y1)
p y1 y2 d
1 a 2 3 1
6 c 3 4 1
EDIT
For the data.table
solution , here is better to use a join Y[X] , looking up Y's rows using X's key.(LEFT OUTER JOIN)
DF2 <- data.table(df2,key='p')
DF1 <- data.table(df1,key='p')
DF2[DF1][d==1 & y2 > y1]
p y2 d y1
1: a 3 1 2
2: c 4 1 3
Upvotes: 6
Reputation: 5424
Similar to rmk but using plyr:
library(plyr)
dfa <- data.frame(p=letters[1:3],y1=c(2,4,3))
dfa
dfb <- data.frame(p=rep(letters[1:3],c(3,2,4)),y2=c(3,1,1,4,3,4,3,3,1),d=rep(1,length=9))
dfb
dfb <- join(dfa, dfb, by = "p", type = "left", match = "all")
dfb
dfb$z <- ifelse(dfb$y2>dfb$y1, 1, 0)
dfb[dfb$z==1, ]
Upvotes: 1
Reputation: 2785
Try:
df3 <- merge(df,df2,by=1)
> df3
p y1 y2 d
1 a 2 3 1
2 a 2 1 1
3 a 2 1 1
4 b 4 4 1
5 b 4 3 1
6 c 3 4 1
7 c 3 3 1
8 c 3 3 1
9 c 3 1 1
> df3[df3$y2>df3$y1 & df3$d==1,]
p y1 y2 d
1 a 2 3 1
6 c 3 4 1
Upvotes: 0