Reputation: 937
This seems like a fairly simple operation but I can't figure out how to do it or search for it.
I have data similar to the following:
id,status
123,0
123,1
124,1
125,1
125,2
126,0
126,2
I want a list of ids that don't have any rows where the status is 2. So in the example above, the operation would return:
123
124
I believe that in sql it might be something like this:
SELECT DISTINCT id
FROM user
WHERE id NOT IN (
SELECT DISTINCT id
FROM user
WHERE status='2')
)
But I'm not sure how to do that in R.
Upvotes: 1
Views: 31
Reputation: 66819
If you put it into a data.table:
require(data.table)
mydt <- data.table(user)
Then the syntax works like this:
mydt[
,
list(keep=all(status!=2)),
by=id][
(keep),
id
]
That is, FROM[WHERE,SELECT,BY][WHERE2,SELECT2,BY2]...[WHEREK,SELECTK,BYK]
. After each set of brackets, you have a new data.table to work with for the next query in the chain.
After you install the data.table
package, you can type vignette("datatable-faq")
and browse to question 2.16 for details on the analogy between SQL syntax and what's available in data.table.
Upvotes: 2