Removing rows of data in R

Question

What's the most reliable way to remove matching Ids from two large data frames in large?

For example, I have a list of participants who do not want to be contacted (n=200). I would like to remove them from my dataset of over 100 variables and 200,000 observations.

This is the list of 200 participants ids that I need to remove from the dataset.

exclude=read.csv("/home/Project/file/excludeids.csv", header=TRUE, sep=",") 
dataset.exclusion<- dataset[-which(exclude$ParticipantId %in% dataset$ParticipantId  ), ]

Is this the correct command to use?

I don't think this command is doing what I want, because when I verify with the following: length(which(dataset.exclusion$ParticipantId %in% exclusion$ParticipantId)) I don't get 0.

Any insight?

agstudy · Accepted Answer

You can do this for example:

sample1[!sample1$ParticipantID %in% 
            unique(exclusion$ParticipantId),]

Removing rows of data in R

Answers (2)

Related Questions