Reputation: 1036
So it got a csv I'm reading into an R dataframe, it looks like this
clientx,clienty,screenx,screeny
481,855,481,847
481,784,481,847
481,784,481,847
879,292,879,355
First line is of course the header. So we have 4 columns with numeric data in it, ranging from 1 to 4 digits. There are no negative numbers in the set except -1 which marks a missing value. I want to remove every row that contains a -1 in any of the 4 columns.
Thanks in advance for the help
Upvotes: 3
Views: 23478
Reputation: 4133
The direct way:
df <- df[!apply(df, 1, function(x) {any(x == -1)}),]
UPDATE: this approach will fail if data.frame contains character columns because apply
implicitly converts data.frame to matrix (which contains data of only one type) and character type has a priority over numeric types thus data.frame will be converted into character matrix.
Or replace -1 with NA
and then use na.omit
:
df[df==-1] <- NA
df <- na.omit(df)
These should work, I didn't check. Please always try to provide a reproducible example to illustrate your question.
Upvotes: 8
Reputation: 179558
Your most efficient way will be to use the na.strings
argument of read.csv()
to code all -1
values as NA
, then to drop incomplete cases.
Step 1: set na.strings=-1
in read.csv()
:
x <- read.csv(text="
clientx,clienty,screenx,screeny
481,855,481,847
481,784,481,847
481,784,481,847
-1,292,879,355", header=TRUE, na.strings=-1)
x
clientx clienty screenx screeny
1 481 855 481 847
2 481 784 481 847
3 481 784 481 847
4 NA 292 879 355
Step 2: Now use complete.cases
or na.omit
:
x[complete.cases(x), ]
clientx clienty screenx screeny
1 481 855 481 847
2 481 784 481 847
3 481 784 481 847
na.omit(x)
clientx clienty screenx screeny
1 481 855 481 847
2 481 784 481 847
3 481 784 481 847
Upvotes: 9