Joey
Joey

Reputation: 181

Delete rows with negative values

In R I am trying to delete rows within a dataframe (ants) which have a negative value under the column heading Turbidity. I have tried

ants<-ants[ants$Turbidity<0,]

but it returns the following error:

Warning message:
In Ops.factor(ants$Turbidity, 0) : < not meaningful for factors

Any ideas why this may be? Perhaps I need to make the negative values NA before I then delete all NAs?

Any ideas much appreciated, thank you!

@Joris: result is

str(ants$Turbidity)

num [1:291] 0 0 -0.1 -0.2 -0.2 -0.5 0.1 -0.4 0 -0.2 ...

Upvotes: 3

Views: 14271

Answers (4)

Seth Brundle
Seth Brundle

Reputation: 170

This should also work using the tidyverse (assuming column is the correct data type).

ants %>% dplyr::filter(Turbidity >= 0)

Upvotes: 0

Spacedman
Spacedman

Reputation: 94317

Always do summary(ants) after reading in data, and check if you get what you expect.

It will save you lots of problems. Numeric data is prone to magic conversion to character or factor types.

Upvotes: 3

Joris Meys
Joris Meys

Reputation: 108613

Marek is right, it's a data problem. Now be careful if you use [as.numeric(ants$Turbidity] , as that one will always be positive. It gives the factor levels (1 to length(ants$Turbidity)), not the numeric factors.

Try this :

tt <- as.numeric(as.character(ants$Turbidity))
which(!is.na(tt))

It will give you a list of indices where the value was not numeric in the first place. This should enable you to first clean up your data.

eg:

> Turbidity <- factor(c(1,2,3,4,5,6,7,8,9,0,"a"))
> tt <- as.numeric(as.character(Turbidity))
Warning message:
NAs introduced by coercion 
> which(is.na(tt))
[1] 11

You shouldn't use the as.numeric(as.character(...)) structure to convert problematic data, as it will generate NA's that will mess with the rest. Eg:

> Turbidity[tt > 5]
[1] 6    7    8    9    <NA>
Levels: 0 1 2 3 4 5 6 7 8 9 a

Upvotes: 3

Marek
Marek

Reputation: 50783

EDIT. I forget about as.character conversion (see Joris comment).


Message mean that ants$Turbidit is a factor. It will work when you do

ants <- ants[as.numeric(as.character(ants$Turbidity)) > 0,]

or

ants <- subset(ants, as.character(as.numeric(Turbidity)) > 0)

But the real problem is that your data are not prepared to analysis. Such conversion should be done in the beginning. You should be careful cause there could be non-numeric values also.

Upvotes: 0

Related Questions