Reputation: 63
I want to get a subset of my dataframe by keeping rows that have numeric in all columns so
>small
0 16h 24h 48h
ID1 1 0 0
ID2 453 254 21 12
ID3 true 3 2 1
ID4 65 23 12 12
would be
>small_numeric
0 16h 24h 48h
ID2 453 254 21 12
ID4 65 23 12 1
I tried
sapply(small, is.numeric)
but got this
0 16h 24h 48h
FALSE FALSE FALSE FALSE
Upvotes: 2
Views: 1336
Reputation: 83215
Using:
small[!rowSums(is.na(sapply(small, as.numeric))),]
gives:
0 16h 24h 48h ID2 453 254 21 12 ID4 65 23 12 12
What this does:
sapply(small, as.numeric)
you force all columns to numeric. Non-numeric values are converted to NA
-values as a result.NA
-values with rowSums(is.na(sapply(small, as.numeric)))
which gives you back a numeric vector, [1] 1 0 1 0
, with the number of non-numeric values by row.!
gives you a logical vector of the rows where all columns have numeric values.Used data:
small <- read.table(text=" 0 16h 24h 48h
ID1 1 0 0
ID2 453 254 21 12
ID3 true 3 2 1
ID4 65 23 12 12", header=TRUE, stringsAsFactors = FALSE, fill = TRUE, check.names = FALSE)
For the updated example data, the problem is that columns with non-numeric values are factors instead of character. There you'll have to adapt the above code as follows:
testdata[!rowSums(is.na(sapply(testdata[-1], function(x) as.numeric(as.character(x))))),]
which gives:
0 16h 24h 48h NA ID2 ID2 46 23 23 48 ID3 ID3 44 10 14 22 ID4 ID4 17 11 4 24 ID5 ID5 13 5 3 18 ID7 ID7 4387 4216 2992 3744
Extra explanation:
as.numeric(as.character(x))
. If you don't do that, as.numeric
with give back the numbers of the factor levels.testdata[-1]
as I supposed that you didn't want to include the first column in the check for numeric values.Upvotes: 5