Reputation:
I have a large dataset (202k points). I know that there are 8 values over 0.5. I want to subset on those rows.
How do I find/return a list the row numbers where the values are > 0.5?
Upvotes: 1
Views: 585
Reputation: 1095
Using the argument arr.ind=TRUE
with which
is a great way for finding the row (or column) numbers where a condition is TRUE
,
df <- matrix(c(0.6,0.2,0.1,0.25,0.11,0.13,0.23,0.18,0.21,0.29,0.23,0.51), nrow=4)
# [,1] [,2] [,3]
# [1,] 0.60 0.11 0.21
# [2,] 0.20 0.13 0.29
# [3,] 0.10 0.23 0.23
# [4,] 0.25 0.18 0.51
which
with arr.ind=TRUE
returns the array indices where the condition is TRUE
which(df > 0.5, arr.ind=TRUE)
row col
[1,] 1 1
[2,] 4 3
so the subset becomes
df[-which(df > 0.5, arr.ind=TRUE)[, "row"], ]
# [,1] [,2] [,3]
# [1,] 0.2 0.13 0.29
# [2,] 0.1 0.23 0.23
Upvotes: 0
Reputation: 34621
Here's some dummy data:
D<-matrix(c(0.6,0.1,0.1,0.2,0.1,0.1,0.23,0.1,0.8,0.2,0.2,0.2),nrow=3)
Which looks like:
> D
[,1] [,2] [,3] [,4]
[1,] 0.6 0.2 0.23 0.2
[2,] 0.1 0.1 0.10 0.2
[3,] 0.1 0.1 0.80 0.2
And here's the logical row index,
index <- (rowSums(D>0.5))>=1
You can use it to extract the rows you want:
PeakRows <- D[index,]
Which looks like this:
> PeakRows
[,1] [,2] [,3] [,4]
[1,] 0.6 0.2 0.23 0.2
[2,] 0.1 0.1 0.80 0.2
Upvotes: 0
Reputation: 31820
If the dataset is a vector named x
:
(1:length(x))[x > 0.5]
If the dataset is a data.frame or matrix named x
and the variable of interest is in column j
:
(1:nrow(x))[x[,j] > 0.5]
But if you just want to find the subset and don't really need the row numbers, use
subset(x, x > 0.5)
for a vector and
subset(x, x[,j] > 0.5)
for a matrix or data.frame.
Upvotes: 5