user3224522
user3224522

Reputation: 1149

Find overlapping ranges based on positions in R

I have two datasets:

 chr1 25 85
 chr1 2000 3000
 chr2 345 2300

and the 2nd,

chr1 34 45 1.2
chr1 100 1000
chr2 456 1500 1.3

This is my desired output,

chr1 25 85 1.2
chr2 345 2300 1.3

Below is my code:

sb <- NULL
rangesC <- NULL
sb$bin <- NULL
for(i in levels(df1$V1)){
   s <- subset(df1, df1$V1 == i)
   sb <- subset(df2, df2$V1 == i)
   for(j in 1:nrow(sb)){
     sb$bin[j] <-s$V4[(s$V2 <= sb$V2[j] & s$V3 >= sb$V3[j])]
  }
 rangesC <- try(rbind(rangesC, sb),silent = TRUE)
}

The error I get is :

replacement has length zero OR when I use as.character rangesC is empty.

I would like to get the V4 corresponding if the positions overlap. What is going wrong?

Upvotes: 0

Views: 40

Answers (1)

Uwe
Uwe

Reputation: 42544

The foverlaps() function from the data.table package does an overlap join of two data.tables:

library(data.table)
setDT(df1, key = names(df1))
setDT(df2, key = key(df1))
foverlaps(df2, df1, nomatch = 0L)[, -c("i.V2", "i.V3")]
     V1  V2   V3  V4
1: chr1  25   85 1.2
2: chr2 345 2300 1.3

Data

library(data.table)
df1 <- fread(
  "chr1 25 85
 chr1 2000 3000
 chr2 345 2300", header = FALSE
)

df2 <- fread(
  "chr1 34 45 1.2
chr1 100 1000 
chr2 456 1500 1.3", header = FALSE
)

Upvotes: 1

Related Questions