Reputation: 383
I have a huge "range" data, this data is in GenomicRanges
format, if I convert this to data.frame, follows one example:
file <- "seqnames start end width strand
chr1 2 5 4 *
chr2 3 7 5 *"
file<-read.table(text=file,header=T)
I would like to decompose this "ranges" in individual positions like this example:
file2 <- "seqnames Position
chr1 2
chr1 3
chr1 4
chr1 5
chr2 3
chr2 4
chr2 5
chr2 6
chr2 7"
file2 <- read.table(text=file2,header=T)
How can I do this?
Upvotes: 1
Views: 157
Reputation: 31452
We can use data.table
library(data.table)
setDT(file)[, .(position = start:end), by = seqnames]
# seqnames position
# 1: chr1 2
# 2: chr1 3
# 3: chr1 4
# 4: chr1 5
# 5: chr2 3
# 6: chr2 4
# 7: chr2 5
# 8: chr2 6
# 9: chr2 7
Upvotes: 1
Reputation: 46856
If using Bioconductor GenomicRanges, then
> GPos(GRanges(c("chr1:2-5", "chr2:3-7")))
GPos object with 9 positions and 0 metadata columns:
seqnames pos strand
<Rle> <integer> <Rle>
[1] chr1 2 *
[2] chr1 3 *
[3] chr1 4 *
[4] chr1 5 *
[5] chr2 3 *
[6] chr2 4 *
[7] chr2 5 *
[8] chr2 6 *
[9] chr2 7 *
-------
seqinfo: 2 sequences from an unspecified genome; no seqlengths
perhaps first with
GRanges(file)
Upvotes: 2