Reputation: 3108
I have a data.frame that looks like this.
bed <- data.frame(chrom=c(rep("Chr1",5)),
chromStart=c(18915152,24199229,73730,81430,89350),
chromEnd=c(18915034,24199347,74684,81550,89768),
strand=c("-","+","+","+","+"))
write.table(bed, "test_xRNA.bed",row.names = F,col.names = F, sep="\t", quote=FALSE)
Created on 2022-07-29 by the reprex package (v2.0.1)
and I want to convert it into a bed file. I try to do it with the writing.table function, but I fail miserably by taking this error comment when I look at the intersect
Error: unable to open file or unable to determine types for file test_xRNA.bed
- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the
expected columns (e.g., cols 2 and 3 for BED).
Any ideas of how I can properly convert a data.frame to a .bed file in R?
I have heard about the rtracklayer package, does anyone have an experience with it?
I have tried the following post but it does not work at all for me export file from R in bed format. Any help is highly appreciated
Upvotes: 0
Views: 2001
Reputation: 3108
I think its a lot more complicated to make a bed file: Here is a solution I have been working on the last days
suppressPackageStartupMessages(library(GenomicRanges))
suppressPackageStartupMessages(library(rtracklayer))
suppressPackageStartupMessages(library(tidyverse))
# data
bed <- data.frame(chrom=c(rep("Chr1",5)),
chromStart=c(18915152,24199229,73730,81430,89350),
chromEnd=c(18915034,24199347,74684,81550,89768),
strand=c("-","+","+","+","+"))
# transform such as always chromStart < chromEnd
bed2 <- bed |>
transform(chromStart=ifelse(chromStart>chromEnd,chromEnd,chromStart),
chromEnd= ifelse(chromEnd<chromStart,chromStart,chromEnd))
# Genomic Ranges
bed3 <- GenomicRanges::makeGRangesFromDataFrame(bed2)
head(bed3)
#> GRanges object with 5 ranges and 0 metadata columns:
#> seqnames ranges strand
#> <Rle> <IRanges> <Rle>
#> [1] Chr1 18915034-18915152 -
#> [2] Chr1 24199229-24199347 +
#> [3] Chr1 73730-74684 +
#> [4] Chr1 81430-81550 +
#> [5] Chr1 89350-89768 +
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
# rtracklayer
bed4 <- rtracklayer::export(bed3, format="bed", ignore.strand = FALSE)
bed4
#> [1] "Chr1\t18915033\t18915152\t.\t0\t-" "Chr1\t24199228\t24199347\t.\t0\t+"
#> [3] "Chr1\t73729\t74684\t.\t0\t+" "Chr1\t81429\t81550\t.\t0\t+"
#> [5] "Chr1\t89349\t89768\t.\t0\t+"
# write it as a bed file
# this is essential to make sure that this works properly
write.table(bed4, "test.bed", sep="\t", col.names=FALSE, row.names = FALSE, append = TRUE, quote = FALSE)
Created on 2022-08-02 by the reprex package (v2.0.1)
and now you have a functional bed file to work with the bed tools
Upvotes: 0
Reputation: 344
Check the BED format specification. The first three columns (chromosome, start, end) are obligatory. Strand is the sixth column, and if you want to use it, you need to include columns 4 (name) and 5 (score). They can be filled with "." if you have nothing to put there.
bed <- data.frame(chrom=c(rep("Chr1",5)),
chromStart=c(18915152,24199229,73730,81430,89350),
chromEnd=c(18915034,24199347,74684,81550,89768),
name = rep(".", 5),
score = rep(".", 5),
strand=c("-","+","+","+","+"))
Upvotes: 1