LAUREN CHUKRALLAH
LAUREN CHUKRALLAH

Reputation: 1

combining two custom .gtf files (r)

I have two custom files and have been trying to combine them into a single .gtf file to eventually make a custom fasta for kallisto. i have been using the biostrings library (bioconductor)in R and have read both files in as Genomic Ranges

genes.gf <-readGFFAsGRanges("novel_genes.gtf")
isoforms.gf <-readGFFAsGRanges("novel_isoforms.gtf")

however, when i try to use cat to combine them, i get an error:

cat(genes.gf, isoforms.gf, file = "custom.gf")

Error in cat(tgenes.gf, isoforms.gf, file = "custom.gf") : argument 1 (type 'S4') cannot be handled by 'cat'

rbind also doesn't seem to work:

rbind(genes.gf, isoforms.gf, file = "custom.gf")

giving this error: Error in rbind2(argl[[i]], r) : no method for coercing this S4 class to a vector

I'm still very new to this and any advice/ suggestions would be greatly appreciated!

Upvotes: 0

Views: 1147

Answers (1)

StupidWolf
StupidWolf

Reputation: 46968

I make two example gtf files:

library(rtracklayer)
gtf = import("ftp://ftp.ensembl.org/pub/release-99/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.99.gtf.gz")
export(gtf[gtf$gene_biotype=="protein_coding"],"coding.gtf")
export(gtf[gtf$gene_biotype=="tRNA"],"tRNA.gtf")

If it is a GRanges, you can combine them using c() :

g1 = import("coding.gtf")
g2 = import("tRNA.gtf")

export(c(g1,g2),"combined.gtf")

However, note the the above combined gtf is not sorted, and that if some columns appear in one but not the other, it will be filled with NAs.

Upvotes: 1

Related Questions