ww22an
ww22an

Reputation: 51

How do I download BAM Files from SRA? I have the SRA Toolkit but I'm confused

I'm trying to download a dataset in the BAM Format from GEO/SRA, that I can use for analysis in RStudio.

I tried using this method: where i downloaded .sra and converted it to .bam

prefetch GSM269238
sam-dump C:\Users\Desktop\sratoolkit.2.10.8-win64\bin\ncbi\SRA\sra\GSM2692389.sra --output-file GSM2692389.bam

However, in RStudio this didn't work, and returned an error, saying it couldn't read the bam file This is my R Code; I'm using RSamTools

> bamfiles <- list.files("directory redacted due to privacy", ".bam")
> file.exists(bamfiles)
[1] TRUE
> 
> 
> #---> Define bam files for count step on Rsamtools
> 
> library("Rsamtools")
> bamfiles <- BamFileList(bamfiles, yieldSize=2000000)
> seqinfo(bamfiles)
Error in value[[3L]](cond) : 
  failed to open BamFile: SAM/BAM header missing or empty
  file: 'GSM2692389.bam'

Does anyone know how to help me download the SRA data into readable .bam files? Any help or guidance would be much appreciated as I'm really trying to meet a deadline with this.

Upvotes: 1

Views: 8011

Answers (1)

athiebaut
athiebaut

Reputation: 172

I'd say that your problem is caused by the fact that you don't actually have bam files ! Right now, your command is downloading sam files (hence the name sam-dump) and you're just saving these with a bam extension (a simple test would be to use head on your "bam files". If you can read them, then they're not binary, which means they're not bam. Otherwise, you can use samtools view, as bli suggested).

That being said, can you try this (make sure samtools is installed before using this) :

sam-dump C:\Users\Desktop\sratoolkit.2.10.8-win64\bin\ncbi\SRA\sra\GSM2692389.sra | samtools view -bS - > GSM2692389.bam

Also, if you're not particularly interested in downloading the .sra files, you might as well use this, which is easier and shorter (and maybe faster as well) :

sam-dump SRR5799988 | samtools view -bS - > GSM2692389.bam

I took the liberty of replacing your GSM number by the associated SRR number (see https://www.ncbi.nlm.nih.gov/sra?term=SRX2979455 ) but don't hesitate to double check the SRR !


More information on sam-dump : https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=sam-dump

Upvotes: 8

Related Questions