Forest
Forest

Reputation: 721

Why does fastx_trimmer think that my fastq file is an unknown file format?

I have some .fastq files from an Illumina NextSeq run. Many of the sequences have poly-A tracts that complicate mapping them. I want to remove all sequences of ten consecutive A's and have been trying to do so using fastx_clipper as follows:

ha1c6n8$ fastx_clipper -l 32 -Q33 -n -v -a AAAAAAAAAA –i FR0826_S1_L004_R1_001.fastq –o FR0826_L004_trimmed.fastq

This has resulted in the following error message:

fastx_clipper: input file (-) has unknown file format (not FASTA or FASTQ), first character = (10)

I'm not entirely sure what this means. I looked into the fastq file using head:

ha5c6n8$ head FR0826_S1_L004_R1_001.fastq

@NS500289:18:H1237BGXX:4:11401:2791:1023 1:N:0:1
NCTACATTGGTTCCTCAGCCAAGCACATACACCAAATGTCTGAACCTGCGGTACCTCTCGTACTGAGCAGGATT
+
#<<AAFAFFFAFFFFF7FF)FF.F<FAFFFFF<FF.AFFF7F.F.FFAFFFF)7AF7F<FFF<<F7FFFFFF7F
@NS500289:18:H1237BGXX:4:11401:19266:1023 1:N:0:1
NAATGGGTCTGCGAGAGCGCCAGCTATCCTGAGGGAAACTTCGGAGGGGGCCGGCTACTAGATGGTTCGCTTAGT
+
#<7AAFAFFFFFFFF7FFAA.AFF<F...<AFFFF7F..FA.A<AA<F7)FA7.FF.<FA..F.A7AF..FFF.A
@NS500289:18:H1237BGXX:4:11401:6297:1023 1:N:0:1
NATAAGAGGGGTGTGGCTAGGCTAAGCGTTTTGAGCTGCATTGCTGCGTGCTTGATGCTTGTCCCTTTTGATCGT

As far as I can tell, this looks like a perfectly normal fastq-formatted file. Can anyone explain what's causing this error? Thanks!

Upvotes: 0

Views: 972

Answers (1)

dkatzel
dkatzel

Reputation: 31658

Your fastq file starts with a new line (ASCII value 10) which is not allowed. Delete the first line and it should be OK.

Upvotes: 0

Related Questions