awk ...regular expression ..exceeds implementation size limit

Question

Would anyone happen to have any insight or suggestion into this error i.e. can this be 'fixed' and if so, how best?

awk: line 1: regular expression /splice_acc ... exceeds implementation size limit

The expression used in my bash script was...

grep -v '^##' $IN | awk 'BEGIN{FS=" "; OFS=" "} $1~/#CHROM/ || $10~/^1\/1/ && ($11~/^1\/0/ || $11~/^0\/0/ || $11~/^0\/1/) && $1~/^[0-9X]*$/ && /splice_acceptor_variant|splice_donor_variant|splice_region_variant|stop_lost|start_lost|stop_gained|missense_variant|coding_sequence_variant|inframe_insertion|disruptive_inframe_insertion|inframe_deletion|disruptive_inframe_deletion|exon_variant|exon_loss_variant|exon_loss_variant|duplication|inversion|frameshift_variant|feature_ablation|duplication|gene_fusion|bidirectional_gene_fusion|rearranged_at_DNA_level|miRNA|initiator_codon_variant|start_retained/ {$3=$7=""; print $0}' | sed 's/ */ /g' | awk '{split($9,a,":"); split(a[2],b,","); if (b[1]>b[2] || $1~/#CHROM/) print $0}' > $OUT

Thanks for any help given, very much appreciated.

Thank you for your suggestions!

A sample of the input is:

Chr1 926694 . C T 2510.49 . AB=0;ABP=0;AC=2;AF=1;AN=2;AO=82;CIGAR=1X;DP=85;DPB=85;DPRA=0;EPP=6.82362;EPPR=9.52472;GTI=0;LEN=1;MEANALT=1;MQM=57.0854;MQMR=60;NS=1;NUMALT=1;ODDS=108.152;PAIRED=1;PAIREDR=1;PAO=0;PQA=0;PQR=0;PRO=0;QA=2916;QR=42;RO=3;RPL=46;RPP=5.65844;RPPR=9.52472;RPR=36;RUN=1;SAF=45;SAP=4.70511;SAR=37;SRF=0;SRP=9.52472;SRR=3;TYPE=snp;ANN=T|upstream_gene_variant|MODIFIER|AT1G03720|AT1G03720|transcript|AT1G03720.1|protein_coding||c.-321G>A|||||321|,T|downstream_gene_variant|MODIFIER|AT1G03700|AT1G03700|transcript|AT1G03700.1|protein_coding||c.*4850C>T|||||4793|,T|downstream_gene_variant|MODIFIER|AT1G03710|AT1G03710|transcript|AT1G03710.1|protein_coding||c.*2407C>T|||||1968|,T|downstream_gene_variant|MODIFIER|AT1G03730|AT1G03730|transcript|AT1G03730.1|protein_coding||c.*4323G>A|||||4134|,T|downstream_gene_variant|MODIFIER|AT1G03710|AT1G03710|transcript|AT1G03710.2|protein_coding||c.*2407C>T|||||2339|,T|intergenic_region|MODIFIER|AT1G03720-AT1G03730|AT1G03720-AT1G03730|intergenic_region|AT1G03720-AT1G03730|||n.926694C>T|||||| GT:DP:AD:RO:QR:AO:QA:GL 1/1:85:3,82:3:42:82:2916:-252.316,-21.6676,0

awk ...regular expression ..exceeds implementation size limit

Answers (1)

Related Questions