Reputation: 67
I'm using bash. I've tried everything I could. It always works in regexr. It works for other files. For one file, it simply does not work, no matter what. I can't use ?
make a quantifier lazy. I can't isolate the dbsnp file. Please help me!
vcfs=$(find . -type f -name "*\.vcf" -printf "%f\n")
echo "$vcfs"
>1000G_omni2.5.b37.vcf
>1000G_phase1.indels.b37.vcf
>1000G_phase1.snps.high_confidence.b37.vcf
>dbsnp_138.b37.vcf
>hapmap_3.3.b37.vcf
>Mills_and_1000G_gold_standard.indels.b37.vcf
thousandG=`expr "$vcfs" : '.*\(1000G_phase1.indels[^\n]*\.vcf\)'`
echo $thousand
>1000G_phase1.indels.b37.vcf
GoldStandard=`expr "$vcfs" : '.*\(Mills_and_1000G_gold_standard.indels[^\n]*\.vcf\)'`
echo $GoldStandard
>Mills_and_1000G_gold_standard.indels.b37.vcf
dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9][^\n]*\.vcf\)'`
echo $dbsnp
>dbsnp_138.b37.vcf
>hapmap_3.3.b37.vcf
dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9][^\n]*?\.vcf\)'`
echo $dbsnp
>
dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9].*?\.vcf\)'`
echo $dbsnp
>
echo `expr "$vcfs" : '.*\(hapmap_[\n]*\.vcf\)'`
>hapmap_3.3.b37.vcf
Upvotes: 0
Views: 487
Reputation: 40056
From man page of expr
:
STRING : REGEX Perform pattern matching. The arguments are coerced to strings and the second is considered to be a (basic, a la GNU `grep') regular expression
In basic GNU grep regex, non-greedy match modifier is not supported.
Consider using other tools like sed
, awk
, grep -P
etc
In your specific example (there will not be newlines in filenames), you can simply do echo "$vcfs" |grep "^dbsnp_[1-9][3-9][3-9].*\.vcf"
Upvotes: 1