eugheugh
eugheugh

Reputation: 67

How to do non-greedy match in expr

I'm using bash. I've tried everything I could. It always works in regexr. It works for other files. For one file, it simply does not work, no matter what. I can't use ? make a quantifier lazy. I can't isolate the dbsnp file. Please help me!

vcfs=$(find . -type f -name "*\.vcf" -printf "%f\n")

echo "$vcfs"

>1000G_omni2.5.b37.vcf

>1000G_phase1.indels.b37.vcf

>1000G_phase1.snps.high_confidence.b37.vcf

>dbsnp_138.b37.vcf

>hapmap_3.3.b37.vcf

>Mills_and_1000G_gold_standard.indels.b37.vcf

thousandG=`expr "$vcfs" : '.*\(1000G_phase1.indels[^\n]*\.vcf\)'`

echo $thousand

>1000G_phase1.indels.b37.vcf

GoldStandard=`expr "$vcfs" : '.*\(Mills_and_1000G_gold_standard.indels[^\n]*\.vcf\)'`

echo $GoldStandard

>Mills_and_1000G_gold_standard.indels.b37.vcf

dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9][^\n]*\.vcf\)'`

echo $dbsnp

>dbsnp_138.b37.vcf

>hapmap_3.3.b37.vcf

dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9][^\n]*?\.vcf\)'`

echo $dbsnp

>

dbsnp=`expr "$vcfs" : '.*\(dbsnp_[1-9][3-9][3-9].*?\.vcf\)'`

echo $dbsnp

>

echo `expr "$vcfs" : '.*\(hapmap_[\n]*\.vcf\)'`

>hapmap_3.3.b37.vcf

Upvotes: 0

Views: 487

Answers (1)

Adrian Shum
Adrian Shum

Reputation: 40056

From man page of expr :

STRING : REGEX Perform pattern matching. The arguments are coerced to strings and the second is considered to be a (basic, a la GNU `grep') regular expression

In basic GNU grep regex, non-greedy match modifier is not supported.

Consider using other tools like sed, awk, grep -P etc


In your specific example (there will not be newlines in filenames), you can simply do echo "$vcfs" |grep "^dbsnp_[1-9][3-9][3-9].*\.vcf"

Upvotes: 1

Related Questions