TonyW
TonyW

Reputation: 18875

Find a string in a file name (shell script)

I am trying to use regex to match a file name and extract only a portion of the file name. My file names have this pattern: galax_report_for_Sample11_8757.xls, and I want to extract the string Sample11 in this case. I have tried the following regex, but it does not work for me, could someone help with the correct regex?

name=galax_report_for_Sample11_8757.xls
sampleName=$([[ "$name" =~ ^[^_]+_([^_]+) ]] && echo ${BASH_REMATCH[2]})

edit:

just found this works for me:

sampleName=$([[ "$name" =~ ^[^_]+_([^_]+)_([^_]+)_([^_]+) ]] && echo ${BASH_REMATCH[3]})

Upvotes: 0

Views: 65

Answers (2)

mklement0
mklement0

Reputation: 437373

In a simple case like this, where you essentially have just a list of values separated by a single instance of a separator character each, consider using cut to extract the field of interest:

sampleName=$(echo 'galax_report_for_Sample11_8757.xls' | cut -d _ -f 4)

If you're using bash or zsh or ksh, you can make it a little more efficient:

sampleName=$(cut -d _ -f 4 <<< 'galax_report_for_Sample11_8757.xls')

Upvotes: 2

Andrew Clark
Andrew Clark

Reputation: 208455

Here is a slightly shorter alternative to the approach you used:

sampleName=$([[ "$name" =~ ^([^_]+_){3}([^_]+) ]] && echo ${BASH_REMATCH[2]})

Upvotes: 1

Related Questions