Reputation: 18875
I am trying to use regex to match a file name and extract only a portion of the file name. My file names have this pattern: galax_report_for_Sample11_8757.xls
, and I want to extract the string Sample11
in this case. I have tried the following regex, but it does not work for me, could someone help with the correct regex?
name=galax_report_for_Sample11_8757.xls
sampleName=$([[ "$name" =~ ^[^_]+_([^_]+) ]] && echo ${BASH_REMATCH[2]})
edit:
just found this works for me:
sampleName=$([[ "$name" =~ ^[^_]+_([^_]+)_([^_]+)_([^_]+) ]] && echo ${BASH_REMATCH[3]})
Upvotes: 0
Views: 65
Reputation: 437373
In a simple case like this, where you essentially have just a list of values separated by a single instance of a separator character each, consider using cut
to extract the field of interest:
sampleName=$(echo 'galax_report_for_Sample11_8757.xls' | cut -d _ -f 4)
If you're using bash
or zsh
or ksh
, you can make it a little more efficient:
sampleName=$(cut -d _ -f 4 <<< 'galax_report_for_Sample11_8757.xls')
Upvotes: 2
Reputation: 208455
Here is a slightly shorter alternative to the approach you used:
sampleName=$([[ "$name" =~ ^([^_]+_){3}([^_]+) ]] && echo ${BASH_REMATCH[2]})
Upvotes: 1