Reputation: 5939
I have the following string: connect_2014-06-03.csv
and the following regex: (\S+)[_-]
.
What I want to do is extract only the first word, i.e. connect
from the string, but for some reason the regex matches connect_2014-06-
. I have tried to make it non greedy by doing (\S+)[_-]?
But that does not seem to work.
Anyone have any idea?
Upvotes: 0
Views: 75
Reputation: 54323
It's the +
that is greedy, not the overall regex. You need to modify the \S+
inside your capture group to not be as greedy.
(\S+?)[_-]
Also see this regex101.
Maybe it makes sense not to use any non-whitespace character, but instead just use ([a-z]+)_
? Remember, dash and underscore are also non-whitespace.
Upvotes: 4
Reputation: 35198
There are two easy solutions to this.
You can explicitly state that you want non-greedy by adding a ?
to your quantifier.
(\S+?)[_-]
Or you can make your character class limit itself:
([^_-\s]*)
Upvotes: 1
Reputation: 784898
You can use BASH string manipulation instead of regex:
s='connect_2014-06-03.csv'
echo "${s%%_*}"
connect
For using regex you can use:
[[ "$s" =~ ^([^_]+) ]] && echo "${BASH_REMATCH[1]}"
connect
Upvotes: 1