Reputation: 49
I have the next strings: for example:
input1 = abc-def-ghi-jkl
input2 = mno-pqr-stu-vwy
I want extract the first word between "-"
for the fisrt string I want to get: def
if the input is the second string, I want to get: pqr
I want to use the command SED, Could you help me please?
Upvotes: 1
Views: 5898
Reputation:
grep
solution (in my opinion this is the most natural approach, as you are only trying to find matches to a regular expression - you are not looking to edit anything, so there should be no need for the more advanced command sed
)
grep -oP '^[^-]*-\K[^-]*(?=-)' << EOF
> abc-qrs-bobo-the-clown
> 123-45-6789
> blah-blah-blah
> no dashes here
> mahi-mahi
> EOF
Output
qrs
45
blah
Explanation
Look at the inputs first, included here for completeness as a heredoc (more likely you would name your file as the last argument to grep
.) The solution requires at least two dashes to be present in the string; in particular, for mahi-mahi
it will find no match. If you want to find the second mahi
as a match, you can remove the lookahead assertion at the end of the regular expression (see below).
The regular expression does this. First note the command options: -o
to return only the matched substring, not the entire line; and -P
to use Perl extensions. Then, the regular expression: start from the beginning of the line (^
); look for zero or more non-dash characters followed by dash, and then (\K
) discard this part of the required match from the substrings found to match the pattern. Then look for zero or more non-dash characters again - this will be returned by the command. Finally, require a dash following this pattern, but do not include it in the match. This is done with a lookahead (marked by (?= ... )
).
Upvotes: 0
Reputation: 19982
When you want to use sed
, you can choose between solutions like
# Double processing
echo "$input1" | sed 's/[^-]*-//;s/-.*//'
# Normal approach
echo "$input1" | sed -r 's/^[^-]*-([^-]*)|-.*)/\1/g'
# Funny alternative
echo "$input1" | sed -r 's/(^[^-]*-|-.*)//g'
The obvious "external" tool would be cut
. You can also look at a Bash builtin solution like
[[ ${input1} =~ ([^-]*)-([^-]*) ]] && printf %s "${BASH_REMATCH[2]}"
Upvotes: 0
Reputation: 4688
With bash
:
var='input1 = abc-def-ghi-jkl'
var=${var#*-} # remove shortest prefix `*-`, this removes `input1 = abc-`
echo "${var%%-*}" # remove longest suffix `-*`, this removes `-ghi-jkl`
Or with awk
:
awk -F'-' '{print $2}' <<<'input1 = abc-def-ghi-jkl'
Use -
as input field separator and print the second field.
Or with cut
:
cut -d'-' -f2 <<<'input1 = abc-def-ghi-jkl'
Upvotes: 1
Reputation: 18611
Use
sed 's,^[^-]*-\([^-]*\).*,\1,' file
The string after the first -
will be captured up to the second -
and the rest will be matched, then the matched line will be replaced with the group text.
Upvotes: 2