Stuber
Stuber

Reputation: 477

sed to make substitution on first character left to right

Using sed to make a substitution I want to match on first character, left to right, and everything right of match. The following does not work.

echo "a_4_3_2_1" | sed 's/.*\(_.*\)/\1/'  

and

echo "a_3_2_1" | sed 's/.*\(_.*\)/\1/'

both output just _1

I wish output to be: _4_3_2_1 and _3_2_1

How should a sed substitution be written to match left to right and capture everything right of match?

Upvotes: 0

Views: 1018

Answers (3)

user2141130
user2141130

Reputation: 1004

echo "a_3_2_1" | sed -r 's/^.(.*)/\1/'

sed regexes are "greedy" meaning they always match the longest possible match. So your examples fail because the first .* takes up all of the underscores before _.* gets to match them.

Upvotes: 1

clt60
clt60

Reputation: 63892

For example:

echo "a_4_3_2_1" | sed 's/[^_]*\(_.*\)/\1/'

or

 echo "a_4_3_2_1" | sed 's/[^_]*//'

or

echo "a_4_3_2_1" | grep -oP '^[^_]*\K.*'

In bash UUOE - Useless Use Of Echo. Better to write:

grep -oP '^[^_]*\K.*' <<<"a_4_3_2_1"

Exactly skip the 1st char:

grep -oP '^.\K.*' <<< "a_4_3_2_1"

or

 sed 's/.//' <<< "a_4_3_2_1"

Pure bash

s="a_4_3_2_1"
echo "${s:1}"

Upvotes: 3

MassPikeMike
MassPikeMike

Reputation: 682

The good news is that your substitution is already written to "match left to right and capture everything right of match"! But there are two things to keep in mind about your current code not working.

First, the match you asked for before the group in parentheses is .*. . matches any character and * matches zero-or-more occurrences. So the code matches zero-or-more occurrences of any character. But how many occurrences is zero-or-more?

That's affected by the second thing. By default, matches are greedy: they match as many characters as they can without causing the match to fail. In this case, they can match everything up to the last underscore (the one before the final 1.)

You can fix your pattern to match just a single character: . without the *:

echo "a_4_3_2_1" | sed 's/.\(_.*\)/\1/'  
_4_3_2_1

Some tools allow use of a minimal match, which would make the match non-greedy. But not sed.

Upvotes: 2

Related Questions