Open the way
Open the way

Reputation: 27379

Remove from the beginning till certain part in a string

I work with strings like

abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf

and I need to get a new one where I remove in the original string everything from the beginning till the last appearance of "_" and the next characters (can be 3, 4, or whatever number)

so in this case I would get

_adf

How could I do it with "sed" or another bash tool?

Upvotes: 0

Views: 3184

Answers (8)

ysth
ysth

Reputation: 98398

Just for fun:

echo abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf | tr _ '\n' | tail -n 1 | rev | tr '\n' _ | rev

Upvotes: 0

brandizzi
brandizzi

Reputation: 27070

Just group the last non-underscore characters preceded by the last underscore with \(_[^_]*\), then reference this group with \1:

 sed 's/^.*\(_[^_]*\)$/\1/'

Result:

$ echo abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf | sed 's/^.*\(_[^_]*\)$/\1/'
_adf

Upvotes: 1

DavidO
DavidO

Reputation: 13942

In Perl, you could do this:

my $string = "abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf";

if ( $string =~ m/(_[^_]+)$/ ) {
    print $1;
}

[Edit] A Perl one liner approach (ie, can be run from bash directly):

perl -lne 'm/(_[^_]+)$/ && print $1;' infile > outfile

Or using substitution:

perl -pe 's/.*(_[^_]+)$/$1/' infile > outfile

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 247042

If you have strings like these in bash variables (I don't see that specified in the question), you can use parameter expansion:

s="abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf"
t="_${s##*_}"
echo "$t"  # ==> _adf

Upvotes: 1

Larry Morell
Larry Morell

Reputation: 502

Regular expression pattern matching is greedy. Hence ^.*_ will match all characters up to and including the last _. Then just put the underscore back in:

echo abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf | sed 's/^.*_/_/'

Upvotes: 5

Greg Jackson
Greg Jackson

Reputation: 118

Do you need to modify the string, or just find everything after the last underscore? The regex to find the last _{anything} would be /(_[^_]+)$/ ($ matches the end of the string), or if you also want to match a trailing underscore with nothing after it, /(_[^_]*)$/.

Unless you really need to modify the string in place instead of just finding this piece, or you really want to do this from the command line instead of a script, this regex is a bit simpler (you tagged this with perl, so I wasn't sure quite how committed to using just the command line as opposed to a simple script you were).

If you do need to modify the string in place, sed -i 's/(_[^_]+)$/\1/' myfile or sed -i 's/(_[^_]+)$/\1/g' myfile. The -i (edit: I decided not to be lazy and look up the proper syntax...) the -i flag will just overwrite the old file with the new one. If you want to create a new file and not clobber the old one, sed -e 's/.../.../g' oldfile > newfile. The g after the s/// will do this for all instances in the file you pass into sed; leaving it out just replaces the first instance.

If the string is not by itself at the end of the line, but rather embedded in other text. but just separated by whitespace, replace the $ with \s, which will match a whitespace character (the end of a word).

Upvotes: 1

Toto
Toto

Reputation: 91508

A Perl way:

echo 'abc_dsdsds_ss_gsgsdsfsdf_ewew_wewewewewew_adf' | \
perl -e 'print ((split/(_)/,<>)[-2..-1])'

output:

_adf

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799170

sed 's/^(.*)_([^_]*)$/_\2/' < input.txt

Upvotes: 1

Related Questions