Reputation: 9
I have one variable which has values like this
VAL1="59809_RH_EA_TEST_1_P1_Q"
or
VAL1="89292-RH_EA_TEST_1_P1_Q"
How can I get only RH_EA_TEST_1_P1_Q
using sed or any other bash command
Upvotes: 0
Views: 498
Reputation: 5768
With awk
#!/bin/sh
rnum () { # remove numeric characters before any alphabetic characters
awk '
function ch(i) { return substr(ARGV[1], i, 1) } # ith character
BEGIN {
a = "[a-zA-Z]" ; d = "[0-9]"
n = length(ARGV[1]); i = 1
for ( ; i <= n && ch(i) !~ a; i++) if (ch(i) !~ d) ans = ans ch(i)
for ( ; i <= n ; i++) ans = ans ch(i)
print ans
}
' "$1"
}
# usage
rnum 59809_RH_EA_TEST_1_P1_Q
rnum 89292-RH_EA_TEST_1_P1_Q
rnum "123 abc 456 efg"
Upvotes: 0
Reputation: 92854
Alternative approaches:
VAL1="59809_RH_EA_TEST_1_P1_Q"
sed
approach:
sed 's/^[^_-]*[_-]\(.*\)/\1/' <<< $VAL1
cut
approach:
cut -d'_' -f2- <<< $VAL1
The output(for both approaches):
RH_EA_TEST_1_P1_Q
Upvotes: 0
Reputation: 6995
One way is to use Bash regex matching.
VAL1="59809_RH_EA_TEST_1_P1_Q"
if
[[ $VAL1 =~ ^[0-9]+_(.*) ]]
then
VAL1=${BASH_REMATCH[1]}
fi
This assumes your numbers are always followed by an underscore. If you want to avoid this assumption, you could use :
if
[[ $VAL1 =~ ^[0-9]+_?(.*) ]]
then
VAL1=${BASH_REMATCH[1]}
fi
Bash regex matching works as a test (the [[ =~ ]]
expression returns 0 if there is a match), and sub-expressions (defined in the matching string by using parentheses around the areas of interest) are available as elements in array BASH_REMATCH
, starting at index 1. Extended regular expressions are used.
In case anyone wonders, no double quoting is required anywhere in the above. The [[ ]]
is special shell syntax (not a command with arguments like the [
/test
command), no word splitting is performed inside. The assignment also does not perform word splitting.
Upvotes: 1
Reputation: 23667
With Parameter Expansion
$ VAL1='59809_RH_EA_TEST_1_P1_Q'
$ echo "${VAL1#*[_-]}"
RH_EA_TEST_1_P1_Q
$ VAL1='89292-RH_EA_TEST_1_P1_Q'
$ echo "${VAL1#*[_-]}"
RH_EA_TEST_1_P1_Q
_
or -
Upvotes: 1