Manish
Manish

Reputation: 9

how to remove numeric characters before any alphabetic characters

I have one variable which has values like this

VAL1="59809_RH_EA_TEST_1_P1_Q" 

or

VAL1="89292-RH_EA_TEST_1_P1_Q"

How can I get only RH_EA_TEST_1_P1_Q using sed or any other bash command

Upvotes: 0

Views: 498

Answers (4)

slitvinov
slitvinov

Reputation: 5768

With awk

#!/bin/sh

rnum () { # remove numeric characters before any alphabetic characters
 awk '
 function ch(i) { return substr(ARGV[1], i, 1) } # ith character
 BEGIN {
         a = "[a-zA-Z]" ; d = "[0-9]"
         n = length(ARGV[1]); i = 1
         for ( ; i <= n && ch(i) !~ a; i++) if (ch(i) !~ d) ans = ans ch(i)
         for ( ; i <= n              ; i++)                 ans = ans ch(i)
         print ans
       }
 ' "$1"
}

# usage
rnum 59809_RH_EA_TEST_1_P1_Q
rnum 89292-RH_EA_TEST_1_P1_Q
rnum "123 abc 456 efg"

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Alternative approaches:

VAL1="59809_RH_EA_TEST_1_P1_Q"

sed approach:

sed 's/^[^_-]*[_-]\(.*\)/\1/' <<< $VAL1

cut approach:

cut -d'_' -f2- <<< $VAL1

The output(for both approaches):

RH_EA_TEST_1_P1_Q

Upvotes: 0

Fred
Fred

Reputation: 6995

One way is to use Bash regex matching.

VAL1="59809_RH_EA_TEST_1_P1_Q"

if
  [[ $VAL1 =~ ^[0-9]+_(.*) ]]
then
  VAL1=${BASH_REMATCH[1]}
fi

This assumes your numbers are always followed by an underscore. If you want to avoid this assumption, you could use :

if
  [[ $VAL1 =~ ^[0-9]+_?(.*) ]]
then
  VAL1=${BASH_REMATCH[1]}
fi

Bash regex matching works as a test (the [[ =~ ]] expression returns 0 if there is a match), and sub-expressions (defined in the matching string by using parentheses around the areas of interest) are available as elements in array BASH_REMATCH, starting at index 1. Extended regular expressions are used.

In case anyone wonders, no double quoting is required anywhere in the above. The [[ ]] is special shell syntax (not a command with arguments like the [/test command), no word splitting is performed inside. The assignment also does not perform word splitting.

Upvotes: 1

Sundeep
Sundeep

Reputation: 23667

With Parameter Expansion

$ VAL1='59809_RH_EA_TEST_1_P1_Q'
$ echo "${VAL1#*[_-]}"
RH_EA_TEST_1_P1_Q

$ VAL1='89292-RH_EA_TEST_1_P1_Q'
$ echo "${VAL1#*[_-]}"
RH_EA_TEST_1_P1_Q
  • This removes minimal match from start of string upto first occurrence of _ or -

Upvotes: 1

Related Questions