David Botezatu
David Botezatu

Reputation: 189

Bash: extract a part of a string, after a number

I have a few strings like this:

var1="string one=3423423 and something which i don't care"
var2="another bigger string=413145 and something which i don't care"
var3="the longest string ever=23442 and something which i don't care"

These strings are the output of a python script (which i am not allowed to touch), and I need a way to extract the 1st part of the string, right after the number. Basically, my outputs should be:

"string one=3423423"
"another bigger string=413145"
"the longest string ever=23442"

As you can see, i can't use positions, or stuff like that, because the number and the string length are not always the same. I assume i would need to use a regex or something, but i don't really understand regexes. Can you please help with a command or something which can do this?

Upvotes: 1

Views: 318

Answers (3)

glenn jackman
glenn jackman

Reputation: 246807

$ shopt -s extglob

$ echo "${var1%%+([^0-9])}"
string one=3423423

$ echo "${var2%%+([^0-9])}"
another bigger string=413145

$ echo "${var3%%+([^0-9])}"
the longest string ever=23442

+([^0-9]) is an extended pattern that matches one or more non-digits.
${var%%+([^0-9])} with %%pattern will remove the longest match of that pattern from the end of the variable value.

Refs: patterns, parameter substitution

Upvotes: 0

janos
janos

Reputation: 124646

You could use parameter expansion, for example:

var1="string one=3423423 and something which i don't care"
name=${var1%%=*}
value=${var1#*=}
value=${value%%[^0-9]*}
echo "$name=$value"
# prints: string one=3423423

Explanation of ${var1%%=*}:

  • %% - remove the longest matching suffix
  • = - match =
  • * - match everything

Explanation of ${var1#*=}:

  • # - remove the shortest matching prefix
  • * - match everything
  • = - match =

Explanation of ${value%%[^0-9]*}:

  • %% - remove the longest matching suffix
  • [^0-9] - match any non-digit
  • * - match everything

To perform the same thing on more than one values easily, you could wrap this logic into a function:

extract_and_print() {
    local input=$1
    local name=${input%%=*}
    local value=${input#*=}
    value=${value%%[^0-9]*}
    echo "$name=$value"
}

extract_and_print "$var1"
extract_and_print "$var2"
extract_and_print "$var3"

Upvotes: 0

P....
P....

Reputation: 18371

grep -oP '^.*?=\d+' inputfile
string one=3423423
another bigger string=413145
the longest string ever=23442

Here -o flag will enable grep to print only matching part and -p will enable perl regex in grep. Here \d+ means one or more digit. So, ^.*?=\d+ means print from start of the line till you find last digit (first match).

Upvotes: 1

Related Questions