TheRed
TheRed

Reputation: 327

Bash: How to extract numbers preceded by _ and followed by

I have the following format for filenames: filename_1234.svg

How can I retrieve the numbers preceded by an underscore and followed by a dot. There can be between one to four numbers before the .svg

I have tried:

width=${fileName//[^0-9]/}

but if the fileName contains a number as well, it will return all numbers in the filename, e.g.

file6name_1234.svg

I found solutions for two underscores (and splitting it into an array), but I am looking for a way to check for the underscore as well as the dot.

Upvotes: 3

Views: 2511

Answers (6)

Doe Johnson
Doe Johnson

Reputation: 1414

There's a solution using cut:

name="file6name_1234.svg"
num=$(echo "$name" | cut -d '_' -f 2 | cut -d '.' -f 1)
echo "$num"

-d is for specifying a delimiter.

-f refers to the desired field.

I don't know anything about performance but it's simple to understand and simple to maintain.

Upvotes: 0

Aserre
Aserre

Reputation: 5062

Try the following code :

filename="filename_6_1234.svg"
if [[ "$filename" =~ ^(.*)_([^.]*)\..*$ ]];
then
    echo "${BASH_REMATCH[0]}" #will display 'filename_6_1234.svg'
    echo "${BASH_REMATCH[1]}" #will display 'filename_6'
    echo "${BASH_REMATCH[2]}" #will display '1234'
fi

Explanation :

  • =~ : bash operator for regex comparison
  • ^(.*)_([^.])\..*$ : we look for any character, followed by an underscore, followed by any character, followed by a dot and an extension. We create 2 capture groups, one for before the last underscore, one for after
  • BASH_REMATCH : array containing the captured groups

Upvotes: 2

David C. Rankin
David C. Rankin

Reputation: 84551

You can use simple parameter expansion with substring removal to simply trim from the right up to, and including, the '.', then trim from the left up to, and including, the '_', leaving the number you desire, e.g.

$ width=filename_1234.svg; val="${width%.*}"; val="${val##*_}"; echo $val
1234

note: # trims from left to first-occurrence while ## trims to last-occurrence. % and %% work the same way from the right.

Explained:

  • width=filename_1234.svg - width holds your filename

  • val="${width%.*}" - val holds filename_1234

  • val="${val##*_}" - finally val holds 1234

Of course, there is no need to use a temporary value like val if your intent is that width should hold the width. I just used a temp to protect against changing the original contents of width. If you want the resulting number in width, just replace val with width everywhere above and operate directly on width.

note 2: using shell capabilities like parameter expansion prevents creating a separate subshell and spawning a separate process that occurs when using a utility like sed, grep or awk (or anything that isn't part of the shell for that matter).

Upvotes: 2

ceving
ceving

Reputation: 23824

If you set IFS, you can use Bash's build-in read.

This splits the filename by underscores and dots and stores the result in the array a.

IFS='_.' read -a a <<<'file1b2aname_1234.svg'

And this takes the second last element from the array.

echo ${a[-2]}

Upvotes: 0

Akshay Hegde
Akshay Hegde

Reputation: 16997

Some more way

[akshay@localhost tmp]$ filename=file1b2aname_1234.svg
[akshay@localhost tmp]$ after=${filename##*_}
[akshay@localhost tmp]$ echo ${after//[^0-9]}
1234

Using awk

[akshay@localhost tmp]$ awk -F'[_.]' '{print $2}' <<< "$filename"
1234

Upvotes: 1

schorsch312
schorsch312

Reputation: 5694

I would use

sed 's!_! !g' | awk '{print "_" $NF}' 

to get from filename_1234.svg to _1234.svg then

sed 's!svg!!g' 

to get rid of the extension.

Upvotes: 0

Related Questions