nmr
nmr

Reputation: 753

Extract substring from string after third last occurrence of underscore

I have a string in Linux shell. This string contains underscores in it.

I want to extract a substring from the string.

I want to extract the substring after the third occurrence of an underscore, counted from the end of the string.

file_name='email_Tracking_export_history_2018_08_15'
string_name="${file_name#*_*_*_}"
file_name2='email_Tracking_export_2018_08_15'
string_name2="${file_name2#*_*_*_}"

echo "$string_name"
echo "$string_name2"

The result

history_2018_08_15
2018_08_15

As you see, string_name="${file_name#*_*_*_}" is not working properly.

Desired result:

2018_08_15
2018_08_15

How can I achieve my desired result?

Upvotes: 3

Views: 1945

Answers (6)

Benjamin W.
Benjamin W.

Reputation: 52132

You can do it in a single step, but it's a bit convoluted. After setting the filename

file_name='email_Tracking_export_history_2018_08_15'

we get the substring that contains everything except what we want to have in the end:

$ echo "${file_name%_*_*_*}"
email_Tracking_export_history

This is almost what we want, just an underscore missing, so we add that:

$ echo "${file_name%_*_*_*}_"
email_Tracking_export_history_

Now we know what we have to remove from the beginning of the string and insert that into the ${word#pattern} expansion:

$ echo "${file_name#"${file_name%_*_*_*}_"}"
2018_08_15

or we assign it to a variable for further use:

string_name=${file_name#"${file_name%_*_*_*}_"}
              └───┬───┘ │  └───┬───┘ └─┬──┘  │
             outer word │  inner word  └─────┼──inner pattern
                        └───outer pattern────┘

And analogous for the second string.

Upvotes: 5

James Brown
James Brown

Reputation: 37404

Is expr already banned to deepest hell even for string matching?:

$ expr "$file_name" : '.*_\([^_]*_[^_]*_[^_]*\)'
2018_08_15
$ expr "$file_name2" : '.*_\([^_]*_[^_]*_[^_]*\)'
2018_08_15

From https://www.tldp.org/LDP/abs/html/string-manipulation.html :

expr "$string" : '.*\($substring\)'

    Extracts $substring at end of $string, where $substring is a regular expression.

Upvotes: 0

oliv
oliv

Reputation: 13249

Using (most) sed and BRE:

sed 's/.*_\([^_]*\(_[^_]*\)\{2\}\)$/\1/' <<< "$file_name"
2018_08_15

Using GNU sed and ERE:

sed -r 's/.*_([^_]*(_[^_]*){2})$/\1/' <<< "$file_name"
2018_08_15

Upvotes: 0

user1551605
user1551605

Reputation: 303

% echo $file_name | rev | cut -f1-3 -d'_' | rev
2018_08_15
% echo $file_name2 | rev | cut -f1-3 -d'_' | rev
2018_08_15

rev reverses the string, making it easy to count the 3 underscores occurrences. The part of string you want to extract is then reversed back.

Upvotes: 0

tshiono
tshiono

Reputation: 22012

How about using regex in bash:

#!/bin/bash

# Extract substring from string after 3rd occurrence in reverse
function extract() {
    if [[ "$1" =~ _([^_]+_[^_]+_[^_]+$) ]]; then
        echo "${BASH_REMATCH[1]}"
    fi
}

file_name='email_Tracking_export_history_2018_08_15'
string_name=$(extract $file_name)

file_name2='email_Tracking_export_2018_08_15'
string_name2=$(extract $file_name2)

echo "$string_name"
echo "$string_name2"

Upvotes: 0

Ipor Sircer
Ipor Sircer

Reputation: 3141

Use a temporary variable:

file_name='email_Tracking_export_history_2018_08_15'
temp="${file_name%_*_*_*}"
string_name="${file_name/${temp}_}"
file_name2='email_Tracking_export_2018_08_15'
temp="${file_name2%_*_*_*}"
string_name2="${file_name2/${temp}_}"

echo "$string_name"
echo "$string_name2"

Upvotes: 0

Related Questions