Ibrahim
Ibrahim

Reputation: 100

using sed or grep to split a url

For instance , I have a url like this --> http://google.com/test/to_be_extracted.html I want to split this url and get to_be_extracted part only. I want to exclude http://google.com/test/ and .html parts.

How can I setup regex pattern using GREP or SED ?

Upvotes: 3

Views: 870

Answers (2)

jaypal singh
jaypal singh

Reputation: 77105

You can use this:

$ echo 'http://google.com/test/to_be_extracted.html' | sed -r 's#.*\/([^.]+).*#\1#'
to_be_extracted

Breakdown:

sed -r '          # -r switch enables Extended Regular expressions   
s                 # Using substitution flag
#                 # Using # as delimiter since you have `/` in your lines
.*\/              # Match everything greedily until you see last `/`. 
([^.]+)           # Create a capture group to capture everything until you see a literal .
.*                # Followed by everything else
#                 # Another delimiter
\1                # Print the captured group
#'                # Final delimiter

Upvotes: 2

tripleee
tripleee

Reputation: 189397

Why do you need the solution to involve unnecessary tools?

basename "$url" .html

will do what you require, trivially and transparently.

Upvotes: 1

Related Questions