Reputation: 100
For instance , I have a url like this --> http://google.com/test/to_be_extracted.html
I want to split this url and get to_be_extracted
part only. I want to exclude http://google.com/test/
and .html
parts.
How can I setup regex pattern using GREP or SED ?
Upvotes: 3
Views: 870
Reputation: 77105
You can use this:
$ echo 'http://google.com/test/to_be_extracted.html' | sed -r 's#.*\/([^.]+).*#\1#'
to_be_extracted
sed -r ' # -r switch enables Extended Regular expressions
s # Using substitution flag
# # Using # as delimiter since you have `/` in your lines
.*\/ # Match everything greedily until you see last `/`.
([^.]+) # Create a capture group to capture everything until you see a literal .
.* # Followed by everything else
# # Another delimiter
\1 # Print the captured group
#' # Final delimiter
Upvotes: 2
Reputation: 189397
Why do you need the solution to involve unnecessary tools?
basename "$url" .html
will do what you require, trivially and transparently.
Upvotes: 1