Reputation: 1819

Capture last element either between or after / and before?

Say I have the following urls:

https://test.com/welcome/
https://sub.test.com/home/edit
https://test.com/home/view?view=column
https://test.com/home/view/?view=list

I would like to capture the following result:

welcome
edit
view
view

Right now I have (?:\/[^\/]+)+?\/(.*?)/{0,1}$, (?:\/[^\/]+)+?(?:.*\/)(.*?)\?{0,1}$, and (?:\/[^\/]+)+?(?:.*\/)(.*)/\?.*$ but they are complicated and I can't seem to combine them.

Upvotes: 2

Answers (4)

warren

Reputation: 33435

Go simple - regular expressions are all well and good, but split() is much easier (and, very often, much faster):

index=ndx sourcetype=srctp url=*
| eval url=split(URL,"/")
| eval lastpart=mvindex(url,-1)

This splits the field url into a multivalue field using the forward slash ('/') as the delimiter

Then select the last entry using mvindex and the index of -1, which is always the last entry

Upvotes: 1

Toshihisa Kawamata

Reputation: 7

| makeresults
| eval _raw="https://test.com/welcome/
https://sub.test.com/home/edit
https://test.com/home/view?view=column
https://test.com/home/view/?view=list"
| makemv delim="
" _raw
| stats count by _raw
| rex "^.*\/(?<result>\w+)"

greedy matching is fine.

\w is [a-zA-Z0-9_]

Upvotes: 1

Cary Swoveland

Reputation: 110675

You can use the plain-vanilla regex:

(?<=[\/])[^\/?=]+(?=\/?$|\/?\?)

Demo

The regex can be written in free-spacing mode¹ to make it self-documenting:

/ 
(?<=[\/])     # match '/' or '?' in positive lookbehind
[^\/?=]+       # match 1+ chars other than '/', '?' and '='
(?=            # begin a positive lookahead
  \/?$         # optionally map '/' then match end of line    
  |            # or
  \/?\?        # optionally match '/' then match '?'
)              # end positive lookahead
/x             # free-spacing mode

^{1. I don't know if Splunk supports free-spacing mode but that is of no matter as I am using it merely to show how the regex works.}

Upvotes: 1

Wiktor Stribiżew

Reputation: 626802

In Splunk, you may use a regex to match all text till the last occurrence of / followed with any 1+ chars other than /, ? or # and these 1+ chars can be captured with a named capturing group:

".*/(?<lasturlpart>[^/?#]+)"

See the regex demo. Note the \n or (?:/?(?:[#?].*|$)) in my top comment are used in the demo to make sure the match does not overflow across lines since the input is a single multiline string in the demo, while you will be using the regex against standalone strings.

Pattern details

.* - any 0 or more chars other than line break chars, as many as possible
/ - a / char
(?<lasturlpart>[^/?#]+) - Named capturing group matching 1 or more chars other than /, ? and #.

Upvotes: 2

Capture last element either between or after / and before?

Answers (4)

Related Questions