Nitin D
Nitin D

Reputation: 49

Regex to remove the specific word from URL

In Dynatrace, there are the URLs which are containing the a word which is dynamic. Want to remove that dynamic word from the URL using regex

Below are the different urls

Expected output

I'm able to manage this regex

(\S+?)ab_cd(.*)

But its not working for dynamics values and all URL. How Can I improve the regex to to remove the dynamic value?

Upvotes: 1

Views: 1065

Answers (2)

The fourth bird
The fourth bird

Reputation: 163577

You could use the 2 capturing groups and match the underscore part after matching a forward slash

^(\S+/)[^\s_]+_[^\s_/?]+(.*)
  • ^ Start of string
  • (\S+/) Capture group 1, match 1+ times a non whitespace char followed by /
  • [^\s_]+ Match 1+ times any char except a whitespace char or _
  • _ Match literally
  • [^\s_/?]+ Match 1+ times any char except a whitespace char, _, / or ?
  • (.*) Capture group 2 Match 0+ times any char except a newline

Regex demo

In the replacement use the 2 capturing groups, for example $1$2

If you want to match country codes and you know that they for example consist of chars a-zA-Z you could make the character class more specific

^(\S+/)[A-Za-z]+_[A-Za-z]+(.*)

Regex demo

Upvotes: 2

Gerb
Gerb

Reputation: 496

It seems that the first portion is fixed, and you're trimming everything after a '/' or '?'. Given that, perhaps you want something like:

s/(\/asd\/fdsadx\/sadsa\/)[^/?]+(.*)/\1\2/

This will capture the head in \1, ignore a group of characters that are not either '\' or '?', and capture the tail in \2.

Upvotes: 1

Related Questions