Eduardo Mello
Eduardo Mello

Reputation: 925

Regular Expression to help with Rewriting a URL

I have strings like this:

http://localhost:2055/web-site-2009/paginas/noticias/**IGP-M recua 0,36% em agosto, aponta FGV**-46.aspx

I'd like to remove all characters that could cause trouble on a URL (like ?, |, &, etc.) and the hyphen(-) on the bold part of the string. It's important that I keep the hyphen next to the 46.aspx.

What is the regex for that?

Upvotes: 0

Views: 176

Answers (1)

Jon Galloway
Jon Galloway

Reputation: 53115

Another approach would just be to URL Encode the string. If you need to use a RegEx for some other reason, I think this would get the characters you're asking about:

Regex.Replace(stringToCleanUp, "[^a-zA-Z0-9/;\-%:]", string.Empty);

Regex explanation:

  • Don't match this list of characters - [] means list, ^ means negation
  • List of characters: a-z (all characters between a and z lower case)
  • List of characters: A-Z (all characters between a and z upper case)
  • All numbers: 0-9 (all numbers)
  • After that, I've included a list of characters to allow: / ; - (have to escape it with \ since - is a reserved character) % :

You can add or remove from that final list - anything in this list will be ALLOWED in your final URL since it will not be replaced.

I recommend using an interactive RegEx tool if you need to tweak this, like RegExr.

Upvotes: 10

Related Questions