Justin808
Justin808

Reputation: 21512

I'm having a Regex Issues removing & chars

I have this string

../cms/Client Files/gallery images/home1.jpg&w=914&h=360&cache=1:28:02 PM

and I want to remove the stuff off the end of the file. In c# I'm trying

html = Regex.Replace(html, @"&(w=([0-9]*))", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"&(h=([0-9]*))", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"&(cache=([0-9]*):([0-9]*):([0-9]*) [AP]M)", "", RegexOptions.IgnoreCase);

but its not removing anything. If I try

html = Regex.Replace(html, @"w=([0-9]*)", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"h=([0-9]*)", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"cache=([0-9]*):([0-9]*):([0-9]*) [AP]M", "", RegexOptions.IgnoreCase);

then I get

../cms/Client Files/gallery images/home1.jpg&&&

How can I remove the &'s as well?

Upvotes: 2

Views: 111

Answers (3)

Scott Rippey
Scott Rippey

Reputation: 15810

You don't need to escape the & to match it, as others have incorrectly suggested.

As a matter of fact, your code works perfectly as you describe! I just ran your code in LINQPad, and verified the results:

var html = "../cms/Client Files/gallery images/home1.jpg&w=914&h=360&cache=1:28:02 PM";

html = Regex.Replace(html, @"&(w=([0-9]*))", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"&(h=([0-9]*))", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"&(cache=([0-9]*):([0-9]*):([0-9]*) [AP]M)", "", RegexOptions.IgnoreCase);

html.Dump(); // Outputs: "../cms/Client Files/gallery images/home1.jpg"

Therefore, you should inspect the rest of your code and see if another error exists. This is where a debugger might show you the light.

Another idea since your variable is named html, is it possible that the & is actually encoded as &? That might explain things.

As a side note: None of your patterns need the (), and they will be simpler without them.

Upvotes: 1

Bojan Bjelic
Bojan Bjelic

Reputation: 3532

This should be enough...

html = Regex.Replace(html, @"\&.*", "", RegexOptions.IgnoreCase);

Upvotes: 0

Marco
Marco

Reputation: 57573

I should try this (easier than using Regex):

int index = html.IndexOf("&");
if (index >= 0) html = html.Substring(0, index);

or try this:

html = Regex.Replace(html, @"\&w=([0-9]*)", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"\&h=([0-9]*)", "", RegexOptions.IgnoreCase);
html = Regex.Replace(html, @"\&cache=([0-9]*):([0-9]*):([0-9]*) [AP]M", "", RegexOptions.IgnoreCase);

Upvotes: 1

Related Questions