Ben
Ben

Reputation: 1023

Regex.Replace finding variable width and height in an html string and replacing with a set value

I'm trying to write a regex that will look for the width and height attributes in a string (which will always be an html iframe) and replace the values that it has.

What I have is a string where ### could be any value, and not necessarily always 3 digits.

string iFrame = <iframe width="###" height="###" src="http://www.youtube.com/embed/xxxxxx" frameborder="0" allowfullscreen></iframe>

I want to end up with set values for the width and height:

<iframe width="315" height="215" src="http://www.youtube.com/embed/xxxxxx" frameborder="0" allowfullscreen></iframe>

I tried this, but am not good with regular expressions:

iFrame = Regex.Replace(iFrame, "width=\".*\"", "width=\"315\"");
iFrame = Regex.Replace(iFrame, "height=\".*\"", "height=\"215\"");

which resulted in:

<iframe width="315" allowfullscreen></iframe>

which is not what I want. Can someone help me?

Upvotes: 1

Views: 4270

Answers (3)

Shai Cohen
Shai Cohen

Reputation: 6249

Replace your patterns to this:

"width=\"([0-9]{1,4})\""

and

"height=\"([0-9]{1,4})\""

Basically, you were using . which performs a greedy-capture. Meaning it grabs as many characters as possible. The patterns above look for any number character [0-9] that repeats between 1 and 4 times {1,4}. Which is what you are really looking for.

Upvotes: 9

Richard
Richard

Reputation: 86

I agree that this isn't the best way to work with html. The problem with your example is the . in you regex which is matching all chars and spaces up to the last " in the string. Change it to the code below which only matches non-whitespace characters.

iFrame = Regex.Replace(iFrame, @"width=""[^\s]*""", "width=\"315\"");
iFrame = Regex.Replace(iFrame, @"height=""[^\s]*""", "height=\"215\"");

Upvotes: 3

Oded
Oded

Reputation: 499212

You are better off using the HTML Agility Pack to parse and query HTML. It handles HTML fragments well.

RegEx is not a good solution for parsing HTML, as this SO answer may convince you.

Upvotes: 3

Related Questions