Liron Harel
Liron Harel

Reputation: 11247

YouTube Regex replacement C# breaking HTML

I am using this code toe place YouTube URLs with an icon that when you click on it, it opens a lightbox that shows the video.

Here's the C# code:

 const string pattern = @"(?:https?:\/\/)?(?:www\.)?(?:(?:(?:youtube.com\/watch\?[^?]*v=|youtu.be\/)([\w\-]+))(?:[^\s?]+)?)";
        const string replacement = "<a title='Click to watch the video' rel='nofollow' class='youtube-popup' href='//www.youtube.com/watch?v=$1' data-lity><span class='fa fa-play'></span>Watch</a>";

 var rgx = new Regex(pattern);
 var result = rgx.Replace(theinput, replacement);
  if(result != null && result != "")
  {
            return result;
  }

The code replaces the video URLs and shows the icons, but also cuts the HTML after it (<p class="tags"></p>) and it gets cut to class='tags'> (both the paragraph tags are not present, and because of that, it places in an element before it that contains the links.

I tested it with two links in the same paragraph, separated with text and spaces between them of course.

How can I changed the Regex to work and not break the HTML for this particular example?

Upvotes: 0

Views: 77

Answers (1)

pinkfloydx33
pinkfloydx33

Reputation: 12739

This Regex seems to work for me. I'm not completely sure on all the formats that YouTube URLS can come in. Your Regex was not stopping at the < and was continuing on until the first space (before class), hence why it was eating some of the following tags. Also note that you need to escape the . inside of youtube.com and youtu.be

(?:https?:\/\/)?(?:www\.)?(?:(?:(?:youtube\.com\/watch\?[^?]*v=|youtu\.be\/)))([\w-]+)

Also please keep in mind: You can't parse HTML with regex

Upvotes: 1

Related Questions