preg matching all hrefs and srcs in a string

Question

I'm trying to extract all the hrefs and srcs in a string like this :

$content = "
At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium
voluptatum deleniti Image:  Link: test.xls";

Basically what I want to do is change example.com to a to a different domain name (say test.com) and then extract all the filenames from hrefs and srcs. I was able to do the domain name replacement with a simple str_replace but now I'm stuck trying to extract the hrefs and srcs.

Here's what I tried using :

$regex = "/src=[\"' ]?([^\"' >]+)[\"' ]?[^>]*>.*?href=[\"' ]?([^\"' >]+)[\"' ]?[^>]*>/i";

This seems to work if there is no space between src (or href) and the = (e.g. ) but if there is space (e.g. ) it does not work. I've tried adding the space character but that fails the preg match. I don't want to use a heavy library like simple HTML dom, besides i don't think it will work as its not a proper HTML document. It's a string coming out of ckeditor.

preg matching all hrefs and srcs in a string

Answers (1)

Related Questions