Doug
Doug

Reputation: 49

Replace spaces in all URLs with %20 using Regex

I have a large block of HTML that contains multiples URLs with spaces in them. How do I used Regex to replace any space that occurs in a URL, with a '%20'. The good thing is that all of the URLs end with '.pdf'.

Looking for something I could run in BBedit/Text Wrangler, or even PHP.

Example: http://www.site-name.com/dir/file name here.pdf

Need to return: http://www.site-name.com/dir/file%20name%20here.pdf

Upvotes: 0

Views: 1553

Answers (2)

Paul Chris Jones
Paul Chris Jones

Reputation: 2919

I was faced with exactly the same problem. I solved it with this:

    $text = preg_replace("/http(.*) (.*)\.pdf/U", "http$1%20$2.pdf", $text);

This looks for a space between http and pdf and then replaces the space with %20.

If your URLs have multiple spaces, then simply run the code over and over until all the spaces are gone:

    while(preg_match("/http(.*) (.*)\.pdf/U", $text))
    {
            $text = preg_replace("/http(.*) (.*)\.pdf/U", "http$1%20$2.pdf", $text);
            echo('testing testing');
    }

However, I've found this will overwrite text if there are two or more URLs on the same line. I haven't found a solution for this yet.

Upvotes: 1

Mehran Hatami
Mehran Hatami

Reputation: 12961

Instead of Regex you could use could use urlencode in PHP to achieve this which escapes the url for you. Similar to encodeURI in JavaScript.

Upvotes: 1

Related Questions