mateus
mateus

Reputation:

replace url paths using Regex

How can I change the url of my images from this:

http://www.myOLDwebsite.com/**********.*** (i have gifs, jpgs, pngs)

to this:

http://www.myNEWwebiste.com/somedirectory/**********.***

Using REGexp text editor?

Really thanks for your time

[]'s

Mateus

Upvotes: 0

Views: 4554

Answers (2)

Tomalak
Tomalak

Reputation: 338228

Why use regex?

Using conventional means, replace:

src="http://www.myOLDwebsite.com/

with:

src="http://www.myNEWwebiste.com/somedirectory/

Granted, this assumes your image tags always follow the 'src="<url>"' pattern, with double quotes and everything.

Using regex is of course also possible. Replace this:

(src\s*=\s*["'])http://www\.myOLDwebsite\.com/

with:

\1http://www.myNEWwebiste.com/somedirectory/

alternatively, if your text editor uses $ to mark back references:

$1http://www.myNEWwebiste.com/somedirectory/

On second thought - why do your images have absolute URLs in the first place? Isn't that unnecessary?

Upvotes: 4

Charles Duffy
Charles Duffy

Reputation: 295472

Well, the easiest way is probably to use sed in in-place mode:

sed -ir \
 's@http://www[.]myOLDwebsite[.]com/@http://www.myNEWwebsite.com/subdirectory/@g' \
 file1 file2 ...

If for some reason you need to actually interpret the HTML (rather than just do a simple string replacement), a quick script built around BeautifulSoup is going to be safer -- lots of people try to do HTML or XML parsing via regular expressions, but it's very hard if not impossible to cover all corner cases.

All that said, it'd be better if you were using relative links to not have your HTML depend on the server it's hosted on. See also the <BASE HREF="..."> element you can put in your <HEAD> to specify a location all URLs are relative to; if you were using that, you'd only need to do a single replacement.

Upvotes: 2

Related Questions