anarchitecton
anarchitecton

Reputation: 31

Regular Expression for relative links ONLY

I'm creating a javascript that checks for links in the DOM and changes those who are NOT absolute links. Unfortunately I'm not having any luck...

I would like to match only the first type of links below, and add a folder path

  1. <a href="somepage.html">link</a>
  2. <a href"http://somesite.net/somepage.html">link</a>

I've used string.replace(/a.+href="([^http]+)"/, 'path'+$1); to no avail...

Can someone help me here? Thanks in advance.

Upvotes: 3

Views: 3260

Answers (6)

anarchitecton
anarchitecton

Reputation: 31

Thanks everyone.

I was able to replace relative paths ONLY by using the following syntax:

var basepath = "pathto/";
var html = html.replace(/(<(a|img)[^>]+(href|src)=")(?!http)([^"]+)/g, '$1'+basepath+'$4');

Upvotes: 0

Ateş G&#246;ral
Ateş G&#246;ral

Reputation: 140052

If the regular expression that you've written to solve a problem using just regular expressions starts to look like overkill, then it is probably overkill. Sometimes a simple if statement used in conjunction with regular expressions can do wonders:

$("a").each(function () {
    if (!/^http:\/\//.test(this.href)) {
        this.href = "http://example.com/folder/" + this.href; // etc.
    }
});

Upvotes: 2

CrayonViolent
CrayonViolent

Reputation: 32532

for example sake, I just made a variable with a couple links in it. You can easily adapt the .replace() to work with however you get the links.

var content = '<a href="/somepage.html">link</a><a href="http://somesite.net/somepage.html">link</a><a href="somepage.html">link</a>';

// whatever you want to prefix link with
var base='http://somsite.net';

content = content.replace(/(href=")(?!https?:\/\/)([^"]*)/gi,'$1'+base+'/$2').replace(/\/+/g,'/');

Upvotes: 0

Gabriele Petrioli
Gabriele Petrioli

Reputation: 195982

You can use

string.replace(/(a.+href=)"(?!http)(.+)"/gi, '$1"path/$2"')

Upvotes: 0

SpliFF
SpliFF

Reputation: 38956

You've created a character class with the square brackets. Remove them. You want a "negative lookbehind", see comment below for info on syntax. Not all languages support this regex feature though.

Javascript doesn't support lookbehind. This may help though: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript

Upvotes: 0

John Douthat
John Douthat

Reputation: 41179

You may want to look at the <base> html tag, instead. It allows you to set the path to which all links and images are relative.

http://www.w3schools.com/tags/tag_base.asp

http://www.w3.org/TR/html5/semantics.html#the-base-element

Upvotes: 1

Related Questions