Reputation: 1452

Extract just the path from CSS and exclude anything after the path

Using the regex below, I can extract the path from the CSS, but the result also includes the bit after the "#" and "?". Is there any way I can just extract the path

Regex

url\([\s]?[\"|\']?(.*?)[\"|\']?[\s]?\)

String

url('../fonts/Google_OpenSans/open-sans-v15-latin-italic.eot?#iefix') format('embedded-opentype')
url('../fonts/Google_OpenSans/open-sans-v15-latin-italic.woff2') format('woff2')
url('../fonts/Google_OpenSans/open-sans-v15-latin-italic.woff') format('woff')
url('../fonts/Google_OpenSans/open-sans-v15-latin-italic.ttf') format('truetype')
url('../fonts/Google_OpenSans/open-sans-v15-latin-italic.svg#OpenSans') format('svg')

Expected

../fonts/Google_OpenSans/open-sans-v15-latin-italic.eot

Actual

../fonts/Google_OpenSans/open-sans-v15-latin-italic.eot?#iefix

Upvotes: 2

Answers (2)

J-Cake

Reputation: 1618

If you're looking for a non-regex based option, I wrote this little function that can parse a URL or path and gives an object of the correct pieces:

function parseURL (url) {
    url = decodeURI(url);
    let split = {}
        split["?"]    = url.split("?")
        split["path"] = split["?"][0]
        split["tmp"]  = split["?"][1].split("#")
        split["#"]    = split["tmp"][1]
        split["?"]    = split["tmp"][0]

    let params = {};

    split["?"].split("&").forEach(i => {
        let tmp = i.split("=");
        params[tmp[0]] = tmp[1];
    }) 

    return {
        path: encodeURI(split["path"]),
        params,
        fragments: split["#"]
    }
}

All you need to do is get the path property from the returned object.

$ node
> parseURL("https://example.com/path/to/resource?name=param&purpose=none#extrainfo")
> {
    path: "https://example.com/path/to/resource",
    params: {
      name: "param",
      purpose: "none"
    },
    fragments: "extrainfo"
  }
>

Upvotes: 0

Wiktor Stribiżew

Reputation: 626806

You may use

url\(\s*(["']?)([^()?#]*)(?:[#?].*?)?\1\s*\)

See the regex demo. The result will be in Group 2.

Details

url\( - url( substring
\s* - 0+ whitespaces
(["']?) - Group 1: an optional ' or "
([^()?#]*) - Group 2: any 0+ chars other than ?, #, ) and (
(?:[#?].*?)? - an optional substring starting with ? or # and then having any 0+ chars other than line break chars, as few as possible (to make it even more efficient, replace .*? here with [^()]*, cf. with this demo)
\1 - same value as captured into Group 1
\s* - 0+ whitespaces
\) - a ) char.

Upvotes: 1