Phil Barnes
Phil Barnes

Reputation: 133

Using Regex to identify a string starting with Quotation Marks but returning a substring

I'm really struggling to get this Regex code to play ball. I'm a beginner and trying to use Regex to identify a certain string within JSON.

For example, within this data:

window.dataAnalyticsJSON = {
    "configuration": {
        "SiteCatalyst": {
            "reportSuiteId": "testsuite"
        },
        "marketingRegion": "gb",
        "contentLanguage": "en",
        "contentLocale": "gb",
        "currency": "GBP"
    },
    "pageId": "testpage",
    "siteSection": "testsitesection",
    "site": "testsite",
    }
}

I am trying to extract the value 'testpage' (without quotes) - only that. I have tried multiple Beginning with formulaes, but none are returning just this value.

My best solution so far returns this:

"pageId": "testpage

With the regular expression being

/["'](pageId": ".*?)["']/g

How can I just return testpage on it's own? The idea is I could then run this code across a website to quickly get individual page names.

Thanks in advance for any help you may have to offer!

Upvotes: 0

Views: 97

Answers (5)

HopefullyHelpful
HopefullyHelpful

Reputation: 1799

What you are looking for is lookahead and lookbehind, which means that the regex engine is looking for the groups in front or behind every possible match, but won't include them in the match itself.

What works for you case would be (?<=\"pageId\"\:\s\")(.*)(?=\")

?<= indicates a lookbehind, which means the regexgroup must be found before any possible match at that location in the regex

?= indicates a lookahead, which means that the regexgroup must be found behind any possible match at that location in the regex

tested with https://regex101.com/

if you need a refresh of the syntax here is a good lookup table for lookahead/lookbehind http://www.rexegg.com/regex-lookarounds.html

Upvotes: 1

Nabeel Khan
Nabeel Khan

Reputation: 3993

You need a lot of learning dude :/

Try this:

/\"pageId\"\:\s\"(.*?)\"/   

Tested with your example here:

https://regex101.com/r/mL3yR9/1

If you simply want the value of pageId, you can also json decode this variable and find the value through the array.

You're welcome :) Mark this as best answer. Let me know if you want to donate :P or if you want me to give you some tuition!

Upvotes: -1

user4956851
user4956851

Reputation:

Why do you need regex if you have a JSON file.

$.getJSON('../data/fileName.json', function (data) {
  $.each(data,function (index, istance) {
    if (istance.pageId === "testpage") {
      //do your staff with testpage
    } 
  });
}); // end get

JSON files usually have a structure with a kind of meaning, understood that, with the .getJSON function you will do whatever you want.

Upvotes: 2

isopropylcyanide
isopropylcyanide

Reputation: 425

I think this would work:

"pageId": "([a-z0-9]*),

The part in parenthesis forms a group, then you could use

$1

to get the corresponding name. If special characters are allowed,

"pageId": "(.*)", #would work

Upvotes: 2

JBux
JBux

Reputation: 1394

You're putting pageId in the capturing group.

Try:

/pageId": "(.*?)"/g

Example

Upvotes: 4

Related Questions