Reputation: 822
I am very new to regex and I'm not sure how to pluck a piece of test from a very large string using regex.
suppose the string is this: FYI: This string would be generated dynamically pulling different elements from the database and the dom. I don't have much control on how it gets created.
Lorem ipsum dolor sit amet, consectetur adipisicing elit. Voluptas architecto dicta amet cumque, atque, labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.
<span itemprop="itemNum">56789</span>
labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.
I need to get the text inside the span that has an itemprop labeled itemNum.
I tried this but it did not work for me:
/\b(itemprop=\"sku\"")\b/g
Ultimately I would have only 56789 in a variable.
Thank you all in advance.
Upvotes: 0
Views: 86
Reputation: 1531
One approach to reach the goal of getting the value if you don't necessarily have to use regex would be to use DOMParser to first parse the string, then get the element using e.g querySelect
:
const str = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Voluptas architecto dicta amet cumque, atque, labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis. <span itemprop="itemNum">56789</span> labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.';
const parser = new DOMParser();
const doc = parser.parseFromString(str, "text/html");
console.log(doc.querySelector('span[itemprop="itemNum"]').innerHTML)
Upvotes: 4
Reputation: 37775
One probable solution.
let str = `Lorem ipsum dolor sit amet, consectetur adipisicing elit. Voluptas architecto dicta amet cumque, atque, labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis. <span itemprop="itemNum">56789</span> labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.`
let op = str.match(/<[^>]+>([^<]+)<\/[^>]+>/g).map(e=>e.replace(/.*?>(.*)<.*/, "$1"))
console.log(op)
Upvotes: 0
Reputation: 17636
Using regex lookbehind for itemprop="itemNum">
and lookahead for </
then just capture whatever is between.
const data = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Voluptas architecto dicta amet cumque, atque, labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis. <span itemprop="itemNum">56789</span> labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.'
const res = data
.match(/(?<=itemprop\="itemNum"\>).+(?=\<\/)/)
//returns an array... get first value
.shift();
console.log(res);
Upvotes: -1
Reputation: 433
Based on https://stackoverflow.com/a/14210948/3999647 just updated the regex and input
function getMatches(string, regex, index) {
index || (index = 1); // default to the first capturing group
var matches = [];
var match;
while (match = regex.exec(string)) {
matches.push(match[index]);
}
return matches;
}
// Example :
var myString = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit. Voluptas architecto dicta amet cumque, atque, labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis. <span itemprop="itemNum">56789</span> labore eos nobis earum fuga tempore officiis excepturi rerum placeat. Perferendis, earum officiis veniam dicta eius aliquid, similique porro quam necessitatibus nobis velit debitis.';
var myRegEx = /(<span itemprop="\w+">)(\d+)(<\/span>)/g;
// Get an array containing the first capturing group for every match
var matches = getMatches(myString, myRegEx, 2);
// Log results
document.write(matches.length + ' matches found: ' + JSON.stringify(matches))
console.log(matches);
Upvotes: 1