shadowmonkey
shadowmonkey

Reputation: 1

Searching for a sub-string (which changes each loop iteration), inside a big string

I am looping through HTML elements, and extracting the following string from a custom dataset.

"<div class="item"><span class="label">Test:</span><span>server</span></div><div class="item"><span <span class="label">Assigned to:</span><span>name(position) </span></div><div"

What I need to do, is extract the keyword name. The problem is, the name is essentialy a variable- each time the string appears in the loop, there is always the sentence Assigned to:</span><span>name(position), only the {{name}} changes but shows as a string. How do I search and extract the name? Do I look for Assigned to:</span><span> and copy the word afterwards? How do I do it?

Thanks!

Upvotes: 0

Views: 56

Answers (3)

Roger Krueger
Roger Krueger

Reputation: 303

Complicated searches are generally better done with regex. This appears to do what you want:

    const subj='<div class="item"><span class="label">Test:</span><span>server</span></div><div class="item"><span><span class="label">Assigned to:</span><span>Fred(position) </span></div><div';
    const re=/Assigned to:<\/span><span>([^\(]*)/;
    const answers=re.exec(subj);
    console.log(answers[1]);

Were assigning the string to "subj", the regex search pattern to "re", then assigning the results of doing and exec in them into "z".

The regex itself: Starts with a slash, has the literal that prepends the name field.... and then the magic.

The parentheses are a "capture group"; whatever matches what's inside them will be returned separately. "^" is "not", the "\" is to escape the "(" that is a reserved character,"*" means "find as many as you can."

So were telling it to look for the prefix string, then return every character until we hit the "(".

Then the return handling. "exec" returns the entire match in [0]--not what we want. The first (only in this case) capture group is in [1].

Upvotes: 0

panepeter
panepeter

Reputation: 4242

Without knowing the full scope of your problem, it's hard to say – but it might be a good alternative to get the contents you want straight from the DOM instead of manually filtering strings. In many cases, this tends to be more robust and maintainable, compare to using regexes (which are great, anyway).

Document.querySelectorAll() and the Adjacent sibling combinator could be your friends here:

// fetch all spans which are neighbour to a span with the class 'label'
const targetSpans = document.querySelectorAll('span.label + span');
// Iterate the items, outputting each of their contents
targetSpans.forEach(target => {
  console.log(target.textContent);
});

Provided your code snippet, this would also match the span containing 'server' as a textContent. But if that's your only 'false positive', filtering it out should be pretty easy.

Like I said, with the provided information it's hard to say what solution solves your actual problem best. But DOM parsing might be an option as well.

Upvotes: 1

Ed_
Ed_

Reputation: 19098

Your initial thought is correct - you need to search for the constant thing that will be surrounding your name in every string, and extract the name from within it.

From your question, it appears your name looks like this:

<span class="label">Assigned to:</span><span>name(position) </span>

The way that I would do this, is using a regular expression - I find the site https://regex101.com/ really useful to get them correct. Paste your whole string in there, then build the expression and ensure the part you want is being matched.

In this case, you want to have a regex something like this:

const regex = /<span class="label">Assigned to:<\/span><span>(.*?)<\/span>/

You can see how this looks on Regex 101 here (note the captured group): screenshot of regex 101 showing matched label

The site also explains what each part of the regex does. In this case it's pretty much a straight text match (the \/ part is simply to escape the / character within the regex), and the only interesting part is that we're capturing everything within the match using a lazy quantifier, which means it won't capture more than it needs to. If we didn't use the lazy quantifier (the ?), it would match everything after the first span until the last closing span in the whole string, because the . character matches everything, so be careful of this (try adding an extra </span> onto your test string to see what I mean.

I'll leave it up to you to read up on how to implement a regex match in javascript.

Upvotes: 0

Related Questions