Devin
Devin

Reputation: 1992

Search for a pattern while excluding some characters in the input

This is a part of bigger problem I'm trying to solve. I'll mention both of them.

Say, with a needle like bi and haystack like Bird, I could use the following code to search for the needle and apply some formatting to the haystack:

const regExp = new RegExp(/bi/, 'gi');
inputStr = inputStr.replace(regExp, '*' + '$&' + '@');

The above would mutate the inputStr like: *Bi@rd (I'll explain later why I'm formatting.)

Now, is it possible to search this modified haystack *Bi@rd for a needle ir?

Bigger problem I'm trying to solve: Finding multiple needles in a haystack and highlighting them. Needles will come in the form of a space separated string. The haystack is user names, so not very long.

This is one of the solutions I came up with:

function highlighter(inputStr, searchQuery) {
  const needles = searchQuery.split(' ');
  let regExp;

  needles.forEach((pattern) => {
    regExp = new RegExp(pattern, 'gi');
    inputStr = inputStr.replace(regExp, '*' + '$&' + '@');
  });

 const highlightedStr = inputStr.replace(/[*]/g, '<span class="h">').replace(/[@]/g, '</span>');
 return highlightedStr;
}

My solution will fail if I try to highlight bi ir in Bird.

I've another solution but it's complicated. So wondering, short of using a library, what's the best way to solve my problem.

Upvotes: 1

Views: 29

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 371233

One option is, between each character in the needle, use [(insertedChars)]* to optionally match the characters that might be inserted. For your particular example, the only character inserted between the characters is @, so to find ir, you would use:

i[@]*r

But there's only one character in that character set, so it reduces to i@*r. Example:

const haystack = '*Bi@rd';
const re = /i@*r/i;
console.log(haystack.replace(re, '<<foobar>>'));

Another example, for if the inserted characters can be @, #, or %, in a haystack of qw##e@%rty with a needle of wer:

const haystack = 'qw##e@%rty';
const re = /w[@#%]*e[@#%]*r/i;
console.log(haystack.replace(re, '<<foobar>>'));

Also note that if the searchQuery items can contain RE special characters, they'll have to be escaped first.

For a dynamic needle, you'll have to insert the character set in between each character programmatically. Using the above example again:

const escape = s => s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');

const haystack = 'qw##e@%rty';
const needle = 'wer';
const inserted = '@#%';
const re = new RegExp(
  escape(needle).replace(/\\?.(?!^)/g, '$&[' + inserted + ']*'),
  'gi'
);

console.log(haystack.replace(re, '<<foobar>>'));

Another thing to keep in mind is that rather than concatenating three strings together, you can simply use one large string:

inputStr = inputStr.replace(regExp, '*$&@');

Upvotes: 2

Related Questions