Jericho
Jericho

Reputation: 145

Build Javascript regex that expects the first character to be capitalized, case insensitive for the rest

I have a large list of job skills, like you might see on LinkedIn ("Nuclear Physics", "Python", "Heavy Machinery", etc). I also have a large block of text: a job description. I am trying to iterate through the list and identify which skills are present in the block of text. Here is my current code:

  // escape possible special characters in a string
  // https://stackoverflow.com/questions/4371565/
  const escapeRegExp = (s) => {
    return s.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&')
  }

  let skills_in_job = {}

  skills.forEach(skill => {
    // Creating a regexp to search for all instances of <skill>
    // \b means it is a standalone word (to prevent 'React' being in 'Reactive')
    // 'g' means it will search globally (not just the first it finds)
    // 'i' means it will be case insensitive
    // Add word boundaries to make sure it is not a substring of a word
    const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'gi')
    const count = (job.match(rx) || []).length
    if (count) skills_in_job[skill] = count
  })

However, the i flag is giving me some issues:

Ideally, my regex expression should care only about capitalization of the first letter. I am not sure how to do this programatically.

Upvotes: 2

Views: 55

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627409

JavaScript regex does not support inline modifiers ((?i)), nor modifier groups ((?i:...)).

You may either follow the path suggested by Barmar and create a character class out of each non-initial letter and then build a case sensitive regex:

skill = skill.replace(/\B./g, (x) => `[${x.toLowerCase()}${x.toUpperCase()}]`);
const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'g');

Or, you may simply filter out the matches that start with a different case of the first letter

const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'gi')
const matches = (job.match(rx) || []).filter(x => x.charAt(0) == skill.charAt(0));
const count = (matches || []).length;

Upvotes: 1

Related Questions