Reputation: 145
I have a large list of job skills, like you might see on LinkedIn ("Nuclear Physics", "Python", "Heavy Machinery", etc). I also have a large block of text: a job description. I am trying to iterate through the list and identify which skills are present in the block of text. Here is my current code:
// escape possible special characters in a string
// https://stackoverflow.com/questions/4371565/
const escapeRegExp = (s) => {
return s.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&')
}
let skills_in_job = {}
skills.forEach(skill => {
// Creating a regexp to search for all instances of <skill>
// \b means it is a standalone word (to prevent 'React' being in 'Reactive')
// 'g' means it will search globally (not just the first it finds)
// 'i' means it will be case insensitive
// Add word boundaries to make sure it is not a substring of a word
const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'gi')
const count = (job.match(rx) || []).length
if (count) skills_in_job[skill] = count
})
However, the i
flag is giving me some issues:
Ideally, my regex expression should care only about capitalization of the first letter. I am not sure how to do this programatically.
Upvotes: 2
Views: 55
Reputation: 627409
JavaScript regex does not support inline modifiers ((?i)
), nor modifier groups ((?i:...)
).
You may either follow the path suggested by Barmar and create a character class out of each non-initial letter and then build a case sensitive regex:
skill = skill.replace(/\B./g, (x) => `[${x.toLowerCase()}${x.toUpperCase()}]`);
const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'g');
Or, you may simply filter out the matches that start with a different case of the first letter
const rx = RegExp("\\b" + escapeRegExp(skill) + "\\b", 'gi')
const matches = (job.match(rx) || []).filter(x => x.charAt(0) == skill.charAt(0));
const count = (matches || []).length;
Upvotes: 1