Reputation: 710
I has a string like this:
const string = 'John Smith: I want to buy 100 apples\r\nI want to buy 200 oranges\r\n, and add 300 apples';
and now I want to split the string by following keywords:
const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300'];
now I want to get result like this:
const result = [
{isKeyword: true, text: 'John Smith'},
{isKeyword: false, text: 'I want to buy '},
{isKeyword: true, text: '100'},
{isKeyword: true, text:'apples'},
{isKeyword: false, text:'\r\nI want to buy'},
{isKeyword: true, text:'200'},
{isKeyword: true, text:'oranges'},
{isKeyword: false, text:'\r\n, and add'},
{isKeyword: true, text:'300'},
{isKeyword: true, text:'apples'}];
Keywords could be lowercase or uppercase, I want to keep the string in array just the same as string.
I also want to keep the array order as the same as the string but identify the string piece in array whether it is a keyword.
How could I get it?
Upvotes: 0
Views: 120
Reputation: 92440
I would start by finding the indexes of all your keywords. From this you can make you can know where all the keywords in the sentence start and stop. You can sort this by the index of where the keyword starts.
Then it's just a matter of taking substrings up to the start of the keywords -- these will be the keyword: false
substrings, then add the keyword substring. Repeat until you are done.
const string = 'John Smith: I want to buy 100 apples\r\nI want to buy 200 oranges\r\n, and add 300 apples Thanks';
const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300'];
// find all indexes of a keyword
function getInd(kw, arr) {
let regex = new RegExp(kw, 'gi'), result, pos = []
while ((result = regex.exec(string)) != null)
pos.push([result.index, result.index + kw.length]);
return pos
}
// find all index of all keywords
let positions = keywords.reduce((a, word) => a.concat(getInd(word, string)), [])
positions.sort((a, b) => a[0] - b[0])
// go through the string and make the array
let start = 0, res = []
for (let next of positions) {
if (start + 1 < next[0])
res.push({ isKeyword: false,text: string.slice(start, next[0]).trim()})
res.push({isKeyword: true, text: string.slice(next[0], next[1])})
start = next[1]
}
// get any remaining text
if (start < string.length) res.push({isKeyword: false, text: string.slice(start, string.length).trim()})
console.log(res)
I'm trimming whitespace as I go, but you may want to do something different.
Here's a much more succinct way to do this if you are willing to pick a set of delimiters that can't appear in your text for example, use {}
below
Here we simply wrap the keywords with the delimiter and then split them out. Grabbing the keyword with the delimiter makes it easy to tell which parts of the split are your keywords:
const string = 'John Smith: I want to buy 100 apples\r\nI want to buy 200 oranges\r\n, and add 300 apples Thanks';
const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300'];
let res = keywords.reduce((str, k ) => str.replace(new RegExp(`(${k})`, 'ig'), '{$1}'), string)
.split(/({.*?})/).filter(i => i.trim())
.map(s => s.startsWith('{')
? {iskeyword: true, text: s.slice(1, s.length -1)}
: {iskeyword: false, text: s.trim()})
console.log(res)
Upvotes: 2
Reputation: 3077
Use a regular expression
rx = new RegExp('('+keywords.join('|')+')')
thus
str.split(rx)
Upvotes: 0