Reputation: 3286
I've got the following string: |Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|
.
I am trying to write a regular expression that only matches terms that include the word Africa
or any deriative of it (meaning yes to all terms above except for |Mafricano|
and |Go Mafricano Go|
. Each term is enclosed between two |
.
Right now I've come up with: /\|[^\|]*africa[^\|]*\|/gi
, which says:
\|
Match |
[^\|]*
Match zero to unlimited instances of any character except |
africa
Match africa
literally[^\|]*
Match zero to unlimited instances of any character except |
\|
Match |
I've tried inserting ((?:\s)|(?!\w))
to make it /\|[^\|]*((?:\s)|(?!\w))africa[^\|]*\|/gi
. Although it succeeds in excluding |Mafricano|
and |Go Mafricano Go|
, it also excludes all other entries except for |West Africa|
and |Go Africa Go|
. So that is good but it needs to include all single word Africa
and its derived forms too.
Can anybody help me?
Upvotes: 2
Views: 72
Reputation: 59232
You can use this regex
[^|]*\bAfrica[a-z]*\b[^|]*
var str = "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|";
var arr = str.match(/[^|]*\bAfrica[a-z]*\b[^|]*/g);
console.log(arr); // ["Africa", "Africans", "African Society", "Go Africa Go", "West Africa"]
Upvotes: 4
Reputation: 174696
I think you want something like this,
\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|
> "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|".match(/\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|/gi);
[ '|Africa|',
'|Africans|',
'|African Society|',
'|Go Africa Go|',
'|West Africa|' ]
Don't forget to turn on the i
modifier to do a case insensitive match.
Explanation:
\| '|'
(?: group, but do not capture (0 or more
times):
(?! look ahead to see if there is not:
Mafrica 'Mafrica'
| OR
\| '|'
) end of look-ahead
. any character except \n
)*? end of grouping
africa 'africa'
(?: group, but do not capture (0 or more
times):
(?! look ahead to see if there is not:
Mafrica 'Mafrica'
| OR
\| '|'
) end of look-ahead
. any character except \n
)*? end of grouping
\| '|'
Upvotes: 1