Reputation: 547
Suppose I have the following string
Bimkingo Clasico Prom 135g LON 49835 Gansito ME 5p 250g MTA MLA 49860 Wonder
I want to extract only tokens that don't contain numbers or only upper case letters.
The output should be
Bimkingo Clasico Prom Gansito Wonder
This doesn't seem to work: \b(([a-zA-Z]+)+)\b
.
Upvotes: 2
Views: 67
Reputation: 2672
The following regex (\b[a-zA-Z]+[a-z]+\b)
should work as expected, for the output example in OP's post and other edge cases:
var string = 'Bimkingo Clasico Prom 135g LON 49835 Gansito ME 5p 250g MTA MLA 49860 Wonder';
var regexp = /(\b[a-zA-Z]+[a-z]+\b)+/g;
var matches = string.match(regexp);
var output = "Bimkingo Clasico Prom Gansito Wonder";
console.log('Matches test output provided by OP "' + output + '":');
console.log(output === matches.join(' '), '\n');
console.log(''); // new line
// Cases not contained in OP's example output string...
var string = 'mArceLino marcelino';
console.log('Also matches all lowercase and uppercase mid word "' + string + '":', '\n');
var matches = string.match(regexp);
console.log(matches.length === 2);
console.log(''); // new line
var string = 'MARCEL1N0 MARCELINO11';
console.log('Excludes all uppercase with number mix "' + string + '":');
var matches = string.match(regexp);
console.log(matches === null);
The accepted answer by "krzyk" matches the output OP posted, but fails to "extract only tokens that don't contain numbers or only upper case letters" in edge cases not represented by OP's example output. Run the code snippet above for a better representation of the issue.
Regex explanation:
( --> // start capture
\b --> // match start of word
[a-zA-Z]+ --> // match one or more lowercase and uppercase letters
[a-z]+ --> // match one or more lowercase only
\b --> // match end of word
) --> // end capture
Upvotes: 1
Reputation: 27476
Use:
(\b[A-Z]?[a-z]+\b)+
And retrieve all the groups from your regexp library (as there is no way to gave it in a single group) and join them with spaces.
Test case: https://regex101.com/r/hY2fM8/1
Upvotes: 1