Scheinin
Scheinin

Reputation: 195

How to group the following strings by regular expression

This is the string I want to process.(At least one of the underlined parts. The last part is never underlined )

'_A._B._C._D._F.f'`

I expected

["A", "B", "C", "D", "F", "f"]

How to achieve the same effect by regularity, I tried, but can't loop the same format part.

new RegExp('^[(_(.+)\\.)]+(.+)$')

Upvotes: 3

Views: 101

Answers (5)

The fourth bird
The fourth bird

Reputation: 163237

In your regex you try to match the whole pattern using an anchor ^ to assert the start of the string followed by a character class which will match only one out of several characters (and might for example also be written as [_(+\\.)]+) and then you capture the rest of the string in a capturing group and assert the end of the line $.

If you want to check the format of the string first, you might use a more exact pattern. When that pattern matches, you could do a case insensitive match for a single character as the pattern is already validated:

const regex = /^_[A-Z](?:\._[A-Z])+\.[a-z]$/;
const str = `_A._B._C._D._F.f`;

if (regex.test(str)) {
  console.log(str.match(/[a-z]/ig));
}

See the regex demo

That will match:

  • ^ Assert the start of the strin
  • _[A-Z] Match an underscore and an uppercase character
  • (?:\._[A-Z])+ 1+ times repeated grouping structure to match ._ followed by an uppercase character
  • \.[a-z] Match a dot and a lowercase character
  • $ Assert the end of the line

Upvotes: 1

mrzasa
mrzasa

Reputation: 23317

You can use split that removes [._]+ (any substring containing dots or floors) and the filter (to remove the initial empty string):

'_A._B._C._D._F.f'.split(/[._]+/).filter(function(s){ return s.length > 0})
# => [ "A", "B", "C", "D", "F", "f" ]

EDIT: Simplification suggested in comments:

'_A._B._C._D._F.f'.split(/[._]+/).filter(Boolean)
# =>  [ "A", "B", "C", "D", "F", "f" ]

Upvotes: 2

Vadim Hulevich
Vadim Hulevich

Reputation: 1833

string method .match with global flag, can help you:

console.log('_A._B._C._D._F.f'.match(/[a-z]+/gi))

Upvotes: 0

A l w a y s S u n n y
A l w a y s S u n n y

Reputation: 38502

How about that without using regex?

str = '_A._B._C._D._F.f'.split('.')
var alphabets = str.map(c => c.replace('_', ''));
console.log(alphabets);

Upvotes: 3

Nina Scholz
Nina Scholz

Reputation: 386560

You could exclude dot and underscore from matching.

var string = '_A._B._C._D._F.f',
    result = string.match(/[^._]+/g);

console.log(result);

Upvotes: 3

Related Questions