javadude
javadude

Reputation: 1813

Get groups with regex and OR

I have something like this

AD ABCDEFG HIJKLMN
AB HIJKLMN
AC DJKEJKW SJKLAJL JSHELSJ

Rule: Always 2 Chars Code (AB|AC|AD) at line beginning then any number of 7 Chars codes following.

With this regex:

^(AB|AC|AD)|((\S{7})?

in this groovy code sample:

def m= Pattern.compile(/^(AB|AC|AD)|((\S{7})?)/).matcher("AC DJKEJKW SJKLAJL JSHELSJ")
println m.getCount()

I always get 8 as count, means it counts the spaces.
How do I get 4 groups (as expected) without spaces ?

Thanks from a not-yet-regex-expert
Sven

Upvotes: 2

Views: 5272

Answers (2)

stema
stema

Reputation: 92986

This pattern will match your requirements

^A[BCD](?:\s\S{7})+

See it here online on Regexr

Meaning start with A then either a B or a C or a D. This is followed by at least one group consisting of a whitespace followed by 7 non whitespaces.

Upvotes: 2

tim_yates
tim_yates

Reputation: 171094

Using this code:

def input = [ 'AD ABCDEFG HIJKLMN', 'AB HIJKLMN', 'AC DJKEJKW SJKLAJL JSHELSJ' ]

def regexp = /^(AB|AC|AD)|((\S{7})+)/

def result = input.collect {
  matcher = ( it =~ regexp )
  println "got $matcher.count for $it"
  matcher.collect { it[0] }
}

println result

I get the output

got 3 for AD ABCDEFG HIJKLMN
got 2 for AB HIJKLMN
got 4 for AC DJKEJKW SJKLAJL JSHELSJ
[[AD, ABCDEFG, HIJKLMN], [AB, HIJKLMN], [AC, DJKEJKW, SJKLAJL, JSHELSJ]]

Is this more what you wanted?

Upvotes: 3

Related Questions