rstarter
rstarter

Reputation: 287

Regex to find a substring using regex

I'm using regex with Groovy(Grails) to find a substring which is a combination of capitalized alphabets, underscores and digits only.

The regex

"THIS_WORD" ==~ /([A-Z_0-9]*)/

returns true ( but the following statement

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll{([A-Z_0-9]*)/}
println str

returns [W, W, T, H, I, S, _, W, O, R, D]

I need only the word THIS_WORD not alphabet W that is repeated twice. What am I missing here?

Upvotes: 0

Views: 910

Answers (3)

Dmitry E.
Dmitry E.

Reputation: 103

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll{([A-Z_0-9]*)/}

This doesn't compile. Perhaps you meant this:

"Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/[A-Z_0-9]*/)

which gives

[W, , , , , , , , , , , , , , , , W, , , , , , , , , , , , THIS_WORD, , , , , , , , , , , , , , ]

If you are looking for all upper-case words, a regex like this will work better:

"Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/\b[A-Z_0-9]+\b/)

Upvotes: 1

user471679
user471679

Reputation:

  • means 0 or more whereas a + means 1 or more. To do 2 or more you would need to use the {MIN,MAX} syntax after the []

([A-Z0-9_]{2,})

After learning a bit about groovy and testing on the groovy console at http://groovyconsole.appspot.com/ I found this worked. ​

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/([A-Z_0-9]{2,})/)
println str​​​​​​​​​​​​

Upvotes: 1

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89639

Perhaps can you use {2,} instead of * to get all matches with more than 1 char:

def str = "Wlkjjf als Wk;lfs fk THIS_WORD dsjf kjd".findAll(/[A-Z_0-9]{2,}/)

Upvotes: 2

Related Questions