Aphex
Aphex

Reputation: 408

Formatting Regex to use for a Java application

I got a text document with a few pieces of information I want to suck out using the magic of regular expressions. I wrote a decent regex that catches the information I needed — you can look at it here.

The regex looks like this:

\w+(?!\>)(?=\-)\W+\w+|\w+\s+\w+(?!\>)(?=\s+\d+\s+)|\w+(?!\>)(?=\s+\d+\s+)

I rewrote it to use in Java — to my knowledge, you need to add an extra backslash, like so:

\\w+(?!\\>)(?=\\-)\\W+\\w+|\\w+\\s+\\w+(?!\\>)(?=\\s+\\d+\\s+)|\\w+(?!\\>)(?=\\s+\\d+\\s+)

The problem is that what it should catch (according to several regex sites) doesn't get catched when I use it in Java. Can anyone point out why this is?

EDIT: To clarify, my regex doesn't match anything in Java.

Upvotes: 0

Views: 61

Answers (1)

tomse
tomse

Reputation: 501

If you don't rely on all the lookaheads try using the following simplified pattern:

Pattern p = Pattern.compile("\\>([^\\d]+) ");
Matcher m = p.matcher(">Sea-Cucumber 576151 1HLB");
if (m.find()) System.out.println(m.group(1));

// prints "Sea-Cucumber"

Upvotes: 2

Related Questions