Reputation: 39
I'm trying to use entities to get a match on some data, and the regex seems like it doesn't really match well with other similar engines in Python or even sites like regexr.com. Here some examples:
Pattern: ([\w]{8}-[\w]{4}-[\w]{4}-[\w]{4}-[\w]{12}-[\w]{3})
String style to match: 83123e42-d9ad-a26a-b13f-b0ec91c7fedf-ABC
However, when testing this out, it gets:
@id:83123e42
@id:d9ad
@id:a26a
@id:b13f
@id:b0ec91c7fedf
@id:ABC
I've tried grouping the whole string, using string delimiters, escaping the hypens, using .{4}-
instead of \w, but all to no solid result, and often getting the exact same matching where it splits it into groups rather than one full match.
Is this a regex issue? I tried not grouping the whole string, but seem to keep running into the exact same issue, where it won't even find the last 3 letters anyway.
If Watson Assistant uses a different regex engine, is there a place with documentation that I just haven't been able to find? They seem to just assume that any normal regex will work, but skipping the hypens is strange behavior.
Upvotes: 2
Views: 2211
Reputation: 39
Ended up finding a more direct answer from an awesome helper in the Slack channel:
Turns out that something in the Watson assistant Regex doesn't recognize hyphens.
He ended up working with me and showing me a bit of SpEL that I have running to assign to a context variable that I can then use.
"<? input.text.extract('(\\w{8}\\-\\w{4}\\-\\w{4}\\-\\w{4}\\-\\w{12}\\-\\w{3}[^\\w]+)', 0) ?>"
Upvotes: 1
Reputation: 17176
Citing the Watson Assistant docs for defining entities, here the relevant parts:
The regular expression engine is loosely based on the Java regular expression engine. The Watson Assistant service will produce an error if you try to upload an unsupported pattern, either via the API or from within the Watson Assistant service Tooling UI.
That section has some information on limitations and what to consider when writing regex expressions. The most significant cited are:
Entity patterns may not contain:
- Positive repetitions (for examplex*+
)
- Backreferences (for example\g1
)
- Conditional branches (for example(?(cond)true)
)
Upvotes: 0