Jared Smith
Jared Smith

Reputation: 21973

Unicode regex does not work in clojure

I have the following string I'd like to match:

"Ambrosia,Restore Health, , , "

containing unicode whitespace (don't ask me why). /,\s*,/u works just fine in regex101.

But #"(?u),\s*," does not work in clojure:

(re-find #"(?u),\s*," "Ambrosia,Restore Health, , , ") ;nil, should be , ,

Why does this fail?

Upvotes: 3

Views: 258

Answers (1)

glts
glts

Reputation: 22734

I believe \s matches six ASCII characters and those six ASCII characters only: see the documentation for Pattern.

As you found out already, it may be worth trying some of the other whitespace character classes like \h or \v.

Also, the \p{...} construct can do actual Unicode property matching. White_Space seems the most promising.

Upvotes: 5

Related Questions