claj
claj

Reputation: 5412

Turn on/off greedy-ness in clojure re-patterns

How to turn on/off greedy-ness in clojure re-patterns?

(re-find #"(.+)-(.+)" "hello-world-you") => ["hello-world-you" "hello-world" "you"]

vs

(re-find #"(.+)-(.+)" "hello-world-you") => ["hello-world-you" "hello" "world-you"]

Upvotes: 7

Views: 3141

Answers (2)

Brigand
Brigand

Reputation: 86260

The ? makes quantifiers, such as +, non-greedy. By default, they are greedy.

  • Greedy: (.+)
  • Non-greedy: (.+?)

By the way, this is just the direct, simple, and to-the-point answer. @fge's answer suggests the better way of doing it. Check it out for future expressions.

Upvotes: 22

fge
fge

Reputation: 121790

Don't use .+, use a complemented character class: this avoids having to care about greediness at all.

You should have used this as a regex: ([^-]+)-([^-]+).

Always make the effort to qualify your input as well as possible. Here you wanted to match everything which is not a dash, once or more, and capture it (([^-]+)), then a dash (-), then (again) everything which is not a dash, once or more, and capture it (([^-]+)).

Relying on quantifiers' (non-)greediness is a fundamental error if you know you can describe your input without relying on it. Not only it is a source of error (as you yourself demonstrate), it is also a hindrance for the regex engine to perform at its maximum efficiency.

Upvotes: 13

Related Questions