Reputation: 5412
How to turn on/off greedy-ness in clojure re-patterns?
(re-find #"(.+)-(.+)" "hello-world-you") => ["hello-world-you" "hello-world" "you"]
vs
(re-find #"(.+)-(.+)" "hello-world-you") => ["hello-world-you" "hello" "world-you"]
Upvotes: 7
Views: 3141
Reputation: 86260
The ?
makes quantifiers, such as +
, non-greedy. By default, they are greedy.
(.+)
(.+?)
By the way, this is just the direct, simple, and to-the-point answer. @fge's answer suggests the better way of doing it. Check it out for future expressions.
Upvotes: 22
Reputation: 121790
Don't use .+
, use a complemented character class: this avoids having to care about greediness at all.
You should have used this as a regex: ([^-]+)-([^-]+)
.
Always make the effort to qualify your input as well as possible. Here you wanted to match everything which is not a dash, once or more, and capture it (([^-]+)
), then a dash (-
), then (again) everything which is not a dash, once or more, and capture it (([^-]+)
).
Relying on quantifiers' (non-)greediness is a fundamental error if you know you can describe your input without relying on it. Not only it is a source of error (as you yourself demonstrate), it is also a hindrance for the regex engine to perform at its maximum efficiency.
Upvotes: 13