Rob Buhler
Rob Buhler

Reputation: 2511

clojure regex to match words and everything inbetween

This is more of a regex question than Clojure, but I am testing it in Clojure.

(re-seq #"\w+" "This is a test. Only a test!")

produces:

("This" "is" "a" "test" "Only" "a" "test")

I want to have this:

("This" " " "is" " " "a" "test" ". " "Only" " " "a" " " "test" "!")

Where I get all the words, but everything else between the words is included too. I don't care for the period and space if they are seperate "." " " or together ". "

Is this simple to do with a regex?

Upvotes: 2

Views: 1018

Answers (2)

soulcheck
soulcheck

Reputation: 36777

You probably could use \b which matches word boundaries and use string/split. The only problem is that it will match the beginning of the string too:

(rest (clojure.string/split "This is a test. Only a test!" #"\b"))

This won't be lazy either.

Upvotes: 0

arshajii
arshajii

Reputation: 129587

Try using the following regex:

\w+|\W+

> (re-seq #"\w+|\W+" "This is a test. Only a test!")
("This" " " "is" " " "a" " " "test" ". " "Only" " " "a" " " "test" "!")

Upvotes: 3

Related Questions