Reputation: 2511
This is more of a regex question than Clojure, but I am testing it in Clojure.
(re-seq #"\w+" "This is a test. Only a test!")
produces:
("This" "is" "a" "test" "Only" "a" "test")
I want to have this:
("This" " " "is" " " "a" "test" ". " "Only" " " "a" " " "test" "!")
Where I get all the words, but everything else between the words is included too.
I don't care for the period and space if they are seperate "." " "
or together ". "
Is this simple to do with a regex?
Upvotes: 2
Views: 1018
Reputation: 36777
You probably could use \b
which matches word boundaries and use string/split
. The only problem is that it will match the beginning of the string too:
(rest (clojure.string/split "This is a test. Only a test!" #"\b"))
This won't be lazy either.
Upvotes: 0
Reputation: 129587
Try using the following regex:
\w+|\W+
> (re-seq #"\w+|\W+" "This is a test. Only a test!")
("This" " " "is" " " "a" " " "test" ". " "Only" " " "a" " " "test" "!")
Upvotes: 3