Reputation:
I have a feeling the answer to my question is something to do with Clojure's lazy evaluation (which I am still fuzzy on...)
So I have a function:
(defn fix-str-old [string]
(let [words (->> string
(clojure.string/split-lines)
(map #(clojure.string/replace % #"\W" "")))]
(apply str (interleave words (repeat " ")))))
Basically it just takes a wacky sentence with non-alphanumeric chars, chars, return characters, line feeds etc in place of a space and turns it into a regular sentence. The reason for this if you're curious is that whenever I try to copy out of certain PDFs, it puts line feeds and other mysterious characters in between the words.
Here is an example:
(fix-str "A
block
of
SQL
statements
that
must
all
complete
successfully
before
returning
or
changing
anything ")
==> "A block of SQL statements that must all complete successfully before returning or changing anything"
It works fine in the REPL but when it is evaluated inside of a little swing gui you get this:
"AblockofSQLstatementsthatmustallcompletesuccessfullybeforereturningorchanginganything "
(note the space at the end of the string)
I was pretty sure that this was because of some gap in my understanding of how Clojure handles lazy seqs so I whipped up this function that just does regex operations.
(defn fix-str [string]
(-> string
(clojure.string/replace #"[ \t\n\r]+" " ")
(clojure.string/replace #"[^a-zA-Z0-9 ]" "")
(clojure.string/trimr)))
which isn't lazy and works fine in both the REPL and in the GUI.
Note: I also tried putting doall statements in various places in the original function that I though might make sense to make sure it forced evaluation of the lazy seqs but I couldn't get it to work either.
So my question isn't really if the first way is a good way to fix the strings, but rather why am I getting a different result in the REPL and in my GUI.
Upvotes: 2
Views: 110
Reputation: 13961
Laziness should not be your problem here, because (apply str ...)
forces the output from map
to be realized (and because there's no bindings here, which is usually your first clue that laziness is the culprit).
Looks to me like there's something funky going on with the line-endings coming from the GUI, and that split-lines
is not splitting anything. That function splits on \n
or \r\n
- maybe somehow you're getting \r
as line-endings from the GUI? You can verify this by adding this to the beginning of your fix-str
function:
(doseq [c string] (println (int c)))
Upvotes: 2