Reputation: 790
I'm trying to write reader for big files, based on iterations in Clojure. But how I can return line by line strings in Clojure? I want to make something like that:
(println (do_something(readFile (:file opts))) ; process and print first line
(println (do_something(readFile (:file opts))) ; process and print second line
Code:
(ns testapp.core
(:gen-class)
(:require [clojure.tools.cli :refer [cli]])
(:require [clojure.java.io]))
(defn readFile [file, cnt]
; Iterate over opened file (read line by line)
(with-open [rdr (clojure.java.io/reader file)]
(let [seq (line-seq rdr)]
; how return only one line there? and after, when needed, take next line?
)))
(defn -main [& args]
; Main function for project
(let [[opts args banner]
(cli args
["-h" "--help" "Print this help" :default false :flag true]
["-f" "--file" "REQUIRED: File with data"]
["-c" "--clusters" "Count of clusters" :default 3]
["-g" "--hamming" "Use Hamming algorithm"]
["-e" "--evklid" "Use Evklid algorithm"]
)]
; Print help, when no typed args
(when (:help opts)
(println banner)
(System/exit 0))
; Or process args and start work
(if (and (:file opts) (or (:hamming opts) (:evklid opts)))
(do
; Use Hamming algorithm
(if (:hamming opts)
(do
(println (readFile (:file opts))
(println (readFile (:file opts))
)
;(count (readFile (:file opts)))
; Use Evklid algorithm
(println "Evklid")))
(println "Please, type path for file and algorithm!"))))
Upvotes: 10
Views: 11319
Reputation: 3504
You can also try to read lazily from the reader, which is not the same as the lazy list of strings returned by line-seq
. The details are discussed in this answer to a very similar question, but the gist of it is here:
(defn lazy-file-lines [file]
(letfn [(helper [rdr]
(lazy-seq
(if-let [line (.readLine rdr)]
(cons line (helper rdr))
(do (.close rdr) nil))))]
(helper (clojure.java.io/reader file))))
You can then map
over the lines which will only be read as far as necessary. As discussed in more details in the linked answer, the downside is that if you don't read till the end of the file, the (.close rdr)
will never be run, potentially causing issues with resources.
Upvotes: 5
Reputation: 61
Try doseq:
(defn readFile [file]
(with-open [rdr (clojure.java.io/reader file)]
(doseq [line (line-seq rdr)]
(println line))))
Upvotes: 6
Reputation: 3452
May be i'm not understanding right what do you mean by "return line by line", but i'll suggest you to write function, which accepts file and processing function, then prints result of processing fuction for every line of your big file. Or, evem more general way, let's accept processing function and output function (println by default), so if we want not just print, but send it over network, save someplace, send to another thread, etc:
(defn process-file-by-lines
"Process file reading it line-by-line"
([file]
(process-file-by-lines file identity))
([file process-fn]
(process-file-by-lines file process-fn println))
([file process-fn output-fn]
(with-open [rdr (clojure.java.io/reader file)]
(doseq [line (line-seq rdr)]
(output-fn
(process-fn line))))))
So
(process-file-by-lines "/tmp/tmp.txt") ;; Will just print file line by ine
(process-file-by-lines "/tmp/tmp.txt"
reverse) ;; Will print each line reversed
Upvotes: 14