Reputation: 33249
How do I open a text file and read it line by line? There are two different cases I'm interested in answers for:
For the second case I don't want to have to keep all the lines in memory at one time.
Upvotes: 40
Views: 16878
Reputation: 33249
Reading a file into memory all at once as an array of lines is just a call to the readlines
function:
julia> words = readlines("/usr/share/dict/words")
235886-element Array{String,1}:
"A"
"a"
"aa"
⋮
"zythum"
"Zyzomys"
"Zyzzogeton"
By default this discards the newlines but if you want to keep them, you can pass the keyword argument keep=true
:
julia> words = readlines("/usr/share/dict/words", keep=true)
235886-element Array{String,1}:
"A\n"
"a\n"
"aa\n"
⋮
"zythum\n"
"Zyzomys\n"
"Zyzzogeton\n"
If you have an already opened file object you can also pass that to the readlines
function:
julia> open("/usr/share/dict/words") do io
readline(io) # throw out the first line
readlines(io)
end
235885-element Array{String,1}:
"a"
"aa"
"aal"
⋮
"zythum"
"Zyzomys"
"Zyzzogeton"
This demonstrates the readline
function, which reads a single line from an open I/O object, or when given a file name, opens the file and reads the first line from it:
julia> readline("/usr/share/dict/words")
"A"
If you don't want to load the file contents all at once (or if you're processing streaming data like from a network socket), then you can use the eachline
function to get an iterator that produces lines one at a time:
julia> for word in eachline("/usr/share/dict/words")
if length(word) >= 24
println(word)
end
end
formaldehydesulphoxylate
pathologicopsychological
scientificophilosophical
tetraiodophenolphthalein
thyroparathyroidectomize
The eachline
function can, like readlines
, also be given an opened file handle to read lines from. You can also "roll your own" iterator by opening the file and calling readline
repeatedly:
julia> open("/usr/share/dict/words") do io
while !eof(io)
word = readline(io)
if length(word) >= 24
println(word)
end
end
end
formaldehydesulphoxylate
pathologicopsychological
scientificophilosophical
tetraiodophenolphthalein
thyroparathyroidectomize
This is equivalent to what eachline
does for you and it's rare to need to do this yourself but if you need to, the ability is there. For more information about reading a file character by character, see this question and answer: How do we use julia to read through each character of a .txt file, one at a time?
Upvotes: 54