Reputation: 2077
I have a 2 GiB file, and I want to read the first line of the file.
I can call the File#readlines
method which returns array, and use [0]
bracket syntax, at(0)
, or the slice(0)
or first
methods.
But there's a problem. My PC has 3.7 GiB RAM, and the usage goes from 1.1 GiB all the way up to 3.7 GiB. But all I want is the first line of the file. Is there an efficient way to do that?
Upvotes: 0
Views: 632
Reputation: 972
get from https://www.rosettacode.org/wiki/Read_a_specific_line_from_a_file#Ruby
seventh_line = open("/etc/passwd").each_line.take(7).last
Upvotes: 1
Reputation: 2077
So I have came with a code that does the job quite efficiently.
IO#each_line
method. Say we need the line at 3,000,000:#!/usr/bin/ruby -w
file = File.open(File.join(__dir__, 'hello.txt'))
final = nil
read_upto = 3_000_000 - 1
file.each_line.with_index do |l, i|
if i == read_upto
final = l
break
end
end
file.close
p final
Running with the time
shell builtin:
[I have a big hello.txt file with #!/usr/bin/ruby -w #lineno in it!!]
$ time ruby p.rb
"#!/usr/bin/ruby -w #3000000\n"
real 0m1.298s
user 0m1.240s
sys 0m0.043s
We can also get the 1st line very easily! You got it...
#!/usr/bin/ruby -w
enum = IO.foreach(File.join(__dir__, 'hello.txt'))
# Getting the first line
p enum.first
# Getting the 100th line
# This can still cause memory issues because it
# creates an array out of each line
p enum.take(100)[-1]
# The time consuming but memory efficient way
# reading the 3,000,000th line
# While loops are fastest
index, i = 3_000_000 - 1, 0
enum.next && i += 1 while i < index
p enum.next # reading the 3,000,000th line
Running with time
:
time ruby p.rb
"#!/usr/bin/ruby -w #1\n"
"#!/usr/bin/ruby -w #100\n"
"#!/usr/bin/ruby -w #3000000\n"
real 0m2.341s
user 0m2.274s
sys 0m0.050s
There could be other ways like the IO#readpartial
, IO#sysread
and so on. But The IO.foreach
, and IO#each_line
are the easiest and quite fast to work with.
Hope this helps!
Upvotes: 0
Reputation: 406
I would use commands line. For example, in this way:
exec("cat #{filename} | head -#{nth_line} | tail -1")
I hope it useful for you.
Upvotes: 0
Reputation: 10536
What about IO.foreach
?
IO.foreach('filename') { |line| p line; break }
That should read the first line, print it, and then stop. It does not read the entire file; it reads one line at a time.
Upvotes: 0
Reputation: 859
have you tried readline
instead of readlines
?
File.open('file-name') { |f| f.readline }
Upvotes: 0