Cole Shores
Cole Shores

Reputation: 319

How do I read a gzip file in Ruby 1.8.7 line by line?

When I try and read a gzip file in to Ruby 1.8.7 line-by-line, it only reads the first line of the gzipped file. This does not happen on my testing machine, only on my production server.

It may have something to do with zlib or Gzipreader but I am currently at a loss on what to do next and any suggestions would be fantastic.

require 'zlib'
require 'open-uri'

list = Array.new
file = Dir.glob("*").max_by {|f| File.mtime(f)}


File.open(file) do |f|
  gz = Zlib::GzipReader.new(f)
  #something right here is causing an issue on production system
  list = gz.read
  gz.close
end

#I need to take the array and push it to redis
list = list.split("\n")
list.shift
list.each do |list|
    puts list
    puts "\n\n"
end

Upvotes: 3

Views: 1606

Answers (3)

user492203
user492203

Reputation:

First, you might want to use '*.gz' instead of '*', in case there are other files in the script's working directory.

Here are a couple of solutions:

Using GzipReader (recommended)

require 'zlib'

file = Dir.glob('*').max_by { |f| File.mtime(f) }
fd = File.open(file)
gz = Zlib::GzipReader(fd)

gz.readlines[1..-1].each do |line|
  line.chomp!
  puts line, "\n\n"
end

Using IO#popen and zcat

You should not pass unsanitized user input to Kernel#exec or similar functions, as it could be used to execute arbitrary commands.

In your case, you're not dealing with user input. Therefore, one would need write access to the script's working directory to do that. However, it's still bad practice—a filename containing special shell characters (', ", "$", etc.) could cause unexpected issues.

The following solution should be as safe as the GzipReader one, but it's usually good practice to use the standard library instead of relying on external programs.

file = Dir.glob('*').max_by { |f| File.mtime(f) }

IO.popen(['zcat', file]).readlines[1..-1].each do |line|
  line.chomp!
  puts line, "\n\n"
end

Upvotes: 2

the Tin Man
the Tin Man

Reputation: 160551

Here's how to write that in a more Ruby-like way:

require 'open-uri'

file = Dir.glob("*").max_by { |f| File.mtime(f) }
`zcat #{file}`.split("\n")[1..-1].each do |list|
  puts list, "\n\n"
end

Here's what it does:

  • It opens a subshell using backticks, sending a command to zcat with the parameter of the name of the file.
  • The resulting output string captured from the output is split on line-ends.
  • The resulting array is looped-over using each, after slicing the array to skip the first element.
  • Each line is passed into the block as list.

What's wrong with the original code? Besides being done in a non-Ruby-like way?

  • Don't initialize an array using Array.new. This isn't Java, so use [] unless you need some of the darker Array initialization magic.
  • Everything beyond that point is very much a target for DRYing (Don't Repeat Yourself).
  • Your variable names are largely undescriptive; Use names that are useful.
  • Don't assign to a variable and use it once unless it's one nasty assignment that would complicate or result in confusing code later.
  • You use list multiple times and in multiple ways. That's a terrible idea, especially when you move from non-trivial apps to large ones. Don't create "slush" variables, create usefully named ones. And, especially, don't stomp on them as you work your way through the logic.

Upvotes: 1

Cole Shores
Cole Shores

Reputation: 319

I figured out the solution based on the suggestion below. I went ahead and fed the system zcat + the newest file, fed that back in to a string called output. Took the string output and put it in to an array called list to be split up by each new line. This is obviously for logstashing purposes. Thanks again.

require 'open-uri'
require 'open3'

list = Array.new

file = Dir.glob("*").max_by {|f| File.mtime(f)}
unzip = "zcat " + file
output = `#{unzip}`
list = output



#I need to take the array and push it to redis
list = list.split("\n")
list.shift
list.each do |list|
    puts list
    puts "\n\n"
end

Upvotes: 0

Related Questions