Reputation: 3412
There is a program that generates huge CSV files. For example:
arr = (0..10).to_a
CSV.open("foo.csv", "wb") do |csv|
(2**16).times { csv << arr }
end
It will generate a big file, so I want to be compressed on-the-fly, and, instead of output a non-compressed CSV file (foo.csv
), output a bzip-compressed CSV file (foo.csv.bzip
).
I have an example from the "ruby-bzip2" gem:
writer = Bzip2::Writer.new File.open('file')
writer << 'data1'
writer.close
I am not sure how to compose Bzip2 write from the CSV one.
Upvotes: 0
Views: 729
Reputation: 114238
Maybe it would be more flexible to write the CSV data to stdout:
# csv.rb
require 'csv'
$stdout.sync = true
arr = (0..10).to_a
(2**16).times do
puts arr.to_csv
end
... and pipe the output to bzip2
:
$ ruby csv.rb | bzip2 > foo.csv.bz2
Upvotes: 3
Reputation: 84172
You can also construct a CSV
object with an IO or something sufficiently like an IO, such as a Bzip2::Writer
.
For example
File.open('file.bz2', 'wb') do |f|
writer = Bzip2::Writer.new f
CSV(writer) do |csv|
(2**16).times { csv << arr }
end
writer.close
end
Upvotes: 4