port5432
port5432

Reputation: 6391

Ruby memory issues with Kernel.open

I am wrestling with a memory issue on our production server. I am using a bioruby gem from within a delayed_job (in a Rails 4 app). Previously all worked fine, and it also works fine on the local development (OS X) machine.

There is plenty of memory on the server. It has 8Gb and is barely using 2GB. It does not change when the file is accessed.

The exact line of code that causes the error to occur is isseing a Kernel.open (line 35) : https://github.com/misshie/bioruby-ucsc-api/blob/master/lib/bio-ucsc/file/twobit.rb

def self.load(filename)
      two_bit = nil
      Kernel.open(filename, 'rb') {|f| two_bit = f.read}
      tbq = Bio::Ucsc::File::ByteQueue.new(two_bit)

The file it is trying to open contains the human genome, and is 800MB, but this process has been working fine for the past 9 months.

1.9.3p327 :001 > Kernel.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit', 'rb') {|f| two_bit = f.read}
NoMemoryError: failed to allocate memory
from (irb):1:in `read'
from (irb):1:in `block in irb_binding'
from (irb):1:in `open'
from (irb):1
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands/console.rb:90:in `start'
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands/console.rb:9:in `start'
from /home/assay/apps/assay/shared/bundle/ruby/1.9.1/gems/railties-4.0.2/lib/rails/commands.rb:62:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'

The server is Ubuntu 12

assay@assaypipeline:~/apps/assay/shared/bin/hg19$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 12.04 LTS
Release:    12.04
Codename:   precise

EDIT

In reponse to CMoi's comment below, I tried an open only, and it seemed be OK. Not sure how to proceed now.

1.9.3p327 :001 > Kernel.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit', 'rb')
 => #<File:/home/assay/apps/assay/shared/bin/hg19/hg19.2bit>

Upvotes: 1

Views: 632

Answers (1)

engineersmnky
engineersmnky

Reputation: 29478

What if you tried this

tbq = Bio::Ucsc::File::ByteQueue.new(File.open('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit', &:read))

Or

tbq = Bio::Ucsc::File::ByteQueue.new(File.read('/home/assay/apps/assay/shared/bin/hg19/hg19.2bit'))

This will eliminate the block reading the file into a local variable and instead place it directly into your Bio::Ucsc::File::ByteQueue object.

Upvotes: 1

Related Questions