Alexander Yakovlev
Alexander Yakovlev

Reputation: 215

Parallel mysql I/O in Ruby

Good day to you. I'm writing a cron job that hopefully will split a huge MySQL table to several threads and do some work on them. This is the minimal sample of what I have at the moment:

require 'mysql'
require 'parallel'
@db = Mysql.real_connect("localhost", "root", "", "database")
@threads = 10

Parallel.map(1..@threads, :in_processes => 8) do |i|
  begin
    @db.query("SELECT url FROM pages LIMIT 1 OFFSET #{i}")
  rescue Mysql::Error => e
    @db.reconnect()
    puts "Error code: #{e.errno}"
    puts "Error message: #{e.error}"
    puts "Error SQLSTATE: #{e.sqlstate}" if e.respond_to?("sqlstate")
  end
end
@db.close

The threads don't need to return anything, they get their job share and they do it. Only they don't. Either connection to MySQL is lost during the query, or connection doesn't exist (MySQL server has gone away?!), or no _dump_data is defined for class Mysql::Result and then Parallel::DeadWorker.

How to do that right?

Upvotes: 0

Views: 393

Answers (1)

Alexander Yakovlev
Alexander Yakovlev

Reputation: 215

map method expects a result; I don't need a result, so I switched to each:

Parallel.each(1..@threads, :in_processes => 8) do |i|

Also this solves a problem with MySQL: I just needed to start the connection inside the parallel process. When using each loop, it's possible. Of course, connection should be closed inside the process also.

Upvotes: 1

Related Questions