user1376350
user1376350

Reputation: 563

Is there a way to write to Kiba CSV destination line by line or in batches instead of all at once?

Kiba is really cool!

I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows from the Source get passed to the Destination, which wouldn't be feasible for my situation.

Upvotes: 0

Views: 175

Answers (1)

Thibaut Barrère
Thibaut Barrère

Reputation: 8873

Glad you like Kiba!

I'm going to make you happy by stating that your understanding is incorrect.

The rows are yielded & processed one by one in Kiba.

To see how things work exactly, I suggest you try it this code:

class MySource
  def initialize(enumerable)
    @enumerable = enumerable
  end

  def each
    @enumerable.each do |item|
      puts "Source is reading #{item}"
      yield item
    end
  end
end

class MyDestination
  def write(row)
    puts "Destination is writing #{row}"
  end
end

source MySource, (1..10)
destination MyDestination

Run this and you'll see that each item is read then written.

Now to your actual concrete case - what's above means that you can implement your source this way:

class ActiveRecord
  def initialize(model:)
    @model = model
  end

  def each
    @model.find_each do |record|
      yield record
    end
  end
end

then you can use it like this:

source ActiveRecordSource, model: Person.where("age > 21")

(You could also leverage find_in_batches if you wanted each row to be an array of multiple records, but that's probably not what you need here).

Hope this properly answers your question!

Upvotes: 1

Related Questions