Reputation: 563
Kiba is really cool!
I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each
to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows
from the Source get passed to the Destination, which wouldn't be feasible for my situation.
Upvotes: 0
Views: 175
Reputation: 8873
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #{item}"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #{row}"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
Upvotes: 1