Reputation: 53
Is there a way I can skip loading certain rows if I deem the row invalid using the kiba-etl gem?
For example, if there is a validation that must be passed before I load it into the system or errors that occur and I still need to push the data into to sys regardless while logging the problem.
Upvotes: 1
Views: 254
Reputation: 8873
Author of Kiba here! To remove a row from the pipeline, simply return nil
at the end of a transform:
transform do |row|
row_valid = some_custom_operation
row_valid ? row : nil
end
You could also "write down" the offending rows, and report on them later using a post_process
block like this (in this case, require a moderate to low number of bogus rows):
@bogus_row_ids = []
transform do |row|
# SNIP
if row_valid(row)
row
else
@bogus_row_ids << row[:id]
nil # remove from pipeline
end
end
post_process do
# do something with @bogus_row_ids, send an email, write a file etc
end
Let me know if this properly answers your question, or if you need a more refined answer.
Upvotes: 1
Reputation: 53
I'm dumb. I realized you can just catch your errors within the transformation/loading process and return nil.
Upvotes: 0