TheAznShumai
TheAznShumai

Reputation: 53

Is it possible to skip loading a row using the kiba-etl gem?

Is there a way I can skip loading certain rows if I deem the row invalid using the kiba-etl gem?

For example, if there is a validation that must be passed before I load it into the system or errors that occur and I still need to push the data into to sys regardless while logging the problem.

Upvotes: 1

Views: 254

Answers (2)

Thibaut Barrère
Thibaut Barrère

Reputation: 8873

Author of Kiba here! To remove a row from the pipeline, simply return nil at the end of a transform:

transform do |row|
  row_valid = some_custom_operation
  row_valid ? row : nil
end

You could also "write down" the offending rows, and report on them later using a post_process block like this (in this case, require a moderate to low number of bogus rows):

@bogus_row_ids = []

transform do |row|
  # SNIP
  if row_valid(row)
    row
  else
    @bogus_row_ids << row[:id]
    nil # remove from pipeline
  end
end

post_process do
  # do something with @bogus_row_ids, send an email, write a file etc
end

Let me know if this properly answers your question, or if you need a more refined answer.

Upvotes: 1

TheAznShumai
TheAznShumai

Reputation: 53

I'm dumb. I realized you can just catch your errors within the transformation/loading process and return nil.

Upvotes: 0

Related Questions