Kranthi
Kranthi

Reputation: 1417

How to write data and headers to csv dynamically?

I need to write headers and data into csv. The problem i'm facing is that my data may change order. I want to map my data exactly under the paricular header column. Can anyone tell me how to do this?

Ex: My data looks like

Min6    ItemNumber  FC1 FC2 Retailer    Brand   Description  Size
630   12549        0     0             Too-Tart  SWEET&SOUR   .7OZ
                   0     0      Ahold  Too-Tart  SWEET&SOUR   .7OZ
                   0     0      Test   Too-Tart  SWEET&SOUR   .7OZ

630   12550        0     0             Too       SWEET&Salt   .60Z
                   0     0      Test   Too       SWEET&SOUR   .7OZ
                   0     0      Ahold  Too       SWEET&SOUR   .23Z

and am expecting as

   Min6 ItemNumber  FC1 FC2 Retailer    Brand   Description     Size     Ahold-Description Ahold-Brand  Test-Description Test-Brand
   630    12549        0     0             Too-Tart  SWEET&SOUR .7OZ      SWEET&SOUR        Too-Tart     SWEET&SOUR       Too-Tart
   630    12549        0     0             Too-Tart  SWEET&Salt .6OZ      SWEET&SOUR        Too          SWEET&SOUR       Too

Here order of the retailer may change and many retailers will be there. For each new retailer new headers will be created. If a retailer is repeated then I want to map his brand and description under his headers exactly which are already created.

class CouponsDataPreProcessor < Transformer
  def run(path)
    @records = []
    input_base_name = File.basename(path, '.csv')
    first_row = true
    output_file = File.join('public', "coupons_data_preprocessor_#{input_base_name}_#{Time.now.strftime('%Y%m%d%H%M')}.csv")
    CSV.open(output_file, 'w') do |csv|
      CSV.foreach(path, {:headers => true}) do |row|
        if first_row
          @headers = row.headers
          first_row = false
        end
        if row['ItemNumber'].present?
          @records << row.fields
        else
          form_retailer_headers_and_fields(row)
        end
      end
      csv << @headers
      @records.each { |record| csv << record }
    end
    output_file
  end

  def form_retailer_headers_and_fields(row)
    retailer = row['Retailer']
    unless @headers.include?("Description-#{retailer}")
      @headers.push("Description-#{retailer}", "Brand-#{retailer}", "SubBrand-#{retailer}", "Size-#{retailer}")
    end
    @records.last.push(row['Description'], row['Brand'], row['SubBrand'], row['Size'])
  end

  def self.about
    'Preprocess coupons data into input line items'
  end
end

Here if a retailer is repeated and he is in different order then i'm not able to map his values exactly under his header columns(ex: Brand-retailername)

How do I create a row object with key as header and value as the value for that header and push it into the output csv file?

Upvotes: 1

Views: 1962

Answers (1)

dcorking
dcorking

Reputation: 1206

There seem to be several bugs in your code.

  1. (line 1) You inherit from Transformer, but you don't seem to re-use any methods from Transformer. Consider deleting the inheritance.

  2. (line 1) You need require 'csv'

  3. (line 4) Perhaps you intended to write File.basename(path << '.csv')

  4. (line 14) You wrote if row['ItemNumber'].present? but if that column is empty, then row['ItemNumber'] will be nil, and so the program will abort on MethodMissing. You probably intended to use the present? method in the activesupport gem, which Stack Overflow users will need to require to test this part of your code

I don't think the CSV library has a tool to detect that the first 2 columns (of row 3 for example) are blank, so I think you need to preprocess with another tool that identifies the columns before using CSV.

You can write a method to identify the column numbers, but I just did it clumsily and manually (please edit my answer with tidier code):

COMMA = ','
f = File.open(path)
g = File.open('commas.csv', mode='w')
lines = f.readlines
lines = lines.reject{|ln| ln.length < 61} # reject the short line(s)
lines_with_commas = lines.collect{|ln| ln[0..3] << COMMA << ln[4..17] \
<< COMMA << ln[18..23] << COMMA << ln[24..27] << COMMA << ln[28..38] \
<< COMMA << ln[39..47]<< COMMA << ln[48..60]<< COMMA << ln[61..-1]} 
lines_with_commas.each{ |ln| g.write(ln)}
g.close

Then you can import your CSV in one go and search it for retailers.

> table = CSV.table('commas.csv')
> table[:retailer_].collect(&:strip).uniq
=> ["", "Ahold", "Test"]

Upvotes: 1

Related Questions