Jackson
Jackson

Reputation: 6851

Splittings files in Ruby

I have a method in my application that splits a file into many smaller files.

The method opens a tabular delimited file and for each line in the file it adds a the first value in the line to a file that it creates, which is named off of values in that line. For instance, in a row with the following values:

Atlanta Boston 354 Goalie

It would add the value 'Atlanta' to a file called 'Boston.csv'.

def split_file(file)    
   File.open(file.decompressed_file_path, 'rb').each { |line|
     row_values = line.split("\t")

     file_name = build_file_name(file,row_values)
     file_path = build_file_path(file_name)

     open(file_path, "a+") { |f| f << row_values[0] + "\n" }
   }
end



def build_file_path(name)
  File.join(file.directory,name)
end


def build_file_name(file,row_values)
  "#{row_values[file.hometown_index]}.csv"
end

My question is, is there a more efficient way of accomplishing this? A quicker way to be more specific or is this just a tedious task.

Upvotes: 2

Views: 55

Answers (1)

Boris Stitnicky
Boris Stitnicky

Reputation: 12578

The question wheter there is more efficient way of accomplishing your task depends on what do you mean by "efficient" and in what way do you reuse your code. Obviously, computational complexity is not a question here, so you must be asking about ergonomical efficiency. The code you wrote is good enough.

But if you want to refactor it from good to great, you would be invited to make more thorough use of object-oriented techniques. For example, you would write an object RowValues:

class RowValues < Array
  def self.of line
    new line.split "\t"
  end

  def build_file_name file
    "#{self[file.hometown_index]}.csv"
  end

  def write file_path
    open( file_path, "a+" ) { |f| f << self[0] + "\n" }
  end
end

def split_file file
  File.open(file.decompressed_file_path, 'rb').each { |line|
    row_values = RowValues.of line
    file_name = row_values.build_file_name file
    file_path = build_file_path file_name
    row_values.write file_path
  }
end

You could continue refactoringg along these lines, for example making #split_file an instance method of your file object, so instead of split_file( file ) you would call file.split. I do not fully understand your code, for example, in your #build_file_path method you use file variable (or method) which is not declared in your code. In any case, I recommend you watching the lecture by Ben Orenstein, Refactoring from good to great.

Upvotes: 1

Related Questions