Reputation: 5
I have a csv files which has a columns test and id and values are :
"abc is 123 test", 1
"abc is 123 test", 2
"abc is 123 test", 3
"abc is 123 test", 4
"abc is 123 test", 5
I want to replace the "abc is 123 test"
with "abc is 567 test"
.
Note : Values 123
and 567
are dynamic values and with every new csv 123
gets changed, but string "abc is <value> test"
always remain same.
Code i tried :
folder_path = "/home/test/files/"
f1 = folder_path + "abc.csv"
string_replace = "abc is 567 test"
file = IO.read(/home/test/files/abc.csv")
file_final = expected_file.gsub!("abc is".*, string_replace)
File.open(f1, 'w') { |f| f.write(file_final) }
I am getting the error:
"ArgumentError: wrong number of arguments calling
*
(0 for 1)
Can anyone help ?
Upvotes: 0
Views: 1182
Reputation: 160551
While technically the files are CSV, we can treat CSV files as text, since that's what they are. That makes it much easier to munge them when they're simple.
I'd start with:
File.open('csv.new', 'w') do |fo|
DATA.each_line do |li|
fo.puts li.sub('123', '456')
end
end
__END__
"abc is 123 test", 1
"abc is 123 test", 2
"abc is 123 test", 3
"abc is 123 test", 4
"abc is 123 test", 5
Running it generates a file called "csv.new" which contains:
"abc is 456 test", 1
"abc is 456 test", 2
"abc is 456 test", 3
"abc is 456 test", 4
"abc is 456 test", 5
Instead of:
DATA.each_line do |li|
you'd want to open your original file using:
File.foreach("/home/test/files/abc.csv") do |li|
(DATA
and __END__
are a way to access sample data stored at the end of a Ruby script.)
'123'
is prone to false-positive hits, and would change sub-strings:
'0123456'.sub('123', '456') # => "0456456"
to counter that, if there is any chance of sub-string matches you'd want to use a more intelligent search string; I'd use a regular expression:
'0123456'.sub(/\b123\b/, '456') # => "0123456"
which now checks to see if there's a word boundary surrounding 123
:
'0 123 456'.sub(/\b123\b/, '456') # => "0 456 456"
Since "123" could change, it'd make sense to assign it to a constant then substitute that into the pattern:
TARGET_STR = '123'
'0123456'.sub(/\b#{TARGET_STR}\b/, '456') # => "0123456"
'0 123 456'.sub(/\b#{TARGET_STR}\b/, '456') # => "0 456 456"
Because I'm using blocks with open
and foreach
, Ruby will automatically close the files once the blocks end, resulting in cleaner code, and better management of file handles.
Your code:
file = IO.read(/home/test/files/abc.csv")
file_final = expected_file.gsub!("abc is".*, string_replace)
File.open(f1, 'w') { |f| f.write(file_final) }
... is a ... mess.
read
is great for files you know will always be below 1MB in size. If you don't know that, especially if you're working in a production environment where files can be well into the GB range, using line-by-line IO is faster and safer as it sidesteps scalability issues. See "Why is "slurping" a file not a good practice?" for more information.expected_file
is, but it'll cause an error because it's undefined so Ruby would revolt because you used the gsub!
method on a nil value.If expected_file
is a String, expected_file.gsub!
would mutate expected_file
, but assigning the result to file_final
wastes CPU. Instead reuse expected_file
, or, better, use:
file_final = expected_file.gsub(
"abc is".*
is an invalid parameter. Possibly "abc is.*"
would be closer, but it appears you're reaching for a regular expression /abc is.*/
, but that wouldn't be necessary to change the string, /123/
or '123'
would be sufficient.
gsub
would be overkill here too, since you only need a single replacement, so sub
would be faster.Technically,
File.open(f1, 'w') { |f| f.write(file_final) }
will work, but it's much more easily written as
File.write(f1, file_final)
You could reduce the code to:
File.write(
'file.csv.new',
File.read('file.csv').gsub(/\b123\b/, '456')
)
which, out of perverseness, could be written as:
File.write('file.csv.new', File.read('file.csv').gsub(/\b123\b/, '456'))
There'd be no improvement in speed, and instead it'd reduce readability.
Upvotes: 1