Converting gsub() pattern from ruby 1.8 to 2.0

Question

I have a ruby program that I'm trying to upgrade form ruby 1.8 to ruby 2.0.0-p247.

This works just fine in 1.8.7:

 begin
   ARGF.each do |line|
     # a collection of pecluliarlities, appended as they appear in data
     line.gsub!("\x92", "'")
     line.gsub!("\x96", "-")
     puts line
   end
 rescue => e
   $stderr << "exception on line #{$.}:
"
   $stderr << "#{e.message}:
"
   $stderr << @line
 end

But under ruby 2.0, this results in this an exxeption when encountering the 96 or 92 encoded into a data file that otherwise contains what appears to be ASCII:

 invalid byte sequence in UTF-8

I have tried all manner of things: double backslashes, using a regex object instead of the string, force_encoding(), etc. and am stumped.

Can anybody fill in the missing puzzle piece for me?

Thanks.

=============== additions: 2013-09-25 ============

Changing \x92 to \u2019 did not fix the problem.

The program does not error until it actually hits a 92 or 96 in the input file, so I'm confused as to how the character pattern in the string is the problem when there are hundreds of thousands of lines of input data that are matched against the patterns without incident.

Converting gsub() pattern from ruby 1.8 to 2.0

Answers (1)

Related Questions