Joshua Flanagan
Joshua Flanagan

Reputation: 8557

How do I run a non-ASCII/Unicode shell command from Ruby on Windows?

I cannot figure out the proper way to encode a shell command to run from Ruby on Windows. The following script reproduces the problem:

# encoding: utf-8

def test(word)
  returned = `echo #{word}`.chomp
  puts "#{word} == #{returned}"
  raise "Cannot roundtrip #{word}" unless word == returned
end

test "good"

test "bÃd"

puts "Success"

# win7, cmd.exe font set to Lucinda Console, chcp 65001
# good == good
# bÃd == bÃd

Is this a bug in Ruby, or do I need to encode the command string manually to a specific encoding, before it gets passed to the cmd.exe process?

Update: I want to make it clear that the problem is not with reading the output back into Ruby, its purely with sending the command to the shell. To demonstrate:

# encoding: utf-8

File.open("bbbÃd.txt", "w") do |f|
  f.puts "nothing to see here"
end

filename = Dir.glob("bbb*.txt").first
command = "attrib #{filename}"

puts command.encoding

puts "#{filename} exists?: #{ File.exists?(filename) }"
system command
File.delete(filename)

#=>
# UTF-8
# bbbÃd.txt exists?: true
# File not found - bbbÃd.txt

You can see that the file gets created correctly, the File.exists? method confirms that Ruby can see it, but when I try to run the attrib command on it, its trying to use a different filename.

Upvotes: 6

Views: 1507

Answers (2)

peter
peter

Reputation: 42182

I had the same issue using drag-and-drop in windows. When I dropped a file having unicode characters in it's name the unicode characters got replaced by question marks. Tried everything with encoding, changing the drophandler etc. The only thing that worked was creating a batch file with following contents.

ruby.exe -Eutf-8 C:\Users\user\myscript.rb %*

The batch file does receive the unicode characters correctly as you can see as you do an echo %* first followed by a pause

I needed to add the -Eutf-8 parameter to have the filename come in as UTF-8 in the script itself, having the following lines in my script were not enough

#encoding: UTF-8
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8

Hope this helps people with similar problems.

Upvotes: 0

Litmus
Litmus

Reputation: 10986

Try setting the environment variable LC_CTYPE like this:

 LC_CTYPE=en_US.UTF-8

Set this globally in the command shell or inside your Ruby script:

ENV['LC_CTYPE']='en_US.UTF-8' 

Upvotes: 2

Related Questions