Evgenii
Evgenii

Reputation: 37329

How to copy files with Unicode characters in file names in Ruby?

I can not copy files that have Unicode characters in their names from Ruby 1.9.2p290, on Windows 7.

For example, I have two files in a dir:

file
ハリー・ポッターと秘密の部屋

(The second name contains Japanese characters if you can not see it)

Here is the code:

> entries = Dir.entries(path) - %w{ . .. }
> entries[0]
=> "file"
> entries[1]
=> "???????????????" # <--- what?

> File.file? entries[0]
=> true
> File.file? entries[1]
=> false   # <---  !!! Ruby can not see it and will not copy

> entries[1].encoding.name
=> "Windows-1251"
> Encoding.find('filesystem').name
=> "Windows-1251"

As you see my Ruby file system encoding is "windows-1251" which is 8 bit and can not handle Japanese. Setting default_external and default_internal encodings to 'utf-8' does not help.

How can I copy those files from Ruby?

Update

I found a solution. It works if I use Dir.glob or Dir[] instead of Dir.entries. File names are now returned in utf-8 encoding and can be copied.

Update #2

My Dir.glob solution appears to be quite limited. It only works with "*" parameter:

Dir.glob("*") # <--- Shows Unicode names correctly
Dir.glob("c:/test/*") # <--- Does not work for Unicode names

Upvotes: 5

Views: 1916

Answers (2)

cbley
cbley

Reputation: 4608

It's been a while, but I was looking into the same problem and it was all but obvious how to do it.

Turns out that you may specify an encoding when you call Dir#entries in Ruby >= 2.1.

Dir.entries(path, encoding: Encoding::UTF_8)

Upvotes: 0

Darshan Rivka Whittle
Darshan Rivka Whittle

Reputation: 34031

Not so much a real solution, but as a workaround, given:

Dir.glob("*") # <--- Shows Unicode names correctly
Dir.glob("c:/test/*") # <--- Does not work for Unicode names

is there any reason you can't do this:

Dir.chdir("c:/test/")
Dir.glob("*")

?

Upvotes: 1

Related Questions