Radek
Radek

Reputation: 11091

invalid chars filter for file/folder name? (ruby)

My script downloads files from the net and then it saves them under the name taken from the same web server. I need a filter/remover of invalid characters for file/folder names under Windows NTFS.

I would be happy for multi platform filter too.

NOTE: something like htmlentities would be great....

Upvotes: 5

Views: 6864

Answers (4)

Sully
Sully

Reputation: 14943

filename_string.gsub(/[^\w\.]/, '_')

Explanation: Replace everything except word-characters (letter, number, underscore) and dots

Upvotes: 15

Mladen Jablanović
Mladen Jablanović

Reputation: 44080

I don't know how you plan to use those files later, but pretty much most reliable solution would be to keep the original filenames in a db table (or otherwise serialized hash), and name physical files after the unique ID that you (or the database) generated.

PS Another advantage of this approach is that you don't have to worry about the files with the same names (or different names that filter to same names).

Upvotes: 0

liwp
liwp

Reputation: 6926

Like Geo said, by using gsub you can easily convert all invalid characters to a valid character. For example:

file_names.map! do |f|
  f.gsub(/[<invalid characters>]/, '_')
end

You need to replace <invalid characters> with all the possible characters that your file names might have in them that are not allowed on your file system. In the above code each invalid character is replaced with a _.

Wikipedia tells us that the following characters are not allowed on NTFS:

  • U+0000 (NUL)
  • / (slash)
  • \ (backslash)
  • : (colon)
  • * (asterisk)
  • ? (question mark)
  • " (quote)
  • < (less than)
  • (greater than)

  • | (pipe)

So your gsub call could be something like this:

file_names.map! { |f| f.gsub(/[\x00\/\\:\*\?\"<>\|]/, '_') }

which replaces all the invalid characters with an underscore.

Upvotes: 22

Geo
Geo

Reputation: 96777

I think your best bet would be gsub on the filename. One of the things I know you'll need to delete/replace is :.

Upvotes: 0

Related Questions