Billy Bones
Billy Bones

Reputation: 2975

Base64 Encoded String for Filename

I cant think of an OS (Linux, Windows, Unix) where this would cause an issue but maybe someone here can tell me if this approach is undesirable.

I would like to use a base64 encoded string as a filename. Something like gH9JZDP3+UEXeZz3+ng7Lw==. Is this likely to cause issues anywhere?

Edit: I will likely keep this to a max of 24 characters

Edit: It looks like I have a character that will cause issues. My function that generated my string is providing stings like: J2db3/pULejEdNiB+wZRow== You will notice that this has a / which is going to cause issues.

According to this site the / is a valid base64 character so I will not be able to use a base64 encoded string for a filename.

Upvotes: 2

Views: 3128

Answers (3)

Billy Bones
Billy Bones

Reputation: 2975

No. You can not use a base64 encoded string for a filename. This is because the / character is valid for base64 strings which will cause issues with file systems.

https://base64.guru/learn/base64-characters

Alternatives:

You could use base64 and then replace unwanted characters but a better option would be to hex encode your original string using a function like bin2hex().

Upvotes: 4

DadiBit
DadiBit

Reputation: 811

The official RFC 4648 states:

An alternative alphabet has been suggested that would use "~" as the 63rd character. Since the "~" character has special meaning in some file system environments, the encoding described in this section is recommended instead. The remaining unreserved URI character is ".", but some file system environments do not permit multiple "." in a filename, thus making the "." character unattractive as well.

I also found on the serverfault stackexchange I found this:

There is no such thing as a "Unix" filesystem. Nor a "Windows" filesystem come to that. Do you mean NTFS, FAT16, FAT32, ext2, ext3, ext4, etc. Each have their own limitations on valid characters in names.

Also, your question title and question refer to two totally different concepts? Do you want to know about the subset of legal characters, or do you want to know what wildcard characters can be used in both systems?

http://en.wikipedia.org/wiki/Ext3 states "all bytes except NULL and '/'" are allowed in filenames.

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx describes the generic case for valid filenames "regardless of the filesystem". In particular, the following characters are reserved < > : " / \ | ? *

Windows also places restrictions on not using device names for files: CON, PRN, AUX, NUL, COM1, COM2, COM3, etc.

Most commands in Windows and Unix based operating systems accept * as a wildcard. Windows accepts % as a single char wildcards, whereas shells for Unix systems use ? as single char wildcard.

And this other one:

Base64 only contains A–Z, a–z, 0–9, +, / and =. So the list of characters not to be used is: all possible characters minus the ones mentioned above.

For special purposes . and _ are possible, too.

Which means that instead of the standard / base64 character, you should use _ or .; both on UNIX and Windows.

Many programming languages allow you to replace all / with _ or ., as it's only a single character and can be accomplished with a simple loop.

Upvotes: 1

Mekroebo
Mekroebo

Reputation: 461

In Windows, you should be fine as long if you conform to the naming conventions of Windows: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions.

As far a I know, any base64 encoded string does not contain any of the reserves characters.

The thing that is probably going to be a problem is the lengte of the file name.

Upvotes: -2

Related Questions