Reputation: 36524
All platforms (and probably filesystems) have different rules regarding which characters are allowed as a file or directory name. Furthermore, some systems have a blacklist of filenames: for example on Windows, com1
is an invalid file name.
Is there a way to know programmatically the rules to compute a valid file name in PHP?
As an alternative, is there a trustable list of safe characters that are guaranteed to be valid on any system, apart from [0-9a-zA-Z]
?
Please note that a solution based on try to save, if it fails, the filename is invalid is not acceptable for my use case.
Upvotes: 4
Views: 1260
Reputation: 467
Already answered well, Sanitizing strings to make them URL and filename safe?
I found this larger function in the Chyrp code:
/** * Function: sanitize * Returns a sanitized string, typically for URLs. * * Parameters: * $string - The string to sanitize. * $force_lowercase - Force the string to lowercase? * $anal - If set to *true*, will remove all non-alphanumeric characters. */ function sanitize($string, $force_lowercase = true, $anal = false) { $strip = array("~", "`", "!", "@", "#", "$", "%", "^", "&", "*", "(", ")", "_", "=", "+", "[", "{", "]", "}", "\\", "|", ";", ":", "\"", "'", "‘", "’", "“", "”", "–", "—", "—", "–", ",", "<", ".", ">", "/", "?"); $clean = trim(str_replace($strip, "", strip_tags($string))); $clean = preg_replace('/\s+/', "-", $clean); $clean = ($anal) ? preg_replace("/[^a-zA-Z0-9]/", "", $clean) : $clean ; return ($force_lowercase) ? (function_exists('mb_strtolower')) ? mb_strtolower($clean, 'UTF-8') : strtolower($clean) : $clean; }
and this one in the wordpress code
/** * Sanitizes a filename replacing whitespace with dashes * * Removes special characters that are illegal in filenames on certain * operating systems and special characters requiring special escaping * to manipulate at the command line. Replaces spaces and consecutive * dashes with a single dash. Trim period, dash and underscore from beginning * and end of filename. * * @since 2.1.0 * * @param string $filename The filename to be sanitized * @return string The sanitized filename */ function sanitize_file_name( $filename ) { $filename_raw = $filename; $special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}"); $special_chars = apply_filters('sanitize_file_name_chars', $special_chars, $filename_raw); $filename = str_replace($special_chars, '', $filename); $filename = preg_replace('/[\s-]+/', '-', $filename); $filename = trim($filename, '.-_'); return apply_filters('sanitize_file_name', $filename, $filename_raw); }
Update Sept 2012
Alix Axel has done some incredible work in this area. His phunction framework includes several great text filters and transformations.
Upvotes: 2