Reputation: 5409
I want to disallow all symbols in a string, and instead of going and disallowing each one I thought it'd be easier to just allow alphanumeric characters (a-z A-Z 0-9).
How would I go about parsing a string and converting it to one which only has allowed characters? I also want to convert any spaces into _
.
At the moment I have:
function parseFilename($name) {
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
$name = str_replace(' ', '_', $name);
return $name;
}
Thanks
Upvotes: 0
Views: 2551
Reputation: 48073
The replacement if spaces with spaces does not require the might of the regex engine; it can wait out the first round of replacements.
The purging of all non-alphanumeric characters and underscores is concisely handled by \W
-- it means any character not in a-z
, A-Z
, 0-9
, or _
.
Code: (Demo)
function sanitizeFilename(string $name): string {
return preg_replace(
'/\W+/',
'',
str_replace(' ', '_', $name)
);
}
echo sanitizeFilename('This/is My 1! FilenAm3');
Output:
Thisis_My_____1_FilenAm3
...but if you want to condense consecutive spaces and replace them with a single underscore, then use regex. (Demo)
function sanitizeFilename(string $name): string {
return preg_replace(
['/ +/', '/\W+/'],
['_', ''],
$name
);
}
echo sanitizeFilename('This/has a Gap !n 1t');
Output:
Thishas_a_Gap_n_1t
Upvotes: 0
Reputation: 7
Try working with the HTML part
pattern="[A-Za-z]{8}" title="Eight letter country code">
Upvotes: -2
Reputation: 281
You could do both replacements at once by using arrays as the find
/ replace
params in preg_match()
:
$str = 'abc def+ghi&jkl ...z';
$find = array( '#[\s]+#','#[^\w]+#' );
$replace = array( '_','' );
$newstr = preg_replace( $find,$replace,$str );
print $newstr;
// outputs:
// abc_defghijkl_z
\s
matches whitespace (replaced with a single underscore), and as @F.J described, ^\w
is anything "not a word character" (replaced with empty string).
Upvotes: 1
Reputation: 208665
preg_replace()
is the way to go here, the following should do what you want:
function parseFilename($name) {
$name = str_replace(' ', '_', $name);
$name = preg_replace('/[^\w]+/', '', $name);
return $name;
}
[^\w]
is equivalent to [^a-zA-Z0-9_]
, which will match any character that is not alphanumeric or an underscore. The +
after it means match one or more, this should be slightly more efficient than replacing each character individually.
Upvotes: 0