Reputation: 5073
I'd like to use any php function or whatever so that i can remove any HTML code and special characters and gives me only alpha-numeric output
$des = "Hello world)<b> (*&^%$#@! it's me: and; love you.<p>";
I want the output become Hello world it s me and love you
(just Aa-Zz-0-9-WhiteSpace)
I've tried strip_tags
but it removes only HTML codes
$clear = strip_tags($des);
echo $clear;
So is there any way to do it?
Upvotes: 51
Views: 121172
Reputation: 1131
Remove all special character don't give space write in single line
trim(preg_replace('/ +/', ' ', preg_replace('/[^A-Za-z0-9 ]/', ' ',
urldecode(html_entity_decode(strip_tags($string))))));
Upvotes: 0
Reputation: 1
preg_replace('/[^a-zA-Z0-9\s]/', '',$string)
this is using for removing special character only rather than space between the strings.
Upvotes: 0
Reputation: 16312
All the other solutions are creepy because they are from someone that arrogantly simply thinks that English is the only language in the world :)
All those solutions strip also diacritics like ç or à.
The perfect solution, as stated in PHP documentation, is simply:
$clear = strip_tags($des);
Upvotes: 6
Reputation: 1
to allow periods and any other character just add them like so:
change: '#[^a-zA-Z ]#
'
to:'#[^a-zA-Z .()!]#
'
Upvotes: 0
Reputation: 547
Here's a function I've been using that I've put together from various threads around the net that removes everything, all tags and leaves you with a perfect phrase. Does anyone know how to modify this script to allow periods (.) ? In other words, leave everything 'as is' but leave the periods alone or other punctuation like and ! or a comma? let me know.
function stripAlpha( $item )
{
$search = array(
'@<script[^>]*?>.*?</script>@si' // Strip out javascript
,'@<style[^>]*?>.*?</style>@siU' // Strip style tags properly
,'@<[\/\!]*?[^<>]*?>@si' // Strip out HTML tags
,'@<![\s\S]*?–[ \t\n\r]*>@' // Strip multi-line comments including CDATA
,'/\s{2,}/'
,'/(\s){2,}/'
);
$pattern = array(
'#[^a-zA-Z ]#' // Non alpha characters
,'/\s+/' // More than one whitespace
);
$replace = array(
''
,' '
);
$item = preg_replace( $search, '', html_entity_decode( $item ) );
$item = trim( preg_replace( $pattern, $replace, strip_tags( $item ) ) );
return $item;
}
Upvotes: 1
Reputation: 1127
You can do it in one single line :) specially useful for GET or POST requests
$clear = preg_replace('/[^A-Za-z0-9\-]/', '', urldecode($_GET['id']));
Upvotes: 1
Reputation: 22081
In a more detailed manner from Above example, Considering below is your string:
$string = '<div>This..</div> <a>is<a/> <strong>hello</strong> <i>world</i> ! هذا هو مرحبا العالم! !@#$%^&&**(*)<>?:";p[]"/.,\|`~1@#$%^&^&*(()908978867564564534423412313`1`` "Arabic Text نص عربي test 123 و,.m,............ ~~~ ٍ،]ٍْ}~ِ]ٍ}"; ';
Code:
echo preg_replace('/[^A-Za-z0-9 !@#$%^&*().]/u','', strip_tags($string));
Allows:
English letters (Capital and small), 0 to 9 and characters !@#$%^&*().
Removes:
All html tags, and special characters other than above
Upvotes: 1
Reputation: 24951
Probably better here for a regex replace
// Strip HTML Tags
$clear = strip_tags($des);
// Clean up things like &
$clear = html_entity_decode($clear);
// Strip out any url-encoded stuff
$clear = urldecode($clear);
// Replace non-AlNum characters with space
$clear = preg_replace('/[^A-Za-z0-9]/', ' ', $clear);
// Replace Multiple spaces with single space
$clear = preg_replace('/ +/', ' ', $clear);
// Trim the string of leading/trailing space
$clear = trim($clear);
Or, in one go
$clear = trim(preg_replace('/ +/', ' ', preg_replace('/[^A-Za-z0-9 ]/', ' ', urldecode(html_entity_decode(strip_tags($des))))));
Upvotes: 151
Reputation: 3053
Strip out tags, leave only alphanumeric characters and space:
$clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags($des));
Edit: all credit to DaveRandom for the perfect solution...
$clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags(html_entity_decode($des)));
Upvotes: 13