Ricardo
Ricardo

Reputation: 173

Regex validation including HTML encoded characters

I'm trying to validate a string (PHP & Regex) and I want option A to validate successfully but option B to not validate at all:

a) fsadgfsd!^_-@<>&lt; 
            OR
   fsadgfsd!^_-@<>&gt;&lt;


b) fsadgfsd!^_-@<>&; 
          OR 
   fsadgfsd!^_-@<>;&&;

So far I have this:

/^[a-zA-Z0-9 \!\^\_\@\<\>-]+$/

Which is validating everything for me except for the HTML encoded substrings of < and > and I'm hitting a brick wall with it at this stage, any help greatly appreciated.

Basically, in addition to my existing regex which is matching my special characters, I also need to be able to match the exact substrings of either < or > but not match the & or ; character on their own.

Due to restrictions in the code that I'm using, I'm not in a positon to be able to decode the data before validating it either...

Upvotes: 0

Views: 2041

Answers (1)

Louis XIV
Louis XIV

Reputation: 2224

$regex = '/^[\w !\^@<>-]+$/';


$string = 'fsadgfsd!^_-@<>&gt;&lt;';
$string = html_entity_decode($string);
if (preg_match($regex, $string))
    echo 'ok';
// echo ok


$string = 'fsadgfsd!^_-@<>;&&;';
$string = html_entity_decode($string);
if (preg_match($regex, $string))
    echo 'ok';
// echo nothing

\w is a shortcut for [a-zA-Z0-9_]

EDIT: without html_entity_decode

$regex = '/^([\w !\^@<>-]*(&[glt]+;)*)+$/';


$string = 'fsadgfsd!^_-@<>&gt;&lt;';
if (preg_match($regex, $string))
    echo 'ok';
// echo ok
echo '-------------';

$string = 'fsadgfsd!^_-@<>;&&;';
if (preg_match($regex, $string))
    echo 'ok';
// echo nothing

Upvotes: 1

Related Questions