GDP
GDP

Reputation: 8178

How to use RegEx to strip specific leading and trailing punctuation in PHP

We're scrubbing a ridiculous amount of data, and am finding many examples of clean data that are left with irrelevant punctuation at the beginning and end of the final string. Quotes and DoubleQuotes are fine, but leading/trailing dashes, commas, etc need to be removed

I've studied the answer at How can I remove all leading and trailing punctuation?, but am unable to find a way to accomplish the same in PHP.

- some text.                dash and period should be removed
"Some Other Text".          period should be removed
it's a matter of opinion    apostrophe should be kept
/ some more text?           Slash should be removed and question mark kept

In short,

How can I accomplish this with PHP - the few examples I've found surpass my RegEx/JS abilites.

Upvotes: 1

Views: 1296

Answers (3)

will
will

Reputation: 149

If the punctuation could be more than one character, you could do this

function trimFormatting($str){ // trim 
    $osl = 0;
    $pat = '(<br>|,|\s+)';
    while($osl!==strlen($str)){
        $osl = strlen($str);
        $str =preg_replace('/^'.$pat.'|'.$pat.'$/i','',$str); 
    }
return $str;
}
echo trimFormatting('<BR>,<BR>Hello<BR>World<BR>, <BR>'); 

// will give "Hello<BR>World"

The routine checks for "<BR>" and "," and one or spaces ("\s+"). The "|" being the OR operator used three times in the routine. It trims both at the start "^" and the end "$" at the same time. It keeps looping through this until no more matches are trimmed off (i.e. there is no further reduction in string length).

Upvotes: 0

Asenar
Asenar

Reputation: 7010

This is an answer without regex.

You can use the function trim (or a combination of ltrim/rtrim to specify all characters you want to remove. For your example:

$str = trim($str, " \t\n\r\0\x0B-.");

(As I suppose you also want to remove spacing and newlines at the begin/end, I left the default mask)

See also rtrim and ltrim if you don't want to remove the same charlist at the beginning and the end of your strings.

Upvotes: 1

php_nub_qq
php_nub_qq

Reputation: 16017

You can modify the pattern to include characters.

$array = array(
    '- some text.',
    '"Some Other Text".',
    'it\'s a matter of opinion',
    '/ some more text?'
);

foreach($array as $key => $string){
    $array[$key] = preg_replace(array(
        '/^[\.\-\/]*/',
        '/[\.\-\/]*$/'
    ), array('', ''), $string);
}

print_r($array);

Upvotes: 0

Related Questions