Pekka
Pekka

Reputation: 449613

An explode() function that ignores characters inside quotes?

Does somebody know a quick and easy explode() like function that can ignore splitter characters that are enclosed in a pair of arbitrary characters (e.g. quotes)?

Example:

my_explode(
  "/", 
  "This is/a string/that should be/exploded.//But 'not/here',/and 'not/here'"
);

should result in an array with the following members:

This is
a string 
that should be 
exploded.

But 'not/here', 
and 'not/here'

the fact that the characters are wrapped in single quotes would spare them from being splitters.

Bonus points for a solution that can deal with two wrapper characters

(not/here)

A native PHP solution would be preferred, but I don't think such a thing exists!

Upvotes: 9

Views: 7035

Answers (3)

Brilliand
Brilliand

Reputation: 13714

This is near-impossible with preg_split, because you can't tell from the middle of the string whether you're between quotes or not. However, preg_match_all can do the job.

Simple solution for a single type of quote:

function quoted_explode($subject, $delimiter = ',', $quote = '\'') {
    $regex = "(?:[^$delimiter$quote]|[$quote][^$quote]*[$quote])+";
    preg_match_all('/'.str_replace('/', '\\/', $regex).'/', $subject, $matches);
    return $matches[0];
}

That function will have all kinds of problems if you pass it certain special characters (\^-], according to http://www.regular-expressions.info/reference.html), so you'll need to escape those. Here's a general solution that escapes special regex characters and can track multiple kinds of quotes separately:

function regex_escape($subject) {
    return str_replace(array('\\', '^', '-', ']'), array('\\\\', '\\^', '\\-', '\\]'), $subject);
}

function quoted_explode($subject, $delimiters = ',', $quotes = '\'') {
    $clauses[] = '[^'.regex_escape($delimiters.$quotes).']';
    foreach(str_split($quotes) as $quote) {
        $quote = regex_escape($quote);
        $clauses[] = "[$quote][^$quote]*[$quote]";
    }
    $regex = '(?:'.implode('|', $clauses).')+';
    preg_match_all('/'.str_replace('/', '\\/', $regex).'/', $subject, $matches);
    return $matches[0];
}

(Note that I keep all of the variables between square brackets to minimize what needs escaping - outside of square brackets, there are about twice as many special characters.)

If you wanted to use ] as a quote, then you probably wanted to use [ as the corresponding quote, but I'll leave adding that functionality as an exercise for the reader. :)

Upvotes: 5

greg0ire
greg0ire

Reputation: 23255

Something very near with preg_split : https://www.php.net/manual/en/function.preg-split.php#92632

It handles multiple wrapper characters AND multiple delimiter characters.

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799082

str_getcsv($str, '/')

There's a recipe for <5.3 on the linked page.

Upvotes: 8

Related Questions