TJR
TJR

Reputation: 6577

RegEx - get count of elements which are not wrapped

For example I have a string like this:

first:second:third"test:test":fourth

I want to count the ':' and later to split every ':' to get the strings.

This is my regex:

/(.*):(.*)/iU

I don't know if this is the best solution, but it works. There is a different between a '.' and a "[...] : [...]" so I need to seperate them. I realized that my regex counts the : but continues when the : is between ".

I tried to solve this with this regex:

/(((.*)[^"]):((.*)[^"]))/iU

I thought this is the right way, but it isn't. I tried to learn the regex syntax, but I don't understand this problem.

This regex just means: search for ':' - every think can be infornt and after it EXCEPT wehen a " is in front of it AND a " is after it.

Maybe you can help me.

edit: I use my regex in PHP - maybe this is an important information

Upvotes: 2

Views: 246

Answers (3)

Pierluc SS
Pierluc SS

Reputation: 3176

This regex should do it, if it match your needs and you want additional explanation, just ask :)

(?<=:|^)(?<!"[^:][^"]+:)\w+?(?=:|"|$)

That's the test string I used

"test1:test2:test3":first:second:third"test1:test2:test3":fourth:fifth"test1:test2:test3":sixth

And these are 6 following matches:

first
second
third
fourth
fifth
sixth

Upvotes: 2

Tim Pietzcker
Tim Pietzcker

Reputation: 336128

How about using

$result = preg_split(
    '/:       # Match a colon
    (?=       # only if followed by
     (?:      # the following group:
      [^"]*"  #  Any number of characters except ", followed by one "
      [^"]*"  #  twice in a row (to ensure even number of "s)
     )*       # (repeated zero or more times)
     [^"]*    # followed by any number of non-quotes until...
     $        # the end of the string.
    )         # End of lookahead assertion
    /x', 
    $subject);

which will give you the result

first
second
third"test:test"
fourth

directly?

This regex splits on a : only if it's followed by an even number of quotes. This means that it won't split on a : inside a string:

Upvotes: 4

Shiplu Mokaddim
Shiplu Mokaddim

Reputation: 57650

I love parsing text. So I write a parser for you.

$sample = 'first:second:third"test:test":fourth';
$len = strlen($sample);
$c =0;
$buffer="";
$output = array();
$instr = false;
for($i =0; $i< $len; $i++){
    if($sample[$i]=='"' or $sample[$i]=="'"){
        $c++;
        $instr= $c%2==0 ? false: true;
        $buffer.=$sample[$i];
    }elseif(!$instr and $sample[$i]==':'){
        $output[]=$buffer;
        $buffer = "";
    }else{
        $buffer.=$sample[$i];
    }
}
if($buffer) $output[] = $buffer;

print_r($output);

See the code in action. Also note for huge string regular expression will perform poor.

Upvotes: 0

Related Questions