Reputation: 6577
For example I have a string like this:
first:second:third"test:test":fourth
I want to count the ':' and later to split every ':' to get the strings.
This is my regex:
/(.*):(.*)/iU
I don't know if this is the best solution, but it works. There is a different between a '.' and a "[...] : [...]" so I need to seperate them. I realized that my regex counts the : but continues when the : is between ".
I tried to solve this with this regex:
/(((.*)[^"]):((.*)[^"]))/iU
I thought this is the right way, but it isn't. I tried to learn the regex syntax, but I don't understand this problem.
This regex just means: search for ':' - every think can be infornt and after it EXCEPT wehen a " is in front of it AND a " is after it.
Maybe you can help me.
edit: I use my regex in PHP - maybe this is an important information
Upvotes: 2
Views: 246
Reputation: 3176
This regex should do it, if it match your needs and you want additional explanation, just ask :)
(?<=:|^)(?<!"[^:][^"]+:)\w+?(?=:|"|$)
That's the test string I used
"test1:test2:test3":first:second:third"test1:test2:test3":fourth:fifth"test1:test2:test3":sixth
And these are 6 following matches:
first
second
third
fourth
fifth
sixth
Upvotes: 2
Reputation: 336128
How about using
$result = preg_split(
'/: # Match a colon
(?= # only if followed by
(?: # the following group:
[^"]*" # Any number of characters except ", followed by one "
[^"]*" # twice in a row (to ensure even number of "s)
)* # (repeated zero or more times)
[^"]* # followed by any number of non-quotes until...
$ # the end of the string.
) # End of lookahead assertion
/x',
$subject);
which will give you the result
first
second
third"test:test"
fourth
directly?
This regex splits on a :
only if it's followed by an even number of quotes. This means that it won't split on a :
inside a string:
Upvotes: 4
Reputation: 57650
I love parsing text. So I write a parser for you.
$sample = 'first:second:third"test:test":fourth';
$len = strlen($sample);
$c =0;
$buffer="";
$output = array();
$instr = false;
for($i =0; $i< $len; $i++){
if($sample[$i]=='"' or $sample[$i]=="'"){
$c++;
$instr= $c%2==0 ? false: true;
$buffer.=$sample[$i];
}elseif(!$instr and $sample[$i]==':'){
$output[]=$buffer;
$buffer = "";
}else{
$buffer.=$sample[$i];
}
}
if($buffer) $output[] = $buffer;
print_r($output);
See the code in action. Also note for huge string regular expression will perform poor.
Upvotes: 0