Reputation: 468
This question is very similar to use preg_split instead of split but I've got some confusions with the regex that I'd live to clear up.
Trying to update some existing split()
functions to use preg_split()
instead and I'm getting some unclear results. Running the code below will give me arrays of different lengths and I'm not sure why.
From what I can see split is matching on \n with a possible \r beforehand.
And I think preg_split()
is doing the same but then why is it creating 2 splits? Is this to do with lazy/greedy matching?
Demo code :
$test = "\r\n";
$val = split('\r?\n', $test); //literal interpretation of string
$val_new = split("\r?\n", $test); //php understanding that these are EOL chars
$val2 = preg_split('/\r?\n/', $test);
var_dump($val); // returns array(1) { [0]=> string(2) " " }
var_dump($val2); // returns array(2) { [0]=> string(0) "" [1]=> string(0) "" }
Edit : added in $val_new based on Kolinks comments because they helped clear up my understanding of the problem so may be of use to another too
Upvotes: 1
Views: 1093
Reputation: 785126
You should PREG_SPLIT_NO_EMPTY
flag as 3rd argument of preg_split
to ignore empty tokens in the split array. So if you use
preg_split('/\r?\n/', $test, PREG_SPLIT_NO_EMPTY);
then it will behave same as split function.
And by the way your use of \r?\n
in split function is not doing any splitting (since split doesn't understand \r
and \n
in single quotes) and returning your original string back.
Edit: Alternatively you can use split with double quotes regex:
split("\r?\n", $test);
to split your string into 2 elements array.
Upvotes: 2
Reputation: 324640
split
does not understand \r
and \n
as special characters, and because you used single quotes PHP doesn't treat them as special characters either. So split
is looking for literal \\n
or \r\n
.
preg_split
, on the other hand, does understand \r
and \n
as special characters, so even though PHP doesn't treat them as such PCRE does and the string is therefore split correctly.
This has nothing to do with lazy/greedy matching, it's all because of the single quotes not parsing \r\n
into their newline meanings.
Upvotes: 1