Reputation: 4099
Consider the following string:
LoReM {FOO} IPSUM dolor {BAR} Samet {fooBar}
I'm looking for a way to lowercase everything - except what is between {brackets} should be ignored. So the desired output is:
lorem {FOO} ipsum dolor {BAR} samet {fooBar}
In another topic @stema pointed to https://www.php.net/manual/en/functions.anonymous.php to achieve something like this, but I dont understand how:
echo preg_replace_callback('~\{.*?\}~', function ($match) {
return strtolower($match[1]);
}, 'LoReM {FOO} IPSUM dolor {BAR} Samet {fooBar}');
This returns only the string without the bracketed {tags}, and not even lowercased. Who can help me solve this? Any help is greatly appreciated :)
Upvotes: 0
Views: 2348
Reputation: 198124
You want to match all characters except those within {}
. Then replace the match with an strtolower
of it.
To do so, you need to create a pattern that matches everything but the bracket-pairs:
~(?:{\w+}(*SKIP)(*FAIL))|[^{}]+~
This will skip (and drop) all bracket pairs but match everything else that is not a bracket character ({
or }
. You can then just lowercase the match using your callback function:
$str = '{LoReM {FOO} IPSUM { dolor {BAR} Samet {fooBar} Tou}Louse';
$out = preg_replace_callback('~(?:{\w+}(*SKIP)(*FAIL))|[^{}]+~', function($m)
{return strtolower($m[0]);}, $str)
;
echo $out;
Demo, Output:
{lorem {FOO} ipsum { dolor {BAR} samet {fooBar} tou}louse
As the example shows, non-associated brackets aren't a burden. This pattern also specifies how the bracket pairs should be written, \w
stands for any word character, you can replace it with any character-class that full-fills your needs if it's not fitting (e.g. in your duplicate question).
This is actually pretty similar to a question that has already been answered: How to let regex ignore everything between brackets? - it's practically an exact duplicate which I now saw after answering more detailed.
Upvotes: 2
Reputation: 7035
You can use preg_replace() with the PREG_REPLACE_EVAL modifier as in:
$string = 'LoReM {FOO} IPSUM dolor {BAR} Samet {fooBar}';
$pattern = '/(?<![[:word:]{])[[:word:]]*?(?![[:word:]}])/e';
echo preg_replace($pattern, 'strtolower($0)', $string);
Everything that the pattern matches is then replaced by evaluating strtolower()
on the match. If you want to understand the regex it's easiest to start in the middle, (I've separated the blocks with spaces for readability)
(?<![[:word:]{]) [[:word:]]*? (?![[:word:]}])
^ ^ ^
| | |
| +-- match any amount of word characters (alphanums)
| |
+-- that are not preceded by a word character or {
|
+-- and are not followed by a word character or }
Where word characters are alphanumeric characters and underscores.
Upvotes: 1
Reputation: 543
Your expression must catch the other parts:
echo preg_replace_callback('~^.*?{|}.*?{|}.*?$}~', function ($match) {
return strtolower($match[0]);
}, 'LoReM {FOO} IPSUM dolor {BAR} Samet {fooBar}');
Upvotes: 3
Reputation: 91488
Change your regex to:
~(?:^|})(.*?)(?:\{|$)~
explanation:
~ : delimiter
(?: : start non capture group
^|} : begin of string or }
) : end of group
( : start capture group #1
.*? : any number of any char. non greedy
(ie: all char outside of {})
) : end of group
(?: : start non capture group
\{|$ : { or end of string
) : end of group
~ : delimiter
Upvotes: 4
Reputation: 5829
This is the type of problem that a REGEX has a lot of trouble with. A better solution would be to write a parser that reads character by charcter and can switch state.
{
character is read in lowercase mode, switch to uppercase mode.}
character is read in uppercase mode, switch to lowercase mode.Keep in mind that it will be more complicated if you want to handle nested braces.
Upvotes: 0
Reputation: 3493
How about this.
$input = 'LoReM {FOO} IPSUM dolor {BAR} Samet {fooBar}';
preg_match_all('~\{.*?\}~', $input, $matches);
$output = strtolower($input);
foreach ($matches[0] as $match) {
$output = str_replace(strtolower($match), $match, $output);
}
Upvotes: 1
Reputation: 28929
Using preg_replace_callback()
is probably the best method. You just need to fix the regular expression to be this instead:
~(^|\})(.*?)(\{|$)~
And then return this:
return $match[1] . strtolower($match[2]) . $match[3];
Upvotes: 2