Reputation: 37810
I'm looking for a regular expression (in PHP PCRE) that can match media queries and their contents reliably, including the somewhat odd case where a media query body is empty. Source text might be:
@media only screen {
p {
color:red;
}
}
@media only screen and (max-width: 596px) {
p {
color:blue;
}
img {
max-width: 200px;
}
}
@media only screen {
}
img {
display: block;
}
@media only screen and (max-width: 240px) {
p {
color:green;
}
}
p {
font-weight: normal;
}
I want to capture each media query and its CSS body as subpatterns, so I'll end up with a PHP array like:
[['@media only screen {
p {
color:red;
}
}','p {
color:red;
}'],...]
The key thing is that this needs to be a recursive or subroutine pattern in order to balance the braces. The empty query is enough to confuse the pattern in this question because it can't distinguish the end of a css rule from the end of the empty media query:
/@media[^{]+\{([\s\S]+?\})\s*\}/
I've been trying and failing to use the advice in this article to make a pattern of the form (b(?:m|(?1))*e)
, where b
is what begins the construct, m
is what can occur in the middle of the construct, and e
is what can occur at the end, and none of them can match the same thing.
So, b
should be @media[^{]+\{
, e
should be \}
, and m
needs to consume CSS rules, perhaps ([^{]+?\{[^}]*?\s*\})
, giving me:
/(@media[^{]+\{(?:([^{]+?\{[^}]*?\}\s*)*|(?1))*\})/s
However, that doesn't work so I'm a bit lost. Can anyone suggest an effective pattern?
I've set up a regex test here.
Alternatively, a non-regex parser might work better.
Note that I'm not attempting to validate or match CSS selectors in general (not a job for a regex), just grab the content of the query and its body.
Update added more sample content, explained what I want to get out.
Upvotes: 2
Views: 938
Reputation: 627469
If you are sure the blocks you want to match always have a balanced number of braces, you can use a regex with subroutine like this:
'~@media\b[^{]*({((?:[^{}]+|(?1))*)})~'
See the regex demo
And here is an IDEONE demo:
$re = '~@media\b[^{]*({((?:[^{}]+|(?1))*)})~';
$str = "@media only screen {\n p {\n color:red;\n }\n}\n@media only screen and (max-width: 596px) {\n p {\n color:blue;\n }\n img {\n max-width: 200px;\n }\n}\n@media only screen {\n\n}\nimg {\n display: block;\n}\n@media only screen and (max-width: 240px) {\n p {\n color:green;\n }\n}\np {\n font-weight: normal;\n}";
preg_match_all($re, $str, $matches, PREG_PATTERN_ORDER);
print_r($matches[0]);
print_r($matches[2]);
Pattern details:
@media\b
- match @media
as a whole word (since \b
is a word boundary)[^{]*
- match 0+ characters other than {
({((?:[^{}]+|(?1))*)})
- a capturing group #1 capturing the {...}
blocks with the balanced number of {
and }
(note it is a technical group, we need to recurse this group subpattern in order to correctly match the {...}
s). It matches...
{
- an opening brace((?:[^{}]+|(?1))*)
- Group 2 (the contents inside the balanced {...}
) matching
[^{}]+
- 1+ characters other than {
and }
(because we need to match everything that is not the leading and trailing delimiters)|
- or...(?1)
- the whole Group 1 subpattern}
- a closing braceNote that $matches[2]
can be further processed with preg_match_all('~\s*(\w+)\s*{\s*([^}]*?)\s*}~', $matches[2], $subblocks)
pattern.
Upvotes: 4