Reputation: 509
What is a regular expression that can be used to validate a CSS selector, and can do so in a way that a invalid selector halts quickly.
Valid selectors:
EE
#myid
.class
.class.anotherclass
EE .class
EE .class EEE.anotherclass
EE[class="test"]
.class[alt~="test"]
#myid[alt="test"]
EE:hover
EE:first-child
E[lang|="en"]:first-child
EE#test .class>.anotherclass
EE#myid.classshit.anotherclass[class~="test"]:hover
EE#myid.classshit.anotherclass[class="test"]:first-child EE.Xx:hover
Invalid selectors, e.g. contain extra whitespace at the end of the line:
EE:hover EE
EE .class EEE.anotherclass
EE#myid.classshit.anotherclass[class="test"]:first-child EE.Xx:hov 9
EE#myid.classshit.anotherclass[class="test"]:first-child EE.Xx:hov -daf
Upvotes: 0
Views: 1598
Reputation: 63566
Regular expressions are the wrong tool. CSS selectors are way to complex. Example:
bo\
dy:not(.\}) {}
Use a parser with a real tokenizer like this one: PHP-CSS-Parser. It is easier to rewrite it to Java than getting regex right.
Upvotes: 4
Reputation: 11
It's a Regex that I use in my codes:
[+>~, ]?\s*(\w*[#.]\w+|\w+|\*)+(:[\w\-]+\([\w\s\-\+]*\))*(\[[\w ]+=?[^\]]*\])*([#.]\w+)*(:[\w\-]+\([\w\s\-\+]*\))*
After tokenized I use the trim function to remove extra spaces e.g.:
expression:
EE.class EE#id.class
tokens:
EE.class
EE#id.class
tokens after trim:
EE.class
EE#id.class
OR e.g.
>EE.class (Alert when it's a direct child, then I treat with any substring code )
Other routines can check if token is a number e.g.
You can use http://regexpal.com/ for tests.
Upvotes: 1
Reputation: 12299
The problem with yer typical regular expression is that they are unable to handle arbitrary levels of nesting. They have no memory. Consider a string of some number of a's followed by the same number of b's: aaabbb
and a reasonable regexp a*b*
. When the regexp gets to the first 'b' it has no memory how many a's it recognized and therefore it can't recognize the same number of b's.
Now replace a and b with (
and )
, IF
and END
, <x>
and </x>
etc... and you can see the problem.
Upvotes: 0