frenchie
frenchie

Reputation: 52047

Regex to extract CSS classnames and IDs

I have some CSS and I'm looking to create a list of all the class names and identifiers. This is what I have:

var TheList = new List<string>();
var Test2 = Regex.Matches(TheCSS, ".-?[_a-zA-Z]+[_a-zA-Z0-9-]*(?=[^}]*\\{)");

foreach(Match m in Test2)
{
    TheList.Add(m.Value);
}

The problem is that there are some unwanted elements:

body
:hover
select
input
label
[for
input
[type
'radio

I've tried with several regex expressions that I've found online; this one is the closest but it's not perfect yet. Basically, it needs to include only elements that begin with # and . so as to avoid body and [type and then not include pseudo-selectors like :hover

What do I need to change in the regex to make it work?

Thanks.

Upvotes: 1

Views: 1520

Answers (1)

Niels Keurentjes
Niels Keurentjes

Reputation: 41968

Following the CSS standards, a class or ID must match [_A-Za-z0-9\-]+. A class or ID thus matches that string prefixed directly by either a # or ..

After determining that all you need to do is ensure that it's followed by a { before an } occurs to make sure you're outside a rule.

The resulting regexp would then be: ([\.#][_A-Za-z0-9\-]+)[^}]*{

Your sample case. Same regexp applied to Facebook CSS.

Upvotes: 5

Related Questions