code.feind
code.feind

Reputation: 77

extract css classes and ID's from source with php

I thought this was going to be pretty simple, but I've been struggling with it now for a while. I know there are CSS parser classes out there that can acheive what I want to do... but I don't need 95% of the functionality they have, so they're not really feasible and would just be too heavy.

All I need to be able to do is pull out any class and/or ID names used in a CSS file via regex. Here's the regex I thought would work, but hasn't.

[^a-z0-9][\w]*(?=\s)

When run against my sample:

.stuffclass {
 color:#fff;
 background:url('blah.jpg');
}
.newclass{
 color:#fff;
 background:url('blah.jpg');
}
.oldclass {
 color:#fff;
 background:url('blah.jpg');
}
#blah.newclass {
 color:#fff;
 background:url('blah.jpg');
}
.oldclass#blah{
 color:#fff;
 background:url('blah.jpg');
}
.oldclass #blah {
 color:#fff;
 background:url('blah.jpg');
}
.oldclass .newclass {
 text-shadow:1px 1px 0 #fff;
 color:#fff;
 background:url('blah.jpg');
}
.oldclass:hover{
 color:#fff;
 background:url('blah.jpg');
}
.newclass:active {
 text-shadow:1px 1px 0 #000;
}

It does match most of what I want, but it's also including the curly brackets and doesn't match the ID's. I need to match the ID's and Classes separately when conjoined. So basically #blah.newclass would be 2 separate matches: #blah AND .newclass.

Any ideas?

===================

FINAL SOLUTION

I wound up using 2 regex to first strip out everything between { and }, then simply matched the selectors based on the remaining input.

Here's a full working example:

//Grab contents of css file
$file = file_get_contents('css/style.css');

//Strip out everything between { and }
$pattern_one = '/(?<=\{)(.*?)(?=\})/s';

//Match any and all selectors (and pseudos)
$pattern_two = '/[\.|#][\w]([:\w]+?)+/';

//Run the first regex pattern on the input
$stripped = preg_replace($pattern_one, '', $file);

//Variable to hold results
$selectors = array();

//Run the second regex pattern on $stripped input
$matches = preg_match_all($pattern_two, $stripped, $selectors);

//Show the results
print_r(array_unique($selectors[0]));

Upvotes: 3

Views: 2535

Answers (4)

modded
modded

Reputation: 21

This version is based on nealio82's, but adding pseudo-selectors: [^a-z0-9][\w:-]+(?=\s)

Upvotes: 1

Andrew Ward
Andrew Ward

Reputation: 61

The solution posted by OP works, though it didn't work for me with CSS classes that had hyphens. As such, I've amended the second pattern to work more effectively:

$pattern_two = '/[\.|#]([A-Za-z0-9_\-])*(\s?)+/';

Upvotes: 0

K..
K..

Reputation: 4223

/(?<!:\s)[#.][\w]*/

some thing like this? excludes the #FFFFFF color stuff...

Upvotes: 0

nealio82
nealio82

Reputation: 2633

[^a-z0-9][\w]+(?=\s)

I changed your * to a + match

It works fine in RegEXR - an awesome regex development tool: http://gskinner.com/RegExr/ (See bottom right of window to download the desktop version)

Upvotes: 1

Related Questions