PHP - extract data from a web page HTML

Question

I need to extract the words FIESTA ERASMUS ans /event/83318 in the following HTML code

      
            
                
                    FIESTA ERASMUS 
                    
                    soirée étudiante             
                    Duplex
                house, electro, r&b chic, latino, disco
                    pass

I tested something like this

$PATTERN = "/\(.*)/"
preg_match($PATTERN, $html, $matches);

but it doesnt work.

ᴍᴇʜᴏᴠ · Accepted Answer

I suggest the following pattern:

$PATTERN = '%(.*?)[\s]+%i';
preg_match($PATTERN, $html, $matches);

The (.*?) part is a non-greedy pattern, which means that the parser won't go all the way to the end of the supplied string but will stop before the " in this case.

You may also want to pre-proccess the html before REGEX'ing it, i.e. remove all line-breaks in order to get rid of the [\s]+ part.

You can try it online here.

PHP - extract data from a web page HTML

Answers (2)

Related Questions