Eugene
Eugene

Reputation: 4389

Parsing html page in php

Today when I was parsing one page with Simple HTML DOM parser I didn't get any result. So I thought, that it must be strange. So I went to see HTML code written there. I found that there's many mistakes.

So here comes the question. What to do in state, when parser works correctly, but HTML is a mess. Maybe some one would suggest some aproach or some other parser which is able to handle, that sort of matters.

Thank you all for help.

Upvotes: 3

Views: 516

Answers (2)

dogmatic69
dogmatic69

Reputation: 7575

Seems like php's built in stuff should work fine for the html that is not so well written. Have a read in the comments as some people have info about it.

http://docs.php.net/manual/en/domdocument.loadhtml.php

Upvotes: 0

David Gillen
David Gillen

Reputation: 1172

Run it through tidy before trying to load it into a DOM tree, http://php.net/manual/en/book.tidy.php

Upvotes: 2

Related Questions