YQL - CDATA ]] error when selecting data using YQL

Question

Trying to scrape data from totalfilm.com using YQL but I'm getting a strange error:

"The character sequence "]]>" must not appear in content unless used to mark the end of a CDATA section."

select * from html where url="www.totalfilm.com"

link

salathe · Accepted Answer

As commented, some fudging may need to occur to get the broken XHTML working as you would like.

Here is a quick, very crude open data table for you which strips any and ]]> from an (X)HTML page (and also Tidys it), before applying an optional XPath expression, as in the normal html table, to get at the data you need.



You can use it like:

use "https://github.com/salathe/yql-tables/raw/examples/data/nocdata.xml" as html;
select * from html where url="www.totalfilm.com"

YQL - CDATA ]]> error when selecting data using YQL

Answers (1)

Related Questions

YQL - CDATA ]]&gt; error when selecting data using YQL

Answers (1)

Related Questions

YQL - CDATA ]]> error when selecting data using YQL