Reputation: 166329
I'm using boss.search
table to extract the data in XML, but I would like to retrieve specific fields.
The standard syntax is:
SELECT * from boss.search WHERE q="stackoverflow" AND ck = 'ConsumerKey' AND secret = 'ConsumerSecret';
which works.
But when I'm trying to select fields like "/title"
or "/results/result/title"
, I've got the syntax error:
Query syntax error(s) [line 1:7 expecting fields_or_star got '/result/title']
What's the correct syntax of retrieving fields (such as title
)?
The REST response looks like:
<results>
<bossresponse>
<web>
<results>
<result>
<date/>
<clickurl>https://example.com/</clickurl>
<url>https://example.com/</url>
<dispurl type="default">https://example.com/</dispurl>
<title type="default">Example</title>
<abstract type="default">Example</abstract>
</result>
<result>
Upvotes: 1
Views: 604
Reputation: 166329
You can check that at 'Syntax of SELECT - Specifying the Elements Returned' (cached) Yahoo Query Language documentation.
If the fields in the result set contain sub-fields, you can indicate the sub-fields by using periods (dots) as delimiters.
Custom tables
For example, for the social.profile
table, to get only the imageUrl
sub-field of the image
field from the following example:
. . .
<results>
<profile xmlns="http://social.yahooapis.com/v1/schema.rng">
<image>
<imageUrl>http://l.yimg.com/us.yimg.com/i/identity/nopic_192.gif</imageUrl>
</image>
</profile>
</results>
enter the following:
select image.imageUrl from social.profile where guid=me
BOSS API
In BOSS response it's a bit different. To get title from boss.search table, you need to name the field as: web.results.result.title
, e.g.
SELECT web.results.result.title FROM boss.search WHERE q="stackoverflow"
HTML
To get just the content from an HTML page, you can specify content
keyword after the word SELECT
. A statement with the content
keyword processes the HTML in the following order:
content
within the elements found.content
is not found, the statement looks for an attribute named content
.content
is found, the statement returns the element's textContent
.For example, the following statement extracts only the HTML links (href
tags) for Yahoo Groups:
SELECT href FROM html WHERE url="http://groups.yahoo.com/search?query=surfing&sort=relevance" and compat="html5" AND xpath='//li[contains(@class,"hbox groupsSearch-result-entry")]/h4/a'
The following statement, for example, returns the textContent of each anchor a
tag retrieved by the XPath expression:
SELECT content FROM html WHERE url="http://groups.yahoo.com/search?query=surfing&sort=relevance" AND compat="html5" AND xpath='//li[contains(@class,"hbox groupsSearch-result-entry")]/h4/a'
Source: Extracting HTML - Using YQL and Open Data Tables at YQL Guide
Upvotes: 1