smgsaga
smgsaga

Reputation: 23

Use Xpath to query HTML in iOS

I am struggling to query text and value of HTML <select><option> for 2 days, But no luck so far.

I have a html document, the content of a select like below,

<select name="ctl00$ContentPlaceHolder1$ddlAreas" id="ctl00_ContentPlaceHolder1_ddlAreas">
    <option value="01">Area1</option>
    <option value="02">Area2</option>
    <option value="03">Area3</option>
    <option value="04">Area4</option>
</select>

I am using xPath expression to retrieve:

//select[@id=\"ctl00_ContentPlaceHolder1_ddlAreas\"]/option/text() 

the inner text of option, like Area1, 2, 3, 4...

I am using xPath expression to retrieve

//select[@id=\"ctl00_ContentPlaceHolder1_ddlAreas\"]/option/@value 

the value of option, 01, 02, 03, 04...

Actually,I want both of the inner text and value to be extracted, and combined by a delimiter, such as "#". The output I would like to be,

Area1#01,
Area2#02
Area3#03
Area4#04....

I tried to use the method concat(),

//select[@id=\"ctl00_ContentPlaceHolder1_ddlAreas\"]/option/[concat(/text(),\"#\",/@value)]

but it seems that only the first option - Area1 is returned, and there is not any delimiter at all.

I appreciate if anybody figures out a solution.

Upvotes: 1

Views: 761

Answers (2)

Mathias M&#252;ller
Mathias M&#252;ller

Reputation: 22647

Could there be a better solution that retrievs both the text and value one time via an XPath expression?

No, this cannot be done with a single XPath 1.0 expression. The reason why a solution using concat() :

concat(//select[@id = 'ctl00_ContentPlaceHolder1_ddlAreas']/option/text(),"#",//select[@id = 'ctl00_ContentPlaceHolder1_ddlAreas']/option/@value)

only returns the first result:

Area1#01

is that functions in XPath 1.0 that expect a single node as argument, when handed a sequence of nodes, only process the first one and ignore all the rest. Also, in XPath 1.0, functions cannot really be steps in path expressions.

In XPath 2.0 you could have

//select[@id = 'ctl00_ContentPlaceHolder1_ddlAreas']/option/concat(.,'#',@value)

and concat() would be applied to each option element in turn.


To sum up, this cannot be done with pure XPath 1.0. Retrieve all the option element nodes with an XPath expression and process them further outside of XPath, in the higher-level language you embed XPath in - as exemplified by myte.

Upvotes: 0

myte
myte

Reputation: 877

you could use an xml/html parser such tfhpple to parse your html

https://github.com/topfunky/hpple

#import "TFHpple.h"

NSString * html = @"<select name=\"ctl00$ContentPlaceHolder1$ddlAreas\" id=\"ctl00_ContentPlaceHolder1_ddlAreas\"><option value=\"01\">Area1</option><option value=\"02\">Area2</option><option value=\"03\">Area3</option><option value=\"04\">Area4</option></select>";

NSData* data = [html dataUsingEncoding:NSUTF8StringEncoding];

TFHpple *parser = [TFHpple hppleWithHTMLData:data];
NSString *optionPath = @"//select[@id=\"ctl00_ContentPlaceHolder1_ddlAreas\"]/option";
NSArray *optionNodes = [parser searchWithXPathQuery:optionPath];

for (TFHppleElement *element in optionNodes) {

    NSDictionary * attributes = [element attributes];

    if ([attributes objectForKey:@"value"]){

        NSString * str = [NSString stringWithFormat:@"%@#%@",element.text, [attributes objectForKey:@"value"]];

        NSLog(@"%@", str);

    }

}

output is

Area1#01

Area2#02

Area3#03

Area4#04

Upvotes: 1

Related Questions