Kasper Hansen
Kasper Hansen

Reputation: 6547

Creating a linq query where 2 values are extracted

I want to extract some information from a website and I use HtmlAgilityPack and linq to create queries on the HTML.

Here in this particular example I want to get the value of the m_name in the href attribute in the A-tag and then the value of the src attribute in the IMG tag.

<A href="/index.php?lang=eng&ssid=&wbid=&refid=website.com&mref=&showall=0&Submit=m_info&refname=&id=37447&m_name=LacosteShoe">
    <DIV name="prdiv1" id="prdiv1" overflow:hidden;">
        <IMG name="pic1" id="pic1" class=pic_2 alt="for sale here for 2 days" title="for sale here for 2 days" src="item/preview/37447_pr2.jpg?55995" >
    </DIV>
</A>

I would like to make a List<string,string> of these values such that in this example that

list.add("LacosteShoe","item/preview/37447_pr2.jpg?55995");

Is it possible to do this in a linq query? It is far to advanced to my beginners knowledge. Also I would have to make sure that it doesn't fail if for example the attribute href doesn't exist.

I basically got this so far:

var query = document.DocumentNode.Descendants("a")
   .Where(a => a.Attributes["href"].Value.Contains("m_name=")
Select();

Upvotes: 0

Views: 157

Answers (2)

Igor Ševo
Igor Ševo

Reputation: 5495

var query = document.DocumentNode.Descendants("a")
    .Where(a => a.Attributes["href"].Value.Contains("m_name=")
    .Select(b => new {Name=ExtractName(b.Attributes["href"].Value),
    Link=b.Descendants("div").First()
    .Descendants("img").First().Attributes["src"].Value}).ToList();

Define the function ExtractName(string str); to extract the name from the href value. You can use Regex for this.

Upvotes: 2

fahadash
fahadash

Reputation: 3281

Try

List<string> products = document.DocumentNode.Descendants("a")
.Where(a => a.Attributes["href"] != null   
 &&a.Attributes["href"].Value.Contains("m_name=")).Select(l => 
l.Attributes["href"].Substring(l.Attributes["href"].IndexOf("m_name=") + 7)).ToList();

Upvotes: 1

Related Questions