How to extract the text values of a given attribute using Xpath?

Question

I want to extract the text within the content attribute using X path.

I want to select only "football,cricket,Rugby,Volleyball"

I'm using C#, htmlagilitypack.

this is how I supposed to do this.but it did not work.

private void scrapBtn_Click(object sender, EventArgs e)
        {
            string url = urlTextBox.Text;
            HtmlWeb web = new HtmlWeb();
            HtmlAgilityPack.HtmlDocument doc = web.Load(url);


               try
            {
                var node = doc.DocumentNode.SelectSingleNode("//head/title/text()");
                var node1 = doc.DocumentNode.SelectSingleNode("//head/meta[@name='DESCRIPTION']/@content");

                try
                {
                    label4.Text = "Title:";
                    label4.Text += "	"+node.Name.ToUpper() + ": " + node.OuterHtml;
                }
                catch (NullReferenceException)
                {
                    MessageBox.Show(url + "does not contain ", "Oppz, Sorry");
                }

                try
                {
                    label4.Text += "
Meta Keywords:";
                    label4.Text += "
	" + node1.Name.ToUpper() + ": " + node1.OuterHtml;
                }
                catch (NullReferenceException)
                {
                    MessageBox.Show(url + "does not contain <meta='Keywords'>", "Oppz, Sorry");
                }

            }
            catch(Exception ex){
                MessageBox.Show(ex.StackTrace, "Oppz, Sorry");
            }
        }
</code></pre>

Martin Honnen · Accepted Answer

With HTML Agility Pack you can use doc.SelectSingleNode("/html/head/meta[@name = 'keywords']").Attributes["content"].Value. I think their XPath support for attribute nodes is a bit odd so it is better to select the element and then use the Attributes property to select the attribute and the Value property to extract the value. If you want to use pure XPath to get the attribute value as a string then use doc.CreateNavigator().Evaluate("string(/html/head/meta[@name = 'keywords']/@content)").

How to extract the text values of a given attribute using Xpath?

Answers (2)

Related Questions