Reputation: 89
I am working with a webBrowser in C# and I need to get the text from the link. The link is just a href without a class.
its like this
<div class="class1" title="myfirstClass">
<a href="link.php">text I want read in C#
<span class="order-level"></span>
Shouldn't it be something like this?
HtmlElementCollection theElementCollection = default(HtmlElementCollection);
theElementCollection = webBrowser1.Document.GetElementsByTagName("div");
foreach (HtmlElement curElement in theElementCollection)
{
if (curElement.GetAttribute("className").ToString() == "class1")
{
HtmlElementCollection childDivs = curElement.Children.GetElementsByName("a");
foreach (HtmlElement childElement in childDivs)
{
MessageBox.Show(childElement.InnerText);
}
}
}
Upvotes: 3
Views: 2313
Reputation: 2952
This is how you get the element by tag name:
String elem = webBrowser1.Document.GetElementsByTagName("div");
And with this you should extract the value of the href:
var hrefLink = XElement.Parse(elem)
.Descendants("a")
.Select(x => x.Attribute("href").Value)
.FirstOrDefault();
If you have more then 1 "a" tag in it, you could also put in a foreach loop if that is what you want.
EDIT:
With XElement:
You can get the content including the outer node by calling element.ToString()
.
If you want to exclude the outer tag, you can call String.Concat(element.Nodes())
.
To get innerHTML with HtmlAgilityPack
:
HtmlWeb web = new HtmlWeb();
HtmlDocument dc = web.Load("Your_Url");
var s = dc.DocumentNode.SelectSingleNode("//a[@name="a"]").InnerHtml;
I hope it helps!
Upvotes: 1
Reputation: 4598
Here I created the console app to extract the text of anchor.
static void Main(string[] args)
{
string input = "<div class=\"class1\" title=\"myfirstClass\"><a href=\"link.php\">text I want read in C#<span class=\"order-level\"></span>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(input);
foreach (HtmlNode item in doc.DocumentNode.Descendants("div"))
{
var link = item.Descendants("a").First();
var text = link.InnerText.Trim();
Console.Write(text);
}
Console.ReadKey();
}
Note this is htmlagilitypack
question so tag the question properly.
Upvotes: 1