Reputation: 1265
Preamble
I'm using the WebBrowser
control, which a user will interact with, so a solution will need to work with a visible WebBrowser
control.
Question
How do I check if an element has an anchor as a child? All browsers are able to distinguish that an element contains an anchor (<a href=""...
), and offers "open in new tab" functionality. That is what I am attempting to replicate. However, when I right click on a HtmlElement
I'm only able to obtain the parent element.
Example
Taking the BBC website as an example, when I right click on the highlighted element (picture below), my output is DIV
, but viewing the source code there is an anchor element as a child of this div
.
SSCCE
using System;
using System.Diagnostics;
using System.Windows.Forms;
namespace BrowserLinkClick
{
public partial class Form1 : Form
{
private WebBrowser wb;
private bool firstLoad = true;
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
wb = new WebBrowser();
wb.Dock = DockStyle.Fill;
Controls.Add(wb);
wb.Navigate("http://bbc.co.uk");
wb.DocumentCompleted += wb_DocumentCompleted;
}
private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
if (e.MouseButtonsPressed == MouseButtons.Right)
{
HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
//I assume I need to check if this element has child elements that contain a TagName "A"
if (element.TagName == "A")
Debug.WriteLine("Get link location, open in new tab.");
else
Debug.WriteLine(element.TagName);
}
}
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (firstLoad)
{
wb.Document.MouseDown += new HtmlElementEventHandler(Document_MouseDown);
firstLoad = false;
}
}
}
}
Please test any proposed solution using the BBC website and the highlighted headline (the headline changes, but the DOM remains the same).
Upvotes: 4
Views: 2566
Reputation: 5151
The challenge with bbc web site, that it have little bit non standard approach toward their url. Below goes one of the samples of their a href:
<A tabIndex=-1 aria-hidden=true class=block-link__overlay-link href="http://www.bbc.com/news/world-africa-36132482" rev=hero5|overlay>Barbie challenges the 'white saviour complex' </A>
so, you need to use two parts in if:
1. element.TagName == "A"
2. check attribute href like this: element.GetAttribute("href")
Those two checks can give you guaranty that you deal with something with tag a, and that tag a has attribute href. See another example of usage:
private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
if (e.MouseButtonsPressed == MouseButtons.Right)
{
HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
//I assume I need to check if this element has child elements that contain a TagName "A"
if (element.TagName == "A" && !string.IsNullOrEmpty(element.GetAttribute("href")))//it means we have deal with href
{
Debug.WriteLine("Get link location, open in new tab.");
var url = element.GetAttribute("href");
Debug.WriteLine(url);
}
else
Debug.WriteLine(element.TagName);
}
}
Upvotes: 1
Reputation: 61
There has to be something else wrong with your program. On the BBC website your code works for the news articles (although I see the non UK version of the site). On other websites where there are anchor elements as children the code below works
private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
if (e.MouseButtonsPressed == MouseButtons.Right)
{
HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
if (element.Children.Count > 0)
{
foreach (HtmlElement child in element.Children)
{
if (child.TagName == "A")
Debug.WriteLine("Get link location, open in new tab.");
}
}
else
{
//I assume I need to check if this element has child elements that contain a TagName "A"
if (element.TagName == "A")
Debug.WriteLine("Get link location, open in new tab.");
else
Debug.WriteLine(element.TagName);
}
}
}
Upvotes: 2
Reputation: 5151
I propose you the following solution:
url variable will have url of your desired output, you'll be able to see it in debugger window.
private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
if (e.MouseButtonsPressed == MouseButtons.Right)
{
HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
//I assume I need to check if this element has child elements that contain a TagName "A"
if (element.TagName == "A")
{
Debug.WriteLine("Get link location, open in new tab.");
var urlRaw = element.OuterHtml;
string hrefBegin = "href=";
var idxHref = urlRaw.IndexOf(hrefBegin) + hrefBegin.Length + 1;
var idxEnd = urlRaw.IndexOf("\"", idxHref + 1);
var url = urlRaw.Substring(idxHref, idxEnd - idxHref);
Debug.WriteLine(url);
}
else
Debug.WriteLine(element.TagName);
}
}
Upvotes: 2
Reputation: 3246
To access the needed properties you need to cast the HtmlElement
to one of the unmanaged MSHTML interfaces, e.g. IHTMLAnchorElement
You have to add Microsoft HTML Object Library
COM reference to your project.
(The file name is mshtml.tlb
.)
foreach (HtmlElement child in element.Children)
{
if (String.Equals(child.TagName, "a", StringComparison.OrdinalIgnoreCase))
{
var anchorElement = (mshtml.IHTMLAnchorElement)child.DomElement;
Console.WriteLine("href: [{0}]", anchorElement.href);
}
}
There are plenty of such interfaces but MSDN will help you choose. :)
Scripting Object Interfaces (MSHTML)
Upvotes: 2
Reputation: 7666
You have to get the child elements of element
before checking if it's an anchor:
HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
foreach (HtmlElement child in element.Children)
{
if (child.TagName == "A")
Debug.WriteLine("Get link location, open in new tab.");
}
Upvotes: 2