TEK
TEK

Reputation: 1265

Obtain child anchor element within WebBrowser control

Preamble

I'm using the WebBrowser control, which a user will interact with, so a solution will need to work with a visible WebBrowser control.

Question

How do I check if an element has an anchor as a child? All browsers are able to distinguish that an element contains an anchor (<a href=""...), and offers "open in new tab" functionality. That is what I am attempting to replicate. However, when I right click on a HtmlElement I'm only able to obtain the parent element.

Example

Taking the BBC website as an example, when I right click on the highlighted element (picture below), my output is DIV, but viewing the source code there is an anchor element as a child of this div.

bbc homepage example

SSCCE

using System;
using System.Diagnostics;
using System.Windows.Forms;

namespace BrowserLinkClick
{
    public partial class Form1 : Form
    {
        private WebBrowser wb;
        private bool firstLoad = true;

        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            wb = new WebBrowser();
            wb.Dock = DockStyle.Fill;
            Controls.Add(wb);
            wb.Navigate("http://bbc.co.uk");
            wb.DocumentCompleted += wb_DocumentCompleted;
        }

        private void Document_MouseDown(object sender, HtmlElementEventArgs e)
        {
            if (e.MouseButtonsPressed == MouseButtons.Right)
            {
                HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
                //I assume I need to check if this element has child elements that contain a TagName "A"
                if (element.TagName == "A")
                    Debug.WriteLine("Get link location, open in new tab.");
                else
                    Debug.WriteLine(element.TagName);
            }
        }


        private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            if (firstLoad)
            {
                wb.Document.MouseDown += new HtmlElementEventHandler(Document_MouseDown);
                firstLoad = false;
            }
        }

    }
}

Please test any proposed solution using the BBC website and the highlighted headline (the headline changes, but the DOM remains the same).

Upvotes: 4

Views: 2566

Answers (5)

Yuriy Zaletskyy
Yuriy Zaletskyy

Reputation: 5151

The challenge with bbc web site, that it have little bit non standard approach toward their url. Below goes one of the samples of their a href:

<A tabIndex=-1 aria-hidden=true class=block-link__overlay-link href="http://www.bbc.com/news/world-africa-36132482" rev=hero5|overlay>Barbie challenges the 'white saviour complex' </A>

so, you need to use two parts in if:
1. element.TagName == "A" 2. check attribute href like this: element.GetAttribute("href")

Those two checks can give you guaranty that you deal with something with tag a, and that tag a has attribute href. See another example of usage:

private void Document_MouseDown(object sender, HtmlElementEventArgs e)
    {
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            //I assume I need to check if this element has child elements that contain a TagName "A"
            if (element.TagName == "A" && !string.IsNullOrEmpty(element.GetAttribute("href")))//it means we have deal with href
            {
                Debug.WriteLine("Get link location, open in new tab.");
                var url = element.GetAttribute("href");
                Debug.WriteLine(url);
            }

            else
                Debug.WriteLine(element.TagName);
        }
    }

Upvotes: 1

Ryan Ward
Ryan Ward

Reputation: 61

There has to be something else wrong with your program. On the BBC website your code works for the news articles (although I see the non UK version of the site). On other websites where there are anchor elements as children the code below works

 private void Document_MouseDown(object sender, HtmlElementEventArgs e)
    {
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            if (element.Children.Count > 0)
            {
                foreach (HtmlElement child in element.Children)
                {
                    if (child.TagName == "A")
                        Debug.WriteLine("Get link location, open in new tab.");
                }
            }
            else
            {
                //I assume I need to check if this element has child elements that contain a TagName "A"
                if (element.TagName == "A")
                    Debug.WriteLine("Get link location, open in new tab.");
                else
                    Debug.WriteLine(element.TagName);
            }
        }
    }

Upvotes: 2

Yuriy Zaletskyy
Yuriy Zaletskyy

Reputation: 5151

I propose you the following solution:
url variable will have url of your desired output, you'll be able to see it in debugger window.

private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            //I assume I need to check if this element has child elements that contain a TagName "A"
            if (element.TagName == "A")
            {
                Debug.WriteLine("Get link location, open in new tab.");
                var urlRaw = element.OuterHtml;
                string hrefBegin = "href=";
                var idxHref = urlRaw.IndexOf(hrefBegin) + hrefBegin.Length + 1;
                var idxEnd = urlRaw.IndexOf("\"", idxHref + 1);
                var url = urlRaw.Substring(idxHref, idxEnd - idxHref);
                Debug.WriteLine(url);
            }

            else
                Debug.WriteLine(element.TagName);
        }
    }

Upvotes: 2

Gabor
Gabor

Reputation: 3246

To access the needed properties you need to cast the HtmlElement to one of the unmanaged MSHTML interfaces, e.g. IHTMLAnchorElement

You have to add Microsoft HTML Object Library COM reference to your project.
(The file name is mshtml.tlb.)

foreach (HtmlElement child in element.Children)
{
    if (String.Equals(child.TagName, "a", StringComparison.OrdinalIgnoreCase))
    {
        var anchorElement = (mshtml.IHTMLAnchorElement)child.DomElement;
        Console.WriteLine("href: [{0}]", anchorElement.href);
    }
}

There are plenty of such interfaces but MSDN will help you choose. :)

Scripting Object Interfaces (MSHTML)

Upvotes: 2

diiN__________
diiN__________

Reputation: 7666

You have to get the child elements of element before checking if it's an anchor:

HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
foreach (HtmlElement child in element.Children)
{
    if (child.TagName == "A")
        Debug.WriteLine("Get link location, open in new tab.");
}

Upvotes: 2

Related Questions