spaceplane
spaceplane

Reputation: 607

How can I get the HTML from a Microsoft.Toolkit WebView?

I am using Microsoft.Toolkit.Forms.UI.Controls.WebView, available from https://www.nuget.org/packages/Microsoft.Toolkit.Forms.UI.Controls.WebView

I have added the WebView into my WinForms application - Designer code:

namespace Testing
{
    partial class TestForm
    {
        /// <summary>
        /// Required designer variable.
        /// </summary>
        private System.ComponentModel.IContainer components = null;

        /// <summary>
        /// Clean up any resources being used.
        /// </summary>
        /// <param name="disposing">true if managed resources should be disposed; otherwise, false.</param>
        protected override void Dispose(bool disposing)
        {
            if (disposing && (components != null))
            {
                components.Dispose();
            }
            base.Dispose(disposing);
        }

        #region Windows Form Designer generated code

        /// <summary>
        /// Required method for Designer support - do not modify
        /// the contents of this method with the code editor.
        /// </summary>
        private void InitializeComponent()
        { 
            this.pnlWebPage = new System.Windows.Forms.Panel();
            this.webView = new Microsoft.Toolkit.Forms.UI.Controls.WebView();
            this.pnlWebPage.SuspendLayout();
            ((System.ComponentModel.ISupportInitialize)(this.webView)).BeginInit();
            this.SuspendLayout();                            
            // 
            // pnlWebPage
            // 
            this.pnlWebPage.Controls.Add(this.webView);
            this.pnlWebPage.Dock = System.Windows.Forms.DockStyle.Fill;
            this.pnlWebPage.Location = new System.Drawing.Point(0, 101);
            this.pnlWebPage.Margin = new System.Windows.Forms.Padding(4);
            this.pnlWebPage.Name = "pnlWebPage";
            this.pnlWebPage.Size = new System.Drawing.Size(1205, 624);
            this.pnlWebPage.TabIndex = 3;
            // 
            // webView
            // 
            this.webView.Dock = System.Windows.Forms.DockStyle.Fill;
            this.webView.Location = new System.Drawing.Point(0, 0);
            this.webView.Margin = new System.Windows.Forms.Padding(4);
            this.webView.MinimumSize = new System.Drawing.Size(27, 25);
            this.webView.Name = "webView";
            this.webView.Size = new System.Drawing.Size(1205, 624);
            this.webView.Source = new System.Uri("https://www.bbc.com", System.UriKind.Absolute);
            this.webView.TabIndex = 0;
            this.webView.NavigationCompleted += new System.EventHandler<Microsoft.Toolkit.Win32.UI.Controls.Interop.WinRT.WebViewControlNavigationCompletedEventArgs>(this.webView_NavigationCompleted);
            // 
            // TestForm
            // 
            this.AutoScaleDimensions = new System.Drawing.SizeF(8F, 16F);
            this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;
            this.ClientSize = new System.Drawing.Size(1505, 725);
            this.Controls.Add(this.pnlWebPage);           
            this.Margin = new System.Windows.Forms.Padding(4);
            this.Name = "TestForm";
            this.Text = "TestForm";
            this.pnlWebPage.ResumeLayout(false);
            ((System.ComponentModel.ISupportInitialize)(this.webView)).EndInit();
            this.ResumeLayout(false);

        }

        #endregion

        private System.Windows.Forms.Panel pnlWebPage;
        private Microsoft.Toolkit.Forms.UI.Controls.WebView webView;
    }
}

Once a website has loaded into the WebView, I want to be able to extract the page's HTML and store it as a string or similar.

I've added the following method, which is called when the WebView's NavigationCompleted event happens.

private async void wvLI_NavigationCompleted(object sender, Microsoft.Toolkit.Win32.UI.Controls.Interop.WinRT.WebViewControlNavigationCompletedEventArgs e)
        {
           \\ HTML extraction to go here
}

I can't see a way to access the HTML however - there doesn't seem to be a property or similar for it. When researching how to do this I can only find questions and answers relating to Android or Xamarin - and none of the solutions for those work for the Microsoft Toolkit WebView. Does anyone know how I can achieve this?

Upvotes: 0

Views: 613

Answers (1)

spaceplane
spaceplane

Reputation: 607

You can invoke the JavaScript eval() function by calling InvokeScriptAsync (or InvokeScript) to get the outerHTML (or innerHTML or whatever you need) as follows:

private async void wvLI_NavigationCompleted(object sender, Microsoft.Toolkit.Win32.UI.Controls.Interop.WinRT.WebViewControlNavigationCompletedEventArgs e)
{
     string html = await webView.InvokeScriptAsync("eval", new string[] { "document.documentElement.outerHTML;" });
}

It is important that this goes in the NavigationCompleted event, rather than, say, DOMContentLoaded, as the page is only fully loaded at the point of NavigationCompleted firing

Upvotes: 1

Related Questions