eclipxe
eclipxe

Reputation: 93

Anyone know of a good algorithm for rendering an HTML table to an image?

There is a standard two-pass algorithm mentioned in RFC 1942: http://www.ietf.org/rfc/rfc1942.txt however I haven't seen any good real-world implementations. Anyone know of any? I haven't been able to find anything useful in the Mozilla or WebKit code bases, but I am not entirely sure where to look.

I guess this might actually be a deeper problem with having to actually render HTML (the contents of table cells) but just to keep it simple - plaintext HTML table as an image. Even an HTML table rendering algorithm ignoring the "as an image" part...

Upvotes: 1

Views: 2526

Answers (6)

Christoph Schiessl
Christoph Schiessl

Reputation: 6868

Take a look at Prince XML - it's a commercial tool to render CSS-styled XML (including XHTML) documents to PDFs. This tool is conform with major W3C standards such as XHTML and CSS2.1. You can try the free demo version from their Homepage!

Since you want an image: It shouldn't be a big problem to convert the generated PDFs programatically to an images.

Upvotes: 0

eclipxe
eclipxe

Reputation: 93

One tool that comes close is: http://www.terrainformatica.com/htmlayout/main.whtm

This library offers a way to capture rendered HTML to an image, however it is not open source (but free!). Hope it is useful to some!

Unfortunately my app is cross platform, C/C++ with no MFC or platform dependencies (nightmare!). I'm hopefully looking to find a general purpose algorithm for table rendering. I think the 2-pass option from the RFC comes pretty close so I'm probably going to just dig in and work against that. I'll be sure to blog about it and post my eventual solution here if I can!

Upvotes: 0

Dimitry
Dimitry

Reputation: 6613

If you have XHTML, not plain HTML, you should be able to retrieve the content of those cells along with information about the table's structure: colspan, rowspan, etc. Using this information, you can render the table using your own border, padding and margin values.

Things get complex when you also want to render the user defined dimensions. But for retrieving the table data and drawing it, you could use an XML parser. PHP's parser is here: https://www.php.net/xml

Upvotes: 0

micahwittman
micahwittman

Reputation: 12476

If a commercial tool is an option, look at:

HtmlCapture ActiveX Control V2.0 (originally named HtmlSnap)

Some features they claim:

  • By calling SnapHtmlString(), you can take a snapshot for a html string.
  • Get snapshot images rendered by either Microsoft IE or Mozilla Firefox.
  • Just by calling SnapUrl() and SaveImage(), you can take a snapshot of a webpage into various images, such as BMP, JPG, JPEG, GIF, PNG, TIF, TGA and PCX.
  • Convert html to vector image format like EMF and WMF.
  • Self contained ActiveX control with no third party dependencies.
  • Support custom gdi output of the resulting image.
  • Support saving resulting image both to file and in memory.
  • Support saving both full-size web page and thumbnail one.
  • Take a snapshot of a whole webpage into one image without scrollbars.
  • Make grayscale or B&W images with efficient algorithms to keep the quality.
  • Support JPEG compression level, compression method selection of TIFF and GIF.
  • Support setting color depth in images while keeping the quality of the image as much as possible.
  • Selectively save activeX, image, java applets, scripts and videos on a web page as you want.
  • Send custom cookies, http headers, credentials in snapshot requests.
  • Take snapshots of webpages via a Proxy server.
  • More than 30 samples written in VC, C- , Delphi, VB, C++ Builder, Java, JScript, Perl, VBScript, ASP, ASP.net and PHP are provided.

Upvotes: 1

Gerald
Gerald

Reputation: 23499

I'm not sure if this will meet your constraints or not, but you can try using IE or an IE control with MSHTML and the IHTMLElementRender interface to render the table to a device context.

Upvotes: 0

Steven A. Lowe
Steven A. Lowe

Reputation: 61242

html table rendering is non-trivial due to the various ways that the sizes of the cells may be specified, tables nested within tables, etc.

if all you want is the image, a simple solution would be the .NET browser control (which is basically the COM component for IE) and a screen-capture function

if you want to get some source to manipulate, the Mozilla source should still be available

Upvotes: 0

Related Questions