
Reputation: 401

HTML Reader from PHPWord does't work with tables?

When I use Html reader for my html for converting into docx, reader is cut off my table.

PHP example:

$reader = IOFactory::createReader('HTML');
$phpWord = $reader->load($this->getReportDir() . '/' . $fileName);
$writer = IOFactory::createWriter($phpWord);
$writer->save($this->getReportDir() . '/' . $fileName);

Table example:

        <td>№ п/п</td>
        <td>Общие показатели результатов прохождения проверочных листов</td>
        <td>Количество пройденных проверок</td>
        <td>% от общего количества пройденных проверок</td>

Upvotes: 5

Views: 12194

Answers (3)

Arivan Bastos
Arivan Bastos

Reputation: 1996

The current HTML class from PHPWord is very limited. The issue you are getting is a know issue (see

I'm working on a project that needs some HTML tables to doc conversion. So, I work a little improving the HTML class. It is very little tested and I just tested DOC conversion.

My version is able to convert the following HTML:

<table style="width: 50%; border: 6px #0000FF solid;">
        <tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">
        <tr><td>1</td><td colspan="2">2</td></tr>

Generating the following DOC table:

enter image description here

It uses PHPWord version 0.13:

 * This file is part of PHPWord - A pure PHP library for reading and writing
 * word processing documents.
 * PHPWord is free software distributed under the terms of the GNU Lesser
 * General Public License version 3 as published by the Free Software Foundation.
 * For the full copyright and license information, please read the LICENSE
 * file that was distributed with this source code. For the full list of
 * contributors, visit
 * @link
 * @copyright   2010-2016 PHPWord contributors
 * @license LGPL version 3

namespace PhpOffice\PhpWord\Shared;

use PhpOffice\PhpWord\Element\AbstractContainer;
use PhpOffice\PhpWord\Element\Table;
use PhpOffice\PhpWord\Element\Row;

 * Common Html functions
 * @SuppressWarnings(PHPMD.UnusedPrivateMethod) For readWPNode
class Html
    //public static $phpWord=null;

    *  Hold styles from parent elements,
    *  allowing child elements inherit attributes.
    *  So if you whant your table row have bold font
    *  you can do:
    *     <tr style="font-weight: bold; ">
    *  instead of
    *     <tr>
    *       <td>
    *           <p style="font-weight: bold;">
    *       ...
    *  Before DOM element children are processed,
    *  the parent DOM element styles are added to the stack.
    *  The styles for each child element is composed by
    *  its styles plus the parent styles.
    public static $stylesStack=null;

     * Add HTML parts.
     * Note: $stylesheet parameter is removed to avoid PHPMD error for unused parameter
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element Where the parts need to be added
     * @param string $html The code to parse
     * @param bool $fullHTML If it's a full HTML, no need to add 'body' tag
     * @return void
    public static function addHtml($element, $html, $fullHTML = false)
         * @todo parse $stylesheet for default styles.  Should result in an array based on id, class and element,
         * which could be applied when such an element occurs in the parseNode function.

        // Preprocess: remove all line ends, decode HTML entity,
        // fix ampersand and angle brackets and add body tag for HTML fragments
        $html = str_replace(array("\n", "\r"), '', $html);
        $html = str_replace(array('&lt;', '&gt;', '&amp;'), array('_lt_', '_gt_', '_amp_'), $html);
        $html = html_entity_decode($html, ENT_QUOTES, 'UTF-8');
        $html = str_replace('&', '&amp;', $html);
        $html = str_replace(array('_lt_', '_gt_', '_amp_'), array('&lt;', '&gt;', '&amp;'), $html);

        if (false === $fullHTML) {
            $html = '<body>' . $html . '</body>';

        // Load DOM
        $dom = new \DOMDocument();
        $dom->preserveWhiteSpace = true;
        $node = $dom->getElementsByTagName('body');

        //self::$phpWord = $element->getPhpWord();
        self::$stylesStack = array();

        self::parseNode($node->item(0), $element);

     * parse Inline style of a node
     * @param \DOMNode $node Node to check on attributes and to compile a style array
     * @param array $styles is supplied, the inline style attributes are added to the already existing style
     * @return array
    protected static function parseInlineStyle($node, $styles = array())
        if (XML_ELEMENT_NODE == $node->nodeType) {
            $stylesStr = $node->getAttribute('style');
            $styles = self::parseStyle($node, $stylesStr, $styles);
            // Just to balance the stack.
            // (make number of pushs = number of pops)

        return $styles;

     * Parse a node and add a corresponding element to the parent element.
     * @param \DOMNode $node node to parse
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element object to add an element corresponding with the node
     * @param array $styles Array with all styles
     * @param array $data Array to transport data to a next level in the DOM tree, for example level of listitems
     * @return void
    protected static function parseNode($node, $element, $styles = array(), $data = array())
        // Populate styles array
        $styleTypes = array('font', 'paragraph', 'list', 'table', 'row', 'cell'); //@change
        foreach ($styleTypes as $styleType) {
            if (!isset($styles[$styleType])) {
                $styles[$styleType] = array();

        // Node mapping table
        $nodes = array(
                              // $method        $node   $element    $styles     $data   $argument1      $argument2
            'p'         => array('Paragraph',   $node,  $element,   $styles,    null,   null,           null),
            'h1'        => array('Heading',     null,   $element,   $styles,    null,   'Heading1',     null),
            'h2'        => array('Heading',     null,   $element,   $styles,    null,   'Heading2',     null),
            'h3'        => array('Heading',     null,   $element,   $styles,    null,   'Heading3',     null),
            'h4'        => array('Heading',     null,   $element,   $styles,    null,   'Heading4',     null),
            'h5'        => array('Heading',     null,   $element,   $styles,    null,   'Heading5',     null),
            'h6'        => array('Heading',     null,   $element,   $styles,    null,   'Heading6',     null),
            '#text'     => array('Text',        $node,  $element,   $styles,    null,   null,           null),
            'strong'    => array('Property',    null,   null,       $styles,    null,   'bold',         true),
            'em'        => array('Property',    null,   null,       $styles,    null,   'italic',       true),
            'sup'       => array('Property',    null,   null,       $styles,    null,   'superScript',  true),
            'sub'       => array('Property',    null,   null,       $styles,    null,   'subScript',    true),
            // @change
            //'table'     => array('Table',       $node,  $element,   $styles,    null,   'addTable',     true),
            //'tr'        => array('Table',       $node,  $element,   $styles,    null,   'addRow',       true),
            //'td'        => array('Table',       $node,  $element,   $styles,    null,   'addCell',      true),
            'table'     => array('Table' ,       $node,  $element,   $styles,    null,   null,     true),
            'tr'        => array('Row'   ,       $node,  $element,   $styles,    null,   null,       true),
            'td'        => array('Cell'  ,       $node,  $element,   $styles,    null,   null,      true),
            'th'        => array('Cell'  ,       $node,  $element,   $styles,    null,   null,      true),
            'ul'        => array('List',        null,   null,       $styles,    $data,  3,              null),
            'ol'        => array('List',        null,   null,       $styles,    $data,  7,              null),
            'li'        => array('ListItem',    $node,  $element,   $styles,    $data,  null,           null),

        $newElement = null;
        $keys = array('node', 'element', 'styles', 'data', 'argument1', 'argument2');

        if (isset($nodes[$node->nodeName])) {
            // Execute method based on node mapping table and return $newElement or null
            // Arguments are passed by reference
            $arguments = array();
            $args = array();
            list($method, $args[0], $args[1], $args[2], $args[3], $args[4], $args[5]) = $nodes[$node->nodeName];
            for ($i = 0; $i <= 5; $i++) {
                if ($args[$i] !== null) {
                    $arguments[$keys[$i]] = &$args[$i];
            $method = "parse{$method}";
            $newElement = call_user_func_array(array('PhpOffice\PhpWord\Shared\Html', $method), $arguments);

            // Retrieve back variables from arguments
            foreach ($keys as $key) {
                if (array_key_exists($key, $arguments)) {
                    $$key = $arguments[$key];
            // Just to balance the stack.
            // Number of pushs = number of pops.

        if ($newElement === null) {
            $newElement = $element;

        self::parseChildNodes($node, $newElement, $styles, $data);

        // After the parent element be processed, 
        // its styles are removed from stack.

     * Parse child nodes.
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array $styles
     * @param array $data
     * @return void
    private static function parseChildNodes($node, $element, $styles, $data)
        if ('li' != $node->nodeName) {
            $cNodes = $node->childNodes;
            if (count($cNodes) > 0) {
                foreach ($cNodes as $cNode) {
                    if (($element instanceof AbstractContainer) or ($element instanceof Table) or ($element instanceof Row)) { // @change
                        self::parseNode($cNode, $element, $styles, $data);

     * Parse paragraph node
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @return \PhpOffice\PhpWord\Element\TextRun
    private static function parseParagraph($node, $element, &$styles)
        $elementStyles = self::parseInlineStyle($node, $styles['paragraph']);

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;

     * Parse heading node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Name of heading style
     * @return \PhpOffice\PhpWord\Element\TextRun
     * @todo Think of a clever way of defining header styles, now it is only based on the assumption, that
     * Heading1 - Heading6 are already defined somewhere
    private static function parseHeading($element, &$styles, $argument1)
        $elementStyles = $argument1;

        $newElement = $element->addTextRun($elementStyles);

        return $newElement;

     * Parse text node
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @return null
    private static function parseText($node, $element, &$styles)
        $elementStyles = self::parseInlineStyle($node, $styles['font']);

        $textStyles = self::getInheritedTextStyles();
        $paragraphStyles = self::getInheritedParagraphStyles();

        // Commented as source of bug #257. `method_exists` doesn't seems to work properly in this case.
        // @todo Find better error checking for this one
        // if (method_exists($element, 'addText')) {
            $element->addText($node->nodeValue, $textStyles, $paragraphStyles);
        // }

        return null;

     * Parse property node
     * @param array &$styles
     * @param string $argument1 Style name
     * @param string $argument2 Style value
     * @return null
    private static function parseProperty(&$styles, $argument1, $argument2)
        $styles['font'][$argument1] = $argument2;

        return null;

     * Parse table node
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param string $argument1 Method name
     * @return \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @todo As soon as TableItem, RowItem and CellItem support relative width and height
    private static function parseTable($node, $element, &$styles, $argument1)
        $elementStyles = self::parseInlineStyle($node, $styles['table']);

        $newElement = $element->addTable($elementStyles);

        // $attributes = $node->attributes;
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement->setWidth($attributes->getNamedItem('width')->value);
        // }

        // if ($attributes->getNamedItem('height') !== null) {
            // $newElement->setHeight($attributes->getNamedItem('height')->value);
        // }
        // if ($attributes->getNamedItem('width') !== null) {
            // $newElement=$element->addCell($width=$attributes->getNamedItem('width')->value);
        // }

        return $newElement;

    private static function parseRow($node, $element, &$styles, $argument1)
        $elementStyles = self::parseInlineStyle($node, $styles['row']);

        $newElement = $element->addRow(null, $elementStyles);

        return $newElement;

    private static function parseCell($node, $element, &$styles, $argument1)
        $elementStyles = self::parseInlineStyle($node, $styles['cell']);

        $colspan = $node->getAttribute('colspan');        
        if (!empty($colspan))
            $elementStyles['gridSpan'] = $colspan-0;        

        $newElement = $element->addCell(null, $elementStyles);
        return $newElement;

     * Parse list node
     * @param array &$styles
     * @param array &$data
     * @param string $argument1 List type
     * @return null
    private static function parseList(&$styles, &$data, $argument1)
        if (isset($data['listdepth'])) {
        } else {
            $data['listdepth'] = 0;
        $styles['list']['listType'] = $argument1;

        return null;

     * Parse list item node
     * @param \DOMNode $node
     * @param \PhpOffice\PhpWord\Element\AbstractContainer $element
     * @param array &$styles
     * @param array $data
     * @return null
     * @todo This function is almost the same like `parseChildNodes`. Merged?
     * @todo As soon as ListItem inherits from AbstractContainer or TextRun delete parsing part of childNodes
    private static function parseListItem($node, $element, &$styles, $data)
        $cNodes = $node->childNodes;
        if (count($cNodes) > 0) {
            $text = '';
            foreach ($cNodes as $cNode) {
                if ($cNode->nodeName == '#text') {
                    $text = $cNode->nodeValue;
            $element->addListItem($text, $data['listdepth'], $styles['font'], $styles['list'], $styles['paragraph']);

        return null;

     * Parse style
     * @param \DOMAttr $attribute
     * @param array $styles
     * @return array
    private static function parseStyle($node, $stylesStr, $styles)
        // Parses element styles.
        $newStyles = array();

        if (!empty($stylesStr))
            $properties = explode(';', trim($stylesStr, " \t\n\r\0\x0B;"));
            foreach ($properties as $property) {
                list($cKey, $cValue) = explode(':', $property, 2);
                $cValue = trim($cValue);
                switch (trim($cKey)) {
                    case 'text-decoration':
                        switch ($cValue) {
                            case 'underline':
                                $newStyles['underline'] = 'single';
                            case 'line-through':
                                $newStyles['strikethrough'] = true;
                    case 'text-align':
                        $newStyles['alignment'] = $cValue; // todo: any mapping?
                    case 'color':
                        $newStyles['color'] = trim($cValue, "#");
                    case 'background-color':
                        $newStyles['bgColor'] = trim($cValue, "#");

                    // @change
                    case 'colspan':
                        $newStyles['gridSpan'] = $cValue-0;
                    case 'font-weight':
                        if ($cValue=='bold')
                            $newStyles['bold'] = true;
                    case 'width':
                        $newStyles = self::parseWidth($newStyles, $cValue);
                    case 'border-width':
                        $newStyles = self::parseBorderStyle($newStyles, $cValue);
                    case 'border-color':
                        $newStyles = self::parseBorderColor($newStyles, $cValue);
                    case 'border':
                        $newStyles = self::parseBorder($newStyles, $cValue);

        // Add styles to stack.

        // Inherit parent styles (including itself).
        $inheritedStyles = self::getInheritedStyles($node->nodeName);

        // Override default styles with the inherited ones.
        $styles = array_merge($styles, $inheritedStyles);       

        /* DEBUG
        if ($node->nodeName=='th')
            echo '<pre>';
            echo '</pre>';

        return $styles;

    *  Parses the "width" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    public static function parseWidth($styles, $cValue)
        if (preg_match('/([0-9]+)px/', $cValue, $matches))
            $styles['width'] = $matches[1];
            $styles['unit'] = 'dxa';
        else if (preg_match('/([0-9]+)%/', $cValue, $matches))
            $styles['width'] = $matches[1]*50;
            $styles['unit'] = 'pct';
        else if (preg_match('/([0-9]+)/', $cValue, $matches))
            $styles['width'] = $matches[1];
            $styles['unit'] = 'auto';

        $styles['alignment'] = \PhpOffice\PhpWord\SimpleType\JcTable::START;

        return $styles;

    *  Parses the "border-width" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    public static function parseBorderWidth($styles, $cValue)
        // border-width: 2px;
        if (preg_match('/([0-9]+)px/', $cValue, $matches))
            $styles['borderSize'] = $matches[1];

        return $styles;

    *  Parses the "border-color" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    public static function parseBorderColor($styles, $cValue)
        // border-color: #FFAACC;
        $styles['borderColor'] = $cValue;

        return $styles;

    *  Parses the "border" style attribute, adding to styles
    *  array the corresponding PHPWORD attributes.
    public static function parseBorder($styles, $cValue)
        if (preg_match('/([0-9]+)px\s+(\#[a-fA-F0-9]+)\s+solid+/', $cValue, $matches))
            $styles['borderSize'] = $matches[1];
            $styles['borderColor'] = $matches[2];

        return $styles;

    *  Return the inherited styles for text elements,
    *  considering current stack state.
    public static function getInheritedTextStyles()
        return self::getInheritedStyles('#text');

    *  Return the inherited styles for paragraph elements,
    *  considering current stack state.
    public static function getInheritedParagraphStyles()
        return self::getInheritedStyles('p');

    *  Return the inherited styles for a given nodeType,
    *  considering current stack state.
    public static function  getInheritedStyles($nodeType)
        $textStyles = array('color', 'bold', 'italic');
        $paragraphStyles = array('color', 'bold', 'italic', 'alignment');

        // List of phpword styles relevant for each element types.
        $stylesMapping = array(
            'p'         => $paragraphStyles,
            'h1'        => $textStyles,
            'h2'        => $textStyles,
            'h3'        => $textStyles,
            'h4'        => $textStyles,
            'h5'        => $textStyles,
            'h6'        => $textStyles,
            '#text'     => $textStyles,
            'strong'    => $textStyles,
            'em'        => $textStyles,
            'sup'       => $textStyles,
            'sub'       => $textStyles,
            'table'     => array('width', 'borderSize', 'borderColor', 'unit'),
            'tr'        => array('bgColor', 'alignment'),
            'td'        => array('bgColor', 'alignment'),
            'th'        => array('bgColor', 'alignment'),
            'ul'        => $textStyles,
            'ol'        => $textStyles,
            'li'        => $textStyles,

        $result = array();

        if (isset($stylesMapping[$nodeType]))
            $nodeStyles = $stylesMapping[$nodeType];

            // Loop trough styles stack applying styles in
            // the right order.
            foreach (self::$stylesStack as $styles)
                // Loop trough all styles applying only the relevants for
                // that node type.
                foreach ($styles as $name => $value)
                    if (in_array($name, $nodeStyles))
                        $result[$name] = $value;

        return $result;

    *  Add the parent styles to stack, allowing
    *  children elements inherit from.
    public static function pushStyles($styles)
        self::$stylesStack[] = $styles;

    *  Remove parent styles at end of recursion.
    public static function popStyles()

With this new structure it is easy add new style support. You just need edit the parseStyle() method and the $stylesMapping variable (at getInheritedStyles() method). Hope it helps.

Example of use:

include_once 'Sample_Header.php';

// New Word Document
echo date('H:i:s') , ' Create new PhpWord object' , EOL;
$phpWord = new \PhpOffice\PhpWord\PhpWord();

$section = $phpWord->addSection();
$html  = '<table style="width: 50%; border: 6px #0000FF solid;">'.
                '<tr style="background-color: #FF0000; text-align: center; color: #FFFFFF; font-weight: bold; ">'.
                '<tr><td>1</td><td colspan="2">2</td></tr>'.

\PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);

// Save file
echo write($phpWord, basename(__FILE__, '.php'), $writers);
if (!CLI) {
    include_once 'Sample_Footer.php';

Upvotes: 8


Reputation: 505

This answer is a comment of this Aviran Post : (I don't have enought reputation to comment).

I tried his method and I had this error :

Attempted to call an undefined method named "addText" of class "PhpOffice\PhpWord\Element\Table". 

To fix this, I had to modified the parseText function like this :

    private static function parseText($node, $element, &$styles)
    $elementStyles = self::parseInlineStyle($node, $styles['font']);

    $textStyles = self::getInheritedTextStyles();
    $paragraphStyles = self::getInheritedParagraphStyles();

    // Commented as source of bug #257. `method_exists` doesn't seems to work properly in this case.
    // @todo Find better error checking for this one
    if (method_exists($element, 'addText')  || $element instanceof Cell ) {
        $element->addText($node->nodeValue, $textStyles, $paragraphStyles);

    return null;

It's not a good bug fix but it's works very well for me.

Upvotes: 0


Reputation: 28355

Try to use the thead and tbody tags, like this:

      <th>№ п/п</th>
      <th>Общие показатели результатов прохождения проверочных листов</th>
      <th>Количество пройденных проверок</th>
      <th>% от общего количества пройденных проверок</th>

Upvotes: 0

Related Questions