Anna from Svetlogorsk
Anna from Svetlogorsk

Reputation: 21

PhpWord and getting image size from docx

I'm trying to parse a docx file, which contains an image, with PHPWord.

 $image = $textRun_element ->  getImageStringData(true);
 $image_type = $textRun_element -> getImageType();       
 $image_style = $textRun_element -> getStyle();
 $width = $image_style -> getWidth();

Using this code I get the original image width, but in the word document the image has smaller size and I need to get the image size as it is seen in the document (for example, the original image has width 12.28 cm and in the document its size is set at 6.11 cm).

Is there any way to get the image size, which is set in the Word document?

Edited.

After studying some of the PHPWord files it seems to me, that this is some kind of a bug. File \phpoffice\phpword\src\PhpWord\Element\Image.php contains a piece of code

/**
    * Set proportional width/height if one dimension not available.
     *
     * @param int $actualWidth
     * @param int $actualHeight
     */
    private function setProportionalSize($actualWidth, $actualHeight)
    {
        $styleWidth = $this->style->getWidth();
        $styleHeight = $this->style->getHeight();
        if (!($styleWidth && $styleHeight)) {
            if ($styleWidth == null && $styleHeight == null) {
                $this->style->setWidth($actualWidth);
                $this->style->setHeight($actualHeight);
            } elseif ($styleWidth) {
                $this->style->setHeight($actualHeight * ($styleWidth / $actualWidth));
            } else {
                $this->style->setWidth($actualWidth * ($styleHeight / $actualHeight));
            }
        }
    }

If I get it right, this function checks, if an Image object has defined properties width and height in its style and if not, takes these values from the actual image in the docx archive, or if one of these properties is defined and the other is not, it calculates the other one, using the actual image size.

I put an echo into the if ($styleWidth == null && $styleHeight == null){} part to see if the condition is true, and this echo was triggered. So, this means, that Image object created during parsing has undefined style properties width and height.

Now I just don't get, which part of the PHPWord Word 2007 Reader code should get these properties from docx.

I'm sorry if I used some of the terminology incorrectly, I'm completely new to the object-oriented programming.

Upvotes: 1

Views: 1276

Answers (1)

Anna from Svetlogorsk
Anna from Svetlogorsk

Reputation: 21

I solved my particular problem with getting an image size from a docx file. As far as I can tell, the problem is not a bug, but more of an absence of a feature. It seems, that the function for reading image style from a docx file just has not been added yet.

I solved my problem by adding a method into the file \phpoffice\phpword\src\PhpWord\Reader\Word2007\AbstractPart.php

//Getting image size
protected function readImageStyle(XMLReader $xmlReader, \DOMElement $domNode)
{
    $styleDefs = array(
    'width'     => array(self::READ_VALUE, 
                   array('wp:inline/wp:extent','wp:anchor/wp:extent'),'cx'),
    'height'    => array(self::READ_VALUE, 
                   array('wp:inline/wp:extent','wp:anchor/wp:extent'),'cy'),
    );
        
    $style=$this -> readStyleDefs($xmlReader, $domNode, $styleDefs);
        
    //Convert EMU to Points
    if (array_key_exists('width',$style))
    {
        $style['width'] = 
        Drawing::pixelsToPoints(Drawing::emuToPixels($style['width']));
    }
        
    if (array_key_exists('height',$style))
    {   
        $style['height'] = 
        Drawing::pixelsToPoints(Drawing::emuToPixels($style['height']));
    }
        
        return $style;
    }

Also I added at the beginning of the file a line

use PhpOffice\PhpWord\Shared\Drawing

so I could use conversion functions from the Drawing class, added line

$style=$this->readImageStyle($xmlReader,$node);

into the method readRunChild and changed null to $style in the line

$parent->addImage($imageSource, null, false, $name);

in that method.

// Office 2011 Image
            $xmlReader->registerNamespace('wp', 'http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing');
            $xmlReader->registerNamespace('r', 'http://schemas.openxmlformats.org/officeDocument/2006/relationships');
            $xmlReader->registerNamespace('pic', 'http://schemas.openxmlformats.org/drawingml/2006/picture');
            $xmlReader->registerNamespace('a', 'http://schemas.openxmlformats.org/drawingml/2006/main');

            $name = $xmlReader->getAttribute('name', $node, 'wp:inline/a:graphic/a:graphicData/pic:pic/pic:nvPicPr/pic:cNvPr');
            $embedId = $xmlReader->getAttribute('r:embed', $node, 'wp:inline/a:graphic/a:graphicData/pic:pic/pic:blipFill/a:blip');
            if ($name === null && $embedId === null) { // some Converters puts images on a different path
                $name = $xmlReader->getAttribute('name', $node, 'wp:anchor/a:graphic/a:graphicData/pic:pic/pic:nvPicPr/pic:cNvPr');
                $embedId = $xmlReader->getAttribute('r:embed', $node, 'wp:anchor/a:graphic/a:graphicData/pic:pic/pic:blipFill/a:blip');
            }
            $target = $this->getMediaTarget($docPart, $embedId);
            if (!is_null($target)) {
                $imageSource = "zip://{$this->docFile}#{$target}";
                $style=$this->readImageStyle($xmlReader,$node); // My addition
                $parent->addImage($imageSource, $style, false, $name); //Changed null to $style here
            }

Upvotes: 1

Related Questions