use pandoc to embed images into a docx file that are in a HTML

Is it possible to embed images into a docx file that are embedded in a HTML file?

I am trying and it's not working for me, and perhaps I am not adding some extra parameter when I am running pandoc.

pandoc -f html -t docx -o testdoc.docx image.html

Thank you very much!

Upvotes: 9

Views: 4545

Answers (1)

Ma'moon Al-Akash
Ma'moon Al-Akash

Reputation: 5393

I managed to solve this by executing the following command:

pandoc -s file_name.html -o file_name.docx;

There are actually 2 important points that you need to consider:

  1. The quality of the output file is pretty much related to how pandoc interpret your HTML file, so that if the source was pretty complex then you wouldn't really expect a pretty good quality output, for instance the <hr/> tag is not recognized by pandoc, while the <p> tag is.
  2. The path of the image is not an HTTP path but instead it is a full desk path, meaning:

This is NO good:

<img src="http://www.example.com/images/img.jpg" />

And This is what pandoc can really read:

<img src="/var/www/example.com/images/img.jpg" />

HTH

Upvotes: 5

Related Questions