Import Images and zip .docx files with repository in Palantir Foundry

Question

This is basically a follow-up question based on the (working) solution given here: Output .docx document using repository in palantir foundry
(generates word files in a foundry repository and writes docx files in a spark df)

What I did manage is to write data from other data sources (dfs) into the document; what I did not manage is to get an image into the document using doc.add_picture(). I think theoretically it should work, since even if Foundry is usually used for tabular data analysis, image processing is (afaik) also supported; but I could not find a proper way to get the image as a repository input that also fits the doc.add_picture() function.

So in general (I know that it is not the fundamental idea of Foundry to allow the creation of well-formatted docx files) the python-docx library has some weaknesses when it comes to formatting. So, looking on a meta level one could think of unzipping a template docx file that contains:

._rels
.docProps
.word
.[Content_Types]

Now the question would be, if I put the unzipped docx file into Foundry and for each row-iteration (like in the previous question) I only update the Document.xml (that is in the .word folder) with the according content from a given df (its rows), is Foundry then able to zip the unzipped docx folders/files with the updated xml into new docx files that are like in the previous question saved in a pyspark df?

Import Images and zip .docx files with repository in Palantir Foundry

Answers (1)

Related Questions