Reputation: 23
I managed to add HTML (text only) to a Word-document following this post Add HTML String to OpenXML, using an already existing Word-file.
Unfortunately, I can't find any solution to use style from this Word-template for my newly added text. It is always "Times New Roman" size 12px although the standard style of the used template is "Arial" size 9px.
So fare I tried:
Paragraph para = body.AppendChild(new Paragraph());
Run run = para.AppendChild(new Run());
run.AppendChild(altChunk);
para.ParagraphProperties = new ParagraphProperties(new ParagraphStyleId() { Val = "berschrift2" });
AltChunkProperties altChunkProperties = new AltChunkProperties();
altChunkProperties.MatchSource = new MatchSource() { Val = new OnOffValue(false) };
altChunk.AppendChild<AltChunkProperties>(altChunkProperties);
Any suggestions?
EDIT: I found a workaround, which isn´t really a solution for my question, but works for me. I'm no longer trying to use the style from word, but adding the styles to my html before using altchunk.
Upvotes: 1
Views: 2559
Reputation: 1
What worked for me and for my situation (if you don't want to go down the rather complex openxml powertools html converter root) is to add a HTML style attribute to the body section of your HTML fragment as follows:
Encoding.UTF8.GetBytes(
@$"<html><head><title></title></head><body style=""font-family: Calibri"">{ConvertUnconventionalUnicodeCharsToAscii(htmlAsString)}</body></html>");
It might be possible to dynamically derive the font family of the "normal" style embedded into the document you are updating and insert that name into the style attribute if deemed compatible.
That way, if you decide to change the base/ normal font the style of the HTML import will attempt to utilise the same font family.
Sorry if a bit off topic, I also could not get alternativeFormatImportPart.FeedData() to process "’" (code 8217) UTF-16 characters and so had to specifically replace them with "'" (code 39) in order to avoid them from being rendered as the following sequence ’
Upvotes: -1
Reputation: 131
Some explanation: if you look at the definition of altChunk in ISO 29500-1 17.17.2.1 and specifically in the A.1 section, the schema shows that altChunk is a EG_BlockLevelElts element and this is a peer with paragraphs (i.e. ). It is technically not correct to add as a child to run elements or even paragraph. It should be added at the body level. The fact that Word doesn't complain when adding as a run or paragraph child is unintentional and shouldn't be relied on.
As a result, what Word is doing is using the default style property for fonts to format this new content. You can try this by changing the document defaults in the styles.xml part. With match source property set to false, there isn't a way to specify the font besides document defaults.
Having said that, I think that Thomas' alternative is a better way to go.
Upvotes: 2
Reputation: 2289
The real solution for your question is to transform HTML into Open XML markup "yourself" rather than relying on the alternative format import parts in conjunction with w:altChunk
elements. This creates a dependency on how Microsoft Word handles the import, often with little control on your side.
How do you transform HTML (or XML in general) to Open XML markup? The best way is to write so-called recursive pure functional transformations, which translate HTML elements and attributes to Open XML elements and attributes. If you have really simple HTML documents, that is not a big task. However, doing this for "arbitrary" HTML and CSS is quite a feat.
The good news is that the Open-XML-PowerTools, an Open Source library, contain functionality to transform HTML to Open XML and vice versa. Thus, I'd recommend you have a look at that library.
Upvotes: 1