Reputation: 2793
Input are Excel files - the cells may contain some basic HTML formatting like <b>, <br>, <h2>.
I want to read the strings and insert the text as formatted text into word documents, i.e. <b>Foo</b> would be shown as a bold string in Word.
I don't know which tags are used so I need a "generic solution", a find/replace approach does not work for me.
I found a solution from January 2011 using the WebBrowser component. So the HTML is converted to RTF and the RTF is inserted into Word. I was wondering if there is a better solution today.
Using a commercial component is fine for me.
Update
I came across Matthew Manela's MarkupConverter class. It converts HTML to RTF. Then I use the clipboard to insert the snippet into the word file
// rtf contains the converted html string using MarkupConverter
Clipboard.SetText(rtf, TextDataFormat.Rtf);
// objTable is a table in my word file
objTable.Cell(1, 1).Range.Paste();
This works, but will copy/pasting up to a few thousand strings using the clipboard break anything?
Upvotes: 2
Views: 4807
Reputation: 144
Use this code it is working..
Response.AppendHeader("content-disposition", "attachment;filename=FileEName.xls");
Response.Charset = "";
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.ContentType = "application/vnd.ms-excel";
this.EnableViewState = false;
//Response.Write("Your HTML Code");
Response.Write("<table border='1 px solid'><tr><th>sfsd</th><th>sfsdfssd</th></tr><tr>
<td>ssfsdf</td><td><table border='1 px solid'><tr><th>sdf</th><th>hhsdf</th></tr><tr>
<td>sdfds</td><td>sdhjhfds</td></tr></table></td></tr></table>");
Response.End();
Upvotes: 2
Reputation: 6578
You will need the OpenXML SDK in order to work with OpenXML. It can be quite tricky getting into, but it is very powerful, and a whole lot more stable and reliable than Office Automation or Interop.
The following will open a document, create an AltChunk
part, add the HTML to it, and embed it into the document. For a broader overview of AltChunk
see Eric White's blog
using (var wordDoc = WordprocessingDocument.Open("DocumentName.docx", true))
{
var altChunkId = "AltChunkId1";
var mainPart = wordDoc.MainDocumentPart;
var chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Html, altChunkId);
using (var textStream = new MemoryStream())
{
var html = "<html><body>...</body></html>";
var data = Encoding.UTF8.GetBytes(html);
textStream.Write(data, 0, data.Length);
textStream.Position = 0;
chunk.FeedData(textStream);
}
var altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document.Body.InsertAt(altChunk, 0);
mainPart.Document.Save();
}
Obviously for your case, you will want to find (or build) the table you want and insert the AltChunk
there instead of at the first position in the body. Note that the HTML that you insert into the word doc must be full HTML documents, with an <html>
tag. I'm not sure if <body>
is required, but it doesn't hurt. If you just have HTML formatted text, simply wrap the text in these tags and insert into the doc.
It seems that you will need to use Office Automation/Interop to get the table heights. See this answer which says that the OpenXML SDK does not update the heights, only Word does.
Upvotes: 3
Reputation: 12934
Why not let WORD do its owns translation since it understands HTML.
Upvotes: 1