askeet
askeet

Reputation: 719

How to convert an HTML document with lots of tables into a Word document?

I have created an HTML document with many tables. How can I convert the document to Word?

The problem is that if I open an HTML document with Word, I get non-standard double-lines tables for some reason.

<table border="1" color="#000000" cellpadding="0" cellspacing="0" width=100%>
<tr>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td width = 15%>0</td>
<td width = 15%>0</td>
<td width = 40%>0</td>
<td> - </td>
</tr>
</table>

Upvotes: 7

Views: 33364

Answers (4)

Aaron Digulla
Aaron Digulla

Reputation: 328556

Most simple solution: Open the HTML in a browser, select the table (or the whole document) and copy and then paste into Word. You might get even better results when pasting into Excel, first, and then copy&paste from there to Word (kudos to Josiah for this tip). That often works pretty well, especially if the table looks good/correct in IE.

There are other solutions but they are much more complicated: You would need a HTML parser and something which can create OOXML files. If you want to try this, use Python with Beautiful Soup as HTML parser. Writing OOXML is explained in this question: How can I create a Word document using Python?

Note that the effort for this solution is probably 1-2 weeks.

Upvotes: 9

Daniel Wong
Daniel Wong

Reputation: 1

From http://www.wordbanter.com/showthread.php?t=105850

"You have to go into the table, select "Table", then Table properties, then Options. Under "default cell spacing" deselect "allow spacing between cells."

Upvotes: -1

askeet
askeet

Reputation: 719

Solved the problem convert a lot of tables to Word document using css styles. After open Generate.html with Word all tables normal

File CSSTable.css

table.CSSTable {
border-width: 1px;
border-spacing: 0px;
border-style: solid;
border-color: black;
border-collapse: collapse;
background-color: white;
}
table.CSSTable th {
    border-width: 1px;
    padding: 0px;
    border-style: solid;
    border-color: black;
    background-color: white;
    -moz-border-radius: ;
}
table.CSSTable td {
    border-width: 1px;
    padding: 0px;
    border-style: solid;
    border-color: black;
    background-color: white;
    -moz-border-radius: ;
}

Generate.html

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf8">
<link rel="stylesheet" href="CSSTable.css" type="text/css">
</head>
<body>
<table class="CSSTable" width=100%>
<tr>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td width = 15%>0</td>
<td width = 15%>0</td>
<td width = 40%>0</td>
<td> - </td>
</tr>
</table>

Upvotes: 3

m0bi5
m0bi5

Reputation: 9452

You can use an altChunk, provided the document is to be opened in Word. Word is needed only for opening it.

In terms of Microsoft's OpenXML SDK classes: you will want AlternativeFormatImportPart of type AlternativeFormatImportPartType.Html

See this or this for examples

Upvotes: 1

Related Questions