user5005768Himadree
user5005768Himadree

Reputation: 1439

Extract list of heading with its containing pages info from HHC file

I have CHM file. I want to extract the Table of contents (TOC) information such as follows: Heading-Section, Topic-Name, Topic-File-Path etc. I noticed its HHC file has similar info as below sample:

main Help-Project.hhc

<UL> // its root
   <LI> <OBJECT type="text/sitemap">
      <param name="Name" value="Topic 1">
      <param name="Local" value="Topics/topic1.htm">
      </OBJECT>
   <LI> <OBJECT type="text/sitemap">
      <param name="Name" value="Topic 2">
      <param name="Local" value="Topics/topic2.htm">
      </OBJECT>
  <OBJECT type="text/sitemap">
      <param name="Merge" value="Heading-Section1.chm::/Heading-Section1.hhc">
      </OBJECT>

Is there anyway to extract Heading-Section, Topic info from this and other hhc file?

Upvotes: 0

Views: 56

Answers (2)

Marco van de Voort
Marco van de Voort

Reputation: 26371

Many CHM decompression tools can extract a single hhc file. If the file has a binary TOC, the chmls utility in the Free Pascal distribution has an extracttoc command to attempt to extract a binary toc as a hhc.

The Free Pascal CHM tools unfortunately can't generate merged CHM files yet. I tried several times (both for master and slave CHMs) but the result is rejected by the Windows CHM viewer.

Upvotes: 0

help-info.de
help-info.de

Reputation: 7298

As I can see from your very short *.hhc sample I would like to give some hints. The desired target format is also missing in your question.

In Microsoft HTML Help CHM files you can create modular projects with separate help files. Index, Full-text search and Contents items of all compiled help modules (.CHM files) are merged at runtime. Open e.g. the file Master.CHM and you will see the master + slave help files unified into a single help system. Open any slave CHM and you simply see that single CHM help.

To make the slave modules Table Of Contents merge with the master module Table Of Contents at runtime "File Include" block statements must be added to the master contents file (master.hhc). E.g.:

<OBJECT type="text/sitemap">
    <param name="Merge" value="slave1.chm::/slave1.hhc">
</OBJECT>
<OBJECT type="text/sitemap">
    <param name="Merge" value="slave2.chm::/slave2.hhc">
</OBJECT>

For a single .chm resp. .hhc file without "runtime merging" creating a basic table of contents -> HTML export would be easy by script or some XML to HTML transformation. I use tools like FAR HTML to save a hhc to HTML and cut down manually by RegEx and Notepad++.

In your example, you show some code lines of the "Master.hhc" (e.g. your main Help-Project.hhc). Each module (Heading-Section1.chm) requires its own *.hhc file.

So you need all .hhc files of your modular CHM help file to extract Heading-Section, Topic-Name, Topic-File-Path etc.

Upvotes: 1

Related Questions