Holograham
Holograham

Reputation: 1368

Splitting Word Documents Into Smaller Ones

I want a user to be able to upload a word document and my program then parses the document into separate word documents. The problem is that the splitting will need to be manual as all the word documents are not formatted the same way. My initial thought is before the user uploads, the user tags the sections with a beginning and end tag (of some sort maybe a comment) that my program can then parse and split the document into separate documents. (This also needs to work for .doc and .docx so a common solution is desirable)

Ex. Input:

Doc1

Chapter 1

Blah Blah Blah

Chapter 2

Blah blah

/end Doc1

Ex. Output:

Doc1

Chapter 1

Blah Blah Blah

/end Doc1

Doc 2

Chapter 2

Blah blah

/end Doc2

Any ideas? I have been struggling with this for awhile

Upvotes: 3

Views: 3311

Answers (5)

Kuppu
Kuppu

Reputation: 1

VBA Macro to split files into sub documents

Sub UpdateDocuments()

    Application.ScreenUpdating = False
    Dim strFolder As String, strFile As String, wdDoc As Document
    strFolder = GetFolder
    If strFolder = "" Then Exit Sub
    strFile = Dir(strFolder & "\*.doc", vbNormal)
    While strFile <> ""
        Set wdDoc = Documents.Open(FileName:=strFolder & "\" & strFile,      AddToRecentFiles:=False, Visible:=False)
        With wdDoc
            'Call your other macro or insert its code here
            'BreakOnSection
            wdDoc.Activate

        ActiveDocument.ActiveWindow.View.Type = wdOutlineView
            Selection.WholeStory
        Selection.Copy
            ActiveDocument.Subdocuments.AddFromRange Range:=Selection.Range
            ActiveDocument.SaveAs "C:\Data\Split\" & ActiveDocument.Name

            ActiveDocument.Close SaveChanges:=True
    End With
    strFile = Dir()
    Wend
    Set wdDoc = Nothing
    Application.ScreenUpdating = True
End Sub

Function GetFolder() As String
    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0,     

"Choose a folder", 0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function

Upvotes: 0

John Laffoon
John Laffoon

Reputation: 2915

I've had great success with Aspose.Words for document manipulation and generation.

Upvotes: 0

bryanjonker
bryanjonker

Reputation: 3416

Something that may help is HTML Transit. It's incredibly old software and incredibly expensive, and from an initial search, it may not be supported anymore. But, it did have the ability to take one Word document, and split it up into smaller pieces (of course, it converted it to HTML as well). Something to look into, maybe. Google "HTML Transit" for more research and free demo.

Upvotes: 0

Paul Kohler
Paul Kohler

Reputation: 2714

What you want to do is non-trivial! I have done my fair share of document manipulation, that said if you are working with a DOCX these days it is not too bad due to the supporting libraries, see:

http://openxmldeveloper.org/

Older version get more difficult, you would need to source a library for that, or as suggested use macros.

Is the "program" a web site? If so make sure you do not use COM interop!

Upvotes: 4

No Refunds No Returns
No Refunds No Returns

Reputation: 8336

I'd say your best bet is to investigate the VSTO or VBA macros to accomplish this. Both will give you full access to the object model in whatever version the document is.

Upvotes: 0

Related Questions