Sudio
Sudio

Reputation: 155

How to open a PDF?

I'm trying to use VBA in Word to open a PDF document, and eventually do some text manipulation.

I've the code below as a test, which does not error and appears to be opening the pdf.

When I try to insert the text of the pdf I just opened in Word, it pastes a box with a question mark inside. I'm not sure if I'm not opening the PDF correctly, or if the file is opening at all.

'Earlier code truncated for brevity
Dim FSOSubFolder As Object
Dim FSOFile As Object
Dim anotherString As String

For Each FSOSubFolder In FSOFolder.SubFolders
    LoopAllSubFolders FSOSubFolder
Next

For Each FSOFile In FSOFolder.Files
    
    ActiveDocument.Range.InsertAfter ("File Name: " & FSOFile.Name & vbNewLine) 'This part works
    Dim myWord As Word.Application, myDoc As Word.Document
    Set myWord = New Word.Application  
    Set myDoc = myWord.Documents.Open(FileName:=FSOFile.Path, ConfirmConversions:=False, Format:="PDF Files")
    myDoc.Activate
    Selection.WholeStory
    anotherString = Selection.Range.Text
    myDoc.Close
    ActiveDocument.Range.InsertAfter (anotherString) 'This pastes a box with a question mark inside

    Set FSOFile = Nothing
    Set FSOSubFolder = Nothing

Next 

I don't have screenupdating or warnings suppressed, and I don't see a new Word document open up as well.

Upvotes: 1

Views: 880

Answers (1)

Timothy Rylatt
Timothy Rylatt

Reputation: 7850

It looks as though you have some code written for Excel that you are trying to use in Word

    Dim myWord As Word.Application, myDoc As Word.Document
    Set myWord = New Word.Application  
    Set myDoc = myWord.Documents.Open(FileName:=FSOFile.Path, ConfirmConversions:=False, Format:="PDF Files")
    myDoc.Activate

As you are running the code in Word you already have an instance of Word so there is no need to open a new one, and you definitely don't want to create a new instance of Word for each file you attempt to open, especially as your code doesn't quit any of those extra instances. You are not seeing the documents being opened as your code does not make the additional instances of Word visible.

Your code with all the unnecessary bits removed:

   'Earlier code truncated for brevity
   Dim FSOSubFolder As Object
   Dim FSOFile As Object
   Dim anotherString As String

   For Each FSOSubFolder In FSOFolder.SubFolders
      LoopAllSubFolders FSOSubFolder
   Next
   
   Dim originalDoc As Word.Document
   Set originalDoc = ActiveDocument

   For Each FSOFile In FSOFolder.Files
    
      originalDoc.Range.InsertAfter ("File Name: " & FSOFile.Name & vbNewLine) 'This part works
      Dim pdfDoc As Word.Document
      Set pdfDoc = Documents.Open(filename:=FSOFile.Path, ConfirmConversions:=False, Format:=wdOpenFormatAuto)
      'use this if you just want the text content without formatting
      originalDoc.Range.InsertAfter pdfDoc.Content.Text
      'use this is if you want the formatted content
      'originalDoc.Range.InsertParagraphAfter
      'originalDoc.Paragraphs.Last.Range.FormattedText = pdfDoc.Content.FormattedText
      pdfDoc.Close

      Set FSOFile = Nothing
      Set FSOSubFolder = Nothing
   Next

As you are attempting to open pdf's you will likely get the notification that Word will attempt to convert it despite setting ConfirmConversions:=False

Upvotes: 4

Related Questions