CJ7
CJ7

Reputation: 23275

Detect language of Word document

Using Range.DetectLanguage, how can I detect the language of each of the paragraphs of a Word document and determine the most used language of the Word document?

The set of documents I wish to run this over can be either French or English, but all will have both English and French in the header, so I cannot use Document.DetectLanguage because this returns WdUndefined on all documents. I need to check all paragraphs and determine what is the most popular language in the document.

What is the best way to do this in VBA?

Upvotes: 2

Views: 2769

Answers (2)

Johan D.
Johan D.

Reputation: 71

Using Dutch, French and English documents. It is my experience that Office DOES NOT recognize the language the right way. I write a document in the system language: Okay, spelling and grammar are controlled, and language is automatically set to system language (even if the two other languages are installed in the system and in the office-language options)

Even while writing this text, all words are red underlined , so Chrome does not detect the language either.

The system language is Dutch, and this problem has always existed, whatever I try or do, I have to select all, set the language manually, and then do the spelling check.

Looping through the languages makes no sense, if the detection is not right. It seems to me the language/spelling/grammar detecting/checking/correcting options are on a stand-by since MS Office 2007, or almost a decade. See here.

If this has to do with the fact that Dutch is a 'small' language, I don't know. If there was a way to "set language" for the current document, a simple start-up code would do the job. So far, I did not find code that does this, except this little simple code I wrote:

Sub setlng()
    'Set language 
    Selection.WholeStory
    With Selection
        Select Case InputBox("What's your language? (NL= Nederlands, FR = Français, EN = English, DE = Deutch)")
            Case "Nl", "NL", "nL"
                .LanguageID = wdDutch
            Case "Fr", "FR", "fR"
                .LanguageID = wdFrench
            Case "En", "EN", "eN"
                .LanguageID = wdEnglishUS
            Case "De", "DE", "dE"
                .LanguageID = wdGerman
        End Select
        Application.CheckLanguage = True
    End With
End Sub

Clearly, since MS Office was written in English, you have to use the English word for your language, instead of the language's own word for it's language, which would be logical...

I'm very curious if people who e.g. live in Azerbaijan even find their language: "Selection.LanguageID = wdAzeriCyrillic" ...hm...

Upvotes: 2

CJ7
CJ7

Reputation: 23275

Dim doc As Document, para As Paragraph
Dim lang As WdLanguageId
Dim dict As New Dictionary

Set doc = ActiveDocument
If Not doc.LanguagedDetected Then doc.DetectLanguage
' count languages in paragraphs
For Each para In doc.Paragaphs
   lang = para.Range.LanguageId
   If Not dict.Exists(lang) Then 
       dict.add lang, 1
   Else
       dict(lang) = dict(lang) + 1
   End if
Next
' determine most popular language
Dim maxCount As Integer, maxKey As wdLanguageId
For Each key In dict.Keys()
   If dict(key) > maxCount Then 
      maxCount = dict(key)
      maxKey = key
   End if
Next

Debug.Print "Most popular language is: " & maxKey

Upvotes: 4

Related Questions