Reputation: 23275
Using Range.DetectLanguage
, how can I detect the language of each of the paragraphs of a Word
document and determine the most used language of the Word document?
The set of documents I wish to run this over can be either French or English, but all will have both English and French in the header, so I cannot use Document.DetectLanguage
because this returns WdUndefined
on all documents. I need to check all paragraphs and determine what is the most popular language in the document.
What is the best way to do this in VBA
?
Upvotes: 2
Views: 2769
Reputation: 71
Using Dutch, French and English documents. It is my experience that Office DOES NOT recognize the language the right way. I write a document in the system language: Okay, spelling and grammar are controlled, and language is automatically set to system language (even if the two other languages are installed in the system and in the office-language options)
Even while writing this text, all words are red underlined , so Chrome does not detect the language either.
The system language is Dutch, and this problem has always existed, whatever I try or do, I have to select all, set the language manually, and then do the spelling check.
Looping through the languages makes no sense, if the detection is not right. It seems to me the language/spelling/grammar detecting/checking/correcting options are on a stand-by since MS Office 2007, or almost a decade. See here.
If this has to do with the fact that Dutch is a 'small' language, I don't know. If there was a way to "set language" for the current document, a simple start-up code would do the job. So far, I did not find code that does this, except this little simple code I wrote:
Sub setlng()
'Set language
Selection.WholeStory
With Selection
Select Case InputBox("What's your language? (NL= Nederlands, FR = Français, EN = English, DE = Deutch)")
Case "Nl", "NL", "nL"
.LanguageID = wdDutch
Case "Fr", "FR", "fR"
.LanguageID = wdFrench
Case "En", "EN", "eN"
.LanguageID = wdEnglishUS
Case "De", "DE", "dE"
.LanguageID = wdGerman
End Select
Application.CheckLanguage = True
End With
End Sub
Clearly, since MS Office was written in English, you have to use the English word for your language, instead of the language's own word for it's language, which would be logical...
I'm very curious if people who e.g. live in Azerbaijan even find their language: "Selection.LanguageID = wdAzeriCyrillic"
...hm...
Upvotes: 2
Reputation: 23275
Dim doc As Document, para As Paragraph
Dim lang As WdLanguageId
Dim dict As New Dictionary
Set doc = ActiveDocument
If Not doc.LanguagedDetected Then doc.DetectLanguage
' count languages in paragraphs
For Each para In doc.Paragaphs
lang = para.Range.LanguageId
If Not dict.Exists(lang) Then
dict.add lang, 1
Else
dict(lang) = dict(lang) + 1
End if
Next
' determine most popular language
Dim maxCount As Integer, maxKey As wdLanguageId
For Each key In dict.Keys()
If dict(key) > maxCount Then
maxCount = dict(key)
maxKey = key
End if
Next
Debug.Print "Most popular language is: " & maxKey
Upvotes: 4