Reputation: 21
I have the VB code below
Sub CountWordFrequencies()
Dim InputSheet As Worksheet
Dim WordListSheet As Worksheet
Dim PuncChars As Variant, x As Variant
Dim i As Long, r As Long, b As Long
Dim txt As String
Dim wordCnt As Long
Dim AllWords As Range
Dim PC As PivotCache
Dim PT As PivotTable
Dim PF As PivotField
Application.ScreenUpdating = False
Set InputSheet = ActiveSheet
Set WordListSheet = Worksheets.Add(after:=Worksheets(Sheets.Count))
WordListSheet.Range("A1").Font.Bold = True
WordListSheet.Range("A1") = "All Words"
InputSheet.Activate
wordCnt = 2
PuncChars = Array(".", ",", ";", ":", "'", "!", "#", _
"$", "%", "&", "(", ")", " - ", "_", "--", "+", _
"=", "~", "/", "\", "{", "}", "[", "]", """", "?", "*")
r = 2
Dim NotRealWord As Variant
NotRealWord = Array("OF","THE")
Do While Cells(r, 1) <> ""
txt = UCase(Cells(r, 1))
For i = 0 To UBound(PuncChars)
txt = Replace(txt, PuncChars(i), "")
Next i
txt = WorksheetFunction.Trim(txt)
x = Split(txt)
For i = 0 To UBound(x)
WordListSheet.Cells(wordCnt, 1) = x(i)
wordCnt = wordCnt + 1
Next i
r = r + 1
Loop
WordListSheet.Activate
Set AllWords = Range("A1").CurrentRegion
Set PC = ActiveWorkbook.PivotCaches.Add _
(SourceType:=xlDatabase, _
SourceData:=AllWords)
Set PT = PC.CreatePivotTable _
(TableDestination:=Range("C1"), _
TableName:="PivotTable1")
With PT
.AddDataField .PivotFields("All Words")
.PivotFields("All Words").Orientation = xlRowField
.PivotFields("All Words") _
.AutoSort xlDescending, "Count of All Words"
End With
Set PF = ActiveSheet.PivotTables("PivotTable1").PivotFields("All Words")
With PF
.ClearManualFilter
.EnableMultiplePageItems = True
For b = LBound(NotRealWord) To UBound(NotRealWord)
.PivotItems(NotRealWord(b)).Visible = False
Next b
End With
End Sub
This one is a Word Frequency Analysis function where the user will insert the list of strings in column A, starting from A2. They will click a button that run this script. The script will then break the strings into single words and create a pivot table that will count the frequency of each word, sorted by the frequency.
Here are the pictures showing the mechanism:
The result
Now I have an issue with the filter. Ultimately, I want the pivot table to automatically filter out the list of words in the "NotRealWord" array because these are not useful words to analyze. My code works only when the script can find all values in the array list in the words being broken out. So in my example, I set NotRealWord = Array("OF", "THE") and the pivot table field does have these words so it works perfectly. But if I added "BY", it returns this error "Unable to get the PivotItems property of the PivotField class". How do I fix this?
Or even better, how can I make NotRealWord a dynamic array which takes the list of values in let's say column F so that the user can add in more words that they want to filter out without having to fix the code (my first picture also shows that column F).
Please note that I'm not super good at VB. I know how to read and adapt complicated codes but don't know the in and out of FB word
Upvotes: 0
Views: 512
Reputation: 166126
Here's one possible approach which is a little different from your current one but should do what you want:
Sub WordCountTester()
Dim d As Object, k, i As Long, ws As Worksheet
Set ws = ActiveSheet
Set d = WordCounts(ws.Range("A2:A" & ws.Cells(Rows.Count, "A").End(xlUp).Row), _
ws.Range("F2:F" & ws.Cells(Rows.Count, "F").End(xlUp).Row))
'list words and frequencies
For Each k In d.keys
ws.Range("H2").Resize(1, 2).Offset(i, 0).Value = Array(k, d(k))
i = i + 1
Next k
End Sub
'rngTexts = range with text to be word-counted
'rngExclude = range with words to exclude from count
Public Function WordCounts(rngTexts As Range, rngExclude As Range) As Object 'dictionary
Dim words, c As Range, dict As Object, regexp As Object, w, wd As String, m
Set dict = CreateObject("scripting.dictionary")
Set regexp = CreateObject("VBScript.RegExp") 'see link below for reference
With regexp
.Global = True
.MultiLine = True
.ignorecase = True
.Pattern = "[\dA-Z-]{2,}" 'at least two characters...
End With
'loop over input range
For Each c In rngTexts.Cells
If Len(c.Value) > 0 Then
Set words = regexp.Execute(UCase(c.Value))
'loop over matches
For Each w In words
wd = w.Value 'the text of the match
If Not IsNumeric(wd) Then 'EDIT: ignore numbers
'increment count if the word is not found in the "excluded" range
If IsError(Application.Match(wd, rngExclude, 0)) Then
dict(wd) = dict(wd) + 1
End If
Else
Debug.Print "Filtered out", wd 'excluding numbers...
End If '>1 char
Next w
End If
Next c
Set WordCounts = dict
End Function
Regular expressions reference: https://learn.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/scripting-articles/ms974570(v=msdn.10)
Upvotes: 2