Reputation: 35
My data looks like this:
negative Comment1
neutral Comment1
positive Comment1
neutral Comment1
positive Comment1
negative Comment1
I have a simple macro that I have written to do three steps:
Finds the most frequently used word in the first column
Deletes all other rows that do not contain this word
Deletes all duplicate rows.
Basically, my macro looks at what was written in the first column and reduces the entire chunk to only one row based on what word was the most frequently used.
An example:
This works so far, but now I would like to apply this same macro to different 'chunks' in an entire sheet. In reality, my data looks more like this:
positive Comment1
neutral Comment1
positive Comment1
negative Comment1
positive Comment2
neutral Comment2
positive Comment2
negative Comment3
negative Comment3
negative Comment3
positive Comment3
And I want to essentially use my macro to reduce it to this:
positive Comment1
positive Comment2
negative Comment3
So far, my macro only works for one comment. Is there a way I can detect differences in comments, for instance, so I can make my macro loop through each 'chunk' of similar comments? Any help is greatly appreciated, thank you.
Here is my macro so far:
Sub MostFrequent()
Dim rng As Range
Dim WorkRng As Range
Set dic = CreateObject("scripting.dictionary")
On Error Resume Next
xTitleId = "KutoolsforExcel"
Set WorkRng = Application.Selection
Set WorkRng = Application.InputBox("Range", xTitleId, WorkRng.Address, Type:=8)
xMax = 0
xOutValue = ""
For Each rng In WorkRng
xValue = rng.Value
If xValue <> "" Then
dic(xValue) = dic(xValue) + 1
xCount = dic(xValue)
If xCount > xMax Then
xMax = xCount
xOutValue = xValue
End If
End If
Next
Dim xRow As Range
Dim xStr As String
On Error Resume Next
xTitleId = "KutoolsforExcel"
xStr = xOutValue
Application.ScreenUpdating = False
For i = WorkRng.Rows.Count To 1 Step -1
Set xRow = WorkRng.Rows(i)
Set rng = xRow.Find(xStr, LookIn:=xlValues)
If rng Is Nothing Then
xRow.EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
With ActiveSheet
Set rng = Range("A1", Range("B1").End(xlDown))
rng.RemoveDuplicates Columns:=Array(1, 2), Header:=xlNo
End With
End Sub
Upvotes: 0
Views: 81
Reputation: 1815
Assuming that you are really only using Column A and B.
The code below will insert formulas on Column C, D, and E. (I could really only use 1 formula, left all 3 to be easier to understand what i'm doing)
With these formulas i get the times and % a comments gets repeated with a "Semantic". If the percentage is lower than 50% it means that it is not the most frequent so it gets deleted. You are left with duplicates of the most frequent, I remove duplicates and the formulas.
Sub delRows()
Dim lastRow As Long, i As Long, wk As Workbook, repeatDel As Boolean
Set wk = ThisWorkbook
Application.ScreenUpdating = False
With wk.Sheets("Sheet1")
lastRow = .Cells(Rows.Count, 1).End(xlUp).Row
.Cells(2, 3).Formula = "=CONCAT(MID(B2,1,240),A2)"
.Cells(2, 4).Formula = "=COUNTIF(C:C,C2)"
.Cells(2, 5).Formula = "=D2/COUNTIF(B:B,B2)"
'FillDown formulas
.Range("C2:E" & lastRow).FillDown
'When deleting rows, you should loop from bottom to top.
For i = lastRow To 2 Step -1
If .Cells(i, 5) < 0.5 Then
'Delete least frequent
.Cells(i, 5).EntireRow.Delete
repeatDel = True
End If
Next i
'Remove duplicates and Formulas
.Columns("A:E").RemoveDuplicates Columns:=Array(1, 2, 3, 4, 5), Header:=xlYes
.Columns("C:E").ClearContents
End With
Application.ScreenUpdating = True
End Sub
Upvotes: 1