sikorloa
sikorloa

Reputation: 101

Categorize Each String under Group (1 time), based on String content? (Excel)

Complicated problem... let me explain first, maybe there is a better solution rather than using iterative calculations:

(Link to Workbook)

Image showing example (to show what I'm working with)

PROBLEM:

Have 4,000+ Strings and would like to categorize them into pre-determined Groups (based on the String's content).

  1. Each String should only be assigned to one Group. (ie. "55 gallon drum faucet" will be listed under "faucet" column, since it contains the word "faucet".)

  2. Once categorized into a Group, the string won't be categorized under any other Groups. (ie. "55 gallon drum faucet" won't be categorized under "drum" once it's already been categorized under "faucet").

  3. It really doesn't matter which Group each String does under, as long as it's categorized.

Note: (I've almost found a solution using iterative calculation, but it doesn't quite work).

SOLUTION:

The way I approached the problem was:

  1. Count the number of times the String (Column A) was duplicated in the worksheet using the formula:

     Formula: =COUNTIF($E$2:$IA$10000,A3)
    
    • This formula goes in Column C.
  2. Created a formula that will categorize a String underneath a Group based on whether the String contains the Group word (ie. "faucet", "beer", "gallon", "kitchen", etc)... AND has not been used before (ie. Column C, which contains the formula from above).

      Formula: =IF(C3<1,IF(IFERROR(SEARCH("faucet",A3),0)>0,A3,""),"")
    
  3. Drag formula down for all 4,000 strings in Column C, and for each individual "Group" column.

The problem with this approach is that it will do an iterative calculation which will either:

  1. Categorize the String under a Group (but WON'T increase the Times Dup'd field from 0 to 1)...

OR

  1. Increase the "Times Dup'd" field from 0 to 1 BUT will keep the String from being categorized under the Groups column.

Any suggestions on how to address the iterative calculation issue? (I know it keeps calculating back and forth since it's dependent, so will have to settle with 1 "right" solution... I'm wondering if there's any way to create some sort of 'block' so it can only be calculated one way...)

Any help would be greatly appreciated!

Upvotes: 1

Views: 551

Answers (2)

user4039065
user4039065

Reputation:

Run this procedure through your data. It performs all processing within a pair of variant arrays.

Sub byGroup()
    Dim g As Long, s As Long, aSTRs As Variant, aGRPs As Variant

    appTGGL bTGGL:=False

    With Worksheets("Sheet1")
        aSTRs = .Range(.Cells(2, 1), .Cells(Rows.Count, 1).End(xlUp)).Value2
        With .Range(.Cells(1, 5), .Cells(Rows.Count, 1).End(xlUp).Offset(0, Application.Match("zzz", .Rows(1)) - 1))
            .Resize(.Rows.Count, .Columns.Count).Offset(1, 0).ClearContents
            aGRPs = .Cells.Value2
        End With

        For s = LBound(aSTRs, 1) To UBound(aSTRs, 1)
            For g = LBound(aGRPs, 2) To UBound(aGRPs, 2)
                If CBool(InStr(1, aSTRs(s, 1), aGRPs(1, g), vbTextCompare)) Then
                    aGRPs(s + 1, g) = aSTRs(s, 1)
                    Exit For
                End If
            Next g
        Next s

        .Cells(1, 5).Resize(UBound(aGRPs, 1), UBound(aGRPs, 2)) = aGRPs

    End With

    appTGGL
End Sub

Public Sub appTGGL(Optional bTGGL As Boolean = True)
    Debug.Print Timer
    Application.ScreenUpdating = bTGGL
    Application.EnableEvents = bTGGL
    Application.DisplayAlerts = bTGGL
    Application.Calculation = IIf(bTGGL, xlCalculationAutomatic, xlCalculationManual)
End Sub

Elapsed time (excluding your worksheet formula recalculation should be in the 1-2 second range.

Priority of the matching groups is left to right. If you think '55 gallon drum' should be grouped as drum and not gallon then make sure that drum comes before gallon in row 1.

Saving the new macro-enabled workbook as an Excel Binary Workbook (.XLSB) reduces the workbook file size by roughly half.

Upvotes: 2

SincereApathy
SincereApathy

Reputation: 675

I was working on something and Jeeped beat me to the answer. I tried Jeeped's code, but was getting multiple group entries for some of the strings. Here is the code I was working on if it's of any value at this point:

Sub sikorloa()

Dim r As Integer
Dim c As Integer
Dim LastRow As Integer
Dim LastCol As Integer
Dim strng As String
Dim grp As String

Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False

LastRow = Range("A" & Rows.Count).End(xlUp).Row
LastCol = Cells(1, Cells.Columns.Count).End(xlToLeft).Column

For r = 3 To LastRow
    If Cells(r, 1).Value <> "" Then
        strng = Cells(r, 1).Value
        For c = 5 To LastCol
            grp = Cells(1, c).Value
            If InStr(strng, grp) > 0 Then
                Cells(r, c).Value = Cells(r, 1).Value
                Exit For
            End If
        Next c
    End If
Next r

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

End Sub

Upvotes: 2

Related Questions