UD.Cole
UD.Cole

Reputation: 45

Remove Duplicate value in single cell in Excel

I have the following text string in one single Excel cell:

AucklandAucklandarea
or 
WellingtonWellingtonarea

It is clear to see that the word phrases or strings include three words without any delimiter (space, comma, seminoma), such as Auckland, Auckland, area; Just Capital letters and lower letters can be identified what the words are.

My question is how can I delete duplicate word from a single cell or keep the word unique. I have the column which presents data as above described. And I have checked some solutions about Excel VBA, however those solutions provide good ideas but not suit for my case. Otherwise, I have to replace them item by item.

Upvotes: 2

Views: 1359

Answers (4)

VBasic2008
VBasic2008

Reputation: 54777

Remove Capital Duplicates

  • In a string, removes repeating sub strings starting with a capital letter.
  • You can use CapsNoDupes as a UDF in Excel e.g. =CapsNoDupes(A1).
Option Explicit

' Select some cells (ranges) and run this to apply the changes.
' To use a non-contiguous range, e.g. select 'A1:A5' then press and hold 'CTRL'
' and select 'C2' and select 'D4:E6' etc.
Sub TESTremoveCapDupesSelection()
    If TypeName(Selection) = "Range" Then
        Dim rng As Range: Set rng = Selection
        removeCapDupesInRange rng
    End If
End Sub

' A reminder that the range can be non-contiguous.
Sub TESTremoveCapDupesInRange()
    Dim rng As Range: Set rng = Range("A1,B2,C4")
    removeCapDupesInRange rng
End Sub

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose:      Utilizes the 'CapsNoDupes' function in a range.
' Remarks:      The range can be non-contiguous. For a contiguous range,
'               it can be written using an array which would increase
'               efficiency.
'               The capFirst argument determines if the first character
'               of the initial string is to be capitalized. Default is 'True'.
' FLow:         It loops through each cell of each area of the specified range.
'               It checks if the current cell does not contain an error value
'               and if it is of type string. If so, it calls the 'CapsNoDupes'
'               function which will possibly modify the cell value (in place).
' Precedents:   'CapsNoDupes', 'UniqueCapsToArray', 'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub removeCapDupesInRange( _
        rng As Range, _
        Optional ByVal Delimiter As String = " ", _
        Optional ByVal capFirst As Boolean = True)
    If Not rng Is Nothing Then
        Dim aRng As Range
        Dim cel As Range
        Dim Curr As String
        For Each aRng In rng.Areas
            For Each cel In aRng.Cells
                If Not IsError(cel) And VarType(cel) = vbString Then
                    cel.Value = CapsNoDupes(cel.Value, Delimiter, capFirst)
                End If
            Next cel
        Next aRng
    End If
End Sub


Sub TESTCapsNoDupes()
    
    Const s As String = "aucklandAucklandWhatEverAucklandarea"
    
    Debug.Print CapsNoDupes(s, False, ",")
    ' Result: 'auckland,Auckland,What,Ever,area'
    
    Debug.Print CapsNoDupes(s, True, ",")
    ' Result: 'Auckland,What,Ever,area'

End Sub

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose:      Returns the unique sub strings, denoted by capital letters
'               in a specified string, in a new delimited string removing
'               all repeating capitalized sub strings.
' Remarks:      The capFirst argument determines if the first character
'               of the initial string is to be capitalized. Default is 'True'.
' FLow:         Checks each element in the resulting "UniqueCapsToArray"'s
'               array against each other and determines which has more
'               characters. Then it tries to replace the string with less
'               characters in the string with more characters.
'               This is only done if the first character of the two comparing
'               strings is upper case. Finally it concatenates ('joins')
'               the elements of the array into a delimited string (the result).
' Precedents:   'UniqueCapsToArray', 'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function CapsNoDupes( _
    ByVal s As String, _
    Optional ByVal Delimiter As String = " ", _
    Optional ByVal capFirst As Boolean = True) _
As String
    Dim Caps As Variant: Caps = UniqueCapsToArray(s, capFirst)
    If Not IsEmpty(Caps) Then
        Dim i As Long
        Dim j As Long
        For i = 0 To UBound(Caps) - 1
            For j = 1 To UBound(Caps)
                If Left(Caps(i), 1) Like "[A-Z]" _
                        And Left(Caps(j), 1) Like "[A-Z]" Then
                    If Len(Caps(i)) = Len(Caps(j)) Then
                    ElseIf Len(Caps(i)) > Len(Caps(j)) Then
                        Caps(i) = Replace(Caps(i), Caps(j), _
                            "", , , vbBinaryCompare)
                    Else
                        Caps(j) = Replace(Caps(j), Caps(i), _
                            "", , , vbBinaryCompare)
                    End If
                End If
            Next j
        Next i
        CapsNoDupes = Join(Caps, Delimiter)
    End If
End Function


Sub testUniqueCapsToArray()
    
    Const s As String = "aucklandAucklandWhatEverAucklandarea"
    Dim arr As Variant
    
    arr = UniqueCapsToArray(s, False)
    Debug.Print Join(arr, ",")
    ' Result: 'auckland,Auckland,What,Ever,Aucklandarea'
    
    arr = UniqueCapsToArray(s, True)
    Debug.Print Join(arr, ",")
    ' Result: 'Auckland,What,Ever,Aucklandarea'

End Sub

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose:      Returns the unique sub strings, denoted by capital letters
'               in a specified string, in a 1D (zero-based) array.
' Remarks:      The capFirst argument determines if the first character
'               of the initial string is to be capitalized. Default is 'True'.
' Flow:         In the main part of the code ('Case Else'), loops through
'               the characters of the specified string to find an upper case
'               character. If found, writes the string ('word') containing
'               the previous characters (that had not already been written)
'               to the dictionary, which automatically discards any duplicates.
'               It continues looping and finally writes the keys
'               of the dictionary to the resulting 1D array.
' Precedents:   'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function UniqueCapsToArray( _
    ByVal CapDenotedString As String, _
    Optional ByVal capFirst As Boolean = True) _
As Variant
    Select Case Len(CapDenotedString)
        Case 0
            UniqueCapsToArray = VBA.Array("")
        Case 1
            If capFirst Then
                UniqueCapsToArray = VBA.Array(UCase(CapDenotedString))
            Else
                UniqueCapsToArray = VBA.Array(CapDenotedString)
            End If
        Case Else
            Dim s As String
            If capFirst Then
                s = capString(CapDenotedString)
            Else
                s = CapDenotedString
            End If
            Dim cStart As Long: cStart = 1
            Dim i As Long
            With CreateObject("Scripting.Dictionary")
                .CompareMode = vbBinaryCompare
                For i = 2 To Len(s)
                    If Mid(s, i, 1) Like "[A-Z]" Then
                        .Item(Mid(s, cStart, i - cStart)) = Empty
                        cStart = i
                    End If
                Next i
                .Item(Right(s, Len(s) - cStart + 1)) = Empty
                UniqueCapsToArray = .Keys
            End With
    End Select
End Function


Sub TESTcapString()
    Debug.Print capString("aaAB") ' Returns "AaAB".
    Debug.Print capString("AaAB") ' No change.
    Debug.Print capString("2aB") ' No change.
End Sub

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose:      Capitalizes a string i.e. replaces only the first character
'               with the same upper case character.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function capString( _
    ByVal s As String) _
As String
    capString = Replace(s, Left(s, 1), UCase(Left(s, 1)), , 1)
End Function

Upvotes: 2

Dy.Lee
Dy.Lee

Reputation: 7567

Use RegExp in VBA.

Sub setRegexPattern()
    Dim regEx As Object 'New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim Myrange As Range

    
    Set regEx = CreateObject("VBScript.RegExp")
    
    Set Myrange = ActiveSheet.Range("A1", Range("a" & Rows.Count).End(xlUp))
    
    For Each C In Myrange
        
        strPattern = "([A-Z][a-z]+[^A-Z])([A-z]+)(area)"
        
        If strPattern <> "" Then
            strInput = C.Value
            
            With regEx
                .Global = True
                .MultiLine = True
                .IgnoreCase = False
                .Pattern = strPattern
            End With
            
            If regEx.test(strInput) Then
                C.Offset(0, 1) = regEx.Replace(strInput, "$1 $3")
            Else
                C.Offset(0, 1) = "(Not matched)"
            End If
        End If
    Next
    
End Sub

enter image description here

Upvotes: 1

Terbiy
Terbiy

Reputation: 690

If I had this problem, I would consider using additional tooling: code editor and regular expressions.

The algorithm is the following:

  1. Copy all cells to the code editor.
  2. Use "Find & Replace" in regex and case-sensitive mode:
    1. Text to find: [A-Z][a-z]+([A-Z][a-z]+)(area).
    2. Text to place: \1 \2.
  3. Replace everything.
  4. Copy the result and paste it back to Excel.

Here is the GIF I prepared to show how it looks on my side. enter image description here

Upvotes: 2

Gass
Gass

Reputation: 9344

This function will identify the substring before the second uppercase, it will bold it in the main data set and paste the substring in the next column for further check.

Give it a go, I hope it's useful..

Sub cleanData()

Dim position As Byte
Dim upper_case(1 To 26) As String
Dim counter As Byte
Dim db_row_start, db_row_ends, db_col As Integer
Dim use_row As Integer

'CONFIG
'don't forget to change this values before you try it
'--------------------
db_row_start = 3       'in what row does your data starts?
db_row_ends = 20     'in what row does your data ends?
db_col = 2                 'in what column is your data?
'--------------------

'fill the array upper_case with all upper case characters
For counter = 1 To 26
    upper_case(counter) = Chr(counter + 64)
Next counter

For use_row = db_row_start To db_row_ends
    For counter = 1 To 26
        
        position = _
        InStr(2, Cells(use_row, db_col), upper_case(counter), _
        vbBinaryCompare)
        
        'if a second appercase is found enter the function
        If position > 0 Then
            
            'move substring to the next column
            Cells(use_row, db_col + 1) = _
            Left(Cells(use_row, db_col), position - 1)
            
            'bold the substring copied to the next column
            Cells(use_row, db_col).Characters(1, position - 1). _
            Font.Bold = True
            
            counter = 26
        End If
        
  Next counter
Next use_row

End Sub

Upvotes: 1

Related Questions