Reputation: 45
I have the following text string in one single Excel cell:
AucklandAucklandarea
or
WellingtonWellingtonarea
It is clear to see that the word phrases or strings include three words without any delimiter (space, comma, seminoma), such as Auckland, Auckland, area; Just Capital letters and lower letters can be identified what the words are.
My question is how can I delete duplicate word from a single cell or keep the word unique. I have the column which presents data as above described. And I have checked some solutions about Excel VBA, however those solutions provide good ideas but not suit for my case. Otherwise, I have to replace them item by item.
Upvotes: 2
Views: 1359
Reputation: 54777
CapsNoDupes
as a UDF in Excel
e.g. =CapsNoDupes(A1)
.Option Explicit
' Select some cells (ranges) and run this to apply the changes.
' To use a non-contiguous range, e.g. select 'A1:A5' then press and hold 'CTRL'
' and select 'C2' and select 'D4:E6' etc.
Sub TESTremoveCapDupesSelection()
If TypeName(Selection) = "Range" Then
Dim rng As Range: Set rng = Selection
removeCapDupesInRange rng
End If
End Sub
' A reminder that the range can be non-contiguous.
Sub TESTremoveCapDupesInRange()
Dim rng As Range: Set rng = Range("A1,B2,C4")
removeCapDupesInRange rng
End Sub
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Utilizes the 'CapsNoDupes' function in a range.
' Remarks: The range can be non-contiguous. For a contiguous range,
' it can be written using an array which would increase
' efficiency.
' The capFirst argument determines if the first character
' of the initial string is to be capitalized. Default is 'True'.
' FLow: It loops through each cell of each area of the specified range.
' It checks if the current cell does not contain an error value
' and if it is of type string. If so, it calls the 'CapsNoDupes'
' function which will possibly modify the cell value (in place).
' Precedents: 'CapsNoDupes', 'UniqueCapsToArray', 'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub removeCapDupesInRange( _
rng As Range, _
Optional ByVal Delimiter As String = " ", _
Optional ByVal capFirst As Boolean = True)
If Not rng Is Nothing Then
Dim aRng As Range
Dim cel As Range
Dim Curr As String
For Each aRng In rng.Areas
For Each cel In aRng.Cells
If Not IsError(cel) And VarType(cel) = vbString Then
cel.Value = CapsNoDupes(cel.Value, Delimiter, capFirst)
End If
Next cel
Next aRng
End If
End Sub
Sub TESTCapsNoDupes()
Const s As String = "aucklandAucklandWhatEverAucklandarea"
Debug.Print CapsNoDupes(s, False, ",")
' Result: 'auckland,Auckland,What,Ever,area'
Debug.Print CapsNoDupes(s, True, ",")
' Result: 'Auckland,What,Ever,area'
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Returns the unique sub strings, denoted by capital letters
' in a specified string, in a new delimited string removing
' all repeating capitalized sub strings.
' Remarks: The capFirst argument determines if the first character
' of the initial string is to be capitalized. Default is 'True'.
' FLow: Checks each element in the resulting "UniqueCapsToArray"'s
' array against each other and determines which has more
' characters. Then it tries to replace the string with less
' characters in the string with more characters.
' This is only done if the first character of the two comparing
' strings is upper case. Finally it concatenates ('joins')
' the elements of the array into a delimited string (the result).
' Precedents: 'UniqueCapsToArray', 'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function CapsNoDupes( _
ByVal s As String, _
Optional ByVal Delimiter As String = " ", _
Optional ByVal capFirst As Boolean = True) _
As String
Dim Caps As Variant: Caps = UniqueCapsToArray(s, capFirst)
If Not IsEmpty(Caps) Then
Dim i As Long
Dim j As Long
For i = 0 To UBound(Caps) - 1
For j = 1 To UBound(Caps)
If Left(Caps(i), 1) Like "[A-Z]" _
And Left(Caps(j), 1) Like "[A-Z]" Then
If Len(Caps(i)) = Len(Caps(j)) Then
ElseIf Len(Caps(i)) > Len(Caps(j)) Then
Caps(i) = Replace(Caps(i), Caps(j), _
"", , , vbBinaryCompare)
Else
Caps(j) = Replace(Caps(j), Caps(i), _
"", , , vbBinaryCompare)
End If
End If
Next j
Next i
CapsNoDupes = Join(Caps, Delimiter)
End If
End Function
Sub testUniqueCapsToArray()
Const s As String = "aucklandAucklandWhatEverAucklandarea"
Dim arr As Variant
arr = UniqueCapsToArray(s, False)
Debug.Print Join(arr, ",")
' Result: 'auckland,Auckland,What,Ever,Aucklandarea'
arr = UniqueCapsToArray(s, True)
Debug.Print Join(arr, ",")
' Result: 'Auckland,What,Ever,Aucklandarea'
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Returns the unique sub strings, denoted by capital letters
' in a specified string, in a 1D (zero-based) array.
' Remarks: The capFirst argument determines if the first character
' of the initial string is to be capitalized. Default is 'True'.
' Flow: In the main part of the code ('Case Else'), loops through
' the characters of the specified string to find an upper case
' character. If found, writes the string ('word') containing
' the previous characters (that had not already been written)
' to the dictionary, which automatically discards any duplicates.
' It continues looping and finally writes the keys
' of the dictionary to the resulting 1D array.
' Precedents: 'capString'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function UniqueCapsToArray( _
ByVal CapDenotedString As String, _
Optional ByVal capFirst As Boolean = True) _
As Variant
Select Case Len(CapDenotedString)
Case 0
UniqueCapsToArray = VBA.Array("")
Case 1
If capFirst Then
UniqueCapsToArray = VBA.Array(UCase(CapDenotedString))
Else
UniqueCapsToArray = VBA.Array(CapDenotedString)
End If
Case Else
Dim s As String
If capFirst Then
s = capString(CapDenotedString)
Else
s = CapDenotedString
End If
Dim cStart As Long: cStart = 1
Dim i As Long
With CreateObject("Scripting.Dictionary")
.CompareMode = vbBinaryCompare
For i = 2 To Len(s)
If Mid(s, i, 1) Like "[A-Z]" Then
.Item(Mid(s, cStart, i - cStart)) = Empty
cStart = i
End If
Next i
.Item(Right(s, Len(s) - cStart + 1)) = Empty
UniqueCapsToArray = .Keys
End With
End Select
End Function
Sub TESTcapString()
Debug.Print capString("aaAB") ' Returns "AaAB".
Debug.Print capString("AaAB") ' No change.
Debug.Print capString("2aB") ' No change.
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose: Capitalizes a string i.e. replaces only the first character
' with the same upper case character.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function capString( _
ByVal s As String) _
As String
capString = Replace(s, Left(s, 1), UCase(Left(s, 1)), , 1)
End Function
Upvotes: 2
Reputation: 7567
Use RegExp in VBA.
Sub setRegexPattern()
Dim regEx As Object 'New RegExp
Dim strPattern As String
Dim strInput As String
Dim Myrange As Range
Set regEx = CreateObject("VBScript.RegExp")
Set Myrange = ActiveSheet.Range("A1", Range("a" & Rows.Count).End(xlUp))
For Each C In Myrange
strPattern = "([A-Z][a-z]+[^A-Z])([A-z]+)(area)"
If strPattern <> "" Then
strInput = C.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
C.Offset(0, 1) = regEx.Replace(strInput, "$1 $3")
Else
C.Offset(0, 1) = "(Not matched)"
End If
End If
Next
End Sub
Upvotes: 1
Reputation: 690
If I had this problem, I would consider using additional tooling: code editor and regular expressions.
The algorithm is the following:
[A-Z][a-z]+([A-Z][a-z]+)(area)
.\1 \2
.Here is the GIF I prepared to show how it looks on my side.
Upvotes: 2
Reputation: 9344
This function will identify the substring before the second uppercase, it will bold it in the main data set and paste the substring in the next column for further check.
Give it a go, I hope it's useful..
Sub cleanData()
Dim position As Byte
Dim upper_case(1 To 26) As String
Dim counter As Byte
Dim db_row_start, db_row_ends, db_col As Integer
Dim use_row As Integer
'CONFIG
'don't forget to change this values before you try it
'--------------------
db_row_start = 3 'in what row does your data starts?
db_row_ends = 20 'in what row does your data ends?
db_col = 2 'in what column is your data?
'--------------------
'fill the array upper_case with all upper case characters
For counter = 1 To 26
upper_case(counter) = Chr(counter + 64)
Next counter
For use_row = db_row_start To db_row_ends
For counter = 1 To 26
position = _
InStr(2, Cells(use_row, db_col), upper_case(counter), _
vbBinaryCompare)
'if a second appercase is found enter the function
If position > 0 Then
'move substring to the next column
Cells(use_row, db_col + 1) = _
Left(Cells(use_row, db_col), position - 1)
'bold the substring copied to the next column
Cells(use_row, db_col).Characters(1, position - 1). _
Font.Bold = True
counter = 26
End If
Next counter
Next use_row
End Sub
Upvotes: 1