Reputation: 97
My spreadsheet has a column which can have multiple word or phrase duplicates within a cell.
Here is an example of the type of values:
╔════════════════════════╗
║ Column A ║
╠════════════════════════╣
║ Apple ║
║ Apple Apple ║
║ Apple Pie ║
║ Apple Pie Apple Pie ║
╚════════════════════════╝
I am looking for the best way to delete duplicate values within a cell. Is there any known VBA code or excel formulas I could use? It would need to recognize a either duplicate words or phrases (up to 3 words) The spreadsheet is large and would be a pain to go and manually remove them.
Any ideas , solutions or direction would be great.
Upvotes: 0
Views: 1768
Reputation: 97
I was able to use an Excel function that finds and deletes MOST of the duplicate Names and phrases within the cell. Hope this helps others
=IF(LEN(A1)-LEN(SUBSTITUTE(A1," ",""))>=2,LEFT(A1,FIND(LEFT(A1,FIND(" ",A1,FIND(" ",A1)+1)-1),A1,2)-2),LEFT(A1,FIND(" "&LEFT(A1,FIND(" ",A1)-1),A1,2)-1))
Upvotes: 0
Reputation: 16311
You can use a regex with a backreference to match duplicated words or phrases. The pattern ^(.+)\s*\1$
will match any duplicating phrase with optional whitespace in between.
Const strText = "Apple Pie Apple Pie"
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "^(.+)\s*\1$"
If re.Test(strText) Then
Debug.Print re.Replace(strText, "$1")
End If
Output:
Apple Pie
Upvotes: 1