user3623590
user3623590

Reputation: 11

Highlight all duplicates after first occurence

I have attached the excel sheet for reference. As shown in the "After" sheet, i need the 2nd to nth instance of duplicate keywords to be highlighted with some color.

Can somebody help me with this.Excel File

Upvotes: 1

Views: 151

Answers (1)

Tony Dallimore
Tony Dallimore

Reputation: 12413

This site exists to allow programmers to help other programmers develop their skills. Sometimes significant pieces of code are provided in answers but this is not a free coding site. You have asked for too much in one go and have shown no effort to solve your own problem. These are two major no-nos for this site.

Do you know Excel VBA? If not you are unlikely to understand any answer that includes code. Search for "VB Excel tutorial". There are many to choose from. Try a few and complete one that matches your learning style. I prefer books. I visited a large library and reviewed all the Excel VBA primers. I then bought the one I preferred.

Once you have some VBA under your belt, try to write the macro yourself. When I am unsure how to create a macro, I divide the total problem into steps. I then write a macro that will tackle step 1 and prove it has successfully tackled step 1. I then update the macro to tackle step 2 as well and prove it has successfully tackled steps 1 and 2. The advantages of this technique are:

  • I isolate each problem and
  • I know I am making progress.

If you write a large macro and it does not work, you do not know where to start with the debugging. An advantage of this technique for you, is that if you cannot get step 4, say, to work, you can isolate the relevant code, say what it does and seek help here to make it do what you want. That is the type of question that is answered most quickly.

There is a lot of detail below. Work through it slowly looking up the statements and functions I mention. Come back with questions if necessary but, the more you can work out for yourself, the faster you will develop your skills.

I should warn you that the code below has been typed straight into the answer and has not been checked. I do not believe there are any errors but you should be prepared for the possibility. I have also deliberately omitted parts of some statements to force you to look them up which I believe will help you develop. I hope this gets you started without actually giving you the answer.

These are the steps that I suggest for this problem:


You will need a for-loop that examines each row from 2 to RowLast. How do you find the value of RowLast? I searched for [excel-vba] find last row and found a variety of questions with answers that explained how to do this. Having set RowLast, you can prove you have the correct value by outputting it to the Immediate Window with:

Debug.Print RowLast

Code the loop to obtain cell A of each row within the required worksheet. To prove you are accessing each cell A of each row, I suggest something like:

Debug.Print RowCrnt & " " & .Cells(RowCrnt, "A").Value

You will need to split each cell into the separate words or phrases. In your example the divider is always comma space. Can you rely on it always being comma space or might it be a selection from comma space, space comma ,space comma space and so on? You need to use Split. Look it up. I suggest you split on comma and then use function Trim to remove any spaces. If you split to array CellPart, I suggest you obtain initial proof that you have split the cell correctly with:

Debug.Print Trim(CellPart(0)) & "|" Trim(CellPart(1))

To properly split the cell you will need another For loop that accesses each element of CellPart. Look up UBound to find out how to get the number of the last element of an array. I suggest something like the following. Look up Debug.Print to find the significance of the ";" at the end of line 2.

For InxCP = 0 to ....
  Debug.Print Trim(CellPart(InxCP)) & "|";
Next
Debug.Print

It now gets a little more complicated. You will have to build an array of all the words or phrases. For each element of CellPart, you need to check if it is already present in the array. If it is already present, you want to colour it. If it is not present, you need to add it to the array. You do not know in advance how many entries the array will have to hold. The statement Redim sizes an array while Redim Preserve sizes an array without discarding the current contents.

If you need to build a large array, Redim Preserve can be a slow statement. The interpreter has to find some memory big enough to hold the new, larger array, copy the existing array to this new space and pass the space for the old array to the garbage collector. I prefer to increase the size of my array in large chunks to reduce the number of times I use Redim Preserve. There are other techniques which may be better but I believe this technique is the easiest (not easy just easier than other techniques) to understand:

Dim InxTokenCrntMax As Long     ' Last used entry in Token
Dim Token() As String

' Initialise Token with room for 1000 entries
ReDim Token(1 To 1000)
InxTokenCrntMax = 0

...

' Add new word/phrase to Token
InxTokenCrntMax = InxTokenCrntMax + 1
If InxTokenCrntMax > UBound(Token) Then
  ' Token is full.  Add another 1000 entries.
  ReDim Preserve Token( 1 To 1000 + UBound(Token))
EndIf
Token(InxTokenCrntMax) = NewWordOrPhrase

Earlier I discussed the two outer For loops you need. The outer loop is for each row in the worksheet and the middle loop is for each word/phrase within cell A of a row. For each word/phrase you need to trim off any leading or trailing spaces and then, with an inner loop, match it against the current contents of the array Token. If you get a match, highlight the word/phrase within the cell and exit the current repeat of the inner loop. If you do not get a match, add the word/phrase to Token as shown above. I would use something like:

Dim Found As Boolean
Dim InxTokenCrnt As Long
Dim InxTokenCrntMax As Long     ' Last used entry in Token
Dim RowCrnt As Long
Dim RowLast As Long
Dim Token() As String

...

For RowCrnt = 2 ....
  CellPart = Split( ...
  For InxCP = 0 ...     
    TokenCrnt = Trim(CellPart(InxCP))
    Found = False
    For InxTokenCrnt = 1 To InxTokenCrntMax
      If Token(InxTokenCrnt) = TokenCrnt Then
        ' Highlight TokenCrnt within Cell
        ....
        Found = True
        Exit For
      End If
    Next
    If Not Found Then
      ' Add TokenCrnt to Token
      ...
    End If
  Next
Next

...

' Prove all the words/phrases have been collected into Token
For InxTokenCrnt = 1 To InxTokenCrntMax
  Debug.Print Token(InxTokenCrnt)
Next

The final step is to hightlight the duplicate words/phrases.

  • Turn on the Macro Recorder.
  • Highlight a word/phrase or two within a cell.
  • Switch off the Macro Recorder.

The recorded macro will show you the syntax of the statement that highlights part of a cell. However, this code:

  • Will operate on the selected cell.
  • Will use a number for the start position.
  • Will use a number for the length.

You will need to adjust this code to operate on a specified cell and to use variables for the position and length of the highlighted part.


Best of luck.

Upvotes: 2

Related Questions