Reputation: 3
I found out that an office at my work spent weeks manually going through an Excel spreadsheet containing a database with >500,000 rows looking for duplicate rows matching certain criteria. The duplicates could not simply be erased before being researched, as a single mistake could have potentially lost hundreds of thousands of dollars in lost production. I decided simply flagging them and referencing the originating row would be the best answer in this case. So I decided to look into macros to see how much time could have been saved by using a simple macro instead. I am using this as a programming learning experience, so please no "here's a =function()" answers.
I've written a macro and changed it several times to no avail (most current is below). I wanted to use String variables because there's no telling what has been entered into the cells that will be checked. Here's what I've tried, failed, and learned(?) from this site:
Initially, I tried declaring a variable, and attaching a value from a cell directly to it. e.g. Dim myString As String Set myString = Cells(x, x).Value
However, I kept getting object errors. Thanks to Michael's response here, I learned that you have to use the Range
variable to use Set
.
My next issue has been getting a "type mismatch" error. I'm trying to assign and compare a stored variable against another stored variable, and I'm sure this is causing the issue. I initially tried Dim myRange As Range, myString As String Set myRange = Cells(x, x).Value myString = myRange
. This obviously didn't work, so I tried using the CStr()
"change to string" function to convert the Range variable to the String variable I want. And that's where I'm stuck.
Sub Duplicate()
'Declare the variables
Dim NSNrange, PNrange, KitIDrange As Range
Dim NSN, PN, KitID As String
Dim NSNCheck, PNCheck, KitIDCheck As String
Dim i, j, printColumn, rowCount As Integer
'Set which column we want to print duplicates on, and count the number of rows used
rowCount = ActiveSheet.UsedRange.Rows.Count
printColumn = 9
'Lets get started!
'Clear the duplicate list column for a fresh start
Columns(printColumn).EntireColumn.Delete
'Start on line 2, and grab the cell values for the NSN, Part number and kit ID.
For i = 2 To rowCount
Set NSNrange = Cells(i, 5).Value
Set PNrange = Cells(i, 7).Value
Set KitIDrange = Cells(i, 2).Value
'Change whatever is contained in those cells into a string and store them into their respective containers
NSN = CStr(NSNrange)
PN = CStr(PNrange)
KitID = CStr(KitIDrange)
'Now let's look through the rest of the sheet and find any others that match the 3 variables that we stored above
For j = 2 To rowCount
'To avoid needless checks, we'll check to see if it's already had a duplicate found. If so, we'll just skip to the next row
If Cells(j, printColumn).Value = "" Then
'If the print column is blank, we'll grab the 3 values from the current row to compare against the above variables
Set NSNrange = Cells(j, 5).Value
Set PNrange = Cells(j, 7).Value
Set KitIDrange = Cells(j, 2).Value
'Now we store the contents into their very own container
NSNCheck = CStr(NSNrange)
PNCheck = CStr(PNrange)
KitIDCheck = CStr(KitIDrange)
'Check the initial row with the current row to see if the contents match. If so, print which row it is duplicated on.
If NSN = NSNCheck And PN = PNCheck And KitID = KitIDCheck Then Cells(j, printColumn).Value = "Duplicated on row " & i
End If
Next j
Next i
MsgBox "Search Complete"
End Sub
Upvotes: 0
Views: 2248
Reputation: 2628
You can also use the RemoveDuplicates method.
'Remove duplicates based on the data in columns; 2 "Kit", 5 "NSN", and 7 "PN".
ActiveSheet.UsedRange.RemoveDuplicates Columns:=Array(2, 5, 7), Header:=xlYes
Upvotes: 0
Reputation: 3139
So it's correct that you assign objects with set
(not just range
). A cell
is an object and can be assigned to a range
variable. But when you use methods and properties of objects, in this case .Value
, it does not mean that the return value is a range
object.
So if you need to know what all propertys and methods to, i highly recommend the microsoft documentation.
So when you use .Value
, you get back a variant (depending on the type of the value). In your use-case you can just assign it to a string
, i.e Dim str as string: str = Cells(1,1).Value
. If you just want the cell
as an object that you can reference: Dim cell as Range: Set cell = Cells(1,1)
. Now can can adress all propertys and methods for example: cell.Value
instead of Cells(1,1).Value
.
Just a few other things that are useful to know. In VBA not like in VB.Net, that you better not mix up, if you Dim var1, var2 as String
only var2
is a string
, var1
is a variant
. So its required to specify the type for each variable, Dim var1 as String, var2 as String
.
Another thing that you might want to change is assigning Cells, Range
to a specific Worksheet
. Depending on which module your code is in, it can happen that you code runs on the wrong worksheet. (Also it minimizes errors when other people adjust/run the code), but mainly you just have to change one variable, if you want to refer to another Worksheet
. It can be done with using Worksheet-Object
.
Dim ws as Worksheet
Dim str as String
Set ws = Worksheets(1)
'Now adress methods and properties with ws
str = ws.Cells(1,1).Value
Also note here the object is Worksheet
without an s
. Worksheets
is the collection of the Worksheet
of the current Workbook
.
Upvotes: 0
Reputation: 84465
As you asked for comments in relation to type errors. There are a number of place where confusion could arise
1) Every line where you do multiple declarations on the same line like this:
Dim NSNrange, PNrange, KitIDrange As Range
Only the last variable is explicitly type declared (in this case as a Range
). The others are implicit Variant
. So, I have gone through and put on separate lines and declared them as I believe you may have meant them to be.
2) Using Activesheet
and, in other places, just Cells
or Range
, which implicitly references the Activesheet
, means if you have changed sheets by then you may longer be referring to the sheet you intended. So whilst I have kept Activesheet
in, and used an overarching With Activesheet
statement that then allows me to say .Cells
or .Range
etc, you should change this to using explicit sheet names.
3) Where ever you use the Set
keyword the expectation is your are working with an object (e.g. a Range
). Going by your naming convention I am going to say that you mean
Set NSNrange = Cells(i, 5)
when you say
Set NSNrange = Cells(i, 5).Value
Which sets a range to another range rather than a cell value.
4) I have changed your Integers to Longs. You are working with rows which can go beyond what Integer
type can handle so you risked overflow. Long
is safer.
5) Rather than doing a conversion on the Range
as follows
NSN = CStr(NSNrange)
Where the default property of the range, .Value
, will be taken, as you want a string you can drop the CStr
conversion and just take the .Text
property which will give you the string you want.
6) Rather than the empty string literal ""
comparison, I have used vbNullString
which is faster to assign and to check.
Option Explicit
Sub Duplicate()
Dim NSNrange As Range
Dim PNrange As Range
Dim KitIDrange As Range
Dim NSN As String
Dim PN As String
Dim KitID As String
Dim NSNCheck As String
Dim PNCheck As String
Dim KitIDCheck As String
Dim i As Long
Dim j As Long
Dim printColumn As Long
Dim rowCount As Long
With ActiveSheet
rowCount = .UsedRange.Rows.Count
printColumn = 9
.Columns(printColumn).EntireColumn.Delete
For i = 2 To rowCount
Set NSNrange = .Cells(i, 5)
Set PNrange = .Cells(i, 7)
Set KitIDrange = .Cells(i, 2)
NSN = NSNrange.Text
PN = PNrange.Text
KitID = KitIDrange.Text
For j = 2 To rowCount
If .Cells(j, printColumn).Value = vbNullString Then
Set NSNrange = .Cells(j, 5)
Set PNrange = .Cells(j, 7)
Set KitIDrange = .Cells(j, 2)
NSNCheck = NSNrange.Text
PNCheck = PNrange.Text
KitIDCheck = KitIDrange.Text
If NSN = NSNCheck And PN = PNCheck And KitID = KitIDCheck Then
.Cells(j, printColumn).Value = "Duplicated on row " & i
End If
End If
Next j
Next i
End With
MsgBox "Search Complete"
End Sub
Upvotes: 1