Brian Powell
Brian Powell

Reputation: 3411

mySQL + vba, checking to see if an item already exists in a table

I have a spreadsheet that's used to connect to my mySQL database and make changes to that database. I think the way I'm checking for duplicates in the system is very inefficient, but I'm having a hard time thinking of a more efficient way of doing this. The time it takes to go item by item isn't that long with 100 items, but it's VERY long with 50,000 items, and I'm trying to find a way to shorten the amount of time this process takes.

In Excel

'/ Define last row with data in it
lastRow = Range("C" & Rows.Count).End(xlUp).Row

'/ go through each row, line by line, to check if this part number is in the system
For c = 5 To lastRow Step 1

'/ Based on user input, runs a SELECT query to see if the part number in question is already in the table they're trying to upload into
SQLStr = "SELECT quotePartNumber FROM " & table & " where (quotePartNumber, `debitNumber) IN (('" & partNumber & "', '" & debitNum & "')) LIMIT 1"`

rs.Open SQLStr, conn, adOpenStatic
        ' Dump to spreadsheet
        With Worksheets("DATA").Range("N28:N28")
            .ClearContents
'/ copy SQL output to cell N28, then check if an output exists.  If the part exists in the table already, a part number will fill N28, if the part doesn't exist, N28 will be blank.             
.CopyFromRecordset rs
        End With

'/ If the part already exists, delete this part number from the spreadsheet.
        pnCheck = Range("N28").Value
        If pnCheck <> "" Then
            Range(Cells(c, 1), Cells(c, 11)).Select
            Selection.Delete Shift:=xlUp
            deleted = deleted + 1
            c = c - 1
        End If

        '/ reset rs for the next run through
        rs.Close
        Set rs = Nothing
        Range("N28") = ""
        debit = False


next c

So basically, I go through each line in the spreadsheet, check to see if that part exists already in my table, and if it does, I delete that line from the spreadsheet. Once the list is whittled down to what's NOT in the table already, I upload it (that code is not shown here...)

Is there a more efficient way of checking anywhere from 5,000 to 100,000 part numbers against a table than going line by line?

Upvotes: 0

Views: 907

Answers (1)

mock_blatt
mock_blatt

Reputation: 965

Sorry, misunderstood initially.

How does mySQL respond if you try to upload the table with duplicate IDs? The field should be defined as requiring unique values if that's what it needs, and I think MySQL would just throw a warning and dump the offending rows.

That aside, this is something Excel can do faster than you repeatedly querying, writing, deleting, and updating. Can you dump ALL the table's IDs to your spreadsheet in a different sheet? Then you could use Excel's countif formula and delete anything with more than 1. I'm on a phone so it's hard to be specific but have a countif next to each spreadsheet ID, and count if it's in the range of mySQL IDs. Then filter for > 1 and delete. You could also dump the ids into one column, use an adjacent column to track what was spreadsheet and what was db, use excel's remove duplicates, then delete the db entries. All that's left are the uniques spreadsheet entries.

In general, for performance, selection and modification is slow. Instead of selecting a range and deleting the selection, delete the range outright. Deleting and shifting is slow too, so maybe just flag the offending rows in an adjacent column, then delete them all at once. Also, turn off screenupdating, application.screenupdating = false. Updating the screen is slow too.

Upvotes: 1

Related Questions