Reputation: 143
I am working on an assignment that I have to compare almost half million rows of data in two table. the structure of table is similar and so is the data (ID column is about 99% matched, but I need to compare the rest of the columns, which is about 15 column)
say for instance, in table A the null value is presented by 'N/A', but in table B it's just null. Boolean column in table A present as 'True' and 'False' but '0' and '1' in table B.
The way I approach to this question is to write a cursor that will process data in table A row by row and compare to table B using 'IF EXISTS'
if all the data matched then I insert that row into a table called 'matched' if the data doesn't match then insert to other table called 'unmatched'
my cursor is still running currently it's almost 3 hours now.
want to know if there is other approach to solve my question.
So here is a Sample of two table
create table A (ID int, Name varchar(20), Address varchar(40), City varchar(20),
State varchar(20), Country varchar(20))
create table B (ID int, Name varchar(20), Address varchar(40), City varchar(20),
State varchar(20), Country varchar(20))
Insert into A values(1,'Daniel','12345 marion way', 'Sunnyvale', 'CA', 'USA'),
(2,'Chloe','38660 Hastings Street', 'Fremont', 'CA', 'USA'),
(3,'Charles', '24980 willimet way', 'Hayward', 'CA', 'USA'),
(4,'Tommy', '98765 washington street', 'Fremont', 'CA', 'USA')
Insert into B values(1,'Daniel','12345 marion way', 'Santa Clara', 'CA', 'USA'),
(2,'Chloe',38660 Hastings Street', 'Fremont', 'CA', 'USA'),
(3,'Charles', '24980 willimet way', 'Hayward', 'CA', 'USA')
So I create a cursor for table A and try to match table B, ID:1 the city value is wrong and ID:4 doesn't exist in table B.
Declare compare Cursor for
Select ID, Name, Address, City, State, Country from A
Open Compare
Declare @ID int, @Name varchar(20), @Address varchar(50), @City varchar(20)
,@State varchar(20), @Country varchar(20)
Fetch next from compare into @ID, @Name, @Address, @City, @State, @Country
While @@Fetch_Status = 0
Begin
Begin
IF EXISTS (Select ID, Name, Address, City, State, Country from B
where ID = @ID and Name = @Name and Address = @Address and
City = @City and State = @State and Country = @country)
Insert into Match values (@ID, @Name, @Address, @City, @State, @Country)
Else
Insert into Unmatched values(@ID, @Name, @Address, @City, @State, @Country)
End
Fetch Next from compare into @ID, @Name, @Address, @City, @State, @Country
End
Close compare
deallocate compare
Upvotes: 0
Views: 49
Reputation: 19184
Why are you doing this row by row in a cursor? Why not just perform a bulk insert?
INSERT INTO Match (ID, Name, Address, City, State, Country)
SELECT ID, Name, Address, City, State, Country
FROM A
WHERE Exists (
SELECT 1 FROM B WHERE
A.ID = B.ID and A.Name = B.Name and A.Address = B.Address and
A.City = B.City and A.State = B.State and A.Country = B.country
)
INSERT INTO UnMatched (ID, Name, Address, City, State, Country)
SELECT ID, Name, Address, City, State, Country
FROM A
WHERE NOT EXISTS (
SELECT 1 FROM B WHERE
A.ID = B.ID and A.Name = B.Name and A.Address = B.Address and
A.City = B.City and A.State = B.State and A.Country = B.country
)
Your match logic will need to be refined, but losing the cursor will at least let the solution perform for the first iteration. Make sure that you index fields if possible.
Upvotes: 1