espresso_coffee
espresso_coffee

Reputation: 6110

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:

tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB

and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?

Here is what I have so far:

UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
  ON tbl1.dob = tbl2.dob
  AND tbl1.fname = tbl2.fname
  AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''

Here is query where I tried to detect duplicates:

SELECT DISTINCT
    tbl1.FName,
    tbl1.LName,
    tbl1.dob,
    COUNT(*) AS count
FROM Table1 AS tbl1
    LEFT OUTER JOIN Table2 AS tbl2
        ON tbl1.dob = tbl2.dob
        AND tbl1.FName = tbl2.first
        AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
    AND LTRIM(RTRIM(tbl1.first)) <> ''
    AND LTRIM(RTRIM(tbl1.last)) <> ''
    AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob

Some data after I tested query above:

First   Last       DOB    Count  
John    Cook    2008-07-11  2
Kate    Witt    2013-06-05  1
Deb     Ruis    2016-01-22  1
Mike    Bennet  2007-01-15  1
Kristy  Cruz    1997-10-20  1
Colin   Jones   2011-10-13  1
Kevin   Smith   2010-02-24  1
Corey   Bruce   2008-04-11  1
Shawn   Maiers  2016-08-28  1
Alenn  Fitchner 1998-05-17  1

If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.

Upvotes: 0

Views: 89

Answers (3)

SqlZim
SqlZim

Reputation: 38023

You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL) along with row_number()., like so:

with cte as (
select
    t.fname
  , t.lname
  , t.dob
  , t.usermasterid
  , NewUserMasterId = t2.usermasterid
  , rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
  from table1 as t
    inner join table2 as t2 on t.dob = t2.dob
      and t.fname = t2.fname
      and t.lname = t2.lname
      and ltrim(rtrim(t.usermasterid)) = ''
  )

--/* confirm these are the rows you want updated 
select * 
  from cte as t
  where t.NewUserMasterId != ''
    and not exists (
      select 1 
        from cte as i 
        where t.dob = i.dob
          and t.fname = i.fname
          and t.lname = i.lname
          and i.rn>1
      );
--*/

/* update those where only 1 usermasterid matches this record
update t
  set t.usermasterid = t.NewUserMasterId
  from cte as t
  where t.NewUserMasterId != ''
      and not exists (
      select 1 
        from cte as i 
        where t.dob = i.dob
          and t.fname = i.fname
          and t.lname = i.lname
          and i.rn>1
    );
--*/

I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):

Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.

Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1

Upvotes: 1

Dinesh Singh
Dinesh Singh

Reputation: 210

Please try below SQL:

UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
  ON tbl1.dob = tbl2.dob
  AND tbl1.fname = tbl2.fname
  AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
  ON tbl3.dob = tbl2.dob
  AND tbl3.fname = tbl2.fname
  AND tbl3.lname = tbl2.lname
  AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Upvotes: 0

APH
APH

Reputation: 4154

You could use a CTE to filter out the duplicates from Table1 before joining:

; with CTE as (select *
               , count(ID) over (partition by LastName, FirstName, DoB) as IDs 
              from Table1)

update a
set  a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1

This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.

Upvotes: 0

Related Questions