eripey
eripey

Reputation: 365

sql removing multiple records with mutltiple date values

I have the following sample data that I need to clean the records by specific date. For this example, lets use anything older than march 2012 will need to be removed along with other related records.

Author Table

    Author       Book     date
-------------------------------------
    Smith        Bk1      01/01/2012
    Smith        Bk2      02/01/2012
    Smith        Bk3      02/25/2012
    Johnson      Bk1      06/01/2011
    Johnson      Bk2      09/01/2011
    Johnson      Bk3      03/01/2012
    Johnson      Bk4      01/01/2013
    Jones        BK1      01/01/2001
    Jones        BK2      01/01/2002
    Jones        BK3      01/01/2009
    Jones        BK4      08/01/2013

There is also a Book table

    Author       Book     info
-------------------------------------
    Smith        Bk1      asdfadf
    Smith        Bk2      asdfasdf
    Smith        Bk3      asdfadf
    Johnson      Bk1      asdfadf
    Johnson      Bk2      adsfasdf
    Johnson      Bk3      asdasdf
    Johnson      Bk4      asdfasdf
    Jones        BK1      asdfasdf
    Jones        BK2      adsfasdf
    Jones        BK3      adsfasdf
    Jones        BK4      adsfasdf

Above is the data set. So if the Author has a book in the table that is older than 03/01/2013, then All of the records need to be removed. If they have at least one Book newer than the date specified then All records are kept. In this scenario, the Jones records are the only ones to be kept in the table data set. Once we figure which Author has been filtered then we would remove the records from the Book table, and then the Author table.

Thanks for any help, I appreciate it.

Upvotes: 2

Views: 1371

Answers (2)

Hannah Vernon
Hannah Vernon

Reputation: 3472

This appears to do what you want (I've added code to provide a proof-of-concept using the TempDB):

USE tempdb;

/* Create the same tables, and insert test data */
CREATE TABLE Authors
(
    Author nvarchar(255)
    , Book nvarchar(255)
    , BookDate datetime
);

INSERT INTO Authors VALUES ('joe','book1',GetDate()-100); --100 days old
INSERT INTO Authors VALUES ('joe','book2',GetDate()-200); --200 days old
INSERT INTO Authors VALUES ('joe','book3',GetDate()-300); --300 days old
INSERT INTO Authors VALUES ('sam','book4',GetDate()-400); --etc
INSERT INTO Authors VALUES ('sam','book5',GetDate()-500);
INSERT INTO Authors VALUES ('sam','book6',GetDate()-600);

CREATE TABLE Books
(
    Author nvarchar(255)
    , Book nvarchar(255)
    , Info nvarchar(255)
);

INSERT INTO Books VALUES ('joe','book1','asdf');
INSERT INTO Books VALUES ('joe','book2','asdfg');
INSERT INTO Books VALUES ('joe','book3','asdfh');
INSERT INTO Books VALUES ('sam','book4','asdf');
INSERT INTO Books VALUES ('sam','book5','bsdf');
INSERT INTO Books VALUES ('sam','book6','csdf');

/* This statement deletes rows from the dependent table, Books */
DELETE  
FROM Books 
FROM Books LEFT JOIN 
(       /* This selects all Authors with books in the past 300 days */
    SELECT Author 
    FROM Authors
    WHERE BookDate >= GetDate()-300 
) t2 ON Books.Author = t2.Author
WHERE t2.Author IS NULL; /* This ensures rows from Books only get deleted
                                when for Authors that DON'T have books in the
                                past 300 days */

/* This statement deletes rows from the main table, Authors */
DELETE  
FROM Authors 
FROM Authors LEFT JOIN 
(       /* This selects all Authors with books in the past 300 days */
    SELECT Author 
    FROM Authors
    WHERE BookDate >= GetDate()-300 
) t2 ON Authors.Author = t2.Author
WHERE t2.Author IS NULL; /* This ensures rows from Books only get deleted
                                when for Authors that DON'T have books in the
                                past 300 days */

SELECT * FROM Books;
SELECT * FROM Authors;

Only Sam's rows will be deleted since he is the only author with no book newer than 300 days old.

Upvotes: 2

Guttsy
Guttsy

Reputation: 2119

Delete from table 1 first:

DELETE FROM Table1
WHERE Author NOT IN (SELECT DISTINCT Author 
   FROM Table1 WHERE Date >= '2013-03-01')

Then from table 2:

DELETE FROM Table2 WHERE Author NOT IN (SELECT Author FROM Table1)

Not the most beautiful SQL I have ever written, but it should suffice. Something like this wouldn't work if you had normalized tables with foreign keys, but I don't think that is the case right now.

If you insist on running it in the opposite order:

DELETE FROM Table2 WHERE Author NOT IN (
    SELECT DISTINCT Author 
    FROM Table1 WHERE Date >= '2013-03-01')

DELETE FROM Table1
    WHERE Author NOT IN (SELECT DISTINCT Author FROM Table2)

Upvotes: 1

Related Questions