Reputation: 36070
I am creating a time machine with c#. A time machine is a way of creating a backup of my files in the way where I can access a specific file like it was at a specific time. Anyways the way I am doing so is by looking for all the files inside a directory and I store those files information in a table named table1. So if the first time I scan my computer lets assume I only have 3 files therefore my table will look something like:
ID FullName DateModified DateInsertedToDatabase
1 C:\A 456588731 0
2 C:\B 955588762 0
3 C:\C 854587783 0
lets say that next time I perform a back up I have the same 3 files but I have created a new file and modified file C. As a result my table should now look like:
ID FullName DateModified DateInsertedToDatabase
1 C:\A 456588731 0
2 C:\B 955588762 0
3 C:\C 854587783 0
4 C:\A 456588731 1
5 C:\B 955588762 1
6 C:\C 111122212 1
7 C:\X 123212321 1
now I will like to copy file C and File X because those are the files that have been changed or created. How could I build a query where I could obtain file X and file C ? In other words I want to get all the files that have a DateInsertedToDatabase = 1 and that don't match files where DateInsertedToDatabase is less than 1.
if I am not being clear here is the continuation of my example: lets say that I continue with my example and I delete files: B and C, I modify file X, I create a new file Z. My table should look like:
ID FullName DateModified DateInsertedToDatabase
1 C:\A 456588731 0
2 C:\B 955588762 0
3 C:\C 854587783 0
4 C:\A 456588731 1
5 C:\B 955588762 1
6 C:\C 111122212 1
7 C:\X 123212321 1
8 C:\A 456588731 2
9 C:\X 898989898 2
10 C:\Z 789564545 2
here I will like to get files X and Z because file X was modified and File Z was created. I will not want to get file A because that file already exist with the same DateModified. How could I build that query?
Upvotes: 0
Views: 1221
Reputation: 36070
I modified it because I am working with a lot of files therefore the solution works great but not for queries dealing with a lot of records. Here is what I worked out.
lets assume I have this records so far:
Select * from table1 WHERE DateInserted = 4
and Path not in(
select Path from table1 t1
where
DateInserted = 4 AND
Path IN (Select Path from table1 where DateInserted<4) AND
DateModified IN (Select DateModified from table1 where DateInserted<4)
)
and that returns:
this query works out much faster. I will obviously have to change the 4 for a variable in my code but this is just to illustrate the changes that I have done.
Upvotes: 0
Reputation: 58741
Phil Sandler's answer works. This does, too:
SELECT FullName
FROM table1
INNER JOIN (SELECT FullName, DateModified
FROM table1
WHERE DateInsertedToDatabase = (SELECT MAX(DateInsertedToDatabase) FROM table1)) d
USING (FullName, DateModified)
GROUP BY FullName
HAVING COUNT(1) = 1
Upvotes: 0
Reputation: 3974
I don't know SqlLite, but I hope this will work anyway. It doesn't use anything fancy.
Select t1.*
From Table1 t1
Left join Table1 t2
On t1.FullName = t2.FullName
And t1.DateInsertedToDatabase = t2.DateInsertedToDatabase + 1
Where t1.DateInsertedToDatabase = (select max(DateInsertedToDatabase) from Table1)
And (t1.DateModified <> t2.DateModified or t2.FullName is null)
Joining on DateInsertedToDatabase + 1 will join with the previous record. Then you filter for the highest DateInsertedToDatabase and include either records that don't have a match (they are new) or where the modified dates don't match.
Upvotes: 0
Reputation: 28046
Hmm, I think I understand. You want to get all files that match on the MAX(DateInsertedToDatabase) but don't have a previous row that also matches their DateModified?
You want to do what I call a "reverse inner join." Basically a left join that filters out anything that would have successfully matched in an inner join. There are other ways it could be done as well (e.g. using subqueries).
This is in T-SQL:
CREATE TABLE #mytemp
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[FullName] [nvarchar](50) NOT NULL,
DateModified [nvarchar](9) NOT NULL,
DateInsertedToDatabase [int] NOT NULL
)
INSERT INTO #mytemp VALUES ('C:\A', '456588731', '0')
INSERT INTO #mytemp VALUES ('C:\B', '955588762', '0')
INSERT INTO #mytemp VALUES ('C:\C', '854587783', '0')
INSERT INTO #mytemp VALUES ('C:\A', '456588731', '1')
INSERT INTO #mytemp VALUES ('C:\B', '955588762', '1')
INSERT INTO #mytemp VALUES ('C:\C', '111122212', '1')
INSERT INTO #mytemp VALUES ('C:\X', '123212321', '1')
INSERT INTO #mytemp VALUES ('C:\A', '456588731', '2')
INSERT INTO #mytemp VALUES ('C:\X', '898989898', '2')
INSERT INTO #mytemp VALUES ('C:\Z', '789564545', '2')
SELECT
temp1.*
FROM
#mytemp temp1
LEFT JOIN #mytemp temp2 ON
temp1.ID != temp2.ID --don't match on the same two rows
AND temp1.FullName = temp2.FullName --match based on full name
AND temp1.DateModified = temp2.DateModified --and date modified
WHERE
temp1.DateInsertedToDatabase = (SELECT MAX(DateInsertedToDatabase) FROM #mytemp)
AND temp2.ID IS NULL --filter out rows that would have matched on an INNER JOIN
DROP TABLE #mytemp
Upvotes: 2