Reputation: 101
I was inserting bulk data from excel to SQL using the bulk insert package which I had created. While inserting data sequentially, I missed a file which had to be inserted, so I stopped the package in middle. However, the package had already added some 2500 rows, I deleted those rows using "delete top(2500) from coulmn_name" and continued with inserting data. Now i fear if I have actually deleted last added records or some random data. Could anyone help me understanding this?
Upvotes: 0
Views: 34
Reputation: 280431
You will need some other way to identify the rows to delete, like having a BatchID or something. People generally use a staging table for this so that large bulk operations don't interfere with normal use and also so that when things go south you can just truncate and start over.
To show that TOP doesn't care about "last":
CREATE TABLE #foo(id int IDENTITY(1,1) PRIMARY KEY, name sysname);
INSERT #foo(name) VALUES('Bob');
INSERT #foo(name) VALUES('Frank');
INSERT #foo(name) VALUES('Aaron');
INSERT #foo(name) VALUES('William');
INSERT #foo(name) VALUES('Mary');
DELETE TOP (2) FROM #foo;
SELECT * FROM #foo;
DROP TABLE #foo;
Results:
id name
-- -------
3 Aaron
4 William
5 Mary
That deleted the first two that were inserted (and just happened to be assigned integer values sequentially). Now try this:
CREATE TABLE #foo(id int PRIMARY KEY, name sysname);
INSERT #foo(id,name) VALUES(3,'Bob');
INSERT #foo(id,name) VALUES(12,'Frank');
INSERT #foo(id,name) VALUES(84,'Aaron');
INSERT #foo(id,name) VALUES(16,'William');
INSERT #foo(id,name) VALUES(7,'Mary');
DELETE TOP (2) FROM #foo;
SELECT * FROM #foo;
DROP TABLE #foo;
Results:
id name
-- -------
12 Frank
16 William
84 Aaron
Different rows were deleted. Now change the CREATE
to:
CREATE TABLE #foo(id int, name sysname UNIQUE);
Same insert + delete yields:
id name
-- -------
84 Aaron
16 William
7 Mary
A table is not ordered. TOP without ORDER BY (in fact any query without ORDER BY) should be considered arbitrary (maybe even thinking about it as random is helpful).
Upvotes: 1
Reputation: 666
You have deleted random data because rows return in order of reading. For instance, if you run the same "select top 5" statement 5-10 times then no guarantees there you'll get the same dataset because some pages or string could be blocked.
Upvotes: 0