Kowser
Kowser

Reputation: 101

Delete data from the inserted row

I was inserting bulk data from excel to SQL using the bulk insert package which I had created. While inserting data sequentially, I missed a file which had to be inserted, so I stopped the package in middle. However, the package had already added some 2500 rows, I deleted those rows using "delete top(2500) from coulmn_name" and continued with inserting data. Now i fear if I have actually deleted last added records or some random data. Could anyone help me understanding this?

Upvotes: 0

Views: 34

Answers (2)

Aaron Bertrand
Aaron Bertrand

Reputation: 280431

You will need some other way to identify the rows to delete, like having a BatchID or something. People generally use a staging table for this so that large bulk operations don't interfere with normal use and also so that when things go south you can just truncate and start over.

To show that TOP doesn't care about "last":

CREATE TABLE #foo(id int IDENTITY(1,1) PRIMARY KEY, name sysname);
INSERT #foo(name) VALUES('Bob');
INSERT #foo(name) VALUES('Frank');
INSERT #foo(name) VALUES('Aaron');
INSERT #foo(name) VALUES('William');
INSERT #foo(name) VALUES('Mary');

DELETE TOP (2) FROM #foo;

SELECT * FROM #foo;

DROP TABLE #foo;

Results:

id  name
--  -------
3   Aaron
4   William
5   Mary

That deleted the first two that were inserted (and just happened to be assigned integer values sequentially). Now try this:

CREATE TABLE #foo(id int PRIMARY KEY, name sysname);
INSERT #foo(id,name) VALUES(3,'Bob');
INSERT #foo(id,name) VALUES(12,'Frank');
INSERT #foo(id,name) VALUES(84,'Aaron');
INSERT #foo(id,name) VALUES(16,'William');
INSERT #foo(id,name) VALUES(7,'Mary');

DELETE TOP (2) FROM #foo;

SELECT * FROM #foo;

DROP TABLE #foo;

Results:

id  name
--  -------
12  Frank
16  William
84  Aaron

Different rows were deleted. Now change the CREATE to:

CREATE TABLE #foo(id int, name sysname UNIQUE);

Same insert + delete yields:

id  name
--  -------
84  Aaron
16  William
7   Mary

A table is not ordered. TOP without ORDER BY (in fact any query without ORDER BY) should be considered arbitrary (maybe even thinking about it as random is helpful).

Upvotes: 1

Meow Meow
Meow Meow

Reputation: 666

You have deleted random data because rows return in order of reading. For instance, if you run the same "select top 5" statement 5-10 times then no guarantees there you'll get the same dataset because some pages or string could be blocked.

Upvotes: 0

Related Questions