Reputation: 3783
I'm working on a batch import API that needs to have a 'preview' function that shows users how many records of each type will be created when a file is imported. Typically there won't be many records in the batch - 10,000 is a realistic upper limit - but the system is currently fairly slow and such an import could take ten minutes or more. Given that we want the batch import to be as close to the real import as possible (ie database constraints/triggers, event handlers firing etc), what is the best approach for implementing this preview function?
Would doing the batch import inside a transaction and then rolling it back be a terrible idea? I'm not exactly sure what the locking behaviour would be given that we are just inserting data...it seems like a simple solution but my spidey sense tells me it's a bad one.
Thanks!
Upvotes: 0
Views: 643
Reputation: 294297
Would doing the batch import inside a transaction and then rolling it back be a terrible idea?
Yes. A 10 minute long transaction would be a disaster. At the very least it pins the log, preventing truncation, causing log growth. But more likely will cause massive outage of the system as everything else blocks behind the locks acquired by this long lasting uncommitted transaction.
If all you need is to display the the user with the estimated count, then doing the real work just to be discarded is a horrible waste of resource. You'll cause 10 minutes outage just to give the estimated count, then rollback, and then cause again the same 10 minutes outage!
I would seriously reconsider the requirement to display this estimated count with all side effects. Basically, there is no way to get it aside from doing the real work, at which point you better do the work only once. Even things like working on a backup database may be unrealistic as most databases have cross-db dependencies (eg. triggers may cause inserts into a different DB and have real, not-so-dry, side effects).
A more realistic requirement would be to display the user a record count of the source file ("There are 9876 record in foo.txt"), which can easily be done by a parsing of the source file(s).
Upvotes: 1