Reputation: 61
I am working on a problem in which i am pulling list of file names from a folder and storing it into a database table,This process will be running every hour so what i need if there is any duplicate file names that got read from the folder then i don't need any duplicate records in the table it will just update the old record but if there is anything new then it will insert the record. I am using Spring Data Jpa and i know it can be done automatically by using saveAll method but what i need is that if the file is a duplicate then it will update another column "Description" in the table which says this record got update but when it is inserting a new record it says its a new record.
I want to know what is the most efficient way of doing this without using any loop.
Upvotes: 1
Views: 258
Reputation: 9492
Basily you have an async job and this async job exists in the context of 1 or more instances of the application. There are couple of problems you need to look after:
The job that is reading the files need to run only on one leg of the application. For this purpose you should use @Schedlock google it.
After you read the filenames you need to verify them against the DB. Couple of variants exist for this procedure:
A) Testing each file would cause 1 select query per file which may be undesirable.
B) You can select all existing files from your DB and then your job would be to divide the incoming file in two groups - files that exist and files that dont. Another option would be to select all existing file.
C) If the amount of filesis so big that you can not effectivly read it at once. You can create a second table "Incoming files" then you persist all incoming files there and then you performa JOIN with the "SAVED_FILES" in order to find out the already saved files.
Upvotes: 1