Reputation: 81
Assume a job instance is running currently and is in the middle of read/process/write operation on first 1000 records. During this period, another 1000 records imported into database. In this situation, First job instance should be completed (assume no failure) after processing 1000 records. And second instance should trigger and process next 1000 records. Is it possible?
OR Have I to leave the responsibility to steps for remaining 1000 records to process (meaning another step instance will start and execute)?
Upvotes: 1
Views: 443
Reputation: 4444
Yes, it is possible. It dependes on how you define your selects.
For instance, if you use a JdbcCursorItemReader, the select ist executed at the very beginning, hence, all rows that are present at this moment, are selected and processed. Rows that are added during the processing of your batch are not part of this selection.
Using a JdbcPagingItemReader works diffrently, since it executes a query for every chunk that is processed. Hence having the potential to select data that was inserted during the processing of the batch. But, that could be a problem concerning restartability and ensuring that all elements will be processed. Therefore, when using a PagingItemReader, you have to ensure that the query selects the same data for every chunk (the paging reader handles an internal state which helps that every chunk receives new rows). You coudd ensure this by either having a part of the where clause which is dependent on a timestamp of the inserted row, or you add a state column and in a first step, you simply set the state of all entries available at that moment to something like "toProcess". Afther that, your query of the Reader just needs to select those entries. Of course, you will also have to update the state, once the entry was processed.
Upvotes: 2