Reputation: 155
I have a spring batch job that fetches the data from a Postgres database and after processing writes it on excel sheets. But I want to implement a keyset paging in a reader for a spring batch job. Currently, I am using JpaPagingItemReader which uses limit offset paging but because I am dealing with a large quantity of data, the query which the JpaPagingItemReader uses to fetch the data becomes inefficient as the offset increases. The keyset paging can be used to avoid the limitations of limit offset pagination but I don't know how to implement a reader with keyset pagination. How can I implement it?
Edit: Keyset pagination does not include the offsetting/skipping of records, instead, we will be ordering and tracking a numerical unique identifier in the result and requesting for entries greater than the last unique entry. In this method, the SQL would be something like below ( assuming customer_id is the unique auto-generated identifier for the records )
select * from CUSTOMERS where status = 'ACTIVE' and customer_id > 0 order by customer_id asc limit 100;
-- Second iteration ( size = 100, lastCustomerId = 100 )
select * from CUSTOMERS where status = 'ACTIVE' and customer_id > 100 order by customer_id asc limit 100;
-- Second iteration ( size = 100, lastCustomerId = 200 )
select * from CUSTOMERS where status = 'ACTIVE' and customer_id > 200 order by customer_id asc limit 100;
The points that need to be kept in mind while implementing keyset pagination are:
public class CustomerProcessorService {
public void processCustomers() {
List<Customer> customers = new ArrayList();
long lastCusId = 0;
int size = 100;
while ( true ) {
// Create a PageRequest object that will be passed as Pageable interface to repo
// Note that here we are setting 0 as the offset
PageRequest pageRequest = new PageRequest(0,size);
// Get the lastCusId
lastCusId = getLastCusId(customers);
// Get the data from the database
customers = customerRepository.findByStatusAndCustomerIdGreaterThanOrderByCustomerIdAsc('ACTIVE',lastCusId,pageRequest);
// Check if data is there
if ( customers == null || customers.isEmpty()) {
break;
}
// Do the processing
}
}
public Long getLastCusId(List<Customer> customers) {
// If passed entry is null or empty, return 0 ( This handles the first iteration case )
if ( customers == null || customers.isEmpty())
return 0l;
// Do the logic to sort the customers list by customer_id of each
// Customer object
// Return the last entry
return customers.get(customers.size() -1).getCustomerId();
}
Upvotes: 1
Views: 1393
Reputation: 31600
You should be able to implement your paging logic by extending the AbstractPaginatedDataItemReader
class. This base class handles most of the paging boilerplate and allows you to specify the paging logic in doPageRead
. Here is a quick example, I will let you adapt it accordingly:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import org.springframework.batch.item.data.AbstractPaginatedDataItemReader;
import org.springframework.data.domain.PageRequest;
public class KeySetPagingItemReader extends AbstractPaginatedDataItemReader<Customer> {
long lastCusId = 0;
int size = 100;
private CustomerRepository customerRepository;
List<Customer> customers = new ArrayList();
public KeySetPagingItemReader(CustomerRepository customerRepository) {
this.customerRepository = customerRepository;
}
@Override
protected Iterator<Customer> doPageRead() {
PageRequest pageRequest = PageRequest.of(0,size);
// Get the lastCusId
lastCusId = getLastCusId(customers);
// Get the data from the database
customers = customerRepository.findByStatusAndCustomerIdGreaterThanOrderByCustomerIdAsc('ACTIVE',lastCusId,pageRequest);
return customers.iterator();
}
public Long getLastCusId(List<Customer> customers) {
// If passed entry is null or empty, return 0 ( This handles the first iteration case )
if ( customers == null || customers.isEmpty())
return 0l;
// Do the logic to sort the customers list by customer_id of each
// Customer object
// Return the last entry
return customers.get(customers.size() -1).getCustomerId();
}
}
Upvotes: 4