ashm
ashm

Reputation: 73

paginated queries / iterator recipe

I see this pattern a lot.

On server:

// Get a bounded number of results, along with a resume token to use 
// for the next call. Successive calls yield a "weakly consistent" view of 
// the underlying set that may or may not reflect concurrent updates.
public<T> String getObjects(
        int maxObjects, String resumeToken, List<T> objectsToReturn);

On client:

// An iterator wrapping repeated calls to getObjects(bufferSize, ...)
public<T> Iterator<T> getIterator(int bufferSize);

Most places roll their own versions of these two methods, and the implementations are surprisingly difficult to get right. There are a lot of edge case bugs.

Is there a canonical recipe or library for these queries?

(you can make some simplifying assumptions for the server-side storage, e.g. T has a natural ordering).

Upvotes: 6

Views: 3092

Answers (2)

Israel Solomonovich
Israel Solomonovich

Reputation: 21

Here is something that works for me. It also uses AbstractIterator from google-guava library but takes advantage of Java8 Stream to simplify the implementation. It returns an Iterator of elements of type T.

Iterator<List<T>> pagingIterator = new AbstractIterator<List<T>>() {
    private String resumeToken;
    private boolean endOfData;

    @Override
    protected List<T> computeNext() {
        if (endOfData) {
            return endOfData();
        }

        List<T> rows = executeQuery(resumeToken, PAGE_SIZE);

        if (rows.isEmpty()) {
            return endOfData();
        } else if (rows.size() < PAGE_SIZE) {
            endOfData = true;
        } else {
            resumeToken = getResumeToken(rows.get(PAGE_SIZE - 1));
        }

        return rows;
    }
};

// flatten Iterator of lists to a stream of single elements
Stream<T> stream = StreamSupport.stream(Spliterators.spliteratorUnknownSize(pagingIterator, 0), false)
    .flatMap(List::stream);

// convert stream to Iterator<T>
return stream.iterator();

It is also possible to return an Iterable by using method reference in the following way:

// convert stream to Iterable<T>
return stream::iterator;

Upvotes: 2

Abhinav Sarkar
Abhinav Sarkar

Reputation: 23782

Here is one using AbstractIterator from google-guava library and spring-jdbc to actually query the database:

public Iterable<T> queryInBatches(
        final String query,
        final Map<String, Integer> paramMap,
        final int pageSize, final Class<T> elementType) {
    return new Iterable<T>() {
        @Override
        public Iterator<T> iterator() {
            final Iterator<List<T>> resultIter = 
                    queryResultIterator(query, paramMap, pageSize, elementType);

            return new AbstractIterator<T>() {
                private Iterator<T> rowSet;

                @Override
                protected T computeNext() {
                    if (rowSet == null) {
                        if (resultIter.hasNext()) {
                            rowSet = resultIter.next().iterator();
                        } else {
                            return endOfData();
                        }
                    }

                    if (rowSet.hasNext()) {
                        return rowSet.next();
                    } else {
                        rowSet = null;
                        return computeNext();
                    }
                }};
        }};
}


private AbstractIterator<List<T>> queryResultIterator(
        final String query, final Map<String, Integer> paramMap, 
        final int pageSize, final Class<T> elementType) {
    return new AbstractIterator<List<T>>() {
        private int page = 0;

        @Override
        protected List<T> computeNext() {
            String sql = String.format(
                    "%s limit %s offset %s", query, pageSize, page++ * pageSize);
            List<T> results = jdbc().queryForList(sql, paramMap, elementType);
            if (!results.isEmpty()) {
                return results;
            } else {
                return endOfData();
            }
        }};
}

AbstractIterator hides most of the complications involving writing your own implementation of Iterator. You need to only implement the computeNext method which either returns the next value in the iterator or calls endOfData to indicate the there are no more values in the iterator.

Upvotes: 1

Related Questions