Reputation: 145
I'm working on a server built with Spring. We have a very simple query (get all entities of a single kind) that fails to run after a certain threshold of records (~1,000,000).
This is the first method that I've tried (modified from the Google documentation):
PersistenceManager pm = persistenceManagerFactory.getPersistenceManager();
List<T> Qresults;
List<T> detachedList = new LinkedList<T>();
try {
Query query = pm.newQuery(cla);
query.setRange(0, 200);
List<T> results = (List<T>) query.execute();
Cursor cursor = JDOCursorHelper.getCursor(results);
String cursorString = cursor.toWebSafeString();
do {
cursor = Cursor.fromWebSafeString(cursorString);
Map<String, Object> extensionMap = new HashMap<String, Object>();
extensionMap.put(JDOCursorHelper.CURSOR_EXTENSION, cursor);
query.setExtensions(extensionMap);
query.setRange(0, 200);
Qresults = (List<T>) query.execute();
results.addAll(Qresults);
} while(Qresults.size() == 200);
for (T item : results) {
detachedList.add(pm.detachCopy(item));
}
return detachedList;
} finally {
pm.close();
}
After that I've tried avoiding the PersistenceManager (I don't have any experience working with Spring or persistence):
JSONArray array = new JSONArray();
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query q = new Query("MobileDevice");
FetchOptions options = FetchOptions.Builder.withChunkSize(1000);
PreparedQuery pq = datastore.prepare(q);
for (Entity result : pq.asIterable(options)) {
array.put(new MobileDevice(result).toJSON());
}
return new transactionResult(array.toString());
The server threw a timeout exception on both cases. Am I doing something wrong? Feels like this should be a simple operation.
Upvotes: 1
Views: 167
Reputation: 41089
Timeout is related to the time of the request execution, not to the number of entities. If you run this query on auto-scaled App Engine instances, you have 60 seconds to complete a request. There is also a limit on how long a datastore query remains available - you need to use cursors if the number of results is large.
The obvious solution is not to return a few million records as a single JSON object, but let the client (or whoever needs this data) get them in manageable chunks that take less than 60 seconds to retrieve.
Upvotes: 2