Reputation: 12972
I'm trying to execute a fairly complex query against GAE's datastore, basically, a subscriber can subscribe to news from a specific town or any town (*), a specific country or any country (*), specific ... or any ... (*)
Now when I want to notify subscribers about news from ZA, I need to find all subscribers that matches country=ZA as well as those matching country=* and the same for the other fields.
Query<Subscriber> query = ofy().load().type(Subscriber.class);
query = query.filter("searchCategory IN", Arrays.asList(new String[]{"*", category}));
query = query.filter("searchCity IN", Arrays.asList(new String[]{"*", city}));
query = query.filter("searchSuburb IN", Arrays.asList(new String[]{"*", suburb}));
query = query.filter("searchTown IN", Arrays.asList(new String[]{"*", town}));
query = query.filter("searchProvince IN", Arrays.asList(new String[]{"*", province}));
query = query.filter("searchCountry IN", Arrays.asList(new String[]{"*", country}));
query = query.limit(100);
QueryResultIterator<Subscriber> iterator = query.iterator();
while (iterator.hasNext()) {
Subscriber subscriber = iterator.next();
System.out.println(iterator.getCursor().toWebSafeString()); // exception here
}
I'm using task queues to notify enormous amounts of subscribers at a time, 100 at a time (anywhere between 20k and 2M results depending on the query) and cursors seemed like the logical way to break down the massive number of results in manageable chunks ... until I ran the application and got a Null Pointer Exception attempting to get the cursor - turned out that cursors aren't supported when using the IN statement.
Because the NOT_EQUAL and IN operators are implemented with multiple queries, queries that use them do not support cursors, nor do composite queries constructed with the CompositeFilterOperator.or method.
What is the alternative in this situation seeing that I can't use cursors?
Upvotes: 0
Views: 669
Reputation: 2136
I think you can use the cursors , according to the documentation
Some NDB queries don't support query cursors, but you can fix them. If a query uses IN, OR, or !=, then the query results won't work with cursors unless ordered by key. If an application doesn't order the results by key and calls fetch_page(), it gets a BadArgumentError. If User.query(User.name.IN(['Joe', 'Jane'])).order(User.name).fetch_page(N) gets an error, change it to User.query(User.name.IN(['Joe', 'Jane'])).order(User.name, User.key).fetch_page(N)
and i suggest you to switch to NDB as its faster and have automatic caching
Upvotes: 2
Reputation: 3626
One option is to run each of these queries independently using only equals. This would allow you to only grab some entries, however in addition to the cursor you would have to remember which query you are running. This also wouldn't handle de-duplication for you -- which could be a dealbreaker if you are sending out notifications.
You should consider looking into the prospective search API. This is an experimental feature, but it does exactly what you want to do.
The basic idea is that for each user you specify their subscription queries (such as searchCountry=NZ
, then when you process a news article and send it to the prospective search API, it can give you all the subscriptions (queries) which match that article.
Upvotes: 1