Reputation: 2122
I need to collect some statistics on my entities in the datastore.
As an example, I need to know how many objects of a kind I have, how many objects with some properties setted to particular values, etc. In usual relational DBMS I may use
SELECT COUNT(*) ... WHERE property=<some value>
or
SELECT MAX(*), ... GROUP BY property
etc. But here I cannot see any of these structures.
Moreover, I cannot take load all the objects in memory (e.g. using pm.getExtent(MyCall.class, false)) as I have too much entities (more than 100k).
Do you know any trick to achieve my goal?
Upvotes: 4
Views: 1726
Reputation: 24910
Support for aggregate functions is limited on GAE. This is primarily an artifact of the schema-less nature of BigTable. The alternative is to maintain the aggregate functions as separate fields yourself to access them quickly.
To do a count, you could do something like this --
Query q = em.createQuery("SELECT count(p) FROM your.package.Class p");
Integer i = (Integer) q.getSingleResult();
but this will probably return you just 1000 rows since GAE limits the number of rows fetched to 1000.
Some helpful reading how to work around these issues --
http://marceloverdijk.blogspot.com/2009/06/google-app-engine-datastore-doubts.html
Is there a way to do aggregate functions on Google App Engine?
Upvotes: 1
Reputation: 35961
Actually it depends on your specific requirements.
Btw, there is a common way, to prepare this stats data in background.
For example, you can run few tasks, by using Queue
service, that will use query like select x where x.property == some value
+ cursor
+ an sum variable
. If you at the first step, cursor will be empty and sum will be zero. Then, you'll iterate your query result, for 1000 items (query limit) or 9 minutes (task limit), incrementing sum
on every step, and then, if it's not finished, call this task with new cursor and sum values. I mean you add request to next step into queue. Cursor is easily serializable into string.
When you have final step - you have to save result value somewhere into stat results table.
Take a look at:
And also, this stats/aggregation stuff is really depends on your actual task/requirements/project, there few way to accomplish this, optimal for different tasks. There is no standard way, like in SQL
Upvotes: 2