Reputation: 5561
I'm building an application in Google App Engine (Java), where users can make posts and I'm thinking in adding tags to these posts, so I will have something like this:
in entity Post:
public List<Key> tags;
in entity Tag:
public List<Key> posts;
It would be easy to query, for example, all posts with a certain tag, but how could I get all the posts that has a list of tags? I could make a query for each tag and then make an intersection of the results, but maybe there is a better way... because that would be slow with a lot of posts.
Another thing that may be more difficult is having a post, get the posts that have tags in common ordered by the number of common tags, so I could get "similar" posts to this one, in some way.
Well, with joins this would be a lot easier, but I'm starting with app engine and can't really think about a good way to replace joins.
Thanks!
Upvotes: 10
Views: 2661
Reputation: 4851
See @topchef's blog post on this: Efficient Keyword Search with Relation Index Entities and Objectify for Google Datastore. It talks about implementing search with list properties using Relation Index Entities and Objectify.
Upvotes: 1
Reputation:
You might want to check out this video from Google IO. Relation Index entities are what you need and allows you to remove List<Key> posts
on the Tag
entity. As well as List<Key> tags
on the Post
entity.
Upvotes: 1
Reputation: 14187
With this design, I'm afraid your Tag Entity could be a bottleneck, especially if you expect some tags to be very common. Three specific issues I can think of are efficiency of your gets and puts, write contention and exploding indexes. Let's look at stackoverflow for an example - there are 14,000 posts tagged "java" right now.
Further Reading:
this post touches on some of the issues with large lists
The good news is, some of your requirements would be easily handled by just the Post entity. For example, you could easily find all the posts that have all of a list of tags with a query filter like this:
Query q = pm.newQuery(Post.class)
q.setFilter("tags" == 'Java' && "tags == 'appengine'");
For all posts with either java or appengine tags, you would need to do one query for each tag, then combine the results yourself. The datastore doesn't handle OR/IN type operations right now.
Finding related posts sounds tricky. I'll think about that after some coffee.
Upvotes: 5