Snowman
Snowman

Reputation: 32071

Hosting a small database aside from Google App Engine?

I asked another question about doing large queries in GAE, to which the answer was pretty much not possible.

What I want to do is this: from an iOS device, I get all the user's contacts phone numbers. So now I have a list of say 250 phone numbers. I want to send these phone numbers back to the server and check to see which of these phone numbers belong to a User account.

So I need to do a query: query = User.query(User.phone.IN(phones_list))

However, with GAE, this is quite an expensive query. It will cost 250 reads for just this one query, and I expect to do this type of query often.

So I came up with a crazy idea. Why don't I host the phone numbers on another host, on another database, where this type of query is cheaper. Then I can have GAE send a HTTP request to my other server to get the desired info.

So I have two questions:

  1. Are there any databases more streamlined to handle these kinds of queries, and which it would be more cheaper to do? Or will it all be the same as GAE?
  2. Is this overkill? Is it a good idea? Should I suck it up and pay the cost?

Upvotes: 1

Views: 165

Answers (2)

Jay
Jay

Reputation: 535

Generalizing slightly on other ideas offered... assuming that all your search keys are unique to a single User (e.g. email, phone, twitter handle, etc.)

At User write time, you can generate a set of SearchIndex(...) and persist that. Each SearchIndex has the key of the User. Then at search time you can construct the keys for any SearchIndex and do two ndb.get_multi_async calls. The first to get matching SearchIndex entities, and the second to get the Users associated with those index entities.

Upvotes: 0

lucemia
lucemia

Reputation: 6627

GAE's datastore should be good enough for your service. Since your application looks like could be parallelized very well.

1. use phone number as key_name of User.

As you set number as key_name of User, the following code will increase the query speed and reduce the read operation.

memcache.get_multi([phone_number1, phone_number2 ... ])
db.get([number1_not_found_in_memcache, number2_not_found_in_memcache])

memcache.set_multi("all_number_found_in_db")

2. store multi number in one datastore.

the operation cost of GAE not directly related to the entity's size. therefore a large entity store multi data would be another way to save the operation cost.

for example, store several phone number which have the same number_prefix together.

class Number(db.Model):
    number_prefix = db.StringProperty()
    numbers = db.StringListProperty(indexed = False)

# check number 01234567, 032123124
numbers = Number.get(["01", "03'])

# check 01234567 in number[0].numbers ?
# check 032123124 in number[1].numbers ?

this method could further imporve with memcache.

Upvotes: 1

Related Questions