avi.elkharrat
avi.elkharrat

Reputation: 6800

How to do an SQL style search query in google cloud datastore?

Google Cloud datastore does not allow to make SQL style search queries like

SELECT * FROM Person WHERE Name LIKE "Rob*"

that would return Rob, Robert, Roberto, Roberta, Roby and so on.

GCP datastore only allows to filter on fields using operators: >, >=, <, <= with rules like:

With these rules, the query

query := datastore.NewQuery("Person").Filter("Name >= ", "Rob").Order("Name")

would not only return all Person whose name starts with Rob, but also all Person whose name is greater than Rob (Ruben, Sarah, Zoe) and all Person whose names starts with a lower case letter.

The present post is a hack that I found in Go to emulate a SQL style search query.

Upvotes: 0

Views: 585

Answers (2)

Jim Morrison
Jim Morrison

Reputation: 2887

Specifically for prefix matches you can use multiple inequality filters on the same property. I.e. from https://cloud.google.com/datastore/docs/concepts/queries "If a query has multiple inequality filters on a given property, an entity will match the query only if at least one of its individual values for the property satisfies all of the filters."

The example on that page is SELECT * FROM Task WHERE tag > 'learn' AND tag < 'math'. Or for your case query := datastore.NewQuery("Person").Filter("Name >= ", "Rob").Filter("Name <= Rob~").Order("Name")

Upvotes: 1

avi.elkharrat
avi.elkharrat

Reputation: 6800

The following solution addresses the issue programmatically. It is coded in go but i believe it could be easily adapted to any language.

I'll first produce the whole snippet and then I shall break it down.

func (store *PersonStore) Search(name string) ([]Person, error) {
    context := context.Background()
    persons := make([]*Person, 0)
    query := datastore.NewQuery("Person").Filter("Name >= ", strings.ToLower(name)).Order("Name")

    keys, err := store.client.GetAll(context, query, &persons)
    if err != nil {
        return nil, fmt.Errorf("unable to search for persons w/ names containing %s - %v", name, err)
    }

    filteredPersons := make([]*Perons, 0)
    for i, p := range persons {
        p.ID = keys[i].ID
        if !strings.Contains(p.Name, strings.ToLower(name)) {
            break
        } else {
            filteredPersons = append(filteredPersons, p)
        }
    }
}

1 - Lower case assumption

In order for this code to work we first need to make a very strong assumption that all names are in lower case. If, for some reason, you cannot make this assumption, you can still partially use this code, but it will be less efficient.

2 - Query the datastore

The first part of the code is dedicated to fetching the datastore for Persons whose name is matching the desired pattern.

    context := context.Background()
    persons := make([]*Person, 0)
    query := datastore.NewQuery("Person").Filter("Name >= ", strings.ToLower(name)).Order("Name")

    keys, err := store.client.GetAll(context, query, &persons)
    if err != nil {
        return nil, fmt.Errorf("unable to search for persons w/ names containing %s - %v", name, err)
    }

We make sure we use strings.ToLower(name) to make sure we won't fetch ALL Persons. Ideally name should be also trimmed, but this is usually done in the front-end, so we ommitted this here.

Once again, this is base on the assumption that all Names are in lower case. If you cannot assume this, you can always use

query := datastore.NewQuery("Person").Filter("Name >= ", name).Order("Name")

You'll just simply get an initial list with (possibly a lot) more Persons.

Finally, we order our fetched list with .Order("Name") as this is going to be the base point of the second part of the code.

3 - Filter on Names

Up to here it was a simple GetAll piece of code. We still need to insert the keys into the Person structures. We need to find a way to optimize that. For this we can rest on the fact that persons and keys list are in exact length and order. So the upcoming for loop starts exactly like a regular insert Key into structure bit.

    for i, p := range persons {
        p.ID = keys[i].ID

The next bit is where the optimization is: since we know that the Persons are ordered by Name we are certain that, as soon as strings.Contains(p.Name, strings.ToLower(name)) is not true we have selected all Persons whose Name matches our criteria, that is, as soon as p.Name doesn't start with Rob anymore but with Roc or Rod or anything lexicographically greater than this.

        if !strings.Contains(p.Name, strings.ToLower(name)) {
            break

We can then escape our loop using a break instruction, hopefuly having only parsed the first few elements of persons list that matche our criteria. These fall into the else statement:

        } else {
            filteredPersons = append(filteredPersons, p)
        }

4 - Filter on Names without the lower case assumption

As I said earlier, if you cannot assume all names are in lower case, you can still use this code, but it will not be optimized, because you will mandatorily have to scan the full persons list returned by the query.

The code should look liske this

    filteredPersons := make([]*Perons, 0)
    for i, p := range persons {
        p.ID = keys[i].ID
        if strings.Contains(p.Name, strings.ToLower(name)) {
            filteredPersons = append(filteredPersons, p)
        }
    }

Upvotes: 0

Related Questions