Reputation: 269
I've managed to save some data to Google Cloud Datastore and now I'm planning on querying it to get meaningful data out.
I've tried to follow Google Cloud documentation as well as I can, but I cannot get add_filter to work properly with floats.
In this query, I'm trying to find all products that have currentPrice
over 500, but instead I get all results just sorted with currentPrice
:
from google.cloud import datastore
def create_client(project_id):
return datastore.Client(project_id)
client = create_client('product-catalog')
query = client.query(kind='Product')
query.add_filter('currentPrice', '>', 500)
for result in query.fetch():
for key in result:
print(key+"("+str(type(result[key]))+"): "+str(result[key]))
print('---')
returns:
title(<class 'str'>): Cheap Oven
url(<class 'str'>): /product/343
currentPrice(<class 'float'>): 109.0
createdAt(<class 'datetime.datetime'>): 2018-03-22 19:52:02.173806+00:00
---
title(<class 'str'>): Regular Oven
url(<class 'str'>): /product/1231
currentPrice(<class 'float'>): 549.0
createdAt(<class 'datetime.datetime'>): 2018-03-21 03:25:24.622558+00:00
---
title(<class 'str'>): Expensive Oven
url(<class 'str'>): /product/4234
currentPrice(<class 'float'>): 2399.0
createdAt(<class 'datetime.datetime'>): 2018-03-23 22:46:01.571207+00:00
---
I was expecting for my query to result only Regular Oven
and Expensive Oven
.
For clarity, I've only included three results, but in my actual code there are over 50 000 products and I've verified, that the query results all products sorted by currentPrice and it is not just coincidence.
Upvotes: 0
Views: 1912
Reputation: 8178
I see you have already found the solution to your issue, but let me just provide some more details about why > 500
and > 500.0
does not return the same type of content, so that it can be better understood why the reported behavior happened.
For Datastore, Floating-point numbers and Integers are completely different and independent property types. In this case, as explained in the section about how Datastore performs ordering by value type, integers are sorted before floats, in such a way that any value with Float type will be greater than any value with Integer type:
5 < 7 --- 5.0 < 7.0 --- 7 < 5.0
That is the reason why, when you were working with the filter query.add_filter('currentPrice', '>', 500)
, all entities were being returned, because in this case, all the prices (of Float type) were greater than 500
(of Integer type):
500 < 109.0 --- 500 < 549.0 --- 500 < 2399.0
Therefore, when applying filters to queries in Datastore, you should work with the same property type as the one of the property you are trying to filter by.
Upvotes: 2
Reputation: 269
After doing some tests, I found out that the problem was that there was integer in the query, when the corresponding values were floats. Changing 500 -> 500.0 fixed this.
So this works correctly:
from google.cloud import datastore
def create_client(project_id):
return datastore.Client(project_id)
client = create_client('product-catalog')
query = client.query(kind='Product')
query.add_filter('currentPrice', '>', 500.0)
for result in query.fetch():
for key in result:
print(key+"("+str(type(result[key]))+"): "+str(result[key]))
print('---')
returns:
title(<class 'str'>): Regular Oven
url(<class 'str'>): /product/1231
currentPrice(<class 'float'>): 549.0
createdAt(<class 'datetime.datetime'>): 2018-03-21 03:25:24.622558+00:00
---
title(<class 'str'>): Expensive Oven
url(<class 'str'>): /product/4234
currentPrice(<class 'float'>): 2399.0
createdAt(<class 'datetime.datetime'>): 2018-03-23 22:46:01.571207+00:00
---
Upvotes: 1