Reputation:
I would like to design the following using classes and clusters but looking for the most logical and efficient solution.
I, Basically have 3 types of users (very different) so I designed them as classes which extends the User abstract class.
My app is strongly based on GeoLoc. So In order to give the best user experience in a matter of response time speed (when performing scans etc..) I'm hesitating between 2 methods :
Having for each UserType as many clusters as the number of countries, then select targetting the concerned Cluster.
_______________________
| User (abstract class) |
|_______________________|
^
|
|
___________________ ___________________ ___________________
| UserType1 (class) | | UserType2 (class) | | UserType3 (class) |
|___________________| |___________________| |___________________|
| | |
| | |
US-Cluster_1 US-Cluster_2 US-Cluster_3
FR-Cluster_1 FR-Cluster_2 FR-Cluster_3
UK-Cluster_1 UK-Cluster_2 UK-Cluster_3
Having a countryField for each UserType then select users filtering with it.
_______________________
| User (abstract class) |
|_______________________|
^
|
|
___________________ ___________________ ___________________
| UserType1 (class) | | UserType2 (class) | | UserType3 (class) |
| | | | | |
| - countryField | | - countryField | | - countryField |
|___________________| |___________________| |___________________|
and then Select * from UserType1 where countryField = "US"
What would be the most efficient and logical way ?
Thank you.
Upvotes: 2
Views: 187
Reputation: 147
If the number of records would grow in millions inside a cluster then you will again have issues retrieving records inside the cluster, because according to this thread [1] orient db cannot use indexes when we specifically retrieving records from cluster.
So in future when the number of records grow inside the cluster, if you want to create an index to another field (say townField) to speed up the data retrieval time, you will not be able to do that. Therefore the only solution you will be left is to again cluster them by towns.
Therefore I would suggest you to go with the second approach and use indexes effectively or try class inheritance based solution as the orient db community is suggesting in this thread [1].
Ref [1] https://github.com/orientechnologies/orientdb/issues/4606
Upvotes: 0
Reputation: 565
Partly depends on your record counts and desired response time. In our experience separating the data into clusters greatly improves query times at the expense of more complexity (managing the clusters, different queries, etc.). We put a couple million records in each cluster and add some home-made indexes to make query times quick.
You really should generate some test data and store it both ways to test query performance vs. your requirements. No 2 use-cases are ever the same.
Upvotes: 1