Reputation: 756
Let's say there's some kind of car rent/sell aggregator. It works with many service providers (that provide cars to the customers) and the customers themselves.
Django way to do models for this kind of system would be something like this:
class Vendor(models.Model):
name = models.CharField()
class Car(models.Model):
vendor = models.ForeignKey(Vendor,
blank=False,
null=False,
related_name='cars',
on_delete=models.CASCADE)
license_plate = models.CharField(max_length=10, blank=False, null=False)
Now, we would proceed to add customer model, probably with m2m field pointing to Cars through the table with dates of rent or whatever. All Vendors would have access only to their Cars even though all the cars are in one table, they would share one table for customers, etc.
Assuming hypothetical scenario of there being a few vendors, say, a dozen, but each of them having a lot of cars - in the millions, my questions are:
Would there be any benefit in attempting to split the Cars model into multiple tables (despite django's principle one table - one model, maybe there are benefits on the DB side)?
I'm talking about splitting by vendor - i.e. each vendor having their own car table.
I kinda think that if each car had some description, like
desc = models.CharField(max_length=500, blank=True)
then splitting could maybe simplify indexing? Dunno. Would be grateful if someone could clarify this.
Anyway, even if there's no real benefit in this, let's say I decided I really need to do this - glue multiple tables to one Django model. Is this even possible? My thinking is to try SQLAlchemy maybe and see where this goes.
I'd love to see any insights or ideas how you would approach such a problem. Or links, if there are articles on similar stuff.
Upvotes: 0
Views: 1886
Reputation: 645
DB table partitioning needs a good design and proper queries. There is a python app Architect which works with django.
Schemaless DB is another approach for massive data, Uber is known to be using it.
Upvotes: 1
Reputation: 1940
Depends what you're optimizing for. If you want fast access to some unique fields, like the car VIN, for example, an index is helpful. But, generally speaking, the database will handle the performance optimizations of your queries.
If your table gets really large (billions of records), you could look into what Instragram did with database sharding. That was a cool example of how to split a table across multiple database instances (using PostgreSQL, but the same would work with any relational database).
Upvotes: 1