Reputation: 7742
I'm trying to learn about PrestoDB and I have a MySql database. This database is a just a relatively small database and I'm using to understand how presto works with JDBC connections.
I already used presto to connect to a Hive metastore and I understood how it works with the ORC files, how the presto workers take the data and run the query as I need.
This image is really clear for me:
It is really clear how the parallelization will work with this model.
But in MySql connection, how presto parallelize the data with a Relational Database? Is prest load the tables to the workers and then run the query?
Or presto just run the query in MySQL and create an interface of the result?
Upvotes: 3
Views: 1113
Reputation: 711
Presto creates a single JDBC connection and pulls data from MySQL in a single-threaded fashion.
In the future, Presto will be able to parallelize pulling data from MySQL if the data is partitioned (creating a separate JDBC connection for each partition).
Upvotes: 6