Distributing SQL queries over a PostgreSQL cluster

Question

In order to improve the scalability of my database, I would like to have multiple instances of PostgreSQL distributed over multiple machines. (Horizontal sharding)

Most of the operations executed on my database are INSERTs - there are a few occasional SELECTs as well, however the main emphasis is still on the INSERTs and thus the database is growing huge quickly.

For example: imagine we have 3 tables named user, product and activity(user_id, product_id, activity(viewed, bought...), timestamp).

Folks here on the business side use pgAdmin to query the (for now) unique database. What would be a great tool that would allow to have the SQL statements be distributed over the cluster and come back with a reduced set of results without having to to use the UNION keyword in our SELECT queries?

For example, consider the following SQL SELECT statement:

select u.name FROM user AS u INNER JOIN activity AS a ON u.id = a.id INNER JOIN 
product AS p ON p.id = a.id WHERE product.name='blabla';

Ideally nothing would change for the folks using pgAdmin, but under the hood the request would get chopped into smaller "pieces" and distributed to all of the servers in the cluster.

While searching the web, I came across Skype's PL/Proxy but is this project still active? (Same question for Postgres-XC.)

Any help will be much appreciated, thank you.

Distributing SQL queries over a PostgreSQL cluster

Answers (1)

Related Questions