icki
icki

Reputation: 51

How to create read replicas from multiple postgres databases into a single database?

I'd like to preface this by saying I'm not a DBA, so sorry for any gaps in technical knowledge.

I am working within a microservices architecture, where we have about a dozen or applications, each supported by its Postgres database instance (which is in RDS, if that helps). Each of the microservices' databases contains a few tables. It's safe to assume that there's no naming conflicts across any of the schemas/tables, and that there's no sharding of any data across the databases.

One of the issues we keep running into is wanting to analyze/join data across the databases. Right now, we're relying on a 3rd Party tool that caches our data and makes it possible to query across multiple database sources (via the shared cache).

Is it possible to create read-replicas of the schemas/tables from all of our production databases and have them available to query in a single database?

Are there any other ways to configure Postgres or RDS to make joining across our databases possible?

Upvotes: 5

Views: 3435

Answers (1)

user330315
user330315

Reputation:

Is it possible to create read-replicas of the schemas/tables from all of our production databases and have them available to query in a single database?

Yes, that's possible and it's actually quite easy.

Setup one Postgres server that acts as the master.

For each remote server, create a foreign server then you then use to create a foreign table that makes the data accessible from the master server.

If you have multiple tables in multiple server that should be viewed as a single table in the master, you can setup inheritance to make all those tables appear like one. If you can define a "sharding" key that identifies a distinct attribute between those server, you can even make Postgres request the data only from the specific server.

All foreign tables can be joined as if they were local tables. Depending on the kind of query, some (or a lot) of the filter and join criteria can even be pushed down to the remote server to distribute the work.

As the Postgres Foreign Data Wrapper is writeable, you can even update the remote tables from the master server.

If the remote access and joins is too slow, you can create materialized views based on the remote tables to create a local copy of the data. This however means that it's not a real time copy and you have to manage the regular refresh of the tables.

Other (more complicated) options are the BDR project or pglogical. It seems that logical replication will be built into the next Postgres version (to be released a the end of this year).

Or you could use a distributed, shared-nothing system like Postgres-XL (which probably is the most complicated system to setup and maintain)

Upvotes: 7

Related Questions