user4237042
user4237042

Reputation: 145

What is PGDATA in PostgreSQL

New to PostgreSQL, experienced in Oracle. Trying to understand what PGDATA is.

Reading about it, it seems to contain what is necessary to create the initial Database cluster.

I did and install and I am converting some data from Oracle. The path the installation put PGDATA by default is in a file system that is fairly small. So I am looking to move it into a bigger file system.

So my questions are:

  1. Is this akin to the Oracle Data Dictionary?
  2. Do I need to put all my application data in that path or is it best practice not to?
  3. I know I can just take the default and create a new database in the bigger file system, but should I?

Thanks

Upvotes: 14

Views: 34711

Answers (2)

richyen
richyen

Reputation: 9968

I cannot speak authoritatively regarding $PGDATA's relationship to the Oracle Data Directory, but would barely be an oversimplification to say that $PGDATA contains everything pertaining to a specific PostgreSQL cluster. Things like database logs can live in /var/log or syslog, and in some Linux distros, postgresql.conf and pg_hba.conf live outside of $PGDATA, but for the most part, all of the components that keep a database up and running live in $PGDATA. This includes (and not limited to):

  • The tables and the data themselves
  • Indexes
  • Constraints
  • Users/roles
  • Materialized views
  • Namespaces
  • Tablespaces (at least the symlinks to point to where the tablespaces are stored)
  • Any Write-Ahead Log (WAL) information (similar to Oracle's REDO logs)
  • ... and more

To answer your second question about whether you need to store all of the data in $PGDATA, the short answer is "no" -- there are other options of data storage, and you can shape the storage according to the needs of your business. For example, if you do not want to put all your eggs in one basket, you can split the data up between different machines and link them together with Foreign Data Wrappers (FDWs) -- Postgres plays nicely with other database engines (like Mysql, MongoDB, Redis, etc.).

If you wish to keep your data all on one host, but not all in the $PGDATA folder, you can create tablespaces and place those tablespaces in different places on the same host -- so long as Postgres knows the locations of those tablespaces, you will be able to spread the data out. In some cases, this may be beneficial in terms of performance, but there is, however a cost to additional complexity, and you as the DBA will need to bear that responsibility of determining how much complexity you want to shoulder.

Finally, if you are in need of moving $PGDATA into a different partition, it is possible to simply shut down the database cluster (pg_ctl stop or systemctl stop postgresql), move the directory, and start up again. if you cannot afford a downtime, there are other methods of copying/moving the directory using replication, Point-In-Time Recovery (PITR), pg_basebackup, and so on.

Upvotes: 0

Ihor Romanchenko
Ihor Romanchenko

Reputation: 28541

One major difference between oracle and postgres is documentation quality. If you plan to work with postgres - make some time (a day or two) to read it.

You can find full description for postgres files here: http://www.postgresql.org/docs/current/static/storage-file-layout.html

Generally you do not need to move your whole DB if you do not want. You can create a tablespace outside postgres data directory (via linux symlink).

Still you may want to move your whole DB to a bigger (faster?) FS. PGDATA directory contains Write Ahead Log (similar to Oracle REDO log). Slow operations on WAL mean slow DB in general. The size of the WAL files depends on DB setting and activity - you may want to be sure there is enough free space for it.

Upvotes: 11

Related Questions