Introduction to PostgreSQL

PostgreSQL is a general purpose and object-relational database management system, the most advanced open source database system. It was developed based on POSTGRES 4.2 at Berkeley Computer Science Department, University of California. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness. PostgreSQL runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and Windows.

PostgreSQL is free and open source software. It’s source code is available under PostgreSQL license, a liberal open source license.

What Makes PostgreSQL Stand Out?

  • PostgreSQL is the first database management system that implements multi-version concurrency control (MVCC) feature, even before Oracle. The MVCC feature is known as snapshot isolation in Oracle.
  • PostgreSQL is a general-purpose object-relational database management system. It allows us to add custom functions developed using different programming languages such as C/C++, Java, etc.
  • PostgreSQL is designed to be extensible. In PostgreSQL, one can define their own data types, index types, functional languages, etc.

Procedural Languages Support 

PostgreSQL supports four standard procedural languages, which allows the users to write their own code in any of the languages and it can be executed by a PostgreSQL database server. These procedural languages are - PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. Besides, other non-standard procedural languages like PL/PHP, PL/V8, PL/Ruby, PL/Java, etc., are also supported.

Key Features of PostgreSQL

PostgreSQL supports a large part of the SQL standard and offers many modern features including the following −

  • Complex SQL queries

  • SQL Sub-selects

  • Foreign keys

  • Trigger

  • Views

  • Transactions

  • Multiversion concurrency control (MVCC)

  • Streaming Replication (as of 9.0)

  • Hot Standby (as of 9.0)

Specifics of PostgreSQL versions and platforms supported on Delphix are located in the PostgreSQL Compatibility Matrix. 

PostgreSQL Replication Methodology

Delphix uses PostgreSQL streaming replication protocol is used to achieve data replication between Source and Staging database.

Log shipping is a replication technique used by many database management systems. The Master records changes in its transaction log (WAL), and then the log data is shipped from the Master to the Standby, where the log is replayed.

Streaming replication:

  • Use similar database versions on all systems.

  • Configure the systems identically, as far as possible.

  • Provides Integrated security.

  • Reduces replication delay.

Architecture for Streaming Replication  

Below is the high-level architecture diagram showing the data replication between the source and the staging hosts.

Transaction logs - WAL (Write Ahead Logs) are replayed from source to the staging environment to maintain data sync between the two.

  1. The end user application connecting the PostgreSQL source database may perform read/write queries on the database.

  2. The database changes are recorded as WAL segments in the PostgreSQL database under the directory pg_xlog.

  3. Set up the staging PostgreSQL server as Standby Node.

  4. Configure replication security by creating a replication user and specifying the authentication protocol.

  5. Initiate a base backup on secondary.

  6. Configure postgresql.conf file as per the staging environment.

  7. Start the standby server.

The WAL receiver process at the secondary continuously listens for any incoming WAL segments from primary in its receive queue and applies the same on the staging database.

WAL segments are archived when a segment size reaches 16 MB (default).