The PostgreSQL Global Development Group (PGDG) has released version 12 of the popular free PostgreSQL database. As our article for Beta 4 has already indicated, a number of new features, improvements and optimizations have been incorporated into the release. These include among others:
btree-Indexes, the default index type in PostgreSQL, has experienced some optimizations in PostgreSQL 12.
btree indexes used to store duplicates (multiple entries with the same key values) in an unsorted order. This has resulted in suboptimal use of physical representation in these indexes. An optimization now stores these multiple key values in the same order as they are physically stored in the table. This improves disk space utilization and the effort required to manage corresponding btree type indexes. In addition, indexes with multiple indexed columns use an improved physical representation so that their storage utilization is also improved. To take advantage of this in PostgreSQL 12, however, if they were upgraded to the new version using
pg_upgrade via a binary upgrade, these indexes must be recreated or re-indexed.
Insert operations in btree indexes are also accelerated by improved locking.
credativ has contributed an extension for pg_checksums that allows to enable or disable block checksums in stopped PostgreSQL instances. Previously, this could only be done by recreating the physical data representation of the cluster using
pg_checksums now has the option to display a status history on the console with the
--progress parameter. The corresponding code contributions come from the colleagues Michael Banck and Bernd Helmle.
Up to and including PostgreSQL 11, the PostgreSQL Optimizer was unable to optimize common table expressions (also called CTE or WITH queries). If such an expression was used in a query, the CTE was always evaluated and materialized first before the rest of the query was processed. This resulted in expensive execution plans for more complex CTE expressions. The following generic example illustrates this. A join is given with a CTE expression that filters all even numbers from a numeric column:
WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id);
In PostgreSQL 11, using a CTE always leads to a CTE scan that materializes the CTE expression first:
EXPLAIN (ANALYZE, BUFFERS) WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id) ; QUERY PLAN ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Aggregate (cost=2231.12..2231.14 rows=1 width=8) (actual time=48.684..48.684 rows=1 loops=1) Buffers: shared hit=488 CTE t_cte -> Seq Scan on foo (cost=0.00..1943.00 rows=500 width=4) (actual time=0.055..17.146 rows=50000 loops=1) Filter: ((id % 2) = 0) Rows Removed by Filter: 50000 Buffers: shared hit=443 -> Hash Join (cost=270.00..286.88 rows=500 width=0) (actual time=7.297..47.966 rows=5000 loops=1) Hash Cond: (t_cte.id = bar.id) Buffers: shared hit=488 -> CTE Scan on t_cte (cost=0.00..10.00 rows=500 width=4) (actual time=0.063..31.158 rows=50000 loops=1) Buffers: shared hit=443 -> Hash (cost=145.00..145.00 rows=10000 width=4) (actual time=7.191..7.192 rows=10000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 480kB Buffers: shared hit=45 -> Seq Scan on bar (cost=0.00..145.00 rows=10000 width=4) (actual time=0.029..3.031 rows=10000 loops=1) Buffers: shared hit=45 Planning Time: 0.832 ms Execution Time: 50.562 ms (19 rows)
This plan first materializes the
CTE with a
sequential scan with a corresponding filter
(id % 2 = 0). Here no functional index is used, therefore this scan is correspondingly more expensive. Then the result of the
CTE is linked to the table
Hash Join with the corresponding
Join condition. With PostgreSQL 12, the optimizer now has the ability to inline these CTE expressions without prior materialization. The underlying optimized plan in PostgreSQL 12 will look like this:
EXPLAIN (ANALYZE, BUFFERS) WITH t_cte AS (SELECT id FROM foo WHERE id % 2 = 0) SELECT COUNT(*) FROM t_cte JOIN bar USING(id) ; QUERY PLAN ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Aggregate (cost=706.43..706.44 rows=1 width=8) (actual time=9.203..9.203 rows=1 loops=1) Buffers: shared hit=148 -> Merge Join (cost=0.71..706.30 rows=50 width=0) (actual time=0.099..8.771 rows=5000 loops=1) Merge Cond: (foo.id = bar.id) Buffers: shared hit=148 -> Index Only Scan using foo_id_idx on foo (cost=0.29..3550.29 rows=500 width=4) (actual time=0.053..3.490 rows=5001 loops=1) Filter: ((id % 2) = 0) Rows Removed by Filter: 5001 Heap Fetches: 10002 Buffers: shared hit=74 -> Index Only Scan using bar_id_idx on bar (cost=0.29..318.29 rows=10000 width=4) (actual time=0.038..3.186 rows=10000 loops=1) Heap Fetches: 10000 Buffers: shared hit=74 Planning Time: 0.646 ms Execution Time: 9.268 ms (15 rows)
The advantage of this method is that there is no initial materialization of the CTE expression. Instead, the query is executed directly with a
Join. This works for all non-recursive CTE expressions without side effects (for example, CTEs with write statements) and those that are referenced only once per query. The old behavior of the optimizer can be forced with the
WITH ... AS MATERIALIZED ... directive.
Generated Columns in PostgreSQL 12 are materialized columns, which calculate a result based on expressions using existing column values. These are stored with the corresponding result values in the tuple. The advantage is that there is no need to create triggers for subsequent calculation of column values. The following simple example illustrates the new functionality using a price table with net and gross prices:
CREATE TABLE preise(netto numeric, brutto numeric GENERATED ALWAYS AS (netto * 1.19) STORED); INSERT INTO preise VALUES(17.30); INSERT INTO preise VALUES(225); INSERT INTO preise VALUES(247); INSERT INTO preise VALUES(19.15); SELECT * FROM preise; netto │ brutto ───────┼───────── 17.30 │ 20.5870 225 │ 267.75 247 │ 293.93 19.15 │ 22.7885 (4 rows)
brutto is calculated directly from the net price. The keyword
STORED is mandatory. Of course, indexes can also be created on
Generated Columns, but they cannot be part of a primary key. Furthermore, the SQL expression must be unique, i.e. it must return the same result even if the input quantity is the same. Columns declared as
Generated Columns cannot be used explicitly in
UPDATE operations. If a column list is absolutely necessary, the corresponding value can be indirectly referenced with the keyword
Explicit OID columns have historically been a way to create unique column values so that a table row can be uniquely identified database-wide. However, for a long time PostgreSQL has only created these explicitly and considered their basic functionality obsolete. With PostgreSQL the possibility to create such columns explicitly is now finally abolished. This means that it will no longer be possible to specify the
WITH OIDS directive for tables. System tables that have always referenced OID objects uniquely will now return OID values without explicitly specifying OID columns in the result set. Especially older software, which handled catalog queries carelessly, could get problems with a double column output.
Up to and including PostgreSQL 11, database recovery and streaming replication instances were configured via a separate configuration file
With PostgreSQL 12, all configuration work done there now migrates to
recovery.conf file is no longer required. PostgreSQL 12 refuses to start as soon as this file exists. Whether recovery or streaming standby is desired is now decided either by a
recovery.signal file (for recovery) or by a
standby.signal file (for standby systems). The latter has priority if both files are present. The old parameter
standby_mode, which controlled this behavior since then, has been removed.
For automatic deployments of high-availability systems, this means a major change. However, it is now also possible to perform corresponding configuration work almost completely using the
ALTER SYSTEM command.
With PostgreSQL 12 there is now a way to re-create indexes with as few locks as possible. This greatly simplifies one of the most common maintenance tasks in very write-intensive databases. Previously, a combination of
CREATE INDEX CONCURRENTLY and
DROP INDEX CONCURRENTLY had to be used. In doing so, it was also necessary to ensure that index names were reassigned accordingly, if required.
The release notes give an even more detailed overview of all new features and above all incompatibilities with previous PostgreSQL versions.
About the author