postgresql 6,000 Q&As Showing 6,000 of 9,130

PostgreSQL FAQ & Answers

6000 expert PostgreSQL answers researched from official documentation. Every answer cites authoritative sources you can verify.

Jump to section:

Functions - System Administration (80) Monitoring > Database Statistics (66) Triggers (66) Functions - System Information (63) Queries and SELECT (62) Architecture and Concepts (60) Server Configuration > Error Reporting and Logging (59) Connection Pooling > PgBouncer Configuration (57) Routine Maintenance > VACUUM (54) Data Manipulation (DML) (54) Server Configuration > Write-Ahead Log Settings (54) PostGIS > Spatial Functions (52) Monitoring > Replication Monitoring (51) Connection Pooling > PgBouncer (50) Functions - Date/Time (50) Database Roles and Privileges (50) Logical Replication > Subscriptions (49) Write-Ahead Log (WAL) (48) Client Authentication > pg_hba.conf Configuration (47) Functions - Pattern Matching (47) Streaming Replication > Replication Slots (46) Foreign Data Wrappers (46) Functions - String (45) Point-in-Time Recovery (45) Full Text Search > Text Search Functions (44) Server Configuration > Connections and Authentication (44) Data Definition (DDL) > Sequences (44) Routine Maintenance > Bloat Management (43) Logical Replication > Publications (43) Query Processing (43) Server Configuration > Resource Consumption - Memory (43) Backup and Restore > pg_basebackup (42) Streaming Replication > Failover and Switchover (41) SSL and Encryption (40) PostGIS > Geometry Types (40) Routine Maintenance > Autovacuum (40) Functions - Aggregate (40) Concurrency Control > Table-Level Locks (39) Server Configuration > Query Planning Settings (38) Database Administration (38) Performance Tips > EXPLAIN and EXPLAIN ANALYZE (37) Backup and Restore > Continuous Archiving (37) Server Configuration > Replication Settings (37) Client Interfaces (37) Streaming Replication > Primary Configuration (37) Indexes > GiST Indexes (36) Information Schema (36) Client Authentication > External Authentication (36) Transactions > Transaction Basics (36) Indexes > BRIN Indexes (35) PL/pgSQL > Cursors (35) Data Types - Range (35) Localization (35) Backup and Restore > pg_dump (35) High Availability (35) PostGIS > Spatial Indexing (35) Connection Pooling > pgpool-II (34) Concurrency Control > Transaction Isolation Levels (34) Row-Level Security (34) Data Types - JSON > JSONPath (34) Internals > Page Structure (33) Connection Pooling > Connection Pooling Patterns (33) Data Types - Geometric (33) SQL Syntax (33) Installation and Setup (33) Indexes > B-tree Indexes (33) Data Types - JSON > JSON Functions (32) Monitoring > pg_stat_statements (32) Stored Procedures (32) Data Definition (DDL) > Constraints (31) Data Definition (DDL) > Schemas and Namespaces (31) Extensions > pg_partman (31) Large Objects (31) Data Types - Pseudo-Types (31) Data Types - Arrays (31) Functions - Mathematical (30) Data Types - Composite (30) Internals > Tuple Structure (30) Data Types - Network Address (30) Data Types - Date/Time (30) Extensions > pgcrypto (30) additional (30) PL/pgSQL > Exception Handling (30) Data Types - JSON > JSON Operators (30) System Catalogs (30) Performance Tips > Query Statistics (29) PL/pgSQL > Dynamic SQL (29) Monitoring > Table Statistics (29) json_operations (29) Parallel Query (29) Functions - Sequence (29) Data Types - Domain (29) Internals > TOAST (29) Transactions > Isolation Level Behaviors (29) Server Configuration > Vacuuming Settings (29) transactions_concurrency (28) security (28) Partitioning > Partition Management (28) Streaming Replication > Standby Configuration (28) Data Types - Object Identifiers (28) Routine Maintenance > REINDEX (28) Server Configuration > Lock Management (28) Extensions > hstore (28) Data Types - JSON > JSON vs JSONB (28) User-Defined Functions (27) Performance Tips > Join Strategies (27) Indexes > GIN Indexes (27) indexing (27) Type Conversion (27) Extensions > pgAudit (27) Extensions > Extension Management (27) Routine Maintenance > ANALYZE (26) Indexes > Unique Indexes (26) Concurrency Control > Explicit Row Locking (26) Server Configuration > Resource Consumption - Disk and Kernel (26) Data Types - Numeric (26) Data Types - Enumerated (26) Data Types - XML (26) Full Text Search > tsvector and tsquery (26) Performance Tips > Planner Statistics (26) Client Authentication > Password Authentication (26) Data Types - Boolean (26) Foreign Key Performance > FK Lock Contention (25) Concurrency Control > MVCC Implementation (25) Partitioning > Range Partitioning (25) Full Text Search > Text Search Configuration (25) Extensions > pg_stat_statements (24) Data Types - UUID (24) Logical Replication > Conflict Handling (24) integration_patterns (24) PostgreSQL Limits (24) Performance Tips > Scan Types (24) Full Text Search > GIN Indexes for FTS (24) query_optimization (24) Streaming Replication > Synchronous Replication (24) performance_tuning (24) fulltext_search (24) Data Definition (DDL) > Views (24) Functions - Array (23) Transactions > Savepoints (23) Backup and Restore > pg_restore (23) JIT Compilation (22) replication_ha (22) Data Types - Character (22) Transactions > Two-Phase Commit (22) Data Definition (DDL) > Tables and Columns (22) Extensions > uuid-ossp (22) Partitioning > Partition Pruning (22) Joins and Subqueries (22) PostGIS > Geography Type (22) Internals > Visibility Map (22) Partitioning > List Partitioning (22) Event Triggers (21) Indexes > Hash Indexes (21) Monitoring > pg_stat_activity (21) Partitioning > Hash Partitioning (21) core_concepts (21) Client Authentication > Certificate Authentication (21) Window Functions > Aggregate Window Functions (21) Monitoring > Index Statistics (21) Internals > Free Space Map (20) Rules (20) Full Text Search > Ranking and Highlighting (20) Window Functions > Offset Functions (20) Logical Replication > Replication Identity (20) Indexes > Indexes on Expressions (19) common_errors (19) backup_recovery (19) PL/pgSQL > Control Structures (19) Partitioning > Partition-wise Operations (19) Concurrency Control > Deadlock Detection (19) Extensions > pg_trgm (19) PL/pgSQL > Variables and Types (19) Indexes > Covering Indexes (18) Indexes > Partial Indexes (18) Concurrency Control > Advisory Locks (18) Foreign Key Performance > FK Index Requirements (18) Indexes > Multicolumn Indexes (18) Indexes > SP-GiST Indexes (17) Window Functions > Ranking Functions (17) Data Types - Bit String (17) partitioning (17) Data Types - Binary (17) Data Types - Monetary (17) Foreign Key Performance > Cascading Actions Performance (17) PostGIS > Coordinate Systems (17) Window Functions > Window Frame Specifications (16) Streaming Replication > Cascading Replication (16) extensions (15) Common Table Expressions (15) Foreign Key Performance > FK and Bulk Operations (15) Server Configuration > File Locations (15) Logical Replication > Row Filters (14) Logical Replication > Column Lists (14) Data Types - pg_lsn (13) Functions - Conditional (13) Extensions > pg_repack (12) ddl_operations (12) authentication (2) replication (2) data_loading (1) session_management (1) server_configuration (1) data_types (1) data_definition (1) foreign_data (1) vacuum_maintenance (1) administration (1) connection_pooling (1) psql_tools (1) monitoring (1)

Functions - System Administration

80 questions
A

PLAIN (no compression or out-of-line storage), EXTENDED (allows both compression and out-of-line storage), EXTERNAL (allows out-of-line storage but not compression), and MAIN (allows compression but not out-of-line storage).

95% confidence

Monitoring > Database Statistics

66 questions

Triggers

66 questions
A

If a view is automatically updatable, the action is handled by automatically rewriting the statement into an action on the view's base table, so the base table's statement-level triggers are the ones that are fired (not triggers on the view itself).

95% confidence

Functions - System Information

63 questions
A

session_user is the user who initiated the database connection (the 'real user'). current_user is the user identifier applicable for permission checking (the 'effective user'). Normally they are equal, but current_user can be changed with SET ROLE or during execution of SECURITY DEFINER functions.

95% confidence
A

It returns a record with columns including: checkpoint_lsn, redo_lsn, redo_wal_file, timeline_id, prev_timeline_id, full_page_writes, next_xid, next_oid, next_multixact_id, next_multi_offset, oldest_xid, oldest_xid_dbid, oldest_active_xid, oldest_multi_xid, oldest_multi_dbid, oldest_commit_ts_xid, newest_commit_ts_xid, checkpoint_time.

95% confidence

Queries and SELECT

62 questions
A

NATURAL JOIN creates an implicit join based on all columns with the same name in both tables, automatically matching and combining rows. It's equivalent to a USING clause listing all common column names.

95% confidence
A

WITH TIES returns additional rows that tie for the last position according to the ORDER BY clause. This requires ORDER BY to be specified, and may return more rows than the specified count.

95% confidence
A

Without ORDER BY, you will get an unpredictable subset of the query's rows. Using different LIMIT/OFFSET values will give inconsistent results unless you enforce a predictable result ordering with ORDER BY.

95% confidence

Architecture and Concepts

60 questions
A

The four strategies are: PLAIN (prevents compression and out-of-line storage), EXTENDED (allows both compression and out-of-line storage, default for most types), EXTERNAL (out-of-line without compression), and MAIN (allows compression but not out-of-line storage except as last resort).

95% confidence

Server Configuration > Error Reporting and Logging

59 questions

Connection Pooling > PgBouncer Configuration

57 questions
A

It allows running multiple PgBouncer instances on the same host listening on the same port, with the kernel distributing connections automatically. This enables PgBouncer to use more CPU cores.

95% confidence

Routine Maintenance > VACUUM

54 questions

Data Manipulation (DML)

54 questions

Server Configuration > Write-Ahead Log Settings

54 questions
A

Torn page writes (also called fractured blocks or partial writes), which occur when a page write is only partially completed during an operating system crash, leaving an on-disk page with a mix of old and new data.

95% confidence
A

It logs a message like 'checkpoints are occurring too frequently (X seconds apart)' with a hint to 'Consider increasing the configuration parameter max_wal_size' when checkpoints occur closer together than the checkpoint_warning threshold.

95% confidence

PostGIS > Spatial Functions

52 questions
A

ST_IsValidReason returns text stating if a geometry is valid, or a reason for invalidity. In addition to the reason (such as self-intersection), the location of the invalidity (coordinate) is also returned.

95% confidence
A

ST_LineMerge sews together the constituent line work of a MULTILINESTRING to form LineString(s). However, lines are not merged across intersections with degree > 2, and the input MultiLineString must not have gaps.

95% confidence

Monitoring > Replication Monitoring

51 questions
A

It optionally specifies an existing replication slot to be used when connecting to the sending server via streaming replication to control resource removal on the upstream node. This ensures the primary doesn't remove WAL segments until the standby processes them.

95% confidence
A

pid (integer), status (text), receive_start_lsn (pg_lsn), receive_start_tli (integer), written_lsn (pg_lsn), flushed_lsn (pg_lsn), received_tli (integer), last_msg_send_time (timestamptz), last_msg_receipt_time (timestamptz), latest_end_lsn (pg_lsn), latest_end_time (timestamptz), slot_name (text), sender_host (text), sender_port (integer), and conninfo (text).

95% confidence

Connection Pooling > PgBouncer

50 questions
A

KILL immediately drops all client and server connections to the named database. New client connections to a killed database will wait until RESUME is called. This is useful in emergency failover scenarios.

95% confidence
A

RECONNECT closes each open server connection after it is released (according to the pooling mode), even if its lifetime is not up. It is useful when server connection setup has changed, for example to perform a gradual switchover to a new server.

95% confidence
A

SUSPEND flushes all socket buffers and stops listening for data on them. The command will not return before all buffers are empty. It is used when rebooting PgBouncer online. New client connections to a suspended database will wait until RESUME is called.

95% confidence
A

PAUSE tries to disconnect from all servers, first waiting for all queries to complete. The command will not return before all queries are finished. New client connections to a paused database will wait until RESUME is called.

95% confidence

Functions - Date/Time

50 questions

Database Roles and Privileges

50 questions

Logical Replication > Subscriptions

49 questions
A

Streaming is triggered when the total amount of changes decoded from the WAL (for all in-progress transactions) exceeds the limit defined by logical_decoding_work_mem setting. Changes in large transactions (greater than logical_decoding_work_mem) are applied directly via one of the parallel apply workers, if available.

95% confidence
A

Only persistent slots that have attained synced state as true on the standby before failover can be used for logical replication after failover. To resume logical replication after failover from synced logical slots, the subscription's conninfo must be altered to point to the new primary server.

95% confidence

Write-Ahead Log (WAL)

48 questions
A

The default value is -1, which automatically selects a size equal to 1/32nd (about 3%) of shared_buffers, but not less than 64kB nor more than the size of one WAL segment (typically 16MB).

95% confidence
A

No. When both wal_init_zero and wal_recycle are enabled (their default state), recycled files are not filled with zeroes because they are not considered 'new' files. wal_init_zero only applies to newly created WAL files.

95% confidence

Client Authentication > pg_hba.conf Configuration

47 questions

Functions - Pattern Matching

47 questions

Streaming Replication > Replication Slots

46 questions

Foreign Data Wrappers

46 questions
A

The extensions option is a comma-separated list of names of PostgreSQL extensions that are installed, in compatible versions, on both the local and remote servers. Functions and operators that are immutable and belong to a listed extension will be considered shippable to the remote server.

95% confidence

Functions - String

45 questions
A

unistr() evaluates escaped Unicode characters in a string. It supports formats: \XXXX (4 hex digits), +XXXXXX (6 hex digits), \uXXXX (4 hex digits), or \UXXXXXXXX (8 hex digits). Two backslashes represent a literal backslash.

95% confidence

Point-in-Time Recovery

45 questions

Full Text Search > Text Search Functions

44 questions
A

ts_parse has two forms: ts_parse(parser_name text, document text, OUT tokid integer, OUT token text) returns setof record, and ts_parse(parser_oid oid, document text, OUT tokid integer, OUT token text) returns setof record.

95% confidence

Server Configuration > Connections and Authentication

44 questions

Data Definition (DDL) > Sequences

44 questions

Routine Maintenance > Bloat Management

43 questions
A

When triggered, any cost-based delay that is in effect will no longer be applied, non-essential maintenance tasks (such as index vacuuming) are bypassed, and any Buffer Access Strategy in use will be disabled, allowing VACUUM to use all of shared buffers.

95% confidence

Logical Replication > Publications

43 questions

Query Processing

43 questions
A

The default value of plan_cache_mode is 'auto'. This allows PostgreSQL to automatically choose whether to use a generic or custom plan for prepared statements. Other values are 'force_custom_plan' and 'force_generic_plan'.

95% confidence
A

CREATE VIEW creates a relation and an entry in pg_rewrite that tells there is a rewrite rule that must be applied whenever the relation is referenced in a query. The action of the rule is one query tree that is a copy of the SELECT statement in the view creation command. Views are implemented using the rule system.

95% confidence
A

PostgreSQL processes queries through five stages in this order: 1) Parser - generates a parse tree from SQL text, 2) Analyzer/Analyser - performs semantic analysis and generates a query tree, 3) Rewriter - transforms the query tree using rules from the rule system, 4) Planner - generates the optimal execution plan, 5) Executor - executes the query by accessing tables and indexes.

Sources
95% confidence
A

Nested loop join is suitable for small tables or when one table is much smaller than the other. It's the only join method that doesn't need to scan the entire set (when index access is available), making it ideal for OLTP systems with short queries returning few rows. It's also the only option when there is no join condition with an = operator.

95% confidence
A

The default value of cursor_tuple_fraction is 0.1. This sets the planner's estimate of the fraction of a cursor's rows that will be retrieved, biasing the planner toward fast-start plans that retrieve the first 10% of rows quickly.

95% confidence
A

The default value of geqo_effort is 5. This parameter ranges from 1 to 10 and controls the trade-off between planning time and query plan quality in GEQO, with larger values increasing planning time but improving plan quality.

95% confidence
A

Merge join is best for joining sorted tables. It works well with both short OLTP queries and long OLAP ones, has linear complexity (both sets scanned only once), needs little memory, and can start output immediately. The only requirement is that the data sets must be pre-sorted.

95% confidence

Server Configuration > Resource Consumption - Memory

43 questions

Backup and Restore > pg_basebackup

42 questions
A

The -R option creates a standby.signal file in the output directory and appends connection settings to postgresql.auto.conf, automating the setup of a standby server with the necessary replication configuration.

95% confidence
A

Yes, when the backup is in tar format, the target directory may be specified as '-' (dash), causing the tar file to be written to stdout. This is only possible if the cluster has no additional tablespaces and WAL streaming is not used.

95% confidence
A

To be effective, olddir must exactly match the path specification of the tablespace as it is defined on the source server. The path is not canonicalized, so if a tablespace was created with '/tmp//foo-old', you must map '/tmp//foo-old' exactly, not '/tmp/foo-old'.

95% confidence

Streaming Replication > Failover and Switchover

41 questions

SSL and Encryption

40 questions
A

It enforces encryption, verifies the certificate chain, and also checks that the server hostname matches the name stored in the server certificate. This is recommended for most security-sensitive environments.

95% confidence

PostGIS > Geometry Types

40 questions
A

A LinearRing is a closed, simple LineString (starts and ends on the same point and does not self-intersect) that forms the boundaries of polygons.

95% confidence

Routine Maintenance > Autovacuum

40 questions
A

When failsafe is triggered: (1) any cost-based delay is no longer applied, (2) non-essential maintenance tasks like index vacuuming are bypassed, and (3) any Buffer Access Strategy is disabled, allowing VACUUM to use all shared buffers.

95% confidence

Functions - Aggregate

40 questions
A

Yes. count(*) counts all input rows including those with NULL values in any column. It does not check individual columns. In contrast, count(column_name) counts only rows where the specified column is not NULL.

95% confidence

Concurrency Control > Table-Level Locks

39 questions
A

Yes, when an index scan is used, CLUSTER creates a temporary copy of the table containing the data in index order, and temporary copies of each index. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes.

95% confidence
A

lock_timeout controls the maximum time waiting to acquire a lock on a database object, while statement_timeout limits the total execution time of a query including both lock acquisition and actual execution. Unlike statement_timeout, lock_timeout can only occur while waiting for locks.

95% confidence
A

deadlock_timeout is the amount of time to wait on a lock before checking to see if there is a deadlock condition. When log_lock_waits is enabled, it also determines the amount of time to wait before a log message is issued about the lock wait.

95% confidence
A

VACUUM (without FULL) acquires a SHARE UPDATE EXCLUSIVE lock mode. However, when the TRUNCATE option is enabled (which it is by default), VACUUM may also briefly acquire an ACCESS EXCLUSIVE lock during the truncation phase.

95% confidence

Server Configuration > Query Planning Settings

38 questions

Database Administration

38 questions
A

When the number of active concurrent connections is at least max_connections minus superuser_reserved_connections, new connections will be accepted only for superusers. With default settings (max_connections=100, superuser_reserved_connections=3), regular users can use up to 97 connections.

95% confidence

Performance Tips > EXPLAIN and EXPLAIN ANALYZE

37 questions

Backup and Restore > Continuous Archiving

37 questions
A

The valid values are: 'minimal' (removes all logging except crash recovery), 'replica' (adds logging for WAL archiving and replication), and 'logical' (adds information for logical decoding). Each level includes information logged at lower levels.

95% confidence
A

PostgreSQL 12 eliminated the recovery.conf file. The server will not start if recovery.conf is present. All configuration parameters moved to postgresql.conf, and two new signal files (recovery.signal and standby.signal) replaced recovery.conf for triggering recovery modes.

95% confidence

Server Configuration > Replication Settings

37 questions
A

primary_conninfo uses a libpq connection string format. Example: 'user=postgres host=10.10.1.8 port=5432 application_name=standby_east'. For replication slot synchronization, it is necessary to also specify a valid dbname in the connection string.

95% confidence

Client Interfaces

37 questions

Streaming Replication > Primary Configuration

37 questions

Indexes > GiST Indexes

36 questions
A

The fetch method converts the compressed index representation of a data item into the original data type for index-only scans. The returned data must be an exact, non-lossy copy of the originally indexed value.

95% confidence
A

Yes, GiST compress methods can be lossy. If compress is lossy, then it's impossible for decompress to exactly reconstruct the original data. This convention allows GiST to support both lossless and lossy index structures.

95% confidence
A

GiST geometric indexes support: << (strictly left of), &< (does not extend to right of), &> (does not extend to left of), >> (strictly right of), <<| (strictly below), &<| (does not extend above), |&> (does not extend below), |>> (strictly above), @> (contains), <@ (contained by), ~= (same), and && (overlaps).

95% confidence

Information Schema

36 questions
A

When a table column, composite-type attribute, domain, function parameter, or function return value is defined to be of an array type, the respective information schema view only contains ARRAY in the data_type column. To obtain information on the element type of the array, you can join the respective view with element_types.

95% confidence

Client Authentication > External Authentication

36 questions

Transactions > Transaction Basics

36 questions
A

START TRANSACTION [ transaction_mode [, ...] ] where transaction_mode options include: ISOLATION LEVEL { SERIALIZABLE | REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED }, READ WRITE | READ ONLY, [ NOT ] DEFERRABLE

95% confidence
A

INSERT, UPDATE, DELETE, MERGE, and COPY FROM if the table they would write to is not a temporary table; all CREATE, ALTER, and DROP commands; COMMENT, GRANT, REVOKE, TRUNCATE; and EXPLAIN ANALYZE and EXECUTE if the command they would execute is among those listed.

95% confidence
A

Without BEGIN, PostgreSQL operates in autocommit mode where each statement is executed in its own transaction and a commit is implicitly performed at the end of the statement (if execution was successful, otherwise a rollback is done).

95% confidence
A

PREPARE TRANSACTION is not intended for use in applications or interactive sessions. Its purpose is to allow an external transaction manager to perform atomic global transactions across multiple databases or other transactional resources. Unless you're writing a transaction manager, you probably shouldn't be using PREPARE TRANSACTION.

95% confidence

Indexes > BRIN Indexes

35 questions

PL/pgSQL > Cursors

35 questions
A

BINARY causes the cursor to return data in binary format rather than text format. This reduces conversion effort for both the server and client, at the cost of more programmer effort to deal with platform-dependent binary data formats.

95% confidence
A

WITH HOLD specifies that the cursor can continue to be used after the transaction that created it successfully commits. The cursor persists beyond transaction boundaries.

95% confidence
A

Yes, the name of the cursor must be different from any other active cursor name in the session. Portal names (the underlying server structure) must be unique within the session for the duration of the portal's existence.

95% confidence
A

Prior to PostgreSQL 16, bound cursor variables were initialized to contain their own names, so the portal name would be the same as the cursor variable's name by default. This was changed in version 16 because it created too much risk of conflicts between similarly-named cursors in different functions.

95% confidence
A

FETCH ABSOLUTE 0 positions the cursor before the first row without returning any data. This is a fast way to rewind the cursor to its initial position.

95% confidence

Data Types - Range

35 questions

Localization

35 questions
A

The builtin C.UTF-8 locale provides code point sort order (fast, memcmp-based) combined with Unicode semantics for character operations such as pattern matching, regular expressions, and LOWER()/INITCAP()/UPPER().

95% confidence
A

The following are always available: 'default' (selects locale specified at database creation), 'C' and 'POSIX' (sort by byte values), 'unicode' (sorts using Unicode Collation Algorithm with Default Unicode Collation Element Table), 'ucs_basic' (sorts by Unicode code point values with ASCII-only letter treatment), 'pg_unicode_fast' (Unicode code point sorting with full case mapping), and 'pg_c_utf8' (Unicode code point sorting with simple case mapping).

95% confidence
A

PostgreSQL supports 6 locale categories: LC_COLLATE (string sort order), LC_CTYPE (character classification), LC_MESSAGES (language of messages), LC_MONETARY (formatting of currency amounts), LC_NUMERIC (formatting of numbers), and LC_TIME (formatting of dates and times).

95% confidence

Backup and Restore > pg_dump

35 questions

High Availability

35 questions
A

Support for recovery.conf was removed in PostgreSQL 12. It was replaced by two signal files: standby.signal (for standby mode) and recovery.signal (for normal archive recovery). Recovery parameters moved to postgresql.conf. The presence of recovery.conf in PG12+ will cause the server to refuse to start.

95% confidence

PostGIS > Spatial Indexing

35 questions

Connection Pooling > pgpool-II

34 questions
A

This parameter allows Pgpool-II to use clear-text-password authentication with frontend clients when pool_passwd file does not contain the password for the connecting user, and use that password (provided by client) to authenticate with the backend using MD5 and/or SCRAM authentication. This parameter does not enforce SSL encryption because pool_hba.conf cannot be used with it.

Sources
95% confidence
A

The selection of the load balance node is influenced by the backend_weight config value for each node. Although backend_weight accepts any positive integer value, all the backend_weights eventually get normalized on the scale of [0 to 1]. For example, if you have 2 nodes with backend_weight = 1 each, the normalized weight for each node will be 0.5 and both will get 50% of SELECT queries.

Sources
95% confidence
A

The delegate_IP parameter specifies the IP address for virtual IP address which must not be used by someone else. When a standby pgpool server promotes to active, the new active server brings up virtual IP interface, while the previous active server brings down the virtual IP interface, enabling the active pgpool to work using the same IP address even when servers is switched over.

Sources
95% confidence
A

pcp_attach_node attaches the given node to Pgpool-II. However, you should only use pcp_attach_node if you are sure that there's no replication delay (or the delay will be recovered later on) and the role of the node (primary/standby) will not be changed.

Sources
95% confidence
A

Pgpool-II supports trust, md5, scram-sha-256, and pam authentication methods. MD5 is a password-based authentication method where MD-5-hashed password is sent by the client. SCRAM-SHA-256 is a challenge-response based authentication that prevents password sniffing on untrusted connections.

Sources
95% confidence
A

The default value is 0, which means no connection from clients will be refused. When set to 1 or greater, incoming connections are rejected if the number of current connections exceeds (num_init_children - reserved_connections). This parameter can only be set at server start.

Sources
95% confidence
A

It specifies a comma separated list of function names that DO update the database. SELECTs including functions specified in this list are not load balanced. You can use regular expression to match function names, to which ^ and $ are automatically added. Example: black_function_list = 'nextval,setval,set_.,update_.,delete_.,insert_.'

Sources
95% confidence

Concurrency Control > Transaction Isolation Levels

34 questions

Row-Level Security

34 questions
A

Yes, you can use sub-SELECTs or functions containing SELECTs in policy expressions to consult other rows or tables, though such accesses can create race conditions that could allow information leakage if care is not taken.

95% confidence
A

session_user is the user who initiated the database connection and normally doesn't change. current_user is the user identifier applicable for permission checking and can change with SET ROLE or during execution of SECURITY DEFINER functions. In Unix terms, session_user is the 'real user' and current_user is the 'effective user'.

95% confidence
A

A leakproof function has no side effects and doesn't leak information about values received. Common operators like equality operators are leakproof. Functions that might throw errors depending on values (like division by zero) are not leakproof. The LEAKPROOF parameter can only be set by superusers.

95% confidence

Data Types - JSON > JSONPath

34 questions

Internals > Page Structure

33 questions
A

(BLCKSZ - SizeOfPageHeaderData) / (MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData)). This represents the upper bound on tuples per page, accounting for maxaligned tuple headers and line pointers.

Sources
95% confidence

Connection Pooling > Connection Pooling Patterns

33 questions
A

Each child process stores 1 connection per user:database pair and can store up to max_pool connections of unique pairs. A child opens a new backend connection only if the requested [user,database] pair is not already in the cache. When max_pool is reached, the oldest connection is disconnected to make room for a new pair.

Sources
95% confidence
A

Transaction pooling does not support: SET/RESET (except SET LOCAL), LISTEN, WITH HOLD CURSOR, PREPARE/DEALLOCATE (traditionally, though version 1.21+ has protocol-level support), PRESERVE/DELETE ROW temp tables (must be created, used, and dropped in same transaction), LOAD, and session-level advisory locks.

95% confidence
A

By default, the load balance mode is at 'session level', meaning the node that read queries are sent to is determined when a client connects to Pgpool-II, with one of the nodes selected randomly each time a new session is created.

Sources
95% confidence

Data Types - Geometric

33 questions

SQL Syntax

33 questions
A

Yes, block comments nest in PostgreSQL (as of version 8.0 and later), as specified in the SQL standard. This allows commenting out larger blocks that may contain existing block comments.

95% confidence

Installation and Setup

33 questions
A

template0 contains the pristine initial data that matches your PostgreSQL version. It should never be modified after initialization. It allows creating a clean database with new encoding and locale settings, or restoring pg_dump backups without site-local additions from template1.

95% confidence

Indexes > B-tree Indexes

33 questions
A

PostgreSQL can use a partial index only if the system can recognize that the WHERE condition of the query mathematically implies the predicate of the index. The system can recognize simple inequality implications (e.g., 'x < 1' implies 'x < 2'), but otherwise the predicate condition must exactly match part of the query's WHERE condition.

95% confidence
A

The following cannot use deduplication: text/varchar/char with nondeterminable collations, numeric, jsonb, float4, float8, and container types (composite types, arrays, range types). Additionally, indexes with INCLUDE columns cannot use deduplication.

95% confidence
A

B-tree version 4 is a new index format introduced in PostgreSQL 12 with optimizations for duplicate entries and improved page splitting logic. Indexes created before version 12 remain in version 3 format after pg_upgrade and must be REINDEXed to use version 4.

95% confidence

Data Types - JSON > JSON Functions

32 questions
A

Concatenates two jsonb values. When concatenating two arrays, it generates an array containing all elements from both inputs. When concatenating objects, it merges them (with the right operand's values overwriting the left's for duplicate keys).

95% confidence

Monitoring > pg_stat_statements

32 questions

Stored Procedures

32 questions

Data Definition (DDL) > Constraints

31 questions
A

Exclusion constraints ensure that if any two rows are compared on the specified columns or expressions using the specified operators, at least one of these operator comparisons will return false or null. This is useful for preventing overlapping ranges or conflicting data.

95% confidence
A

The three options are: NOT DEFERRABLE (default - checked immediately, cannot change), DEFERRABLE INITIALLY IMMEDIATE (checked immediately but can be changed per transaction), and DEFERRABLE INITIALLY DEFERRED (checked at transaction commit, can be changed per transaction).

Sources
95% confidence

Data Definition (DDL) > Schemas and Namespaces

31 questions

Extensions > pg_partman

31 questions
A

hourly, daily, weekly, monthly, quarterly, yearly, custom, or any combination of these. By default, pg_partman's time-based partitioning will truncate child table starting values to line up at the beginning of typical boundaries (midnight for daily, day 1 for monthly, Jan 1 for yearly, etc).

95% confidence
A

retention_schema is an optional parameter that tells pg_partman to move a table to another schema instead of dropping it. When combined with retention configuration, it provides a way to reliably dump out tables that would normally just be dropped by the retention system via ALTER TABLE SET SCHEMA.

Sources
95% confidence
A

The function returns child_start_time, child_end_time, child_start_id, child_end_id, and suffix. It provides detailed information about a specific child partition including start and end times for time-based partitions, start and end IDs for ID-based partitions, and the partition suffix.

Sources
95% confidence
A

This function reapplies ownership & grants on all child tables based on what the parent table has set. Privileges that the parent table has will be granted to all child tables and privileges that the parent does not have will be revoked (with CASCADE).

Sources
95% confidence
A

It lists all child partitions in a set in logical order, with the default partition not listed by default. The function signature is show_partitions(p_parent_table text, p_order text DEFAULT 'ASC', p_include_default boolean DEFAULT false) and returns partition_schemaname and partition_tablename.

Sources
95% confidence
A

The part_config table has a constraint_valid column to set whether constraints should be set NOT VALID on creation. While this can make the creation of the constraint(s) nearly instantaneous, constraint exclusion cannot be used until it is validated.

Sources
95% confidence
A

By default, if you undo a child table that is also partitioned, it will not stop additional sibling children of the parent partition set from being subpartitioned unless that parent is also undone. This function simply deletes the parent_table entry from the part_config_sub table to handle situations where you're not removing the parent but don't want any additional subpartitioned children.

Sources
95% confidence

Large Objects

31 questions

Data Types - Pseudo-Types

31 questions
A

The void pseudo-type indicates that a function returns no value. In PL/pgSQL, if you declared the function to return void, a RETURN statement can be used to exit the function early, but do not write an expression following the RETURN keyword.

95% confidence
A

For DELETE operations, the return value should be OLD. The returned value has no direct effect but must be nonnull to allow the trigger action to proceed. Returning NEW is not sensible since NEW is null in DELETE triggers.

95% confidence
A

The unknown pseudo-type is initially imputed to unadorned string literals and NULL constants. The parser normally converts these to constants of some other type as soon as it can figure out what type they ought to be based on context.

95% confidence
A

A foreign-data wrapper handler is declared to return fdw_handler. The handler function returns a struct of function pointers to callback functions that will be called by the planner, executor, and various maintenance commands.

95% confidence

Data Types - Arrays

31 questions

Functions - Mathematical

30 questions

Data Types - Composite

30 questions

Internals > Tuple Structure

30 questions

Data Types - Network Address

30 questions

Data Types - Date/Time

30 questions

Extensions > pgcrypto

30 questions

additional

30 questions
A

PostgreSQL supports six types of constraints:

1. CHECK Constraints
Enforce that column values satisfy a Boolean expression. Evaluated on INSERT/UPDATE only.

CREATE TABLE products (
    price numeric CHECK (price > 0),
    discounted_price numeric CHECK (discounted_price > 0 AND price > discounted_price)
);

2. NOT-NULL Constraints
Prevent NULL values in a column. More efficient than equivalent CHECK constraints.

CREATE TABLE products (
    name text NOT NULL
);

3. UNIQUE Constraints
Ensure values in a column or group of columns are unique across all rows. Automatically creates a B-tree index. Multiple NULLs are allowed by default.

CREATE TABLE products (
    product_no integer UNIQUE,
    UNIQUE (product_no, name)  -- multi-column
);

4. PRIMARY KEY Constraints
Combines UNIQUE and NOT NULL. Each table can have at most one primary key. Automatically creates an index.

CREATE TABLE products (
    product_no integer PRIMARY KEY
);

5. FOREIGN KEY Constraints
Maintain referential integrity between tables. Values must match existing values in referenced table. Supports CASCADE, RESTRICT, SET NULL, SET DEFAULT actions on DELETE/UPDATE.

CREATE TABLE orders (
    product_no integer REFERENCES products (product_no) ON DELETE CASCADE
);

6. EXCLUSION Constraints
Ensure that comparing any two rows using specified operators on specified columns returns at least one false or null. Requires an index method (typically GiST or SP-GiST).

CREATE TABLE circles (
    c circle,
    EXCLUDE USING gist (c WITH &&)  -- no overlapping circles
);

All constraints except NOT-NULL can be deferred using DEFERRABLE and checked at transaction end instead of immediately.

Sources:

95% confidence
A

Use EXPLAIN to show PostgreSQL's query execution plan without running the query, or EXPLAIN ANALYZE to execute it and see actual performance metrics.

Basic Usage

-- Show estimated execution plan only
EXPLAIN SELECT * FROM users WHERE age > 25;

-- Execute query and show actual performance
EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25;

-- Get detailed information with buffer statistics
EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM users WHERE age > 25;

Key Optimization Steps

  1. Run EXPLAIN ANALYZE - Execute your query with EXPLAIN ANALYZE to get actual timing and row counts, not just estimates.

  2. Identify expensive operations - Look for:

    • High total cost numbers (in arbitrary units, relative comparison matters)
    • Sequential Scans on large tables (may need indexes)
    • High "Rows Removed by Filter" (inefficient filtering)
    • Nested Loop joins with large outer tables (may need different join strategy)
    • Sorts/hashes that spill to disk (increase work_mem)
  3. Compare estimated vs actual rows - Large discrepancies indicate stale statistics. Run ANALYZE table_name to update them.

  4. Check index usage - Verify the planner uses appropriate indexes. If not:

    • Ensure indexes exist on WHERE/JOIN columns
    • Check if condition makes index unusable (e.g., functions on indexed columns)
    • Consider composite indexes for multi-column conditions
  5. Analyze buffer hits - With BUFFERS option, check the ratio of "shared hit" (cache) vs "read" (disk). Low cache hits may indicate insufficient shared_buffers.

Useful Options

  • ANALYZE - Actually execute and show real timing (WARNING: modifies data for INSERT/UPDATE/DELETE)
  • BUFFERS - Show cache hit/miss statistics
  • VERBOSE - Include column names and additional details
  • FORMAT JSON - Output as JSON for programmatic analysis

Safe Testing for Data Modification

BEGIN;
EXPLAIN ANALYZE DELETE FROM orders WHERE status = 'cancelled';
ROLLBACK;  -- Prevent actual deletion

Common Patterns to Optimize

  • Sequential Scan → Index Scan: Add appropriate index
  • High loop count in Nested Loop: Switch join type or reorder joins
  • Large "Rows Removed by Filter": Create partial index or restructure query
  • Sort operation with high memory: Increase work_mem configuration

Sources:

95% confidence
A

Based on the official node-postgres documentation, here's the authoritative answer:

Default: 0 (no timeout)

The connectionTimeoutMillis parameter controls how long (in milliseconds) the pool will wait when attempting to acquire a client connection before timing out.

Default behavior: When set to 0 (the default), there is no timeout - the pool will wait indefinitely for a connection to become available.

What it does: This setting determines the maximum time to wait when calling pool.connect() or pool.query() if all connections are currently in use and the pool is at max capacity. If a connection cannot be acquired within this time, an error is thrown.

Configuration considerations:

  • Set a positive value (e.g., 2000 for 2 seconds) to prevent indefinite hangs when connections aren't properly released
  • The appropriate value depends on your application's performance requirements and expected connection availability
  • If connections are frequently timing out, it may indicate you need to increase max pool size or fix connection leaks (unreleased clients)

Example:

const pool = new Pool({
  connectionTimeoutMillis: 2000, // Wait max 2 seconds for a connection
  max: 10
})

Sources:

95% confidence
A

JSON stores an exact copy of the input text and reparses it on every query, while JSONB stores data in a decomposed binary format that is faster to process.

Key Differences:

Storage:

  • JSON: Stores exact text copy, requires reparsing on each execution
  • JSONB: Binary format, slower to insert (conversion overhead), faster to query (no reparsing)

Data Preservation:

  • JSON: Preserves whitespace, key order, and duplicate keys
  • JSONB: Removes whitespace, doesn't preserve key order, keeps only last value for duplicate keys

Indexing:

  • JSON: No indexing support
  • JSONB: Supports indexing (GIN indexes for efficient queries)

When to Use:

  • JSONB: Default choice for most applications - use when querying, indexing, or manipulating JSON data
  • JSON: Use only when exact text preservation is required (whitespace, key order) or legacy compatibility needs

The PostgreSQL documentation states: "most applications should prefer to store JSON data as jsonb."

Sources:

95% confidence
A

B-tree indexes are PostgreSQL's default index type and should be used for most common indexing scenarios.

Use B-tree indexes when you need:

  1. Equality and range queries - B-trees handle comparisons with operators: <, <=, =, >=, >
  2. Sorted data retrieval - When you need to retrieve data in sorted order
  3. BETWEEN and IN queries - These constructs work efficiently with B-tree indexes
  4. NULL checks - IS NULL and IS NOT NULL conditions are supported
  5. Pattern matching with anchored patterns - LIKE 'foo%' or ~ '^foo' (pattern at the start of string)

Key characteristics:

  • B-tree is created by default when you run CREATE INDEX without specifying a type
  • Works with any data type that can be sorted into a well-defined linear order
  • Handles the vast majority of indexing use cases in typical applications

When NOT to use B-tree:

  • Full-text search (use GiST or GIN instead)
  • Pattern matching with non-anchored patterns like LIKE '%bar' (use trigram indexes)
  • Geometric data queries (use GiST or SP-GiST)
  • Array containment or JSON operations (use GIN)

Example:

-- Creates a B-tree index by default
CREATE INDEX idx_users_email ON users(email);

-- Works efficiently with:
SELECT * FROM users WHERE email = '[email protected]';
SELECT * FROM users WHERE created_at >= '2024-01-01';
SELECT * FROM users WHERE name LIKE 'John%';

Sources:

95% confidence
A

PostgreSQL is an open-source object-relational database management system (ORDBMS) with nearly 40 years of active development, originating from the POSTGRES project at UC Berkeley in 1986.

Key Differentiators

Object-Relational Model: Unlike pure relational databases, PostgreSQL extends SQL with object-oriented features, allowing inheritance and complex data types.

Extensibility: Users can define custom data types, functions, operators, aggregate functions, index methods, and procedural languages without recompiling the database. This level of customization is not standard in traditional relational databases.

Advanced Concurrency: Uses Multi-Version Concurrency Control (MVCC), enabling readers and writers to operate without blocking each other.

Standards Compliance: Conforms to at least 170 of 177 mandatory features for SQL:2023 Core conformance—among the highest compliance levels of any database.

Multiple Index Types: Supports B-tree, Hash, GiST, SP-GiST, GIN, and BRIN indexes, allowing optimization for different data patterns.

Proven Architecture: Known for reliability, data integrity, and transactional guarantees—handling workloads from gigabytes to petabytes.

Open Source License: Freely available for private, commercial, or academic use with no vendor lock-in.

Sources:

95% confidence
A

PostgreSQL has three main backup methods, each with distinct trade-offs:

1. SQL Dump (pg_dump)

How it works: Exports database objects and data as SQL commands that can recreate the database.

Advantages:

  • Cross-version compatibility (can restore to newer PostgreSQL versions)
  • Works across different machine architectures (e.g., 32-bit to 64-bit)
  • Can be performed remotely from any host with database access
  • Non-blocking (doesn't interfere with other database operations)
  • Internally consistent snapshot at start time
  • Selective restoration possible with custom formats

Disadvantages:

  • Requires read access to all tables (typically needs superuser privileges)
  • Subject to operating system file size limits
  • Only handles single databases (doesn't include cluster-wide data like roles)
  • Slower than file-level backups
  • Smaller storage footprint but slower execution

Use when: You need portability across versions/architectures, selective table restoration, or remote backup capability.

2. File System Level Backup

How it works: Direct copy of PostgreSQL data directory files.

Advantages:

  • Faster execution than SQL dumps (no reconstruction needed)
  • Can use filesystem snapshots for minimal downtime
  • Rsync with --checksum enables brief-shutdown backups

Disadvantages:

  • Database must be shut down for usable backup (partial measures insufficient)
  • Larger file size than SQL dumps (includes all index data)
  • No selective restoration—must restore entire database cluster
  • Not portable across PostgreSQL versions or architectures
  • Requires commit log files (pg_xact/*) for any table restoration

Use when: You need fast backup/restore of complete clusters on the same version/architecture, or have snapshot-capable filesystems.

3. Continuous Archiving and Point-in-Time Recovery (PITR)

How it works: Combines file-system backup with ongoing WAL (Write-Ahead Log) file archiving.

Advantages:

  • Point-in-time recovery to any moment since base backup
  • Indefinite backup continuity without frequent full backups
  • Ideal for very large databases
  • Enables warm standby systems
  • Doesn't require perfectly consistent base backup (WAL replay corrects inconsistencies)

Disadvantages:

  • Significant storage demands (must retain all WAL files since base backup)
  • Recovery time proportional to time since last base backup (must replay all WAL segments)
  • Cluster-level recovery only (cannot restore individual databases)
  • High operational complexity (requires careful archive command configuration and monitoring)
  • Failed archiving can fill pg_wal/ and trigger database shutdown
  • Performance impact during backup (full_page_writes forced on)

Use when: You need point-in-time recovery capability, have large databases where frequent full backups are impractical, or need warm standby systems.

Key Trade-off Summary:

  • Speed vs. Portability: File-level fastest but not portable; SQL dump slowest but most portable
  • Size vs. Flexibility: SQL dumps smaller with selective restore; file-level larger, all-or-nothing
  • Simplicity vs. Recovery Options: SQL dump/file-level simple but fixed recovery point; PITR complex but flexible recovery time

Sources:

95% confidence
A

Handle database migrations safely in PostgreSQL using these practices:

1. Use Transactions for Schema Changes

Wrap schema modifications in transactions so the database remains in a known-good state if the migration fails. PostgreSQL supports transactional DDL for most operations.

BEGIN;
ALTER TABLE users ADD COLUMN email VARCHAR(255);
-- More migration steps
COMMIT;

Exception: CREATE INDEX CONCURRENTLY cannot run inside a transaction block.

2. Create Indexes Concurrently

Use CREATE INDEX CONCURRENTLY to avoid blocking writes during index creation:

CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

This performs two table scans and allows normal operations to continue. If it fails, drop the invalid index and retry.

Limitation: Cannot be used inside transaction blocks or on partitioned tables directly.

3. Set Lock Timeouts

Prevent migrations from blocking indefinitely when waiting for locks:

SET lock_timeout = '5s';
ALTER TABLE users ADD COLUMN status VARCHAR(50);

Most ALTER TABLE operations require ACCESS EXCLUSIVE locks, which block all access to the table.

4. Use Lower-Impact ALTER TABLE Operations

Some operations require lighter locks:

  • SET STATISTICS and per-attribute options: SHARE UPDATE EXCLUSIVE lock (allows reads and row-level operations)
  • ADD FOREIGN KEY: SHARE ROW EXCLUSIVE lock (lighter than ACCESS EXCLUSIVE)

5. Add Columns with Defaults Carefully

In PostgreSQL 11+, adding columns with defaults is optimized and doesn't rewrite the table:

ALTER TABLE users ADD COLUMN active BOOLEAN DEFAULT true;

Pre-11 versions rewrite the entire table, which can be slow and lock-intensive.

6. Always Backup Before Migrations

Create encrypted backups before running migrations:

pg_dump -Fc database_name > backup.dump

7. Monitor Lock Waits

Enable logging to detect lock issues:

SET log_min_duration_statement = '1s';
SET log_lock_waits = on;

8. Avoid Long-Running Transactions

Keep migration transactions short to minimize lock contention and reduce the risk of blocking other operations.

Sources:

95% confidence
A

MVCC (Multiversion Concurrency Control) is PostgreSQL's concurrency mechanism where each SQL statement sees a snapshot of data as it existed at a specific point in time, rather than the absolute current state.

How it enables high concurrency:

The key advantage is that readers don't block writers, and writers don't block readers. In MVCC, locks acquired for querying (reading) data do not conflict with locks acquired for writing data. This eliminates the lock contention that exists in traditional locking systems.

Mechanism:

  • Each transaction sees a consistent snapshot of the database from when its snapshot was taken
  • Multiple versions of rows can exist simultaneously
  • PostgreSQL maintains data consistency internally without requiring read/write locks to conflict
  • This allows many concurrent users to read and modify data simultaneously without waiting for each other

Result:
High-concurrency performance in multiuser environments while maintaining data consistency—read operations proceed freely alongside writes, maximizing throughput.

Sources:

95% confidence
A

Window functions in PostgreSQL perform calculations across sets of rows that are related to the current query row, without collapsing the rows into a single output like aggregate functions do.

Key characteristics:

  1. OVER clause required - Every window function must have an OVER clause directly following the function name, which distinguishes it from regular functions.

  2. Maintains row identity - Unlike GROUP BY aggregates, window functions preserve all individual rows while computing values across row sets.

  3. Window specification - The OVER clause can include:

    • PARTITION BY - divides rows into groups
    • ORDER BY - controls the order of row processing within partitions
    • Frame specification - defines which rows are included in the calculation
  4. Placement restrictions - Window functions are only permitted in the SELECT list and ORDER BY clause. They cannot be used in WHERE, GROUP BY, or HAVING clauses because they execute after row filtering and grouping.

Example:

SELECT salary, 
       AVG(salary) OVER (PARTITION BY department) as dept_avg,
       rank() OVER (ORDER BY salary DESC) as salary_rank
FROM employees;

This returns every employee row with their salary, their department's average salary, and their salary rank—all without reducing the result set.

Sources:

95% confidence
A

When your PostgreSQL connection pool is exhausted, implement graceful degradation using these strategies:

1. Configure Connection Acquisition Timeouts

Set a reasonable timeout for connection acquisition. When this timeout is reached, return an error (typically 503 Service Unavailable) rather than letting requests hang indefinitely.

PgBouncer configuration:

query_wait_timeout = 120  # seconds - how long clients wait for a connection
reserve_pool_timeout = 5  # seconds - enables reserve pool connections
reserve_pool_size = 5     # additional connections available after timeout

2. Implement Circuit Breaker Pattern

Use a circuit breaker to prevent overwhelming a saturated connection pool:

States:

  • CLOSED: Normal operation
  • OPEN: Stops sending requests after consecutive failures (typically 5)
  • HALF-OPEN: Tests recovery after cooldown period (typically 60 seconds)

Node.js example with opossum:

const CircuitBreaker = require('opossum');

const breaker = new CircuitBreaker(async () => {
  const client = await pool.connect();
  // ... database operation
  client.release();
}, {
  timeout: 3000,           // operation timeout
  errorThresholdPercentage: 50,
  resetTimeout: 60000      // time before attempting HALF-OPEN
});

3. Queue Requests with Backpressure

Limit in-flight database transactions to match available resources. Queue additional requests rather than attempting to acquire connections immediately.

Monitor queue depth:

SHOW POOLS;  -- in PgBouncer, watch cl_waiting column

If the oldest client wait time increases, your pool is undersized or queries are too slow.

4. Return Cached/Stale Data

For read operations, serve cached data with appropriate staleness indicators (e.g., X-Cache: stale header) when fresh connections are unavailable.

5. Implement Retry with Exponential Backoff

async function withRetry(operation, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await operation();
    } catch (err) {
      if (i === maxRetries - 1) throw err;
      await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
    }
  }
}

6. Optimize Query Performance

The most effective solution: make queries faster so connections return to the pool sooner. PostgreSQL performance degrades significantly beyond the saturation point.

Sources:

95% confidence
A

SERIAL in PostgreSQL is not a true data type, but a notational convenience for creating auto-incrementing integer columns (similar to AUTO_INCREMENT in other databases).

When you declare a column as SERIAL, PostgreSQL automatically:

  1. Creates a sequence object
  2. Sets the column's default value to the next value from that sequence
  3. Marks the sequence as owned by the column (so it's dropped when the column is dropped)

Example:

CREATE TABLE tablename (
    colname SERIAL
);

Is equivalent to:

CREATE SEQUENCE tablename_colname_seq AS integer;
CREATE TABLE tablename (
    colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

Three variants exist:

  • SMALLSERIAL (alias: SERIAL2) → creates smallint column (2 bytes, range: 1 to 32,767)
  • SERIAL (alias: SERIAL4) → creates integer column (4 bytes, range: 1 to 2,147,483,647)
  • BIGSERIAL (alias: SERIAL8) → creates bigint column (8 bytes, range: 1 to 9,223,372,036,854,775,807)

Important: Sequences can have gaps in numbering even if no rows are deleted, because allocated values are consumed even when transactions roll back.

Sources:

95% confidence
A

Based on my research of authoritative PostgreSQL connection pool documentation, I can provide a verified answer:

Why Requests Block When Connection Pool Is Full

Requests block when a connection pool is full because connection pools use a queue to handle demand that exceeds available connections. When all connections in the pool are actively in use, new connection requests are placed in a FIFO (First-In-First-Out) queue and wait until an existing connection is released back to the pool.

The Mechanism:

  1. Pool Saturation: When a client requests a connection and all connections in the pool are checked out (in use), the request cannot be served immediately

  2. Queue Waiting: Instead of failing immediately, the request is placed in a waiting queue. The client thread/process blocks until:

    • A connection becomes available (returned to the pool by another client)
    • A timeout is reached (if configured)
    • The queue limit is exceeded (if configured)
  3. Fairness: The FIFO queue ensures requests are served in the order they arrive

Configuration Parameters That Control Blocking:

node-postgres (pg):

  • connectionTimeoutMillis: Maximum time to wait for a connection (default: 0 = wait indefinitely)
  • Blocks in FIFO queue until timeout or connection available

psycopg3 (Python):

  • timeout: Maximum wait time in seconds (default: 30 seconds)
  • max_waiting: Maximum queued requests before rejecting new ones (default: 0 = unlimited)
  • Raises TooManyRequests when queue limit exceeded

PgBouncer:

  • max_client_conn: Controls how many clients can queue while waiting
  • maxwait: Shows how long the oldest client has waited in queue

Why This Design?

Blocking with a queue is preferable to immediate failure because it smooths out temporary traffic spikes and ensures requests are eventually served when the pool has capacity, rather than dropping requests that might succeed moments later.

Sources:

95% confidence
A

JSONB is a binary storage format for JSON data in PostgreSQL. Unlike the plain json type which stores an exact copy of the input text, jsonb stores data in a decomposed binary format.

Key characteristics:

  • Binary storage: Data is parsed and stored in a binary representation, not as text
  • Faster processing: No reparsing needed on each query (unlike json type)
  • Indexing support: Can be indexed using GIN (Generalized Inverted Index) for efficient searching
  • Normalized format: Does not preserve whitespace, object key order, or duplicate keys
  • Slower input: Requires conversion overhead when inserting data

Trade-offs vs JSON type:

Feature json jsonb
Storage Exact text copy Binary decomposed
Input speed Faster Slower (conversion overhead)
Query speed Slower (must reparse) Faster
Indexing No Yes (GIN indexes)
Preserves key order Yes No
Preserves whitespace Yes No

Recommended use: PostgreSQL documentation recommends using jsonb for most applications unless you have specialized needs like preserving exact key ordering.

Example:

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    data JSONB
);

-- Create GIN index for efficient querying
CREATE INDEX idx_users_data ON users USING GIN (data);

-- Insert data
INSERT INTO users (data) VALUES ('{"name": "Alice", "age": 30}');

-- Query using containment operator
SELECT * FROM users WHERE data @> '{"age": 30}';

Sources:

95% confidence
A

Window functions preserve individual rows while computing across related row sets, whereas aggregate functions collapse rows into summary results.

Key Differences:

  1. Output rows: Aggregate functions return one row per group (or one row total). Window functions return one row for each input row.

  2. Syntax: Window functions require an OVER clause after the function name. Without OVER, the same function acts as a regular aggregate.

  3. Row access: Window functions can access multiple rows related to the current row within a defined "window frame" while keeping each row separate.

  4. Execution order: Window functions execute after aggregate functions in the query pipeline.

  5. Clause restrictions: Window functions are only allowed in SELECT list and ORDER BY clause, not in GROUP BY, HAVING, or WHERE.

Example:

-- Aggregate: Returns 1 row per department
SELECT depname, avg(salary) 
FROM empsalary 
GROUP BY depname;

-- Window: Returns all employee rows with dept average added
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname)
FROM empsalary;

In the window function example, every employee row is preserved, but each shows their department's average salary alongside their individual data.

Sources:

95% confidence
A

Use GIN indexes for JSONB and array data when you need to search within composite values rather than match them entirely. Specifically:

For JSONB:

  • When searching for keys or key/value pairs across many JSONB documents (containment queries using @>, jsonpath queries using @? and @@)
  • When checking for key existence using ?, ?|, and ?& operators
  • NOT when checking equality of complete JSONB documents (use B-tree instead)

For Arrays:

  • When searching for specific elements within arrays
  • When checking for overlap (&&), containment (@>), or subset (<@) operations
  • NOT for simple array equality checks

Operator Class Selection for JSONB:

Use jsonb_ops (default) when:

  • You need key-existence operators (?, ?|, ?&)
  • Query patterns are varied and unpredictable
  • Schema flexibility is important

Use jsonb_path_ops when:

  • You only need containment (@>) or jsonpath operators (@?, @@)
  • Performance and index size are critical (indexes are typically much smaller and searches faster)
  • You won't search for empty JSON structures like {"a": {}} (these require full-index scans)

Performance Tradeoffs:

  • GIN indexes are slower for inserts/updates than B-tree indexes
  • Use fastupdate parameter for write-heavy workloads
  • For bulk loading, drop the index, load data, then recreate the index

Sources:

95% confidence
A

PostgreSQL triggers are database functions that execute automatically in response to data modification events (INSERT, UPDATE, DELETE, TRUNCATE) on tables, views, or foreign tables.

How Triggers Work:

Triggers execute at specific points:

  • BEFORE - fires before the operation, can modify data or prevent execution
  • AFTER - fires after the operation completes
  • INSTEAD OF - replaces the operation (views only)

Triggers operate at two levels:

  • Row-level - executes once per affected row
  • Statement-level - executes once per SQL statement, regardless of rows affected

Common Use Cases:

  1. Automatic timestamp tracking - Recording when records were created/modified and by whom:
CREATE FUNCTION emp_stamp() RETURNS trigger AS $emp_stamp$
    BEGIN
        NEW.last_date := current_timestamp;
        NEW.last_user := current_user;
        RETURN NEW;
    END;
$emp_stamp$ LANGUAGE plpgsql;

CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE FUNCTION emp_stamp();
  1. Data validation - Enforcing business rules before data changes:
IF NEW.salary < 0 THEN
    RAISE EXCEPTION '% cannot have a negative salary', NEW.empname;
END IF;
  1. Audit logging - Recording all changes to critical tables:
CREATE FUNCTION process_emp_audit() RETURNS TRIGGER AS $emp_audit$
    BEGIN
        IF (TG_OP = 'DELETE') THEN
            INSERT INTO emp_audit SELECT 'D', now(), current_user, OLD.*;
        ELSIF (TG_OP = 'UPDATE') THEN
            INSERT INTO emp_audit SELECT 'U', now(), current_user, NEW.*;
        ELSIF (TG_OP = 'INSERT') THEN
            INSERT INTO emp_audit SELECT 'I', now(), current_user, NEW.*;
        END IF;
        RETURN NULL;
    END;
$emp_audit$ LANGUAGE plpgsql;
  1. Maintaining referential integrity - Cascading updates/deletes across related tables

  2. Summary table synchronization - Automatically updating aggregate/materialized data when source tables change

Sources:

95% confidence
A

PostgreSQL supports four transaction isolation levels as defined by the SQL standard:

  1. Read Uncommitted
  2. Read Committed (default)
  3. Repeatable Read
  4. Serializable

Note: PostgreSQL internally implements only three distinct isolation levels. Read Uncommitted behaves identically to Read Committed, as PostgreSQL does not allow dirty reads.

You can set the isolation level using:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Sources:

95% confidence
A

Advisory locks in PostgreSQL are application-defined locks where the database provides the locking mechanism but does not enforce their use—it's up to your application to use them correctly.

What they are:

  • Locks with user-defined meanings that can be acquired on arbitrary integer values
  • They exist outside PostgreSQL's MVCC (Multi-Version Concurrency Control) system
  • Visible in the pg_locks system view
  • Two types available:
    • Session-level: Held until explicitly released or session ends (ignore transaction boundaries)
    • Transaction-level: Automatically released at transaction end (no explicit unlock needed)

Functions available:

-- Session-level (must explicitly unlock)
pg_advisory_lock(key bigint)
pg_advisory_unlock(key bigint)
pg_try_advisory_lock(key bigint)  -- Non-blocking

-- Transaction-level (auto-release)
pg_advisory_xact_lock(key bigint)
pg_try_advisory_xact_lock(key bigint)  -- Non-blocking

When to use them:

  1. Implementing pessimistic locking strategies that mimic traditional flat-file database systems
  2. Application-level mutex/semaphore coordination (e.g., ensuring only one worker processes a queue item)
  3. Distributed job coordination where multiple application instances need to coordinate work
  4. Better than flag columns in tables because advisory locks:
    • Offer superior performance
    • Prevent table bloat
    • Receive automatic cleanup when sessions terminate

Important considerations:

  • Session-level locks ignore transaction semantics—a lock acquired in a rolled-back transaction remains held
  • Be cautious with LIMIT clauses in queries that acquire locks, as execution order isn't guaranteed
  • A single lock identifier can be held at both session and transaction levels simultaneously

Sources:

95% confidence
A

The circuit breaker pattern for database connection pools is a fault tolerance mechanism that prevents cascading failures by monitoring database operation failures and temporarily blocking requests when a failure threshold is exceeded.

How It Works

The circuit breaker wraps database calls and operates as a state machine with three states:

1. CLOSED (Normal Operation)

  • All database requests pass through to the connection pool
  • Failures are counted within a time window
  • If failures exceed the threshold, transitions to OPEN

2. OPEN (Failing Fast)

  • Database requests fail immediately without attempting connection
  • Returns errors to the application instantly
  • After a timeout period (e.g., 30-60 seconds), transitions to HALF-OPEN
  • Prevents connection pool exhaustion during database outages

3. HALF-OPEN (Testing Recovery)

  • Limited requests are allowed through to test database availability
  • If successful, transitions back to CLOSED
  • If failures continue, returns to OPEN

Failure Thresholds

Common configurations:

  • Failure threshold: 5-10 consecutive failures or 50% failure rate
  • Timeout period: 30-60 seconds before retry
  • Different thresholds for error types: e.g., 10 for timeouts, 3 for connection failures

Benefits for Connection Pools

  • Prevents connection pool exhaustion when database is unresponsive
  • Avoids resource depletion from connections waiting on timeouts
  • Stops retry storms that amplify load on failing databases
  • Provides automatic recovery testing without manual intervention

PostgreSQL-Specific Context

PostgreSQL connection poolers like PgBouncer manage connection reuse, but circuit breakers operate at the application layer to protect against database unavailability before connection attempts consume pool resources.

Sources:

95% confidence
A

SERIAL is a pseudo-type convenience notation, not an actual data type. When you use SERIAL, PostgreSQL automatically creates a sequence and wires it to the column. A sequence is the underlying database object that generates the numbers.

Key Differences:

SERIAL (convenience notation):

  • Not a true data type
  • Automatically creates a sequence with name tablename_colname_seq
  • Sets column type to integer (smallint for smallserial, bigint for bigserial)
  • Adds NOT NULL constraint
  • Sets default value to nextval('sequence_name')
  • Links sequence ownership to the column (sequence drops when column is dropped)

SEQUENCE (database object):

  • Actual database object created with CREATE SEQUENCE
  • Independent entity (unless OWNED BY is set)
  • Can be shared across multiple tables/columns
  • Requires manual configuration of all parameters
  • Must manually set as column default with DEFAULT nextval('sequence_name')
  • Persists independently unless owned by a column

Equivalent Commands:

-- Using SERIAL
CREATE TABLE users (
    id SERIAL PRIMARY KEY
);

-- Using SEQUENCE (manual equivalent)
CREATE SEQUENCE users_id_seq;
CREATE TABLE users (
    id integer NOT NULL DEFAULT nextval('users_id_seq') PRIMARY KEY
);
ALTER SEQUENCE users_id_seq OWNED BY users.id;

Modern Alternative:
Since PostgreSQL 10, IDENTITY columns are SQL-standard and preferred over SERIAL:

CREATE TABLE users (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY
);

Sources:

95% confidence
A

An INNER JOIN in PostgreSQL returns only the rows where there is a match between the specified columns in both tables. Use the JOIN keyword (which defaults to INNER JOIN) with an ON clause to specify the matching condition.

Basic Syntax:

SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column;

Example with actual data:

SELECT city, temp_lo, temp_hi, prcp, date, location
FROM weather 
JOIN cities ON weather.city = cities.name;

With qualified column names (recommended for clarity):

SELECT weather.city, weather.temp_lo, weather.temp_hi,
       weather.prcp, weather.date, cities.location
FROM weather 
JOIN cities ON weather.city = cities.name;

Key characteristics:

  • JOIN and INNER JOIN are equivalent (INNER is the default)
  • Only returns rows where the join condition matches in both tables
  • Unmatched rows from either table are excluded from results
  • The JOIN ... ON syntax is preferred over the older WHERE clause style for clarity

Alternative (older) syntax (not recommended):

SELECT *
FROM weather, cities
WHERE weather.city = cities.name;

The explicit JOIN ... ON syntax is preferred because it separates the join condition from filtering conditions in the WHERE clause, making queries easier to understand and maintain.

Sources:

95% confidence
A

Regular Views vs. Materialized Views in PostgreSQL

Regular Views:

  • Virtual/logical views with no data storage
  • Execute the defining query every time they are accessed
  • Always return current, real-time data
  • Can be updatable (depending on the query)
  • No refresh needed—data is always fresh

Materialized Views:

  • Persist query results in a table-like form (physical storage)
  • Return data directly from stored results, not by executing the query
  • Data becomes stale until explicitly refreshed with REFRESH MATERIALIZED VIEW
  • Cannot be directly updated with INSERT/UPDATE/DELETE
  • Much faster query performance, but data may not be current

Performance Trade-off:
Materialized views provide faster access than regular views or direct table access, but at the cost of data currency. Use materialized views when you need query performance and can tolerate stale data.

Example:

-- Regular view (executes query each time)
CREATE VIEW sales_summary AS
  SELECT product_id, SUM(amount) FROM sales GROUP BY product_id;

-- Materialized view (stores results)
CREATE MATERIALIZED VIEW sales_summary_mat AS
  SELECT product_id, SUM(amount) FROM sales GROUP BY product_id;

-- Refresh when needed
REFRESH MATERIALIZED VIEW sales_summary_mat;

Sources:

95% confidence
A

PostgreSQL extensions package functionality into reusable, manageable units that can be loaded into a database with a single command.

How the Extension System Works

Core Components:

  • Control file (extension.control): Defines extension metadata including version, dependencies, and relocatability
  • Script file (extension--version.sql): Contains SQL commands to create the extension's objects
  • Shared library (optional): C code for performance-critical functionality

Loading Mechanism:
When you run CREATE EXTENSION extension_name, PostgreSQL:

  1. Reads the control file from SHAREDIR/extension/
  2. Executes the SQL script within a transaction
  3. Registers all created objects as extension members in system catalogs

Management Benefits:

  • Drop all objects with single DROP EXTENSION command
  • pg_dump exports CREATE EXTENSION instead of individual objects
  • Version tracking and upgrades via update scripts

Usage:

CREATE EXTENSION extension_name;
DROP EXTENSION extension_name CASCADE;

Essential Extensions (Supplied with PostgreSQL)

Data Types:

  • hstore: Key-value pairs within a single value
  • ltree: Hierarchical tree structures
  • citext: Case-insensitive text type
  • cube: Multi-dimensional cubes

Text Search & Matching:

  • pg_trgm: Trigram-based similarity and fuzzy matching
  • fuzzystrmatch: Soundex, Levenshtein distance algorithms
  • unaccent: Remove accents from text

Security & Cryptography:

  • pgcrypto: Cryptographic functions (hashing, encryption)
  • uuid-ossp: UUID generation

Performance & Administration:

  • pg_stat_statements: Track SQL query performance statistics
  • pageinspect: Inspect database pages at low level

Array Operations:

  • intarray: Integer array functions and operators with GiST/GIN indexing

Many extensions are "trusted," allowing non-superuser installation in databases where they have CREATE privilege.

Sources:

95% confidence
A

VACUUM is a maintenance operation that reclaims storage and prevents critical database failures in PostgreSQL.

What VACUUM Does

VACUUM removes dead row versions (tuples) left behind by UPDATE and DELETE operations. Due to PostgreSQL's MVCC (Multi-Version Concurrency Control) architecture, these operations don't immediately remove old row versions—they're marked as dead but remain on disk until VACUUM processes them.

Why It's Necessary

1. Disk Space Recovery
Dead tuples consume disk space. Without VACUUM, your database will grow unbounded even if the actual data size remains constant.

2. Query Performance
VACUUM updates table statistics used by the query planner to generate efficient execution plans. It also maintains the visibility map, enabling faster index-only scans.

3. Transaction ID Wraparound Prevention
PostgreSQL uses 32-bit transaction IDs that wrap around after ~4 billion transactions. Without regular VACUUM to "freeze" old rows, wraparound causes catastrophic data loss—old transactions suddenly appear "in the future" and become invisible. VACUUM must run at least once every 2 billion transactions per table.

4. Index Performance
Dead tuples bloat indexes, slowing down queries. VACUUM cleans up index entries pointing to dead tuples.

Key Variants

  • VACUUM: Reclaims space for reuse within the table, runs concurrently with normal operations
  • VACUUM FULL: Rewrites entire table to return space to OS, requires exclusive lock
  • VACUUM ANALYZE: Combines vacuuming with statistics update

Most installations rely on the autovacuum daemon, which automatically runs VACUUM based on table activity.

Sources:

95% confidence
A

PostgreSQL implements ACID properties through four core mechanisms:

Atomicity

PostgreSQL uses Write-Ahead Logging (WAL) to ensure atomicity. Changes to data files are written only after those changes have been logged—WAL records describing the changes must be flushed to permanent storage first. If a transaction fails or the system crashes, PostgreSQL can roll back incomplete transactions or redo committed changes during recovery using the WAL records.

Consistency

PostgreSQL maintains consistency through:

  • Integrity constraints (foreign keys, check constraints, unique constraints)
  • Triggers that enforce business rules
  • Transaction isolation that prevents viewing inconsistent intermediate states

Isolation

PostgreSQL implements Multi-Version Concurrency Control (MVCC) for isolation. Each SQL statement sees a snapshot of data as it existed at transaction start, regardless of concurrent modifications. This means:

  • Reading never blocks writing
  • Writing never blocks reading
  • Multiple transactions work with their own data snapshots

PostgreSQL supports multiple isolation levels (Read Committed, Repeatable Read, Serializable) with Serializable Snapshot Isolation (SSI) providing the strictest guarantees.

Durability

Durability is guaranteed through WAL with synchronous commits. When a transaction commits, PostgreSQL:

  1. Flushes WAL records to disk (using fsync() system calls)
  2. Only requires the sequential WAL file to be synced, not all modified data pages
  3. Can recover to a consistent state after crashes by replaying WAL records

The WAL file is written sequentially, making synchronization much faster than flushing scattered data pages.

Sources:

95% confidence
A

Use both: application-level pooling AND PgBouncer, not either/or.

Application-level pooling (built into frameworks like HikariCP, pgx, Npgsql) should always be your first layer. It provides better integration with prepared statements, result set caching, and eliminates network hops within the application process.

Add PgBouncer when:

  • You have 100+ concurrent clients (below ~56 clients, direct connections outperform poolers by 2.5x)
  • Multiple small applications connect to the same database
  • Connections are frequently idle or short-lived
  • You need to pool 1,000+ application connections down to 100 database connections

Deployment strategy:
Run PgBouncer as close to your application as possible—ideally on the same host using Unix sockets to avoid TCP overhead. Only use centralized PgBouncer for multi-application environments.

Performance consideration:
With low concurrency, PgBouncer adds pure overhead with no benefit. Always benchmark your specific workload—patterns of data access and latencies determine which approach yields better results.

Sources:

95% confidence
A

PostgreSQL sequences are special single-row database objects that generate sequential numeric values. They are sequence generators used to produce unique, ordered numbers automatically.

What they are:

  • Database objects created with CREATE SEQUENCE
  • Single-row tables designed specifically to generate sequential numbers
  • Commonly used to generate unique identifiers for table rows (e.g., primary keys)

How they work:

  • Operated via functions: nextval() (get next value), currval() (get current value), setval() (set value)
  • Support data types: smallint, integer, bigint (default)
  • Generate values within range: -9223372036854775808 to 9223372036854775807 for bigint

Key characteristics:

  • Configurable increment (positive or negative)
  • Optional min/max boundaries
  • Optional cycling (restart when limit reached)
  • Cache support for performance
  • Can be owned by table columns (auto-drop when column is dropped)
  • Changes are never rolled back (sequence values are not transactional)

Example:

CREATE SEQUENCE serial START 101;
SELECT nextval('serial');  -- Returns 101
SELECT nextval('serial');  -- Returns 102

Sources:

95% confidence
A

The most important PostgreSQL configuration parameters for performance tuning are:

Memory Configuration

shared_buffers - Sets memory for database page caching. Recommended: 25% of system RAM (maximum 40% of RAM). Default is often too low for production systems.

work_mem - Memory per query operation (sort, hash) before spilling to disk. Default: 4MB. Increase for complex queries, but multiply by max_connections to avoid over-allocation.

maintenance_work_mem - Memory for maintenance operations (VACUUM, CREATE INDEX, ALTER TABLE). Default: 64MB. Can be set much higher than work_mem since few operations run concurrently.

effective_cache_size - Planner's estimate of OS and PostgreSQL cache size. Default: 4GB. Set to 50-75% of total RAM. Does not allocate memory, only influences query planning decisions.

Query Planning

random_page_cost - Cost estimate for non-sequential disk reads. Default: 4.0. Lower to 1.1-2.0 for SSDs to favor index scans.

effective_io_concurrency - Number of concurrent I/O operations PostgreSQL expects the storage can handle. Default: 1 (0 on systems without asynchronous I/O). Set to ~200 for SSDs, ~100 for RAID arrays.

default_statistics_target - Sample size for ANALYZE operations affecting planner estimates. Default: 100. Increase to 500-1000 for large tables with non-uniform distributions.

Background Processes

max_parallel_workers_per_gather - Workers per parallel query operation. Default: 2. Increase for CPU-bound analytic queries on multi-core systems.

max_worker_processes - Total background worker processes. Default: 8. Must be ≥ max_parallel_workers + autovacuum workers.

Checkpoint and WAL

checkpoint_completion_target - Fraction of checkpoint interval to spread writes. Default: 0.9. Keep at 0.9 to smooth I/O.

wal_buffers - WAL (Write-Ahead Log) buffer size. Default: -1 (auto-tuned to 1/32 of shared_buffers). Usually auto-sizing is sufficient.

Sources:

95% confidence

PL/pgSQL > Exception Handling

30 questions
A

RETURNED_SQLSTATE (SQLSTATE error code), MESSAGE_TEXT (primary message), PG_EXCEPTION_DETAIL (detail message), PG_EXCEPTION_HINT (hint message), PG_EXCEPTION_CONTEXT (call stack description), COLUMN_NAME (related column), CONSTRAINT_NAME (related constraint), PG_DATATYPE_NAME (related datatype), TABLE_NAME (related table), and SCHEMA_NAME (related schema).

95% confidence

Data Types - JSON > JSON Operators

30 questions
A

The ?| operator tests if any of the strings in the provided array exist as top-level keys or array elements. For example, '{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'd'] returns true because 'b' exists.

95% confidence
A

The <@ operator tests if the left jsonb value is contained within the right jsonb value. It is the inverse of @>. For example, '{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb returns true.

95% confidence
A

jsonb_ops creates independent index items for each key and value in the data, while jsonb_path_ops creates index items only for each value (incorporating keys into hash values). For example, to index {"foo": {"bar": "baz"}}, jsonb_path_ops creates a single index item with all three (foo, bar, baz) in the hash, while jsonb_ops creates three separate index items.

95% confidence
A

The #- operator deletes the field or array element at the specified path, where path elements can be either field keys or array indexes. For example, '["a", {"b":1}]'::jsonb #- '{1,b}' returns ["a", {}].

95% confidence
A

The @? operator checks for path existence and should NOT use predicate check expressions. The @@ operator evaluates a predicate (boolean expression) and REQUIRES predicate check expressions. @? returns true/false based on existence, while @@ evaluates the boolean result of the path expression.

95% confidence
A

The ? operator tests whether a string appears as an object key or array element at the top level of the jsonb value. For example, '{"a":1, "b":2}'::jsonb ? 'b' returns true.

95% confidence
A

The ?& operator tests if all of the strings in the provided array exist as top-level keys or array elements. For example, '["a", "b", "c"]'::jsonb ?& array['a', 'b'] returns true because both 'a' and 'b' exist.

95% confidence
A

The #> operator returns the result as the same type as the input (json or jsonb), while #>> returns the result as text. For example, '{"a": {"b": ["foo","bar"]}}'::json #> '{a,b,1}' returns "bar" (with quotes), while #>> returns bar (without quotes, as text).

95% confidence

System Catalogs

30 questions

Performance Tips > Query Statistics

29 questions

PL/pgSQL > Dynamic SQL

29 questions

Monitoring > Table Statistics

29 questions

json_operations

29 questions
A

PostgreSQL JSONB #>> Operator

The #>> operator extracts a JSON value at a specified path as text (not JSONB).

Syntax

jsonb_column #>> text_array_path

Key Differences from #>

  • #> returns JSONB type
  • #>> returns TEXT type (the value is converted to text)

Examples

-- Given this JSONB data:
SELECT '{"user": {"name": "Alice", "age": 30}}'::jsonb #>> '{user,name}';
-- Returns: 'Alice' (text)

SELECT '{"user": {"name": "Alice", "age": 30}}'::jsonb #>> '{user,age}';
-- Returns: '30' (text, not integer)

SELECT '{"items": [10, 20, 30]}'::jsonb #>> '{items,1}';
-- Returns: '20' (text)

-- Array path uses zero-based indexing
SELECT '{"a": [{"b": "value"}]}'::jsonb #>> '{a,0,b}';
-- Returns: 'value' (text)

Return Values

  • Scalar values: converted to text representation
  • Objects/arrays: JSON text representation
  • Non-existent path: NULL

Version

Available in PostgreSQL 9.4+ (when JSONB type was introduced).

Source

PostgreSQL Official Documentation: JSON Functions and Operators

95% confidence
A

jsonb_path_query is a PostgreSQL function (available since version 12) that executes a JSON path expression against JSONB data and returns a set of JSONB values matching the path.

Syntax:

jsonb_path_query(target jsonb, path jsonpath [, vars jsonb [, silent boolean]])

Parameters:

  • target: The JSONB column/value to query
  • path: A JSONPath expression (SQL/JSON path language per SQL:2016 standard)
  • vars: Optional JSONB object containing variables referenced in the path (e.g., $varname)
  • silent: Optional boolean (default false); if true, suppresses errors for invalid paths

Returns: Set of JSONB values (use with set-returning context like SELECT, LATERAL, or aggregate functions)

Key Differences from Related Functions:

  • jsonb_path_query() → returns set of rows (multiple results)
  • jsonb_path_query_first() → returns single JSONB value (first match only)
  • jsonb_path_query_array() → returns JSONB array containing all matches

Example:

SELECT jsonb_path_query(
  '{"items": [{"price": 10}, {"price": 20}, {"price": 15}]}',
  '$.items[*].price'
);
-- Returns 3 rows: 10, 20, 15

SELECT jsonb_path_query(
  '{"users": [{"age": 25}, {"age": 35}]}',
  '$.users[*] ? (@.age > $min)',
  '{"min": 30}'
);
-- Returns: {"age": 35}

Source: PostgreSQL 17 official documentation - JSON Functions and Operators

95% confidence
A

jsonb_build_array Function in PostgreSQL

jsonb_build_array is a PostgreSQL function that constructs a JSONB array from a variadic list of arguments. It returns a jsonb type (binary JSON format).

Syntax

jsonb_build_array(VARIADIC "any")

Key Characteristics

  • Available since: PostgreSQL 9.4
  • Return type: jsonb
  • Arguments: Accepts 0 to N arguments of any type
  • NULL handling: NULL values are preserved as JSON null in the array
  • Type conversion: Non-JSON types are automatically converted to their JSON equivalents

Usage Examples

-- Empty array
SELECT jsonb_build_array();
-- Result: []

-- Array with mixed types
SELECT jsonb_build_array(1, 2, 'foo', 4, 5);
-- Result: [1, 2, "foo", 4, 5]

-- Array with NULL
SELECT jsonb_build_array(1, NULL, 'text', true);
-- Result: [1, null, "text", true]

-- Nested with other JSONB functions
SELECT jsonb_build_array(
    jsonb_build_object('id', 1, 'name', 'Alice'),
    jsonb_build_object('id', 2, 'name', 'Bob')
);
-- Result: [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]

Performance Note

jsonb format is stored in a decomposed binary format (slower to create than json, but significantly faster for processing operations like indexing and querying).

Source

PostgreSQL Official Documentation: JSON Functions and Operators (versions 9.4+)

95% confidence
A

Use the jsonb_typeof() function to check if a JSONB value is an array. It returns 'array' for JSON arrays.

SELECT jsonb_typeof('["a", "b", "c"]'::jsonb);
-- Returns: 'array'

SELECT jsonb_typeof('{"key": "value"}'::jsonb);
-- Returns: 'object'

To filter rows where a JSONB column is an array:

SELECT * FROM table_name 
WHERE jsonb_typeof(json_column) = 'array';

Possible return values of jsonb_typeof():

  • 'object' - JSON object
  • 'array' - JSON array
  • 'string' - JSON string
  • 'number' - JSON number
  • 'boolean' - JSON boolean
  • 'null' - JSON null

Alternative using @? operator (PostgreSQL 12+):

SELECT * FROM table_name 
WHERE json_column @? '$ ? (@.type() == "array")';

Available since: PostgreSQL 9.4 (when JSONB was introduced)

Source: PostgreSQL official documentation - JSON Functions and Operators

95% confidence
A

The || operator for JSONB in PostgreSQL is the concatenation/shallow merge operator.

Behavior

For JSONB objects:

  • Merges two JSONB objects by combining their keys
  • If duplicate keys exist, the right operand's value overwrites the left
  • Performs a shallow merge only (nested objects are replaced, not merged)

For JSONB arrays:

  • Concatenates arrays into a single array

Examples

-- Object merge (duplicate key 'b' takes right value)
SELECT '{"a": 1, "b": 2}'::jsonb || '{"b": 3, "c": 4}'::jsonb;
-- Result: {"a": 1, "b": 3, "c": 4}

-- Shallow merge (nested object replaced entirely)
SELECT '{"a": {"x": 1}}'::jsonb || '{"a": {"y": 2}}'::jsonb;
-- Result: {"a": {"y": 2}}

-- Array concatenation
SELECT '[1, 2]'::jsonb || '[3, 4]'::jsonb;
-- Result: [1, 2, 3, 4]

-- Adding/updating a single key-value pair
SELECT '{"name": "John"}'::jsonb || '{"age": 30}'::jsonb;
-- Result: {"name": "John", "age": 30}

Key Limitations

  • Not recursive: Nested objects are completely replaced, not deep-merged
  • For deep merging, use jsonb_set() or recursive CTEs
  • Available since PostgreSQL 9.5

Official Documentation

PostgreSQL JSON Functions and Operators: https://www.postgresql.org/docs/current/functions-json.html

95% confidence
A

row_to_json Function in PostgreSQL

row_to_json is a PostgreSQL built-in function that converts a SQL composite value (row) into a JSON object.

Syntax

row_to_json(record, [pretty_bool])

Parameters

  • record: A row type, composite type, or record value
  • pretty_bool (optional): If true, adds whitespace for readability; if false or omitted, returns compact JSON

Return Type

Returns json type (not jsonb)

Behavior

  • Each column name becomes a JSON key
  • Each column value becomes the corresponding JSON value
  • NULL SQL values become JSON null
  • Column names are NOT quoted unless necessary (e.g., mixed case or special characters)
  • Nested composite types are recursively converted to JSON objects

Code Examples

-- Basic usage
SELECT row_to_json(row(1, 'foo', true));
-- Result: {"f1":1,"f2":"foo","f3":true}

-- With table row
SELECT row_to_json(users) FROM users WHERE id = 1;
-- Result: {"id":1,"name":"Alice","email":"[email protected]"}

-- With subquery and named columns
SELECT row_to_json(t) FROM (
  SELECT id, name, created_at 
  FROM users 
  WHERE id = 1
) t;
-- Result: {"id":1,"name":"Alice","created_at":"2024-01-15T10:30:00"}

-- Pretty printed
SELECT row_to_json(row(1, 'foo'), true);
-- Result: {"f1": 1,
--  "f2": "foo"}

Version

Available since PostgreSQL 9.2 (released September 2012)

Related Functions

  • json_build_object(): More control over key names
  • to_json(): Converts any value to JSON
  • row_to_jsonb(): Same functionality but returns jsonb type (PostgreSQL 9.4+)

Source: PostgreSQL Official Documentation - JSON Functions

95% confidence
A

jsonb_agg Function in PostgreSQL

jsonb_agg is an aggregate function that collects values from multiple rows into a single JSONB array.

Syntax

jsonb_agg(expression [ORDER BY ...])

Behavior

  • Input: Any data type (automatically converted to JSONB)
  • Output: A single JSONB array containing all input values
  • NULL handling: NULL input values are included in the output array as JSON null
  • Empty set: Returns NULL (not an empty array)
  • Available since: PostgreSQL 9.5

Common Use Cases

1. Basic aggregation:

SELECT jsonb_agg(name) FROM users;
-- Result: ["Alice", "Bob", "Charlie"]

2. Aggregate entire rows as objects:

SELECT jsonb_agg(users) FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

3. Aggregate with row construction:

SELECT jsonb_agg(jsonb_build_object('id', id, 'name', name)) 
FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

4. With ORDER BY:

SELECT jsonb_agg(name ORDER BY created_at DESC) FROM users;
-- Result: ["Charlie", "Bob", "Alice"]

Key Differences from json_agg

  • jsonb_agg returns binary JSONB format (more efficient for operations)
  • json_agg returns text JSON format
  • Use jsonb_agg when you need to further manipulate the result with JSONB operators

Source

PostgreSQL Official Documentation: Aggregate Functions (PostgreSQL 9.5+)

95% confidence
A

The #- operator in PostgreSQL removes a key-value pair from a JSONB object by matching the key path. It takes a JSONB value on the left and a text array path on the right, returning a new JSONB value with the specified path removed.

Syntax:

jsonb #- text[]

Behavior:

  • Removes the field or array element at the specified path
  • Returns the modified JSONB value (original is unchanged)
  • If the path doesn't exist, returns the original JSONB unchanged
  • For nested paths, use an array like '{key1, key2, key3}'
  • Array elements are referenced by index (0-based)

Examples:

-- Remove top-level key
SELECT '{"a": 1, "b": 2}'::jsonb #- '{a}';
-- Result: {"b": 2}

-- Remove nested key
SELECT '{"a": {"b": {"c": 1}}}'::jsonb #- '{a, b, c}';
-- Result: {"a": {"b": {}}}

-- Remove array element by index
SELECT '{"items": [1, 2, 3]}'::jsonb #- '{items, 1}';
-- Result: {"items": [1, 3]}

-- Non-existent path (no change)
SELECT '{"a": 1}'::jsonb #- '{b}';
-- Result: {"a": 1}

Version: Available in PostgreSQL 9.5+ (introduced with JSONB operators)

Source: PostgreSQL Official Documentation - JSON Functions and Operators
https://www.postgresql.org/docs/current/functions-json.html

95% confidence
A

The ?& operator in PostgreSQL checks if a JSONB object contains all of the specified keys in a text array.

Syntax:

jsonb_column ?& text_array

Return value: boolean - Returns true if the JSONB value contains all keys in the array, false otherwise.

Example:

-- Check if JSONB contains all specified keys
SELECT '{"name": "John", "age": 30, "city": "NYC"}'::jsonb ?& ARRAY['name', 'age'];
-- Returns: true

SELECT '{"name": "John", "age": 30}'::jsonb ?& ARRAY['name', 'city'];
-- Returns: false (missing 'city')

-- Common usage in WHERE clause
SELECT * FROM users WHERE metadata ?& ARRAY['email', 'phone'];

Key differences from related operators:

  • ? - checks for a single key
  • ?| - checks if any of the keys exist (OR logic)
  • ?& - checks if all of the keys exist (AND logic)

Availability: PostgreSQL 9.4+ (introduced with JSONB type)

Index support: Can use GIN index with jsonb_path_ops or jsonb_ops operator class for performance.

Source: PostgreSQL Official Documentation - JSON Functions and Operators

95% confidence
A

json_build_object - PostgreSQL JSON Constructor Function

json_build_object is a PostgreSQL built-in function that constructs a JSON object from a variadic list of alternating keys and values.

Syntax

json_build_object(key1, value1, key2, value2, ...)

Key Characteristics

  • Available since: PostgreSQL 9.4
  • Return type: json (use jsonb_build_object for jsonb type)
  • Parameters: Variadic list where odd positions are keys (coerced to text) and even positions are values
  • Minimum arguments: 0 (returns empty JSON object {})
  • Arguments must be even: Function requires an even number of arguments (pairs of key-value)

Behavior

  • Keys are automatically converted to text strings
  • Values are converted to appropriate JSON types:
    • NULL → JSON null
    • boolean → JSON boolean
    • numeric → JSON number
    • text → JSON string
    • arrays/composites → JSON arrays/objects
  • Duplicate keys are allowed; all occurrences are preserved in the output

Examples

-- Basic usage
SELECT json_build_object('name', 'John', 'age', 30, 'active', true);
-- Result: {"name":"John","age":30,"active":true}

-- With NULL values
SELECT json_build_object('id', 1, 'email', NULL);
-- Result: {"id":1,"email":null}

-- From table columns
SELECT json_build_object('user_id', id, 'username', name) 
FROM users 
WHERE id = 5;

-- Empty object
SELECT json_build_object();
-- Result: {}

Source: PostgreSQL 17 Official Documentation - JSON Functions

95% confidence
A

PostgreSQL jsonb_each Function

jsonb_each is a set-returning function that expands a JSONB object's top-level key-value pairs into rows, where each row contains a key (text) and value (jsonb) column.

Signature

jsonb_each(jsonb) → SETOF record(key text, value jsonb)

Behavior

  • Input: JSONB object (will error if given a JSONB array, null, or scalar)
  • Output: One row per top-level key-value pair
  • Column 1 (key): Text representation of the object key
  • Column 2 (value): JSONB value (preserves original JSON type)
  • Non-recursive: Only expands the top level of the object

Usage Example

SELECT * FROM jsonb_each('{"name":"Alice","age":30,"active":true}'::jsonb);

Result:

  key   |  value  
--------+---------
 name   | "Alice"
 age    | 30
 active | true

Common Patterns

Filter by key:

SELECT value FROM jsonb_each('{"a":1,"b":2}'::jsonb) WHERE key = 'a';
-- Returns: 1

Convert values to text (use jsonb_each_text instead):

SELECT * FROM jsonb_each_text('{"name":"Alice","age":30}'::jsonb);
-- Returns: key=name, value=Alice (text); key=age, value=30 (text)

Version

Available since PostgreSQL 9.4 (when JSONB was introduced).

Source: PostgreSQL Official Documentation - JSON Functions

95% confidence
A

PostgreSQL to_jsonb() Function

to_jsonb() converts any SQL value into a JSONB (binary JSON) value. Available in PostgreSQL 9.4+.

Syntax

to_jsonb(anyelement) → jsonb

Behavior

  • Primitives: Converts text, numbers, booleans to JSON equivalents
  • NULL: Converts to JSON null
  • Composite types/rows: Converts to JSON objects with column names as keys
  • Arrays: Converts to JSON arrays
  • Records: Converts to JSON objects

Key Characteristics

  • Returns JSONB (binary format), not JSON text
  • JSONB stores data in decomposed binary form for faster processing
  • Whitespace is not preserved (normalized)
  • Object key order is not preserved (sorted)
  • Duplicate keys are eliminated (last value wins)

Examples

-- Primitive values
SELECT to_jsonb(42);              -- 42
SELECT to_jsonb('hello'::text);   -- "hello"
SELECT to_jsonb(true);            -- true

-- Arrays
SELECT to_jsonb(ARRAY[1,2,3]);    -- [1, 2, 3]

-- Row/composite type
SELECT to_jsonb(row(1, 'foo'));   -- {"f1": 1, "f2": "foo"}

-- Table rows
SELECT to_jsonb(users.*) FROM users LIMIT 1;
-- {"id": 1, "name": "Alice", "email": "[email protected]"}

Common Use Cases

  • Converting query results to JSON for APIs
  • Building JSON aggregates with jsonb_agg()
  • Storing structured data in JSONB columns

Related Functions

  • to_json(): Returns text JSON (not binary)
  • row_to_json(): Specifically for row types (use to_jsonb() instead in 9.4+)
  • jsonb_build_object(): Construct JSONB from key-value pairs

Source: PostgreSQL 16 Official Documentation - JSON Functions and Operators

95% confidence
A

PostgreSQL JSON Operators: -> vs ->>

-> returns JSON/JSONB (preserves type), ->> returns TEXT (converts to string).

Exact Behavior

  • ->: Extracts JSON object field or array element as JSON/JSONB type
  • ->>: Extracts JSON object field or array element as TEXT type

Code Examples

-- Sample data
SELECT '{"name": "Alice", "age": 30}'::jsonb AS data;

-- Using -> returns JSONB (can be chained)
SELECT '{"user": {"name": "Alice"}}'::jsonb -> 'user' -> 'name';
-- Result: "Alice" (as JSONB, with quotes)

-- Using ->> returns TEXT (cannot be chained with JSON operators)
SELECT '{"user": {"name": "Alice"}}'::jsonb -> 'user' ->> 'name';
-- Result: Alice (as TEXT, without quotes)

-- Array access
SELECT '[1, 2, 3]'::jsonb -> 2;     -- Result: 3 (as JSONB)
SELECT '[1, 2, 3]'::jsonb ->> 2;    -- Result: 3 (as TEXT)

-- Type difference matters for operations
SELECT '{"count": 5}'::jsonb -> 'count';   -- Returns: 5 (JSONB number)
SELECT '{"count": 5}'::jsonb ->> 'count';  -- Returns: '5' (TEXT)

When to Use Each

  • Use -> when: Chaining multiple extractions, need to preserve JSON type for further JSON operations
  • Use ->> when: Extracting final value for display, comparison with text, or casting to specific types

Compatibility

Available in PostgreSQL 9.3+ for both json and jsonb types.

Source: PostgreSQL JSON Functions and Operators Documentation

95% confidence
A

Deleting a Key from JSONB in PostgreSQL

Use the - operator to delete a key from a JSONB object:

-- Delete a single top-level key
UPDATE table_name 
SET jsonb_column = jsonb_column - 'key_to_delete'
WHERE condition;

-- Example: Delete 'email' key
UPDATE users 
SET data = data - 'email'
WHERE id = 1;

Deleting Nested Keys

For nested keys, use the #- operator with a text array path (PostgreSQL 9.5+):

-- Delete nested key at path
UPDATE table_name 
SET jsonb_column = jsonb_column #- '{parent,child,key_to_delete}'
WHERE condition;

-- Example: Delete 'address.city'
UPDATE users 
SET data = data #- '{address,city}'
WHERE id = 1;

Deleting Multiple Keys

Delete multiple top-level keys using the - operator with a text array (PostgreSQL 9.5+):

UPDATE table_name 
SET jsonb_column = jsonb_column - ARRAY['key1', 'key2', 'key3']
WHERE condition;

Deleting Array Elements

To delete by array index, use - with an integer:

-- Delete element at index 2 (0-based)
UPDATE table_name 
SET jsonb_column = jsonb_column - 2
WHERE condition;

Minimum Version: PostgreSQL 9.5 for - operator; 9.5+ for #- operator.

Source: PostgreSQL 16 JSONB Functions Documentation

95% confidence
A

PostgreSQL JSONB -> Operator

The -> operator extracts a JSON object field or array element from a JSONB column and returns the result as JSONB (not as text).

Syntax

jsonb_column -> key          -- Extract object field by text key
jsonb_column -> index        -- Extract array element by integer index

Key Behavior

  • Returns: JSONB type (preserves JSON structure)
  • Object fields: Use text key (e.g., 'name')
  • Array elements: Use zero-based integer index (e.g., 0, 1, 2)
  • Non-existent keys/indices: Returns NULL
  • Type preservation: Nested objects/arrays remain JSONB for further chaining

Examples

-- Extract object field
SELECT '{"name": "Alice", "age": 30}'::jsonb -> 'name';
-- Result: "Alice" (as JSONB)

-- Extract array element
SELECT '["red", "green", "blue"]'::jsonb -> 1;
-- Result: "green" (as JSONB)

-- Chain multiple extractions
SELECT '{"user": {"address": {"city": "NYC"}}}'::jsonb -> 'user' -> 'address' -> 'city';
-- Result: "NYC" (as JSONB)

-- Non-existent key
SELECT '{"name": "Alice"}'::jsonb -> 'email';
-- Result: NULL

Compare with ->> Operator

  • -> returns JSONB (chainable, preserves structure)
  • ->> returns TEXT (terminal operation, converts to string)
SELECT data -> 'price'  FROM products;  -- Returns: 29.99 (JSONB number)
SELECT data ->> 'price' FROM products;  -- Returns: '29.99' (TEXT)

Version Compatibility

Available in PostgreSQL 9.4+ (when JSONB type was introduced).

Source

PostgreSQL Official Documentation: JSON Functions and Operators

95% confidence
A

How to Insert JSON Data in PostgreSQL

PostgreSQL supports two JSON data types: json (text-based storage) and jsonb (binary, indexed, recommended). Use jsonb for better performance and indexing capabilities.

Direct JSON Insert

-- Create table with jsonb column
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    data jsonb
);

-- Insert JSON using single quotes and type cast
INSERT INTO users (data) VALUES ('{"name": "John", "age": 30}'::jsonb);

-- Alternative: explicit CAST
INSERT INTO users (data) VALUES (CAST('{"name": "Jane", "age": 25}' AS jsonb));

Insert from Application Parameters

Using parameterized queries (prevents SQL injection):

-- PostgreSQL expects JSON as a string parameter
INSERT INTO users (data) VALUES ($1::jsonb);

In application code, pass the JSON as a string:

  • Node.js (pg): client.query('INSERT INTO users (data) VALUES ($1)', [JSON.stringify(obj)])
  • Python (psycopg2): cursor.execute("INSERT INTO users (data) VALUES (%s)", (json.dumps(obj),))
  • Go (lib/pq): db.Exec("INSERT INTO users (data) VALUES ($1)", jsonBytes)

Key Requirements

  1. Always use jsonb over json unless you need exact whitespace/key-order preservation
  2. JSON must be valid - PostgreSQL validates on insert and rejects malformed JSON
  3. Use single quotes for string literals in SQL (double quotes are for identifiers)
  4. Maximum size: 1GB per jsonb value (PostgreSQL TOAST limit)

PostgreSQL Version: 9.4+ for jsonb (recommended: 12+)

Source: PostgreSQL 16 JSON Types Documentation

95% confidence
A

Querying Nested JSONB in PostgreSQL

PostgreSQL provides three operators to query nested JSONB structures:

Operators

  • -> - Returns JSONB object/array element (result is JSONB)
  • ->> - Returns JSONB object/array element as TEXT
  • #> - Returns JSONB object at specified path (array of keys)
  • #>> - Returns JSONB object at specified path as TEXT

Code Examples

-- Sample data
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    data JSONB
);

INSERT INTO users (data) VALUES 
('{"name": "Alice", "address": {"city": "NYC", "zip": "10001"}}'),
('{"name": "Bob", "contacts": {"emails": ["[email protected]", "[email protected]"]}}');

-- Access nested object (returns JSONB)
SELECT data -> 'address' -> 'city' FROM users;
-- Result: "NYC"

-- Access nested object as text
SELECT data -> 'address' ->> 'city' FROM users;
-- Result: NYC

-- Access deeply nested using path (returns JSONB)
SELECT data #> '{address,city}' FROM users;
-- Result: "NYC"

-- Access deeply nested as text
SELECT data #>> '{address,city}' FROM users;
-- Result: NYC

-- Access array element (0-indexed)
SELECT data -> 'contacts' -> 'emails' -> 0 FROM users WHERE id = 2;
-- Result: "[email protected]"

-- Query with WHERE clause
SELECT * FROM users WHERE data -> 'address' ->> 'city' = 'NYC';

-- Check if nested key exists
SELECT * FROM users WHERE data -> 'address' ? 'zip';

-- Check if nested path exists (PostgreSQL 12+)
SELECT * FROM users WHERE data @? '$.address.city';

Key Rules

  1. Chain -> operators for each nesting level when keys are known at query time
  2. Use #> with array syntax '{key1,key2,key3}' for dynamic or deep paths
  3. Use ->> or #>> only at the final level when you need TEXT output for comparisons or display
  4. Array indexing is 0-based, not 1-based

Version: Available in PostgreSQL 9.4+. JSONPath syntax (@?) requires PostgreSQL 12+.

Source: PostgreSQL Official Documentation - JSON Functions

95% confidence
A

PostgreSQL JSONB <@ Operator

The <@ operator checks if the left JSONB value is contained within the right JSONB value. Returns true if all key-value pairs in the left operand exist in the right operand.

Syntax:

jsonb_left <@ jsonb_right

Containment Rules:

  • For objects: All keys and values in left must exist in right (right can have additional keys)
  • For arrays: All elements in left must exist in right (order doesn't matter, right can have additional elements)
  • For scalars: Left must equal right

Examples:

-- Object containment (returns true)
SELECT '{"a": 1}'::jsonb <@ '{"a": 1, "b": 2}'::jsonb;

-- Array containment (returns true)
SELECT '[1, 2]'::jsonb <@ '[1, 2, 3]'::jsonb;

-- Not contained (returns false)
SELECT '{"a": 1, "c": 3}'::jsonb <@ '{"a": 1, "b": 2}'::jsonb;

-- Scalar match (returns true)
SELECT '"foo"'::jsonb <@ '"foo"'::jsonb;

Performance: The <@ operator uses GIN indexes. Create an index for optimal performance:

CREATE INDEX idx_data_gin ON table_name USING GIN (jsonb_column);

Available Since: PostgreSQL 9.4 (when JSONB was introduced)

Inverse Operator: @> (contains) - checks if left contains right

Source: PostgreSQL Official Documentation - JSON Functions and Operators

95% confidence
A

jsonb_typeof() Function in PostgreSQL

jsonb_typeof() returns the type of the top-level JSON value as a text string.

Return Values

The function returns one of these exact strings:

  • "object" - for JSON objects {}
  • "array" - for JSON arrays []
  • "string" - for JSON strings
  • "number" - for JSON numbers (integer or decimal)
  • "boolean" - for JSON booleans
  • "null" - for JSON null values

Syntax

jsonb_typeof(jsonb_value) → text

Code Examples

-- Returns "object"
SELECT jsonb_typeof('{"name": "Alice"}'::jsonb);

-- Returns "array"
SELECT jsonb_typeof('[1, 2, 3]'::jsonb);

-- Returns "string"
SELECT jsonb_typeof('"hello"'::jsonb);

-- Returns "number"
SELECT jsonb_typeof('42'::jsonb);

-- Returns "boolean"
SELECT jsonb_typeof('true'::jsonb);

-- Returns "null"
SELECT jsonb_typeof('null'::jsonb);

-- Nested access with ->
SELECT jsonb_typeof('{"user": {"age": 30}}'::jsonb -> 'user'); -- Returns "object"
SELECT jsonb_typeof('{"user": {"age": 30}}'::jsonb -> 'user' -> 'age'); -- Returns "number"

Availability

Available in PostgreSQL 9.4+ (when jsonb type was introduced).

Related Function

json_typeof() - identical function for the json type (not jsonb).

Source: PostgreSQL Official Documentation - JSON Functions

95% confidence
A

PostgreSQL JSONB ?| Operator

The ?| operator tests whether any of the specified keys/elements exist in a JSONB value. It returns true if at least one key from the provided array exists at the top level of the JSONB object.

Syntax

jsonb_column ?| text[]

Parameters

  • Left operand: JSONB column or value
  • Right operand: Array of text strings (keys to check)

Return Value

  • Returns boolean
  • true if ANY key from the array exists in the JSONB
  • false if NONE of the keys exist
  • NULL if either operand is NULL

Examples

-- Returns true (has 'name')
SELECT '{"name": "John", "age": 30}'::jsonb ?| ARRAY['name', 'email'];

-- Returns false (has neither 'email' nor 'phone')
SELECT '{"name": "John", "age": 30}'::jsonb ?| ARRAY['email', 'phone'];

-- Returns true (has 'age')
SELECT '{"name": "John", "age": 30}'::jsonb ?| ARRAY['age', 'address', 'phone'];

Important Notes

  1. Top-level only: Only checks keys at the top level, not nested keys
  2. Key existence only: Checks key presence regardless of value (even if value is null)
  3. Case-sensitive: Key matching is case-sensitive

Version

Available in PostgreSQL 9.4+ (when JSONB was introduced)

Related Operators

  • ? - checks if a single key exists
  • ?& - checks if ALL specified keys exist (AND logic vs. ?| OR logic)

Source

PostgreSQL Official Documentation: JSON Functions and Operators

95% confidence
A

Querying a Specific Key from JSONB in PostgreSQL

Use the -> operator to extract a JSON object/array or ->> operator to extract as text.

Syntax:

  • -> returns JSONB (preserves JSON type)
  • ->> returns TEXT (extracts value as string)

Code Examples:

-- Extract nested JSONB object
SELECT data -> 'user' FROM users;

-- Extract as text (most common for WHERE clauses)
SELECT data ->> 'email' FROM users;

-- Nested key access
SELECT data -> 'user' ->> 'name' FROM users;

-- In WHERE clause
SELECT * FROM users WHERE data ->> 'status' = 'active';

-- Array element access (0-indexed)
SELECT data -> 'tags' -> 0 FROM users;

Path-based extraction for deep nesting:

-- #> returns JSONB, #>> returns TEXT
SELECT data #> '{user,address,city}' FROM users;
SELECT data #>> '{user,address,city}' FROM users;

Performance:

  • Use GIN index for JSONB columns when filtering: CREATE INDEX idx_data ON users USING GIN (data);
  • For specific keys: CREATE INDEX idx_email ON users ((data ->> 'email'));

Compatibility: Available since PostgreSQL 9.4 (JSONB introduction).

Source: PostgreSQL Official Documentation - JSON Functions and Operators (https://www.postgresql.org/docs/current/functions-json.html)

95% confidence
A

PostgreSQL: Aggregate Rows into a JSON Array

Use json_agg() or jsonb_agg() to aggregate rows into a JSON array.

Basic Syntax

SELECT json_agg(column_name) FROM table_name;

Functions

  • json_agg(expression) - Returns a JSON array (text-based)
  • jsonb_agg(expression) - Returns a JSONB array (binary, faster for processing)

Both available since PostgreSQL 9.3+.

Examples

Aggregate a single column:

SELECT json_agg(name) FROM users;
-- Result: ["Alice", "Bob", "Carol"]

Aggregate entire rows as JSON objects:

SELECT json_agg(users) FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

Aggregate specific fields into objects:

SELECT json_agg(json_build_object('id', id, 'name', name)) FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

With GROUP BY:

SELECT department, json_agg(name) 
FROM employees 
GROUP BY department;
-- Returns one array per department

Order elements within the array:

SELECT json_agg(name ORDER BY created_at DESC) FROM users;

Key Differences

  • Use jsonb_agg() for data you'll query/process further (supports indexing, operators)
  • Use json_agg() for data you'll immediately output as text
  • Both return NULL for empty input sets (not an empty array)

Handling NULLs

NULL values are included in the array as null:

SELECT json_agg(email) FROM users WHERE email IS NULL;
-- Result: [null, null]

Source: PostgreSQL 17 Official Documentation - Aggregate Functions (9.21)

95% confidence
A

Use jsonb_set() to update a specific key in a JSONB column in PostgreSQL.

Syntax:

jsonb_set(target jsonb, path text[], new_value jsonb [, create_if_missing boolean])

Parameters:

  • target: The JSONB column/value to modify
  • path: Text array specifying the key location (e.g., '{key}' for top-level, '{parent,child}' for nested)
  • new_value: The new JSONB value (must be valid JSONB, use to_jsonb() or quotes for literals)
  • create_if_missing: Optional boolean (default true); creates the key if it doesn't exist

Example - Update top-level key:

UPDATE users 
SET data = jsonb_set(data, '{email}', '"[email protected]"')
WHERE id = 1;

Example - Update nested key:

UPDATE users 
SET data = jsonb_set(data, '{address,city}', '"San Francisco"')
WHERE id = 1;

Example - Update with dynamic value:

UPDATE products
SET attributes = jsonb_set(attributes, '{price}', to_jsonb(99.99))
WHERE id = 100;

Alternative - Concatenation operator (||) for top-level keys:

UPDATE users 
SET data = data || '{"email": "[email protected]"}'
WHERE id = 1;

Modern approach - Subscripting (PostgreSQL 14+):

UPDATE users 
SET data['email'] = '"[email protected]"'
WHERE id = 1;

Important: The new_value parameter must be valid JSONB. Use double quotes for string literals within single quotes ('"string"') or use to_jsonb() for non-string values.

Sources:

95% confidence
A

The ? operator in PostgreSQL checks if a single text string exists as a top-level key in a JSONB object or as an element in a JSONB array.

Syntax:

jsonb_column ? 'key_name'

Behavior:

  • For JSONB objects: Returns true if the string exists as a top-level key
  • For JSONB arrays: Returns true if the string exists as an array element
  • Case-sensitive matching
  • Only checks top-level keys/elements (not nested)

Examples:

-- Check for key in object
SELECT '{"name": "John", "age": 30}'::jsonb ? 'name';  -- true
SELECT '{"name": "John", "age": 30}'::jsonb ? 'email'; -- false

-- Check for element in array
SELECT '["apple", "banana", "cherry"]'::jsonb ? 'banana';  -- true
SELECT '["apple", "banana", "cherry"]'::jsonb ? 'orange';  -- false

-- Nested keys are NOT matched
SELECT '{"user": {"name": "John"}}'::jsonb ? 'name';  -- false (name is nested)

Related operators:

  • ?| - exists ANY of the array keys (OR logic)
  • ?& - exists ALL of the array keys (AND logic)

Available since: PostgreSQL 9.4 (when JSONB was introduced)

Source: PostgreSQL official documentation - JSONB operators
https://www.postgresql.org/docs/current/functions-json.html

95% confidence
A

PostgreSQL json_agg Function

json_agg is a PostgreSQL aggregate function that combines multiple rows into a single JSON array. Available since PostgreSQL 9.2.

Syntax

json_agg(expression [ORDER BY ...])

What It Does

  • Takes values from multiple rows and aggregates them into a JSON array
  • Returns type json
  • NULL input values are included as JSON null in the array
  • Returns NULL if there are no input rows (not an empty array)

Basic Examples

Aggregate primitive values:

SELECT json_agg(name) FROM users;
-- Result: ["Alice", "Bob", "Charlie"]

Aggregate row objects:

SELECT json_agg(users) FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

Build custom objects:

SELECT json_agg(json_build_object('id', id, 'name', name)) 
FROM users;
-- Result: [{"id":1,"name":"Alice"}, {"id":2,"name":"Bob"}]

With ORDER BY:

SELECT json_agg(name ORDER BY created_at DESC) FROM users;
-- Result: Ordered array ["Charlie", "Bob", "Alice"]

With GROUP BY:

SELECT department, json_agg(employee_name)
FROM employees
GROUP BY department;
-- Result: Multiple rows, each with department and its employees array

Key Behavior

  • For jsonb output, use jsonb_agg instead (available since PostgreSQL 9.4)
  • Empty result set returns NULL, not []. Use COALESCE(json_agg(...), '[]'::json) for empty array
  • Does not automatically deduplicate values

Source: PostgreSQL Official Documentation for Aggregate Functions (versions 9.2+)

95% confidence
A

PostgreSQL 17 JSON_TABLE Function

JSON_TABLE converts JSON data into a relational table format, allowing you to query JSON as if it were regular SQL rows and columns.

Basic Syntax

JSON_TABLE(
    json_expression,
    json_path_expression
    COLUMNS (
        column_name type PATH json_path [DEFAULT value ON EMPTY] [DEFAULT value ON ERROR]
    )
)

Examples

-- Basic usage: extract array elements as rows
SELECT * FROM JSON_TABLE(
    '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]'::jsonb,
    '$[*]'
    COLUMNS (
        name TEXT PATH '$.name',
        age INTEGER PATH '$.age'
    )
) AS jt;

-- Result:
-- name  | age
-- Alice | 30
-- Bob   | 25

-- Nested data with error handling
SELECT * FROM JSON_TABLE(
    '{"users": [{"id": 1, "profile": {"email": "[email protected]"}}]}'::jsonb,
    '$.users[*]'
    COLUMNS (
        user_id INTEGER PATH '$.id',
        email TEXT PATH '$.profile.email' DEFAULT 'unknown' ON EMPTY
    )
) AS users;

-- Use with actual table data
SELECT o.order_id, items.*
FROM orders o,
    JSON_TABLE(
        o.line_items,
        '$[*]'
        COLUMNS (
            product_id INTEGER PATH '$.product_id',
            quantity INTEGER PATH '$.qty',
            price NUMERIC PATH '$.price'
        )
    ) AS items;

Column Options

Clause Purpose
PATH JSON path to extract value
DEFAULT ... ON EMPTY Value when path returns nothing
DEFAULT ... ON ERROR Value when extraction fails
FOR ORDINALITY Row number counter
EXISTS Boolean: does path exist?

Related SQL/JSON Functions (PG17)

-- JSON_EXISTS: check if path exists
SELECT JSON_EXISTS('{"a": 1}'::jsonb, '$.a');  -- true

-- JSON_QUERY: extract JSON fragment
SELECT JSON_QUERY('{"a": {"b": 1}}'::jsonb, '$.a');  -- {"b": 1}

-- JSON_VALUE: extract scalar value
SELECT JSON_VALUE('{"name": "test"}'::jsonb, '$.name');  -- test

Source: PostgreSQL 17 Documentation - JSON Functions
https://www.postgresql.org/docs/17/functions-json.html

95% confidence
A

jsonb_array_length()

jsonb_array_length(jsonb) returns the number of elements in the outermost JSON array as an integer.

Syntax

jsonb_array_length(target jsonb) → integer

Behavior

  • Returns the count of top-level array elements
  • Raises error 22023 ("cannot get array length of a non-array") if the input is not a JSON array
  • Nested arrays count as single elements
  • Returns 0 for empty arrays []

Examples

-- Basic usage
SELECT jsonb_array_length('[1,2,3,4,5]');
-- Returns: 5

-- Empty array
SELECT jsonb_array_length('[]');
-- Returns: 0

-- Nested arrays count as single elements
SELECT jsonb_array_length('[[1,2],[3,4],[5,6]]');
-- Returns: 3

-- Error on non-array
SELECT jsonb_array_length('{"a":1}');
-- ERROR: cannot get array length of a non-array

Related Functions

  • json_array_length(json) - identical function for json type (returns same results but operates on json type)
  • jsonb_array_elements(jsonb) - expands array to rows

Source: PostgreSQL Official Documentation (available since PostgreSQL 9.4+, current through PostgreSQL 17)

95% confidence
A

jsonb_object_keys function in PostgreSQL

jsonb_object_keys(jsonb) is a PostgreSQL function that returns the set of top-level keys from a JSONB object as a set of text values.

Syntax

jsonb_object_keys(target jsonb) → setof text

Behavior

  • Returns only top-level keys (not nested keys)
  • Returns keys as individual rows (set-returning function)
  • Returns keys in storage order (not alphabetical or insertion order)
  • Throws error if argument is not a JSONB object (e.g., if it's an array or scalar)

Example

SELECT jsonb_object_keys('{"name": "Alice", "age": 30, "address": {"city": "NYC"}}'::jsonb);

Result:

 jsonb_object_keys
-------------------
 name
 age
 address
(3 rows)

Key Details

  • Available since: PostgreSQL 9.4 (when JSONB type was introduced)
  • Return type: setof text (can be used in FROM clause or with aggregate functions)
  • Companion function: json_object_keys() exists for JSON type (not JSONB)

Common Usage Patterns

-- Get array of keys
SELECT array_agg(jsonb_object_keys(data)) FROM table_name;

-- Check if specific key exists (prefer ? operator instead)
SELECT 'name' IN (SELECT jsonb_object_keys(data)) FROM table_name;

Source: PostgreSQL Official Documentation - JSON Functions and Operators (v9.4+)

95% confidence
A

jsonb_array_elements function in PostgreSQL

jsonb_array_elements(jsonb) is a PostgreSQL set-returning function that expands a JSON array into a set of rows, one row per array element. Each element is returned as type jsonb.

Function Signature:

jsonb_array_elements(jsonb) → SETOF jsonb

Behavior:

  • Input: A JSONB array
  • Output: Set of rows, each containing one array element as jsonb
  • If input is NOT an array, raises error: ERROR: cannot extract elements from a scalar/object
  • If input is NULL, returns empty set (0 rows)
  • Preserves the original data type of each element (string, number, boolean, null, object, array)

Code Example:

-- Basic usage
SELECT jsonb_array_elements('[1, "text", true, {"key": "value"}]'::jsonb);
-- Returns 4 rows:
-- 1
-- "text"
-- true
-- {"key": "value"}

-- Common pattern: expand array and access properties
SELECT elem->>'name' AS name, elem->>'age' AS age
FROM jsonb_array_elements('[
  {"name": "Alice", "age": 30},
  {"name": "Bob", "age": 25}
]'::jsonb) AS elem;
-- Returns:
--   name  | age
-- --------+-----
--  Alice  | 30
--  Bob    | 25

Related Functions:

  • jsonb_array_elements_text(jsonb) - returns elements as text instead of jsonb
  • json_array_elements(json) - same function for json type

Availability: PostgreSQL 9.3+ (when JSONB was introduced)

Source: PostgreSQL 16 Official Documentation - JSON Functions

95% confidence

Parallel Query

29 questions
A

Resource limits such as work_mem are applied individually to each worker. Each parallel worker gets the full work_mem allocation, so a query with 4 workers and work_mem set to 32MB could use up to 160MB total (4 workers + 1 leader × 32MB).

95% confidence
A

No. The system must not be running in single-user mode for parallel query plans to be generated, since the entire database system is running as a single process in this situation and no background workers will be available.

95% confidence

Functions - Sequence

29 questions

Data Types - Domain

29 questions
A

A domain is a user-defined data type that is based on another underlying type, and can optionally have constraints that restrict its valid values to a subset of what the underlying type would allow.

95% confidence
A

The default expression will be used in any insert operation that does not specify a value for the column. A column-level default overrides any domain default, and the domain default in turn overrides any default value associated with the underlying data type.

95% confidence

Internals > TOAST

29 questions
A

No, the TOAST management code ensures that no in-memory TOAST pointer datum can accidentally get stored on disk. In-memory TOAST pointers are automatically expanded to normal in-line varlena values before storage.

95% confidence
A

Every TOAST table has the columns chunk_id (an OID identifying the particular TOASTed value), chunk_seq (a sequence number for the chunk within its value), and chunk_data (the actual data of the chunk).

95% confidence
A

The four TOAST storage strategies are: PLAIN (prevents compression or out-of-line storage), EXTENDED (allows both compression and out-of-line storage, the default), EXTERNAL (allows out-of-line storage but not compression), and MAIN (allows compression but not out-of-line storage).

95% confidence
A

TOAST usurps two bits of the varlena length word (the high-order bits on big-endian machines, the low-order bits on little-endian machines), thereby limiting the logical size to 1 GB. With 2 bits taken from the 32-bit length word, only 30 bits remain for storing the actual size, giving 2³⁰ = 1 GB.

95% confidence

Transactions > Isolation Level Behaviors

29 questions
A

No, predicate locks in PostgreSQL do not cause any blocking and therefore cannot play any part in causing a deadlock. They are used only to identify and flag dependencies among concurrent SERIALIZABLE transactions.

95% confidence
A

SQLSTATE 23505 (unique_violation) and 23P01 (exclusion_violation) might be appropriate to retry in some cases, as they may represent transient failures when application logic selects new keys concurrently. However, they might also represent persistent error conditions requiring application-level judgment.

95% confidence

Server Configuration > Vacuuming Settings

29 questions

transactions_concurrency

28 questions
A

When to Use Advisory Locks in PostgreSQL

Use PostgreSQL advisory locks when you need application-level mutual exclusion that outlives individual transactions or doesn't correspond to table rows.

Specific Use Cases

1. Preventing Concurrent Job Execution

-- Try to acquire lock, skip if already running
SELECT pg_try_advisory_lock(12345);
-- Returns true if acquired, false if held by another session

2. Distributed Task Queue (Single Consumer per Task)

-- Workers compete for tasks
SELECT task_id 
FROM queue 
WHERE pg_try_advisory_lock(task_id) 
LIMIT 1;
-- Only one worker gets the lock per task_id

3. Application-Level Singletons

-- Ensure only one maintenance process runs
SELECT pg_advisory_lock(1); -- Blocks until available
-- Perform maintenance
SELECT pg_advisory_unlock(1);

When NOT to Use Advisory Locks

  • Row-level concurrency: Use SELECT ... FOR UPDATE instead
  • Schema changes: Use regular table locks
  • Simple counters: Use UPDATE with row locks

Critical Properties

  • Session-scoped by default: pg_advisory_lock() held until explicit unlock or session end
  • Transaction-scoped variant: pg_advisory_xact_lock() auto-releases at transaction end
  • 64-bit space: Use integers 0 to 2^63-1, or two 32-bit integers via pg_advisory_lock(int4, int4)
  • No deadlock detection: Between advisory locks only (PostgreSQL 9.1+)
  • Not in pg_locks for rows: Visible as locktype='advisory' in pg_locks system view

Version: PostgreSQL 9.1+ (advisory locks introduced in 8.2, improved in 9.1)

Source: PostgreSQL 17 Official Documentation - Explicit Locking (§13.3.5)

95% confidence
A

What is a Savepoint in PostgreSQL?

A savepoint is a named checkpoint within a PostgreSQL transaction that allows you to rollback part of a transaction without aborting the entire transaction.

Core Concept

Savepoints establish a point to which you can later roll back, discarding all database changes made after that savepoint while preserving changes made before it. The outer transaction remains active.

Syntax

-- Create a savepoint
SAVEPOINT savepoint_name;

-- Roll back to a savepoint (discards changes after the savepoint)
ROLLBACK TO SAVEPOINT savepoint_name;

-- Remove a savepoint (keeps all changes)
RELEASE SAVEPOINT savepoint_name;

Practical Example

BEGIN;

INSERT INTO accounts (id, balance) VALUES (1, 1000);

SAVEPOINT my_savepoint;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- Error occurs or you change your mind

ROLLBACK TO SAVEPOINT my_savepoint;
-- The UPDATE is undone, but the INSERT remains

COMMIT;
-- Only the INSERT is committed

Key Facts

  • Naming: Savepoint names are case-insensitive identifiers
  • Nesting: Savepoints can be nested; rolling back to an outer savepoint automatically releases all inner savepoints
  • Scope: Savepoints only exist within a transaction block (between BEGIN and COMMIT/ROLLBACK)
  • Subtransactions: Internally implemented as subtransactions (performance consideration for high volumes)
  • Reuse: You can reuse a savepoint name; it releases the old savepoint and establishes a new one at the current point

Use Cases

  1. Error recovery in complex transactions
  2. Conditional logic where some operations may need to be undone
  3. Batch processing where partial failures shouldn't abort the entire batch

Source: PostgreSQL 17 Official Documentation - SQL Commands: SAVEPOINT, ROLLBACK TO SAVEPOINT, RELEASE SAVEPOINT

95% confidence
A

PostgreSQL uses a wait-for graph algorithm with optimistic waiting to detect deadlocks.

How It Works

  1. Optimistic Waiting: When a process cannot acquire a lock immediately, it waits without performing a deadlock check. Instead, it sets a timer for deadlock_timeout milliseconds (default: 1 second).

  2. Triggered Detection: If the timer expires before the lock is granted, PostgreSQL runs the deadlock detection algorithm. This avoids the expensive check on every lock wait.

  3. Wait-For Graph (WFG): PostgreSQL represents processes as nodes in a directed graph. An edge exists from process A to process B when A is waiting for a lock that B holds. A deadlock exists when there's a cycle in this graph.

  4. FindLockCycle Function: The core detection algorithm recursively scans outward from the waiting process across wait-for edges. If it finds a cycle involving the starting process, a deadlock is detected.

  5. Hard vs Soft Edges:

    • Hard edges: Process B already holds a lock conflicting with A's request
    • Soft edges: Process A is waiting behind process B in the lock queue with conflicting requests
  6. Resolution: When a deadlock is detected, PostgreSQL attempts to resolve it by:

    • If only hard edges exist: Abort one transaction
    • If soft edges exist: Attempt to reorder the lock queue using topological sorting to break the cycle without aborting
    • Which transaction aborts is unpredictable and should not be relied upon

Configuration

The deadlock_timeout parameter controls when checks occur. Increasing it reduces overhead from unnecessary checks but delays reporting real deadlocks.

Sources:

95% confidence
A

Rollback to a Savepoint in PostgreSQL

Use the ROLLBACK TO SAVEPOINT command to roll back all commands executed after a savepoint was established, while keeping the transaction active.

Syntax:

ROLLBACK TO SAVEPOINT savepoint_name;

Complete Example:

BEGIN;

INSERT INTO accounts (id, balance) VALUES (1, 1000);
SAVEPOINT my_savepoint;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
DELETE FROM accounts WHERE id = 2;

-- Undo UPDATE and DELETE, keep INSERT
ROLLBACK TO SAVEPOINT my_savepoint;

COMMIT; -- Commits the INSERT only

Key Behaviors:

  • All commands after the savepoint are undone
  • The transaction remains ACTIVE (not aborted)
  • The savepoint remains defined and can be rolled back to again
  • Savepoints created after my_savepoint are automatically destroyed
  • You can execute new commands after rollback and still commit/rollback the transaction

Alternative Syntax:

ROLLBACK TO my_savepoint;  -- SAVEPOINT keyword is optional

To Release a Savepoint (without rollback):

RELEASE SAVEPOINT savepoint_name;  -- Frees resources, can't rollback to it anymore

Compatibility: Available in PostgreSQL 8.0+ (all currently supported versions: 12-17 as of 2025).

Source: PostgreSQL Official Documentation - SQL Commands: SAVEPOINT, ROLLBACK TO SAVEPOINT

95% confidence
A

PostgreSQL max_prepared_transactions Setting

max_prepared_transactions controls the maximum number of transactions that can be in the "prepared" state simultaneously for two-phase commit (2PC) operations.

Exact Configuration

Default value: 0 (two-phase commit disabled)

Valid range: 0 to 262143

Requires restart: Yes (server must be restarted for changes to take effect)

Setting the Value

In postgresql.conf:

max_prepared_transactions = 100

Or at server start:

postgres -c max_prepared_transactions=100

Critical Rules

  1. Must be ≥ max_connections if you use prepared transactions in normal operations
  2. Must be > 0 to enable PREPARE TRANSACTION command
  3. Each prepared transaction consumes shared memory (~600 bytes + locks)
  4. Standby servers must have max_prepared_transactions ≥ primary server's value

Usage Example

-- Begin transaction
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Prepare for 2PC
PREPARE TRANSACTION 'txn_001';

-- Later: commit or rollback
COMMIT PREPARED 'txn_001';
-- OR
ROLLBACK PREPARED 'txn_001';

Source

PostgreSQL 17 official documentation: Server Configuration - Resource Consumption

95% confidence
A

Prepared Transactions in PostgreSQL

Prepared transactions are PostgreSQL's implementation of two-phase commit (2PC), allowing a transaction to be prepared for commit and later either committed or rolled back in a separate session.

How They Work

  1. PREPARE TRANSACTION saves all transaction state to disk with a unique identifier
  2. The connection can disconnect
  3. Another session can COMMIT PREPARED or ROLLBACK PREPARED using that identifier

Syntax

-- Phase 1: Prepare the transaction
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
PREPARE TRANSACTION 'transfer_001';

-- Phase 2: Commit or rollback (can be from different session)
COMMIT PREPARED 'transfer_001';
-- OR
ROLLBACK PREPARED 'transfer_001';

Critical Configuration

Prepared transactions are disabled by default. Enable via:

-- postgresql.conf
max_prepared_transactions = 100  -- Default is 0 (disabled)

Must be set ≥ max_connections if using external transaction managers. Requires PostgreSQL restart.

Key Constraints

  • Transaction identifier: max 200 bytes
  • Prepared transactions hold locks until resolved
  • Consume connection slots from max_prepared_transactions
  • Not automatically cleaned up—must be explicitly committed/rolled back
  • Cannot use:
    • LISTEN/NOTIFY
    • Advisory locks
    • Session-level operations
    • Cursors (WITH HOLD cursors work)

Use Cases

  • Distributed transactions across multiple databases
  • External transaction managers (e.g., Java JTA, XA protocol)
  • NOT recommended for application-level use (use regular transactions instead)

Monitoring

-- View prepared transactions
SELECT * FROM pg_prepared_xacts;

Source: PostgreSQL 17 Official Documentation - SQL Commands: PREPARE TRANSACTION

95% confidence
A

SKIP LOCKED in PostgreSQL

SKIP LOCKED is a row-level locking clause (available since PostgreSQL 9.5) that causes a SELECT ... FOR UPDATE/SHARE query to skip over rows that are already locked by other transactions, instead of waiting for those locks to be released.

Syntax

SELECT * FROM table_name
WHERE conditions
FOR UPDATE SKIP LOCKED;

Exact Behavior

  • Without SKIP LOCKED: Query blocks/waits when encountering locked rows
  • With SKIP LOCKED: Query immediately skips locked rows and returns only unlocked rows
  • Must be combined with FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE, or FOR KEY SHARE
  • Returns non-deterministic results (which rows are locked varies by concurrent transactions)

Primary Use Case: Job Queue Pattern

-- Worker process claims next available job
BEGIN;
SELECT * FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;

-- Process the job, then:
UPDATE jobs SET status = 'completed' WHERE id = ...;
COMMIT;

Multiple workers can run this concurrently without conflicts—each gets a different unlocked row.

Critical Details

  • Transaction required: The lock is held until transaction commits/rolls back
  • Works with LIMIT: Common pattern is LIMIT 1 to get one work item
  • Not in subqueries: Cannot use in subqueries or CTEs in PostgreSQL <12; PostgreSQL 12+ supports it in CTEs

Source: PostgreSQL Official Documentation v9.5+ - Explicit Locking section

95% confidence
A

Non-Repeatable Read in PostgreSQL

A non-repeatable read occurs when a transaction reads the same row twice and gets different data because another transaction modified and committed that row between the two reads.

Example Scenario

Transaction 1:

BEGIN;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 100

-- Transaction 2 updates and commits here

SELECT balance FROM accounts WHERE id = 1;  -- Returns 150 (different!)
COMMIT;

Transaction 2 (executes between T1's reads):

BEGIN;
UPDATE accounts SET balance = 150 WHERE id = 1;
COMMIT;

Transaction 1 read the same row twice but got different values (100, then 150) - this is a non-repeatable read.

PostgreSQL Isolation Levels

Non-repeatable reads are prevented at these isolation levels:

  • REPEATABLE READ (PostgreSQL default for explicit isolation)
  • SERIALIZABLE

Non-repeatable reads are allowed at:

  • READ COMMITTED (PostgreSQL's default isolation level)

Prevention Example

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 1;  -- Returns 100
-- Even if another transaction commits changes...
SELECT balance FROM accounts WHERE id = 1;  -- Still returns 100 (snapshot isolation)
COMMIT;

PostgreSQL uses snapshot isolation for REPEATABLE READ, guaranteeing each transaction sees a consistent snapshot of data from when the transaction started.

Source: PostgreSQL 16 Documentation, Section 13.2 "Transaction Isolation"

95% confidence
A

PostgreSQL Lock Types

PostgreSQL has 8 table-level lock modes and 4 row-level lock modes (as of PostgreSQL 16).

Table-Level Lock Modes (ordered by strength):

  1. ACCESS SHARE - Acquired by SELECT queries. Conflicts only with ACCESS EXCLUSIVE.

  2. ROW SHARE - Acquired by SELECT FOR UPDATE and SELECT FOR SHARE. Conflicts with EXCLUSIVE and ACCESS EXCLUSIVE.

  3. ROW EXCLUSIVE - Acquired by INSERT, UPDATE, DELETE. Conflicts with SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE.

  4. SHARE UPDATE EXCLUSIVE - Acquired by VACUUM, ANALYZE, CREATE INDEX CONCURRENTLY. Protects against concurrent schema changes. Conflicts with itself and higher modes.

  5. SHARE - Acquired by CREATE INDEX (non-concurrent). Conflicts with ROW EXCLUSIVE and higher modes.

  6. SHARE ROW EXCLUSIVE - Acquired by CREATE TRIGGER. Conflicts with itself and higher modes except ACCESS SHARE and ROW SHARE.

  7. EXCLUSIVE - Acquired by REFRESH MATERIALIZED VIEW CONCURRENTLY. Conflicts with ROW SHARE and higher modes.

  8. ACCESS EXCLUSIVE - Acquired by DROP TABLE, TRUNCATE, ALTER TABLE, LOCK TABLE (default). Conflicts with ALL lock modes. Only one session can hold this lock.

Row-Level Lock Modes:

  1. FOR KEY SHARE - Lightest row lock. Blocks only FOR UPDATE.

  2. FOR SHARE - Blocks UPDATE, DELETE, FOR UPDATE, FOR NO KEY UPDATE.

  3. FOR NO KEY UPDATE - Blocks DELETE, FOR UPDATE, FOR SHARE, FOR NO KEY UPDATE.

  4. FOR UPDATE - Strongest row lock. Blocks all other row locks.

Explicit Locking Syntax:

-- Table-level explicit lock
LOCK TABLE table_name IN ACCESS EXCLUSIVE MODE;

-- Row-level locks (automatically acquired, or explicit via SELECT)
SELECT * FROM table_name WHERE id = 1 FOR UPDATE;
SELECT * FROM table_name WHERE id = 1 FOR SHARE;
SELECT * FROM table_name WHERE id = 1 FOR NO KEY UPDATE;
SELECT * FROM table_name WHERE id = 1 FOR KEY SHARE;

Advisory Locks:

PostgreSQL also provides advisory locks (application-level):

  • Session-level: pg_advisory_lock(key), pg_advisory_unlock(key)
  • Transaction-level: pg_advisory_xact_lock(key) (auto-released at transaction end)

Source: PostgreSQL 16 Official Documentation - Chapter 13.3: Explicit Locking

95% confidence
A

Serialization Anomaly in PostgreSQL

A serialization anomaly is a phenomenon where the result of executing concurrent transactions at the SERIALIZABLE isolation level could not have occurred if those transactions had been executed one at a time in any serial order.

What PostgreSQL Does

PostgreSQL's SERIALIZABLE isolation level (implemented using Serializable Snapshot Isolation since version 9.1) prevents serialization anomalies by detecting potential conflicts and raising an error:

ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt.
HINT: The transaction might succeed if retried.

When this error occurs, the transaction is rolled back and must be retried by the application.

Classic Example: Write Skew

-- Two bank accounts, constraint: sum must stay >= 0
-- Account A: $100, Account B: $100

-- Transaction 1:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM accounts; -- reads $200
UPDATE accounts SET balance = balance - 150 WHERE id = 'A';
COMMIT;

-- Transaction 2 (concurrent):
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM accounts; -- reads $200
UPDATE accounts SET balance = balance - 150 WHERE id = 'B';
COMMIT;

-- Result: A = -$50, B = -$50 (violates constraint)
-- At SERIALIZABLE: One transaction aborts with serialization error

Implementation Details

  • Predicate locks: PostgreSQL uses Serializable Snapshot Isolation (SSI) with predicate locking to detect dangerous patterns (rw-conflicts forming cycles)
  • Performance cost: SSI adds ~25-30% overhead vs REPEATABLE READ due to conflict tracking
  • Memory: Uses max_pred_locks_per_transaction (default: 64) × max_connections

When It Matters

Use SERIALIZABLE when you need true serializability guarantees and can handle retry logic. Lower isolation levels (READ COMMITTED, REPEATABLE READ) do not detect or prevent serialization anomalies.

Source: PostgreSQL Documentation v16, Section 13.2.3 "Serializable Isolation Level"

95% confidence
A

pg_locks View in PostgreSQL

The pg_locks view is a system catalog view that displays information about all currently held locks in the PostgreSQL database cluster and their associated processes.

Core Details

  • Type: System view (not a table)
  • Scope: Database cluster-wide (shows locks from all databases)
  • Access: Available to all users, but non-superusers see only their own locks plus locks on shared objects

Key Columns

SELECT 
    locktype,        -- Type of lockable object (relation, tuple, transactionid, etc.)
    database,        -- OID of database (NULL for shared objects)
    relation,        -- OID of locked relation (NULL if not a relation lock)
    page,            -- Page number within relation (NULL if not a page lock)
    tuple,           -- Tuple number within page (NULL if not a tuple lock)
    transactionid,   -- Transaction ID (NULL if not a transaction lock)
    pid,             -- Process ID of server process holding/awaiting lock
    mode,            -- Lock mode (e.g., AccessShareLock, RowExclusiveLock, ExclusiveLock)
    granted          -- TRUE if lock is held, FALSE if awaiting lock
FROM pg_locks;

Common Lock Modes (mode column)

  • AccessShareLock - Acquired by SELECT queries
  • RowShareLock - Acquired by SELECT FOR UPDATE/FOR SHARE
  • RowExclusiveLock - Acquired by INSERT, UPDATE, DELETE
  • ShareUpdateExclusiveLock - Acquired by VACUUM, CREATE INDEX CONCURRENTLY
  • ShareLock - Acquired by CREATE INDEX
  • ExclusiveLock - Blocks all concurrent access except AccessShareLock
  • AccessExclusiveLock - Acquired by DROP TABLE, TRUNCATE, REINDEX, VACUUM FULL

Practical Example - Finding Blocking Queries

-- Find locks that are blocking other processes
SELECT 
    blocking.pid AS blocking_pid,
    blocking.mode AS blocking_mode,
    blocking.locktype,
    blocked.pid AS blocked_pid,
    blocked.mode AS blocked_mode
FROM pg_locks AS blocking
JOIN pg_locks AS blocked 
    ON blocking.locktype = blocked.locktype
    AND blocking.database IS NOT DISTINCT FROM blocked.database
    AND blocking.relation IS NOT DISTINCT FROM blocked.relation
WHERE NOT blocking.granted 
    AND blocking.pid != blocked.pid;

Version Notes

  • Available in all modern PostgreSQL versions (9.0+)
  • PostgreSQL 13+ added waitstart column (timestamp when lock wait began)

Source

PostgreSQL Official Documentation: pg_locks

95% confidence
A

What is a Deadlock in PostgreSQL?

A deadlock in PostgreSQL occurs when two or more transactions are waiting for each other to release locks, creating a circular dependency that prevents any of them from proceeding.

How It Happens

Example scenario:

-- Transaction 1
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- waits here for lock on id = 2
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Transaction 2 (concurrent)
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;
-- waits here for lock on id = 1
UPDATE accounts SET balance = balance + 50 WHERE id = 1;

Transaction 1 holds a lock on row id=1 and waits for id=2. Transaction 2 holds a lock on row id=2 and waits for id=1. Neither can proceed.

PostgreSQL's Behavior

  • Detection: PostgreSQL's deadlock detector runs every 1000 milliseconds (configurable via deadlock_timeout, default = 1s)
  • Resolution: When detected, PostgreSQL aborts one transaction with error 40P01 (deadlock_detected)
  • Error message:
ERROR: deadlock detected
DETAIL: Process 12345 waits for ShareLock on transaction 67890; blocked by process 12346.
HINT: See server log for query details.

Prevention Strategies

  1. Acquire locks in consistent order across all transactions (e.g., always lock lower IDs first)
  2. Use explicit locking with SELECT ... FOR UPDATE to acquire all locks upfront
  3. Keep transactions short to minimize lock duration

Source: PostgreSQL 16 Official Documentation (Chapter 13.3 - Explicit Locking)

95% confidence
A

PostgreSQL prevents transaction ID wraparound through autovacuum and regular VACUUM operations that freeze old transaction IDs.

Primary Prevention Method

Enable autovacuum (enabled by default). PostgreSQL's autovacuum daemon automatically runs VACUUM operations to freeze transaction IDs before wraparound occurs.

Key Configuration Parameters

Set these in postgresql.conf:

  • autovacuum_freeze_max_age (default: 200,000,000 transactions) - Forces autovacuum when a table's pg_class.relfrozenxid exceeds this age. Autovacuum will run even if autovacuum is otherwise disabled.
  • vacuum_freeze_min_age (default: 50,000,000 transactions) - Minimum age before rows are frozen.
  • vacuum_freeze_table_age (default: 150,000,000 transactions) - Triggers aggressive whole-table scanning when exceeded (capped at 95% of autovacuum_freeze_max_age).

Manual Prevention

Run manual VACUUM operations:

-- Database-wide vacuum
VACUUM;

-- Specific table
VACUUM table_name;

-- Aggressive freeze (for emergency situations)
VACUUM FREEZE;

Important: Use standard VACUUM, not VACUUM FULL for routine maintenance. VACUUM FULL requires exclusive locks and is much slower.

Monitoring

Check transaction age to prevent wraparound:

-- Check database age
SELECT datname, age(datfrozenxid) FROM pg_database;

-- Check table ages
SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r';

PostgreSQL warns when XIDs reach 40 million transactions from wraparound point (out of 4 billion total).

Additional Safeguards

  • Terminate long-running transactions that hold old XIDs
  • Resolve or drop old prepared transactions
  • Remove inactive replication slots

Sources:

95% confidence
A

ROW EXCLUSIVE Lock in PostgreSQL

ROW EXCLUSIVE is a table-level lock mode in PostgreSQL that allows concurrent reads but prevents other transactions from acquiring locks that would modify the table's structure or exclusively lock it.

When It's Acquired

Automatically acquired by these commands:

  • UPDATE
  • DELETE
  • INSERT
  • MERGE

Manually acquired with:

LOCK TABLE table_name IN ROW EXCLUSIVE MODE;

What It Blocks

Blocks these lock modes:

  • SHARE
  • SHARE ROW EXCLUSIVE
  • EXCLUSIVE
  • ACCESS EXCLUSIVE

Allows these lock modes:

  • ROW SHARE (acquired by SELECT FOR UPDATE/SHARE)
  • ROW EXCLUSIVE (other concurrent INSERT/UPDATE/DELETE)

Practical Implications

  • Multiple transactions can execute INSERT, UPDATE, or DELETE concurrently on the same table
  • Concurrent SELECT queries work without blocking
  • Commands that need table-level exclusive access (like ALTER TABLE, DROP TABLE, TRUNCATE, VACUUM FULL) are blocked

Lock Hierarchy Position

In PostgreSQL's lock conflict matrix, ROW EXCLUSIVE is lock mode ID 3 out of 8 modes (weakest to strongest: ACCESS SHARE → ROW SHARE → ROW EXCLUSIVE → SHARE UPDATE EXCLUSIVE → SHARE → SHARE ROW EXCLUSIVE → EXCLUSIVE → ACCESS EXCLUSIVE).

Source: PostgreSQL 17 official documentation, Explicit Locking - Table-Level Locks

95% confidence
A

Setting Transaction Isolation Level in PostgreSQL

Method 1: SET TRANSACTION (Per-Transaction)

Set isolation level at the start of a transaction (before any data-modifying statement):

BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- your queries here
COMMIT;

Method 2: SET SESSION CHARACTERISTICS (Session-Level)

Set default for all subsequent transactions in the session:

SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Method 3: SET (Session-Level, Shorthand)

SET default_transaction_isolation = 'repeatable read';

Available Isolation Levels (PostgreSQL 9.1+)

  1. READ UNCOMMITTED - Treated as READ COMMITTED in PostgreSQL (no dirty reads)
  2. READ COMMITTED - Default level, prevents dirty reads
  3. REPEATABLE READ - Prevents dirty reads and non-repeatable reads
  4. SERIALIZABLE - Full isolation, prevents all anomalies

Per-Query Example

BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM accounts WHERE id = 1;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;

Configuration File (postgresql.conf)

Set cluster-wide default:

default_transaction_isolation = 'read committed'

Requires server reload: pg_ctl reload or SELECT pg_reload_conf();

Important Notes

  • Must be set before first query in transaction (before SELECT, INSERT, UPDATE, DELETE)
  • Use single quotes for level names in SET commands
  • Names are case-insensitive but conventionally uppercase in SET TRANSACTION

Source: PostgreSQL 16 Official Documentation, Section 13.2 "Transaction Isolation"

95% confidence
A

PostgreSQL supports 4 transaction isolation levels as defined by the SQL standard:

  1. Read Uncommitted (treated as Read Committed)
  2. Read Committed (default)
  3. Repeatable Read
  4. Serializable

Key Details:

Read Uncommitted: PostgreSQL does not implement true Read Uncommitted. It behaves identically to Read Committed due to PostgreSQL's MVCC implementation.

Read Committed (default): Each statement sees a snapshot of data as of the start of that statement. Prevents dirty reads but allows non-repeatable reads and phantom reads.

Repeatable Read: All statements in a transaction see a snapshot as of the first non-transaction-control statement. Prevents dirty reads, non-repeatable reads, and phantom reads. May result in serialization failures (SQLSTATE 40001).

Serializable: Strictest level. Guarantees true serializability using Serializable Snapshot Isolation (SSI). May result in serialization failures that require retry logic.

Setting Isolation Level:

-- Set for current transaction
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

-- Or at session level
SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Or inline with BEGIN
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Version:

Applies to PostgreSQL 9.1+ (SSI introduced in 9.1). Current through PostgreSQL 17.

Source: PostgreSQL official documentation - Chapter 13.2 "Transaction Isolation"

95% confidence
A

PostgreSQL Advisory Locks: pg_advisory_lock vs pg_try_advisory_lock

The key difference: blocking behavior.

  • pg_advisory_lock(key bigint) - BLOCKS until the lock is acquired. If another session holds the lock, the calling session waits indefinitely until the lock becomes available.

  • pg_try_advisory_lock(key bigint) - NON-BLOCKING. Returns immediately with a boolean: true if lock was acquired, false if lock is already held by another session.

Usage Examples

-- Blocking version - waits until lock is available
SELECT pg_advisory_lock(12345);
-- (execution pauses here if lock is held elsewhere)
-- No return value - succeeds silently when acquired

-- Non-blocking version - returns immediately
SELECT pg_try_advisory_lock(12345);
-- Returns: true (acquired) or false (already locked)

When to Use Each

Use pg_advisory_lock when:

  • You MUST ensure exclusive access (e.g., running migrations, processing exactly-once jobs)
  • Waiting for the lock is acceptable behavior

Use pg_try_advisory_lock when:

  • You want to skip work if another process is already doing it
  • You need timeout control (poll with application-level retry logic)
  • You want to avoid indefinite blocking

Critical Details

  • Both use session-level locks by default (released on connection close or explicit pg_advisory_unlock)
  • Transaction-level variants exist: pg_advisory_xact_lock (auto-released at transaction end)
  • Lock keys are 64-bit integers (range: -9223372036854775808 to 9223372036854775807)
  • Available since PostgreSQL 8.2
  • Locks are held in shared memory (max controlled by max_locks_per_transaction)

Source: PostgreSQL 16 Official Documentation - Advisory Locks

95% confidence
A

MVCC in PostgreSQL

MVCC (Multi-Version Concurrency Control) is PostgreSQL's core concurrency mechanism that allows multiple transactions to access the same data simultaneously without blocking each other by maintaining multiple versions of each row.

How It Works

Each row has hidden system columns that track versioning:

  • xmin: Transaction ID that inserted the row
  • xmax: Transaction ID that deleted/updated the row (0 if current)
  • ctid: Physical location (page, tuple)

When you UPDATE a row, PostgreSQL:

  1. Creates a new version of the row with a new xmin
  2. Marks the old version with xmax set to the updating transaction's ID
  3. Old versions remain visible to transactions that started before the update
-- View hidden columns
SELECT xmin, xmax, ctid, * FROM users WHERE id = 1;

Visibility Rules

Each transaction sees data based on its snapshot:

  • READ COMMITTED (default): New snapshot per statement
  • REPEATABLE READ: Single snapshot for entire transaction
  • SERIALIZABLE: Strictest isolation, prevents anomalies

A row version is visible to a transaction if:

  • xmin is committed AND before transaction's snapshot
  • xmax is 0 OR uncommitted OR after transaction's snapshot

Critical Implications

Vacuum Required: Old row versions accumulate as "dead tuples". autovacuum removes them, but under heavy UPDATE/DELETE load, manual VACUUM may be needed to prevent bloat.

No Locking on Reads: SELECT never blocks writes, writes never block reads (unlike traditional row-level locking databases).

Write Conflicts: Two transactions updating the same row will cause the second to wait; if isolation rules are violated, one gets serialization failure.

Source: PostgreSQL 17 Official Documentation - Chapter 13 (Concurrency Control)

95% confidence
A

An ACCESS SHARE lock is the least restrictive lock mode in PostgreSQL, automatically acquired by SELECT queries (read-only operations) on tables.

Key Characteristics:

  • Conflicts with: Only ACCESS EXCLUSIVE locks (acquired by DROP TABLE, TRUNCATE, VACUUM FULL, LOCK TABLE ... ACCESS EXCLUSIVE)
  • Does NOT conflict with: All other lock modes including writes (INSERT, UPDATE, DELETE)
  • Purpose: Prevents table structure changes while allowing all data modifications
  • Duration: Held until transaction commits or rolls back
  • Auto-release: Yes, automatically released at transaction end

Explicit Acquisition:

LOCK TABLE table_name IN ACCESS SHARE MODE;

Common Operations That Acquire ACCESS SHARE:

-- Simple SELECT
SELECT * FROM users;

-- SELECT with joins
SELECT u.name, o.total FROM users u JOIN orders o ON u.id = o.user_id;

-- COPY TO (exporting data)
COPY users TO '/tmp/users.csv';

Critical Behavior:

Multiple concurrent SELECT queries can hold ACCESS SHARE locks simultaneously on the same table. This is why reads don't block reads or writes in PostgreSQL.

Source:

PostgreSQL Official Documentation - Table-Level Locks

Valid for PostgreSQL 9.1+ through current versions (including 17).

95% confidence
A

Phantom Read in PostgreSQL

A phantom read occurs when a transaction re-executes a query and finds a different set of rows that satisfy the search condition due to another transaction's committed INSERT or DELETE operations between the reads.

Specific Example

-- Transaction 1: Read accounts with balance > 1000
BEGIN;
SELECT * FROM accounts WHERE balance > 1000;
-- Returns 5 rows

-- Transaction 2 (concurrent): Insert new account
BEGIN;
INSERT INTO accounts (id, balance) VALUES (100, 1500);
COMMIT;

-- Transaction 1: Re-read same query
SELECT * FROM accounts WHERE balance > 1000;
-- Returns 6 rows (phantom row appears)
COMMIT;

PostgreSQL Behavior by Isolation Level

  • READ COMMITTED (default): Phantom reads can occur
  • REPEATABLE READ: Phantom reads cannot occur (prevented by snapshot isolation)
  • SERIALIZABLE: Phantom reads cannot occur (prevented by serializable snapshot isolation)

Key Difference from SQL Standard

PostgreSQL's REPEATABLE READ is stronger than the SQL standard requires. The SQL standard allows phantom reads at REPEATABLE READ level, but PostgreSQL prevents them using MVCC (Multi-Version Concurrency Control) snapshot isolation.

To prevent phantom reads, use:

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- or
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;

Source: PostgreSQL 17 documentation, Chapter 13.2 "Transaction Isolation"

95% confidence
A

SELECT FOR UPDATE in PostgreSQL

SELECT FOR UPDATE is a row-level locking mechanism that acquires exclusive locks on rows returned by a SELECT query, preventing other transactions from modifying or locking those rows until the current transaction completes.

Syntax

SELECT * FROM table_name WHERE condition FOR UPDATE;

Behavior

  • Locks acquired: RowShareLock on the table, ExclusiveLock on each row
  • Blocks: Other transactions attempting UPDATE, DELETE, SELECT FOR UPDATE/SHARE/KEY SHARE/NO KEY UPDATE on locked rows will wait
  • Allows: Other transactions can still perform plain SELECT (reads are not blocked)
  • Lock release: Automatically released at transaction COMMIT or ROLLBACK

Common Use Case

Prevents lost updates in concurrent transactions:

BEGIN;
-- Lock the row to prevent concurrent modifications
SELECT balance FROM accounts WHERE id = 123 FOR UPDATE;
-- Now safe to modify based on the read value
UPDATE accounts SET balance = balance - 100 WHERE id = 123;
COMMIT;

Important Variants

  • FOR UPDATE NOWAIT - Returns error immediately if row is locked (instead of waiting)
  • FOR UPDATE SKIP LOCKED - Skips locked rows, returns only unlocked rows (PostgreSQL 9.5+)

Critical Details

  • Must be used within a transaction block (BEGIN/COMMIT)
  • Applies to all tables in the query's FROM clause unless specific tables listed: FOR UPDATE OF table_name
  • Cannot be used with UNION, INTERSECT, EXCEPT, or aggregate functions on the locked rows
  • Deadlock possible if multiple transactions lock rows in different orders

Source: PostgreSQL Official Documentation v16 (applicable to 9.5+)

95% confidence
A

SHARE UPDATE EXCLUSIVE Lock in PostgreSQL

SHARE UPDATE EXCLUSIVE is a table-level lock mode in PostgreSQL that prevents concurrent schema modifications and VACUUM operations while allowing concurrent data reads and writes.

Lock Behavior

Conflicts with:

  • SHARE UPDATE EXCLUSIVE
  • SHARE
  • SHARE ROW EXCLUSIVE
  • EXCLUSIVE
  • ACCESS EXCLUSIVE

Compatible with:

  • ROW SHARE (SELECT FOR UPDATE/SHARE)
  • ROW EXCLUSIVE (INSERT, UPDATE, DELETE)
  • Concurrent SELECT queries (no explicit lock)

Primary Use Cases

  1. VACUUM (not VACUUM FULL) - Prevents concurrent VACUUM operations
  2. CREATE INDEX CONCURRENTLY - Prevents concurrent index creation
  3. ANALYZE - Statistical analysis operations
  4. ALTER TABLE VALIDATE CONSTRAINT - Constraint validation without blocking reads/writes

Acquiring Explicitly

BEGIN;
LOCK TABLE my_table IN SHARE UPDATE EXCLUSIVE MODE;
-- Perform operations that need to prevent concurrent schema changes
COMMIT;

Key Characteristics

  • Self-conflicting: Two transactions cannot both hold SHARE UPDATE EXCLUSIVE on the same table
  • Non-blocking for DML: Does NOT block INSERT, UPDATE, DELETE operations
  • Blocks DDL: Prevents ALTER TABLE, CREATE INDEX (non-CONCURRENT), VACUUM
  • Lock level: Table-level only (not row-level)

Source

PostgreSQL Official Documentation: Explicit Locking (applicable to all versions 9.x through 17.x with consistent behavior)

95% confidence
A

Transaction ID Wraparound in PostgreSQL

Transaction ID wraparound is a critical maintenance issue where PostgreSQL's 32-bit transaction ID counter exhausts its range and must wrap around to prevent data loss.

Core Mechanism

PostgreSQL uses 32-bit transaction IDs (XIDs) to implement MVCC. This provides approximately 4 billion (2^32) transaction IDs. XIDs are compared using modulo-2^32 arithmetic where:

  • Any transaction with an XID more than 2 billion transactions old becomes "in the future"
  • This would make old rows invisible, causing catastrophic data loss

Wraparound Prevention

PostgreSQL prevents wraparound through freezing: marking old tuples as visible to all transactions by setting their XID to FrozenTransactionId (2).

Critical Thresholds (PostgreSQL 12+)

  • vacuum_freeze_min_age: 50 million transactions (default) - minimum age before a row can be frozen
  • vacuum_freeze_table_age: 150 million transactions (default) - triggers aggressive whole-table scan
  • autovacuum_freeze_max_age: 200 million transactions (default) - HARD LIMIT triggering emergency autovacuum
  • Wraparound emergency: At 2 billion transactions old, database enters read-only mode

Configuration Example

-- Check current age of oldest unfrozen transaction
SELECT datname, age(datfrozenxid) 
FROM pg_database 
ORDER BY age(datfrozenxid) DESC;

-- Critical: If age approaches 200M, emergency vacuum occurs
-- Configure more aggressive freezing:
ALTER SYSTEM SET vacuum_freeze_min_age = 5000000;
ALTER SYSTEM SET autovacuum_freeze_max_age = 100000000;

Action Required

Monitor age(datfrozenxid) regularly. If it exceeds autovacuum_freeze_max_age, PostgreSQL will halt normal operations and force VACUUM FREEZE to completion.

Source: PostgreSQL 16 Documentation - Routine Vacuuming

95% confidence
A

ACID Properties in PostgreSQL

PostgreSQL is a fully ACID-compliant relational database system. ACID stands for:

Atomicity

Transactions are all-or-nothing. Either all operations within a transaction complete successfully and are committed, or none are applied. If any operation fails, PostgreSQL performs an automatic rollback.

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT; -- Both updates succeed or both are rolled back

Consistency

Database constraints (CHECK, FOREIGN KEY, UNIQUE, NOT NULL) are enforced at transaction commit. The database transitions from one valid state to another valid state.

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    amount NUMERIC CHECK (amount > 0)
);
-- Transaction will fail if constraint violated

Isolation

Concurrent transactions are isolated from each other. PostgreSQL supports 4 isolation levels defined by SQL standard:

  • Read Uncommitted (treated as Read Committed in PostgreSQL)
  • Read Committed (default) - queries see only committed data
  • Repeatable Read - consistent snapshot of database at transaction start
  • Serializable - strictest level, prevents all anomalies
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
-- Queries see consistent snapshot
COMMIT;

Durability

Once a transaction commits, data is permanently stored even if system crashes. PostgreSQL uses Write-Ahead Logging (WAL) - changes are written to WAL files on disk before acknowledgment. Default synchronous_commit = on ensures fsync() before commit returns.

Source: PostgreSQL 17 Official Documentation, Chapter 13 (Concurrency Control). ACID compliance verified across all PostgreSQL versions 9.x through 17.x.

95% confidence
A

SELECT FOR SHARE in PostgreSQL

SELECT FOR SHARE is a row-level locking clause that acquires a shared lock on selected rows, preventing other transactions from acquiring exclusive locks (UPDATE, DELETE, SELECT FOR UPDATE) while allowing concurrent SELECT FOR SHARE locks.

Syntax

SELECT * FROM table_name WHERE condition FOR SHARE;

Lock Behavior

  • Blocks: UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE on the same rows
  • Allows: Other SELECT FOR SHARE and SELECT FOR KEY SHARE locks (concurrent reads)
  • Duration: Held until transaction commits or rolls back

Variants (PostgreSQL 9.5+)

-- Basic shared lock
SELECT * FROM orders WHERE id = 1 FOR SHARE;

-- Skip locked rows instead of waiting
SELECT * FROM orders WHERE status = 'pending' FOR SHARE SKIP LOCKED;

-- Error immediately if row is locked
SELECT * FROM orders WHERE id = 1 FOR SHARE NOWAIT;

-- Lock only specific tables in a join
SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id 
FOR SHARE OF o;

Key Difference from SELECT FOR UPDATE

  • FOR UPDATE: Exclusive lock - blocks all other locks including FOR SHARE
  • FOR SHARE: Shared lock - allows multiple transactions to hold it simultaneously

Use Case

Use when you need to ensure rows won't be modified while you perform related operations, but want to allow other transactions to also read those same rows with similar guarantees.

BEGIN;
-- Lock invoice rows to prevent modification during audit
SELECT * FROM invoices WHERE date >= '2025-01-01' FOR SHARE;
-- Perform calculations knowing rows won't change
-- Other transactions can also SELECT FOR SHARE but not UPDATE/DELETE
COMMIT;

Version: Available since PostgreSQL 9.0. SKIP LOCKED and NOWAIT added in 9.5.

Source: PostgreSQL 17 Official Documentation - SELECT

95% confidence
A

PostgreSQL EXCLUSIVE Lock

An EXCLUSIVE lock in PostgreSQL is a table-level lock that blocks all concurrent access except for ACCESS SHARE locks (reads). It allows reads but prevents writes, DDL operations, and other conflicting locks.

Key Characteristics

Conflicts with:

  • ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

Compatible with:

  • ACCESS SHARE only (SELECT queries without FOR UPDATE/SHARE)

Common operations that acquire EXCLUSIVE:

  • REFRESH MATERIALIZED VIEW CONCURRENTLY (PostgreSQL 9.4+)
  • Explicit LOCK TABLE name IN EXCLUSIVE MODE

Usage Example

-- Explicitly acquire EXCLUSIVE lock
BEGIN;
LOCK TABLE my_table IN EXCLUSIVE MODE;
-- Multiple sessions can still SELECT from my_table
-- but cannot INSERT, UPDATE, DELETE, or ALTER it
COMMIT;

Practical Impact

  • Reads: Allowed (concurrent SELECT statements proceed normally)
  • Writes: Blocked (INSERT, UPDATE, DELETE wait)
  • DDL: Blocked (ALTER TABLE, DROP TABLE wait)
  • Other locks: Most lock acquisition attempts wait

Lock Hierarchy Position

On the 8-level PostgreSQL lock hierarchy (weakest to strongest), EXCLUSIVE is level 7 of 8—only ACCESS EXCLUSIVE is stronger.

Source: PostgreSQL 17 Documentation - Explicit Locking

95% confidence
A

deadlock_timeout is a PostgreSQL server configuration parameter that specifies how long (in milliseconds) to wait on a lock before checking for a deadlock condition.

Default Value: 1000 (1 second)

Valid Range: Any positive integer representing milliseconds. Minimum practical value is typically 1.

Purpose:
When a transaction waits for a lock, PostgreSQL doesn't immediately check if a deadlock exists (expensive operation). Instead, it waits for deadlock_timeout milliseconds first. If still blocked, it runs the deadlock detection algorithm. If a deadlock is found, PostgreSQL aborts one transaction with error 40P01.

Configuration:

-- View current setting
SHOW deadlock_timeout;

-- Set in postgresql.conf (requires reload)
deadlock_timeout = 1000

-- Set for current session
SET deadlock_timeout = '2s';

-- Set for specific database
ALTER DATABASE mydb SET deadlock_timeout = 500;

Units: Accepts time units: ms, s, min, h, d. Without unit, assumes milliseconds.

Tuning Guidance:

  • Lower values (100-500ms): Use if deadlocks are frequent and transactions are short
  • Higher values (2000-5000ms): Use if deadlock detection overhead is measurable and deadlocks are rare
  • Don't set too low (<100ms) as deadlock detection is CPU-intensive

Version: Available since PostgreSQL 7.1. Behavior unchanged through PostgreSQL 17.

Source: PostgreSQL Official Documentation - Runtime Config: Locks

95% confidence
A

NOWAIT in SELECT FOR UPDATE

NOWAIT is a locking option in PostgreSQL's SELECT FOR UPDATE that causes the query to immediately fail with an error instead of waiting if a requested row is already locked by another transaction.

Behavior

Without NOWAIT (default):

  • The query blocks and waits indefinitely until the row lock is released
  • No timeout unless lock_timeout is configured

With NOWAIT:

  • If any requested row is already locked, PostgreSQL immediately raises error code 55P03 (lock_not_available)
  • Error message: could not obtain lock on row in relation "table_name"
  • No rows are returned; the entire query fails

Syntax

SELECT * FROM orders 
WHERE order_id = 123 
FOR UPDATE NOWAIT;

Use Case

Use NOWAIT when you want to:

  1. Avoid blocking operations in high-concurrency scenarios
  2. Implement optimistic locking patterns
  3. Quickly detect lock contention and retry or skip the operation

Error Handling Example

import psycopg2

try:
    cursor.execute("SELECT * FROM orders WHERE id = %s FOR UPDATE NOWAIT", (order_id,))
    row = cursor.fetchone()
except psycopg2.errors.LockNotAvailable:
    # Row is locked by another transaction
    # Handle: retry, skip, or return error to user
    pass

Version Support

Available since PostgreSQL 8.3 (2008).

Related Options

  • SKIP LOCKED (PostgreSQL 9.5+): Skips locked rows instead of failing
  • lock_timeout: Sets maximum wait time for locks (default: 0 = wait forever)

Source: PostgreSQL Official Documentation - SELECT FOR UPDATE

95% confidence

security

28 questions
A

How to Encrypt Data at Rest in PostgreSQL

PostgreSQL supports data-at-rest encryption through Transparent Data Encryption (TDE) in enterprise forks or filesystem-level encryption in standard PostgreSQL.

Standard PostgreSQL (Community Edition)

PostgreSQL core does not include built-in TDE. Use filesystem-level encryption:

Option 1: LUKS (Linux Unified Key Setup)

Encrypt the entire data directory partition:

# Create encrypted partition
cryptsetup luksFormat /dev/sdb1
cryptsetup luksOpen /dev/sdb1 pgdata_encrypted

# Format and mount
mkfs.ext4 /dev/mapper/pgdata_encrypted
mount /dev/mapper/pgdata_encrypted /var/lib/postgresql/data

# Initialize PostgreSQL
sudo -u postgres initdb -D /var/lib/postgresql/data

Option 2: Application-Level Encryption

Encrypt sensitive columns using pgcrypto extension:

CREATE EXTENSION pgcrypto;

-- Encrypt data
INSERT INTO users (id, ssn) 
VALUES (1, pgp_sym_encrypt('123-45-6789', 'encryption_key'));

-- Decrypt data
SELECT id, pgp_sym_decrypt(ssn, 'encryption_key') AS ssn 
FROM users WHERE id = 1;

Key size: pgp_sym_encrypt uses AES-128 by default (16-byte key recommended).

Enterprise Solutions with TDE

  • EDB Postgres Advanced Server: Native TDE using AES-256
  • Percona Distribution for PostgreSQL: TDE support via patches
  • AWS RDS PostgreSQL: Enable encryption at instance creation (AES-256, cannot be enabled post-creation)

Performance Impact

  • Filesystem encryption (LUKS): ~3-10% overhead
  • pgcrypto: ~15-30% overhead per encrypted column operation

Source: PostgreSQL 17 official documentation, LUKS cryptsetup man pages (v2.6), AWS RDS documentation (2024).

95% confidence
A

Peer Authentication in PostgreSQL

Peer authentication is a PostgreSQL authentication method that validates database connections by checking if the operating system username matches the PostgreSQL username. It works exclusively for local connections (Unix domain sockets) and uses the OS kernel to obtain the client's credentials.

How It Works

When a user connects via peer authentication, PostgreSQL:

  1. Asks the operating system kernel for the username of the process connecting via the Unix socket
  2. Compares this OS username to the requested PostgreSQL username
  3. Allows connection only if they match exactly (case-sensitive)

Configuration

In pg_hba.conf:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
local   all             all                                     peer

Practical Example

If you're logged into Linux as OS user john:

# This works - OS user 'john' connects as PostgreSQL user 'john'
psql -U john mydatabase

# This fails - OS user 'john' trying to connect as PostgreSQL user 'alice'
psql -U alice mydatabase
# Error: Peer authentication failed for user "alice"

Username Mapping (Optional)

Use pg_ident.conf to map OS usernames to different PostgreSQL usernames:

# MAPNAME       SYSTEM-USERNAME         PG-USERNAME
mymap           john                    postgres

Then in pg_hba.conf:

local   all             all                                     peer map=mymap

Key Constraints

  • Platform: Linux, BSD, macOS, Solaris (not available on Windows)
  • Connection type: Only works for local (Unix domain socket) connections, NOT host (TCP/IP)
  • Default: Peer is the default authentication for local connections in PostgreSQL 9.1+

Source: PostgreSQL Official Documentation - Chapter 21.9 Peer Authentication

95% confidence
A

The CREATEROLE attribute in PostgreSQL is a role privilege that allows a role to create, alter, and drop other roles (users).

Key Capabilities

A role with CREATEROLE can:

  • Create new roles using CREATE ROLE or CREATE USER
  • Modify existing roles using ALTER ROLE
  • Drop roles using DROP ROLE
  • Grant membership in roles it has created

Critical Limitations

CREATEROLE is NOT a superuser privilege. A CREATEROLE role:

  • Cannot create superuser roles (unless it also has SUPERUSER attribute)
  • Cannot modify superuser roles
  • Cannot grant privileges it doesn't itself possess
  • PostgreSQL 16+: Can only manage roles it has ADMIN OPTION for (breaking change from prior versions)

Syntax

-- Grant CREATEROLE when creating a role
CREATE ROLE manager WITH CREATEROLE LOGIN PASSWORD 'secret';

-- Grant CREATEROLE to existing role
ALTER ROLE manager WITH CREATEROLE;

-- Revoke CREATEROLE
ALTER ROLE manager WITH NOCREATEROLE;

Version-Specific Behavior

  • PostgreSQL ≤15: CREATEROLE can modify ANY non-superuser role
  • PostgreSQL 16+: CREATEROLE can only modify roles where it has been granted membership with ADMIN OPTION. This improves security but breaks backward compatibility.

Source

PostgreSQL Official Documentation: Database Roles and CREATE ROLE

95% confidence
A

PostgreSQL Role

A PostgreSQL role is a database entity that can own database objects and have database privileges. Roles consolidate the concepts of "users" and "groups" from older PostgreSQL versions (pre-8.1).

Key Characteristics

  • LOGIN capability: Roles can have the LOGIN attribute (making them equivalent to "users") or not (equivalent to "groups")
  • Ownership: Roles can own database objects (tables, functions, schemas, etc.)
  • Membership: Roles can be members of other roles (inheritance of privileges)
  • Privileges: Roles can be granted specific privileges on database objects

Creation Syntax

-- Role with login (a "user")
CREATE ROLE myuser WITH LOGIN PASSWORD 'securepassword';

-- Alternative shorthand
CREATE USER myuser WITH PASSWORD 'securepassword';

-- Role without login (a "group")
CREATE ROLE readonly;

-- Role with specific attributes
CREATE ROLE admin WITH LOGIN PASSWORD 'pass' CREATEDB CREATEROLE;

Common Attributes

  • LOGIN / NOLOGIN - Can connect to database (default: NOLOGIN)
  • SUPERUSER / NOSUPERUSER - Bypass all permission checks
  • CREATEDB / NOCREATEDB - Can create databases
  • CREATEROLE / NOCREATEROLE - Can create other roles
  • PASSWORD 'string' - Sets authentication password
  • VALID UNTIL 'timestamp' - Password expiration

Membership Example

-- Create group role
CREATE ROLE analysts NOLOGIN;

-- Grant privileges to group
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analysts;

-- Add user to group
GRANT analysts TO alice;

Source: PostgreSQL 17 Official Documentation - Database Roles (applies to PostgreSQL 8.1+)

95% confidence
A

Row-Level Security (RLS) in PostgreSQL

Row-Level Security (RLS) is a PostgreSQL feature (available since version 9.5) that allows you to restrict which rows users can access or modify in a table based on security policies.

Core Mechanism

RLS policies are expressions that return boolean values, evaluated for each row. If the policy returns true, the row is visible/modifiable; if false, it's hidden/blocked.

Enabling RLS

-- Enable RLS on a table (requires table owner or superuser)
ALTER TABLE table_name ENABLE ROW LEVEL SECURITY;

Once enabled, no rows are visible to non-superuser roles until you create explicit policies. Superusers and table owners bypass RLS by default.

Creating Policies

-- Policy syntax
CREATE POLICY policy_name ON table_name
  [FOR {ALL | SELECT | INSERT | UPDATE | DELETE}]
  [TO role_name]
  [USING (boolean_expression)]      -- rows visible for SELECT/UPDATE/DELETE
  [WITH CHECK (boolean_expression)]; -- rows allowed for INSERT/UPDATE

Practical Example

-- Table with user-specific data
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL,
  content TEXT
);

ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

-- Policy: Users see only their own documents
CREATE POLICY user_documents ON documents
  FOR ALL
  TO public
  USING (user_id = current_setting('app.current_user_id')::INTEGER);

Critical Details

  • USING clause: Filters rows for SELECT, UPDATE, DELETE operations
  • WITH CHECK clause: Validates rows for INSERT and UPDATE operations. If omitted for INSERT, USING is used instead.
  • Default deny: With RLS enabled but no policies, all access is denied (except superusers)
  • Performance: Policies are evaluated per-row; complex expressions can impact query performance
  • FORCE ROW LEVEL SECURITY: Apply RLS even to table owners: ALTER TABLE table_name FORCE ROW LEVEL SECURITY;

Bypassing RLS

-- Grant bypass privilege (use sparingly)
GRANT BYPASSRLS ON table_name TO role_name;

Source: PostgreSQL Official Documentation v16 - Row Security Policies
https://www.postgresql.org/docs/current/ddl-rowsecurity.html

95% confidence
A

The pgcrypto extension provides cryptographic functions for PostgreSQL, enabling encryption, hashing, and random data generation directly within the database.

Core Functionality

Installation:

CREATE EXTENSION pgcrypto;

Key Functions:

  1. Password Hashing (recommended):
-- Hash password (uses bcrypt by default)
SELECT crypt('user_password', gen_salt('bf'));

-- Verify password
SELECT password_hash = crypt('user_input', password_hash) FROM users;
  1. Encryption (symmetric):
-- Encrypt (AES-128 default)
SELECT pgp_sym_encrypt('sensitive data', 'encryption_key');

-- Decrypt
SELECT pgp_sym_decrypt(encrypted_column, 'encryption_key') FROM table;
  1. Hashing:
-- SHA-256
SELECT digest('data', 'sha256');

-- Returns bytea, convert to hex if needed
SELECT encode(digest('data', 'sha256'), 'hex');
  1. Random Data:
-- Cryptographically secure random bytes
SELECT gen_random_bytes(32);  -- 32 bytes

-- Random UUID (preferred over uuid-ossp)
SELECT gen_random_uuid();

Available Algorithms

  • Symmetric encryption: AES (128/192/256-bit), Blowfish, 3DES
  • Hashing: MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512
  • Password hashing: bf (Blowfish/bcrypt), md5, xdes, des

Best Practices

  • Use gen_salt('bf') for password hashing (automatically includes salt and multiple rounds)
  • gen_random_uuid() is faster than uuid-ossp's uuid_generate_v4()
  • Store encrypted data as bytea type
  • pgcrypto has been part of PostgreSQL contrib since version 8.1+

Source: PostgreSQL Official Documentation - pgcrypto module (https://www.postgresql.org/docs/current/pgcrypto.html)

95% confidence
A

REVOKE Command in PostgreSQL

The REVOKE command removes previously granted privileges from database roles (users or groups). It is the inverse operation of GRANT.

Basic Syntax

REVOKE privilege_type ON object_type object_name FROM role_name;

Common Usage Patterns

Revoke specific privileges:

-- Revoke SELECT privilege on a table
REVOKE SELECT ON TABLE employees FROM user_alice;

-- Revoke multiple privileges
REVOKE SELECT, INSERT, UPDATE ON TABLE products FROM role_sales;

-- Revoke all privileges
REVOKE ALL PRIVILEGES ON TABLE orders FROM user_bob;

Revoke from PUBLIC:

-- Remove default public access
REVOKE ALL ON DATABASE mydb FROM PUBLIC;

Revoke with CASCADE:

-- Revoke and remove dependent privileges granted by this user
REVOKE SELECT ON TABLE customers FROM user_charlie CASCADE;

Key Details

  • Privilege Types: SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, USAGE, SET, ALTER SYSTEM, ALL PRIVILEGES
  • Object Types: TABLE, SEQUENCE, DATABASE, DOMAIN, FOREIGN DATA WRAPPER, FOREIGN SERVER, FUNCTION, PROCEDURE, ROUTINE, LANGUAGE, LARGE OBJECT, SCHEMA, TABLESPACE, TYPE
  • CASCADE: Automatically revokes dependent privileges that other users received through GRANT OPTION
  • RESTRICT (default): Refuses to revoke if dependent privileges exist

Grant Option Handling

-- Revoke the ability to grant privileges to others (but keep the privilege itself)
REVOKE GRANT OPTION FOR SELECT ON TABLE data FROM user_dave;

-- Revoke both the privilege and grant option
REVOKE SELECT ON TABLE data FROM user_dave CASCADE;

Version Compatibility

This syntax is standardized and works consistently across PostgreSQL 9.x through 17.x. The ROUTINE keyword (covering both functions and procedures) was added in PostgreSQL 11.

Source: PostgreSQL 17 Official Documentation - REVOKE

95% confidence
A

PostgreSQL Security Policy (Row-Level Security)

A security policy in PostgreSQL is a named rule attached to a table that defines row-level security (RLS) restrictions. Policies control which rows users can view, insert, update, or delete based on SQL expressions evaluated per row.

Core Concepts

Enable RLS on a table:

ALTER TABLE table_name ENABLE ROW LEVEL SECURITY;

Create a policy:

CREATE POLICY policy_name ON table_name
    [AS {PERMISSIVE | RESTRICTIVE}]
    [FOR {ALL | SELECT | INSERT | UPDATE | DELETE}]
    [TO {role_name | PUBLIC | CURRENT_USER | SESSION_USER}]
    [USING (condition)]      -- For SELECT/UPDATE/DELETE visibility
    [WITH CHECK (condition)] -- For INSERT/UPDATE validation

Key Details

  • Policy types:

    • PERMISSIVE (default): Rows matching ANY permissive policy are accessible (OR logic)
    • RESTRICTIVE: Rows must pass ALL restrictive policies (AND logic)
  • Table owners and superusers bypass RLS by default unless FORCE ROW LEVEL SECURITY is set

  • Multiple policies on the same table are combined with OR (for PERMISSIVE) or AND (for RESTRICTIVE)

  • Performance: Policy expressions run on every row access - use indexed columns in conditions

Example

-- Only users can see their own data
CREATE POLICY user_isolation ON accounts
    FOR ALL
    TO public
    USING (user_id = current_user)
    WITH CHECK (user_id = current_user);

Version Notes

Available since PostgreSQL 9.5. No changes to core syntax through PostgreSQL 17.

Source: PostgreSQL Official Documentation - Row Security Policies

95% confidence
A

Enabling RLS on a PostgreSQL Table

Use the ALTER TABLE command with the ENABLE ROW LEVEL SECURITY clause:

ALTER TABLE table_name ENABLE ROW LEVEL SECURITY;

Critical details:

  1. Privileges required: You must be the table owner or have superuser privileges.

  2. Effect: Once enabled, RLS blocks ALL access by default (returns zero rows) for non-owner users until you create explicit policies using CREATE POLICY.

  3. Table owners bypass RLS by default. To force owners to obey policies:

ALTER TABLE table_name FORCE ROW LEVEL SECURITY;
  1. Verification: Check if RLS is enabled:
SELECT tablename, rowsecurity 
FROM pg_tables 
WHERE tablename = 'table_name';

Returns true if RLS is enabled.

  1. Disabling RLS:
ALTER TABLE table_name DISABLE ROW LEVEL SECURITY;

Complete example:

-- Enable RLS
ALTER TABLE employees ENABLE ROW LEVEL SECURITY;

-- Create a policy (required for non-owners to access data)
CREATE POLICY employee_policy ON employees
    FOR SELECT
    USING (user_id = current_user_id());

Source: PostgreSQL official documentation (compatible with PostgreSQL 9.5+, where RLS was introduced).

95% confidence
A

MD5 authentication in PostgreSQL is a deprecated password-based authentication method that stores and transmits MD5-hashed passwords.

How it works:

  1. Password hashes are stored in pg_authid system catalog as md5 + MD5 hash of (password + username)
  2. During authentication, the client sends: md5 + MD5(MD5(password + username) + server salt)
  3. Server compares this with its stored hash using the provided salt

Configuration:

In pg_hba.conf:

# TYPE  DATABASE  USER  ADDRESS      METHOD
host    all       all   0.0.0.0/0    md5

Critical Facts:

  • Deprecated since PostgreSQL 14 (released September 2021)
  • Removed in PostgreSQL 17 (released September 2024)
  • Vulnerable to rainbow table attacks and does not use key stretching
  • Replaced by scram-sha-256 which is the default in PostgreSQL 14+

Recommended Action:

Use scram-sha-256 instead:

host    all       all   0.0.0.0/0    scram-sha-256

To migrate existing md5 passwords, users must change their password after switching to scram-sha-256:

ALTER USER username PASSWORD 'newpassword';

Source: PostgreSQL Official Documentation - Client Authentication (versions 14, 15, 16, 17)

95% confidence
A

PostgreSQL Row-Level Security (RLS) for Multi-Tenant Applications

Implementation Pattern

Requirement: PostgreSQL 9.5 or later (RLS introduced in 9.5).

1. Database Schema Setup

-- Add tenant identifier to all multi-tenant tables
CREATE TABLE tenants (
    tenant_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL
);

CREATE TABLE users (
    user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id UUID NOT NULL REFERENCES tenants(tenant_id),
    email TEXT NOT NULL,
    -- other columns
);

CREATE TABLE data (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id UUID NOT NULL REFERENCES tenants(tenant_id),
    content TEXT,
    -- other columns
);

2. Enable RLS and Create Policies

-- Enable RLS on multi-tenant tables
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE data ENABLE ROW LEVEL SECURITY;

-- Create policy that filters rows by tenant_id
CREATE POLICY tenant_isolation_policy ON users
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);

CREATE POLICY tenant_isolation_policy ON data
    USING (tenant_id = current_setting('app.current_tenant_id')::UUID);

3. Set Tenant Context per Connection

-- Set at start of each application session/transaction
SET app.current_tenant_id = 'tenant-uuid-here';

-- Or use LOCAL for transaction-only scope
SET LOCAL app.current_tenant_id = 'tenant-uuid-here';

4. Application Code Example (Node.js with pg)

async function queryAsTenant(client, tenantId, query, params) {
    await client.query('SET LOCAL app.current_tenant_id = $1', [tenantId]);
    return client.query(query, params);
}

// Usage
const result = await queryAsTenant(
    client,
    'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
    'SELECT * FROM data WHERE id = $1',
    [dataId]
);

Critical Security Requirements

  1. Always set tenant context BEFORE queries - Missing context results in zero rows returned (safe failure)
  2. Use transaction-scoped settings - Use SET LOCAL inside transactions to prevent context leakage
  3. Exempt superusers - RLS bypasses for superusers; application must use non-superuser role
  4. Force RLS for table owners - Add FORCE ROW LEVEL SECURITY if table owner queries data:
ALTER TABLE data FORCE ROW LEVEL SECURITY;
  1. Validate tenant_id - Verify user has access to tenant_id before calling SET:
// Verify first
const hasAccess = await checkUserTenantAccess(userId, tenantId);
if (!hasAccess) throw new Error('Unauthorized tenant access');

// Then set context
await client.query('SET LOCAL app.current_tenant_id = $1', [tenantId]);

Performance Considerations

  • RLS policies add WHERE clauses to queries; ensure tenant_id columns are indexed:
CREATE INDEX idx_users_tenant_id ON users(tenant_id);
CREATE INDEX idx_data_tenant_id ON data(tenant_id);
  • Index scan overhead: ~5-10% compared to non-RLS queries (PostgreSQL 14+ optimizer improvements)

Source

PostgreSQL Official Documentation: Row Security Policies (https://www.postgresql.org/docs/current/ddl-rowsecurity.html)

95% confidence
A

PostgreSQL Authentication Methods

PostgreSQL supports the following authentication methods (configured in pg_hba.conf):

Core Methods

  1. trust - Allows connection without password (unsafe for production)
  2. reject - Rejects connection unconditionally
  3. scram-sha-256 - Challenge-response scheme (default since PostgreSQL 14, recommended)
  4. md5 - MD5-hashed password (legacy, deprecated since PostgreSQL 14)
  5. password - Clear-text password (unsafe, not recommended)
  6. gss - GSSAPI authentication (Kerberos)
  7. sspi - SSPI authentication (Windows native)
  8. ident - Uses OS username mapping
  9. peer - Uses OS username (Unix-domain sockets only)
  10. ldap - LDAP server authentication
  11. radius - RADIUS server authentication
  12. cert - SSL client certificate authentication
  13. pam - PAM (Pluggable Authentication Modules)
  14. bsd - BSD authentication

Configuration Example

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    all             all             127.0.0.1/32            scram-sha-256
local   all             postgres                                peer
host    all             all             0.0.0.0/0               reject

Best Practice

Use scram-sha-256 for password-based authentication (PostgreSQL 10+). It replaced md5 as the secure standard.

Source: PostgreSQL 16 Official Documentation - Client Authentication
https://www.postgresql.org/docs/current/auth-methods.html

95% confidence
A

Trust Authentication in PostgreSQL

Trust authentication allows any user to connect to PostgreSQL without a password if the connection matches the pg_hba.conf rule. The server grants access solely based on connection parameters (database, username, IP address) without verifying credentials.

Configuration

In pg_hba.conf:

# TYPE  DATABASE    USER        ADDRESS         METHOD
host    mydb        alice       192.168.1.0/24  trust
local   all         postgres                    trust

Critical Security Implications

  • No authentication performed: Any user who can reach the server and match the rule gains full database access as the specified role
  • Use cases: Development environments, localhost-only connections, or environments with network-level security
  • Production warning: Never use trust for internet-facing or multi-user production systems

How It Works

  1. Client connects with username alice to database mydb from 192.168.1.100
  2. PostgreSQL checks pg_hba.conf top-to-bottom
  3. First matching rule has method trust
  4. Server grants access immediately - no password check, no credential validation

Example Connection

# This succeeds even with wrong/no password
psql -h 192.168.1.50 -U alice -d mydb
# Connects immediately without password prompt

Configuration Reload

After modifying pg_hba.conf:

SELECT pg_reload_conf();

Or via shell:

pg_ctl reload -D /var/lib/postgresql/data

Source: PostgreSQL 17 Documentation - Authentication Methods

Best practice: Use trust only for local development or in combination with network isolation (e.g., host all all 127.0.0.1/32 trust for localhost-only access).

95% confidence
A

PostgreSQL session_user Function

session_user is a SQL standard function that returns the name of the user who initiated the current database session, as a name type (PostgreSQL's internal string type for identifiers).

Key Characteristics

  • Immutable during session: Returns the authenticated username that connected to the database, regardless of any SET ROLE or SET SESSION AUTHORIZATION commands executed during the session
  • Differs from current_user: While current_user reflects the currently active role (which can change via SET ROLE), session_user always shows the original authenticated user
  • No parentheses required: Can be called as session_user or session_user()

Usage Example

-- Show original authenticated user
SELECT session_user;

-- Compare with current_user after role change
SELECT session_user, current_user;
-- Returns: session_user = 'alice', current_user = 'alice'

SET ROLE bob;

SELECT session_user, current_user;
-- Returns: session_user = 'alice', current_user = 'bob'

Return Type

Returns name type (64-byte maximum length identifier in PostgreSQL, though actual limit is NAMEDATALEN - 1, typically 63 bytes).

Source

PostgreSQL Official Documentation: System Information Functions (applies to PostgreSQL 9.x through 17.x with consistent behavior).

95% confidence
A

PostgreSQL PERMISSIVE vs RESTRICTIVE Policies

In PostgreSQL Row-Level Security (RLS), policy types determine how multiple policies combine:

PERMISSIVE (default)

  • Multiple PERMISSIVE policies use OR logic
  • A row is accessible if it passes ANY PERMISSIVE policy
  • Most common use case: granting different types of access

RESTRICTIVE

  • Multiple RESTRICTIVE policies use AND logic
  • A row is accessible only if it passes ALL RESTRICTIVE policies
  • Used to enforce mandatory conditions that cannot be bypassed

Combined Behavior

When both types exist on a table:

  1. At least ONE PERMISSIVE policy must pass (if any PERMISSIVE policies exist)
  2. ALL RESTRICTIVE policies must pass (if any RESTRICTIVE policies exist)

Formula: (PERMISSIVE_1 OR PERMISSIVE_2 OR ...) AND (RESTRICTIVE_1 AND RESTRICTIVE_2 AND ...)

Code Example

-- PERMISSIVE: User can see their own rows OR public rows
CREATE POLICY user_own_data ON documents
    FOR SELECT
    AS PERMISSIVE  -- This is the default
    USING (user_id = current_user);

CREATE POLICY public_data ON documents
    FOR SELECT
    AS PERMISSIVE
    USING (is_public = true);

-- RESTRICTIVE: But never show deleted rows, regardless of above policies
CREATE POLICY hide_deleted ON documents
    FOR SELECT
    AS RESTRICTIVE
    USING (deleted_at IS NULL);

In this example, users see rows where (user_id = current_user OR is_public = true) AND (deleted_at IS NULL).

Version: Available since PostgreSQL 9.5 (RLS introduced), RESTRICTIVE keyword added in PostgreSQL 10.

Source: PostgreSQL 17 Documentation - Row Security Policies

95% confidence
A

PostgreSQL GRANT Privileges

Use the GRANT command to assign specific privileges on database objects to roles (users/groups).

Basic Syntax

GRANT privilege_type ON object_type object_name TO role_name;

Common Privilege Types

Table privileges:

-- Single privilege
GRANT SELECT ON TABLE users TO readonly_user;

-- Multiple privileges
GRANT SELECT, INSERT, UPDATE ON TABLE users TO app_user;

-- All privileges
GRANT ALL PRIVILEGES ON TABLE users TO admin_user;

Database privileges:

GRANT CONNECT ON DATABASE mydb TO app_user;
GRANT CREATE ON DATABASE mydb TO developer;

Schema privileges:

GRANT USAGE ON SCHEMA public TO app_user;
GRANT CREATE ON SCHEMA public TO developer;

All tables in schema:

GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;

-- Apply to future tables
ALTER DEFAULT PRIVILEGES IN SCHEMA public 
GRANT SELECT ON TABLES TO readonly_user;

Key Privilege Keywords

  • SELECT, INSERT, UPDATE, DELETE - Table data operations
  • TRUNCATE, REFERENCES, TRIGGER - Table management
  • CONNECT - Database connection
  • CREATE - Create objects in database/schema
  • USAGE - Use schema objects
  • EXECUTE - Call functions/procedures
  • ALL PRIVILEGES - All applicable privileges for object type

Grant with Additional Options

-- WITH GRANT OPTION: allows grantee to grant same privilege to others
GRANT SELECT ON TABLE users TO manager_user WITH GRANT OPTION;

Verification

-- Check table privileges
\dp table_name

-- Check role privileges
\du role_name

Version: PostgreSQL 12+ (syntax stable since PostgreSQL 9.0)
Source: PostgreSQL GRANT Documentation

95% confidence
A

Certificate Authentication in PostgreSQL

Certificate authentication (cert) is a PostgreSQL authentication method that authenticates clients using SSL/TLS client certificates. PostgreSQL validates the client's X.509 certificate against a trusted Certificate Authority (CA) and maps the certificate's Common Name (CN) to a database user.

How It Works

  1. SSL Connection Required: Client must connect with sslmode=require or higher
  2. Certificate Validation: PostgreSQL verifies the client certificate against the CA certificate specified in ssl_ca_file (default: root.crt in data directory)
  3. User Mapping: The certificate's CN field must match the PostgreSQL username (or use pg_ident.conf for custom mappings)

Configuration

postgresql.conf:

ssl = on
ssl_ca_file = '/path/to/root.crt'
ssl_cert_file = '/path/to/server.crt'
ssl_key_file = '/path/to/server.key'

pg_hba.conf:

# TYPE  DATABASE  USER  ADDRESS      METHOD  OPTIONS
hostssl all       all   0.0.0.0/0    cert    clientcert=verify-full
  • clientcert=verify-full (PostgreSQL 12+): Requires valid client certificate
  • clientcert=verify-ca (PostgreSQL 12+): Validates certificate chain only
  • Legacy versions use cert method which implies certificate requirement

Client Connection

psql "host=dbserver dbname=mydb user=john sslmode=verify-full sslcert=/path/to/client.crt sslkey=/path/to/client.key sslrootcert=/path/to/root.crt"

Critical: The CN in the client certificate MUST be john (matching the database username), or you must configure pg_ident.conf for custom mapping.

Source: PostgreSQL 16 Documentation - Authentication Methods: Certificate Authentication

95% confidence
A

PostgreSQL Ident Authentication

Ident authentication is a method where PostgreSQL verifies a user's identity by querying an Ident service (RFC 1413) running on the client's machine. The Ident service reports which OS user owns the TCP connection to PostgreSQL.

How It Works

  1. Client connects to PostgreSQL from IP address X, port Y
  2. PostgreSQL connects to the Ident service on X:113
  3. Ident service reports the OS username that owns the connection from port Y
  4. PostgreSQL maps this OS username to a PostgreSQL role using an optional user map

Configuration in pg_hba.conf

# TYPE  DATABASE  USER      ADDRESS        METHOD
host    mydb      pguser    192.168.1.0/24 ident
host    mydb      pguser    192.168.1.0/24 ident map=mymap

Key Specifications

  • Protocol: RFC 1413 Identification Protocol
  • Port: TCP 113 (on client machine)
  • Timeout: 10 seconds (default ident_timeout in postgresql.conf)
  • Connection types: Only works with TCP/IP connections (host or hostssl)
  • Trust level: LOW - the client's Ident service can be spoofed or unreliable

User Mapping (Optional)

Define in pg_ident.conf:

# MAPNAME  SYSTEM-USERNAME  PG-USERNAME
mymap      john             pguser
mymap      jane             pguser

Important Limitations

  • Does NOT work with local Unix socket connections (use peer instead)
  • Not secure - relies on client-side service that can be manipulated
  • Rarely used in production - primarily for controlled LAN environments
  • Deprecated on Windows - Windows has no standard Ident service

Modern Alternative

For local connections, use peer authentication instead, which directly checks the OS user via the Unix socket without network calls.

Source: PostgreSQL 17 Official Documentation, Section 21.9 "Ident Authentication"

95% confidence
A

PostgreSQL ssl Setting

The ssl setting in postgresql.conf is a boolean parameter that controls whether the PostgreSQL server accepts SSL/TLS encrypted connections.

Configuration

Parameter: ssl
Type: Boolean
Values: on | off | true | false | yes | no | 1 | 0
Default: off (PostgreSQL 12+), on (if compiled with SSL support, PostgreSQL 15+)
Context: Requires server restart (postmaster)

# postgresql.conf
ssl = on

Requirements

When ssl = on, PostgreSQL requires these files in the data directory:

  1. server.crt - Server certificate
  2. server.key - Private key (permissions must be 0600 or less)

Without these files, the server will fail to start if ssl = on.

Behavior

  • ssl = on: Server accepts both SSL and non-SSL connections (unless pg_hba.conf enforces hostssl)
  • ssl = off: Server rejects all SSL connection attempts

Related Settings

When enabling SSL, configure:

  • ssl_cert_file - Path to server certificate (default: server.crt)
  • ssl_key_file - Path to private key (default: server.key)
  • ssl_ca_file - Certificate authority file for client verification

Verification

# Check if SSL is enabled
psql -h localhost -U postgres -c "SHOW ssl;"

Source: PostgreSQL 16 Official Documentation - Server Configuration: Connections and Authentication

95% confidence
A

PostgreSQL LOGIN Attribute

The LOGIN attribute is a role privilege in PostgreSQL that grants the ability to connect to the database server. It is one of the core role attributes that determines whether a role can be used for authentication.

Key Facts

  • Default Behavior: Roles created with CREATE ROLE do NOT have LOGIN by default. Roles created with CREATE USER DO have LOGIN by default (since PostgreSQL 8.1+).
  • Syntax:
    • To grant: LOGIN
    • To revoke: NOLOGIN

Code Examples

-- Create a role WITHOUT login capability (cannot connect)
CREATE ROLE readonly_role;

-- Create a role WITH login capability (can connect)
CREATE ROLE app_user LOGIN PASSWORD 'secure_password';

-- Equivalent using CREATE USER (LOGIN is implicit)
CREATE USER app_user PASSWORD 'secure_password';

-- Grant LOGIN to an existing role
ALTER ROLE readonly_role LOGIN;

-- Revoke LOGIN from a role
ALTER ROLE app_user NOLOGIN;

Technical Details

  • Only roles with LOGIN can be used as the authentication role in connection strings
  • Roles without LOGIN are typically used as groups for privilege management
  • A role can have LOGIN but still be prevented from connecting via pg_hba.conf rules
  • The LOGIN attribute is stored in the pg_authid.rolcanlogin system catalog column

Source

PostgreSQL Official Documentation: Database Roles (Current through PostgreSQL 17)

95% confidence
A

pg_hba.conf Entry Syntax

Each non-comment line in pg_hba.conf follows this exact format:

TYPE  DATABASE  USER  ADDRESS  METHOD  [OPTIONS]

Field Specifications

  1. TYPE (required): Connection type

    • local - Unix-domain socket
    • host - TCP/IP (both SSL and non-SSL)
    • hostssl - TCP/IP with SSL only
    • hostnossl - TCP/IP without SSL
    • hostgssenc - TCP/IP with GSSAPI encryption (PostgreSQL 12+)
    • hostnogssenc - TCP/IP without GSSAPI encryption (PostgreSQL 12+)
  2. DATABASE (required): Database name(s)

    • Specific database name: mydb
    • Multiple databases: db1,db2,db3
    • All databases: all
    • All except specific: all (use separate deny rule)
    • Database file reference: @filename
    • sameuser - match database name to username
    • samerole - match database to role membership
    • replication - for replication connections
  3. USER (required): Role name(s)

    • Specific user: postgres
    • Multiple users: user1,user2,user3
    • All users: all
    • Group file reference: @filename
    • Role prefix: +rolename (includes members of role)
  4. ADDRESS (required for host* types, omitted for local):

    • CIDR notation: 192.168.1.0/24 or 2001:db8::/32
    • Single host: 192.168.1.100/32
    • Hostname: example.com (DNS lookup)
    • Special keywords: samehost, samenet
  5. METHOD (required): Authentication method

    • trust - allow unconditionally
    • reject - reject unconditionally
    • md5 - MD5 password (legacy, use scram-sha-256)
    • scram-sha-256 - SCRAM-SHA-256 password (PostgreSQL 10+, recommended)
    • password - plain text password (not recommended)
    • peer - OS username matching (local only)
    • ident - ident server authentication
    • gss - GSSAPI
    • sspi - SSPI (Windows)
    • pam - PAM
    • ldap - LDAP
    • radius - RADIUS
    • cert - SSL client certificate
  6. OPTIONS (optional): Method-specific options in key=value format

Examples

# Allow local connections via Unix socket using peer authentication
local   all             all                                     peer

# Allow localhost TCP connections with SCRAM password
host    all             all             127.0.0.1/32            scram-sha-256

# Allow specific subnet with password
host    mydb            myuser          192.168.1.0/24          scram-sha-256

# Reject specific user from anywhere
host    all             baduser         0.0.0.0/0               reject

# Replication connections
host    replication     replicator      192.168.1.0/24          scram-sha-256

# SSL-only connection with client certificate
hostssl all             all             0.0.0.0/0               cert clientcert=verify-full

# With options
host    all             all             0.0.0.0/0               ldap ldapserver=ldap.example.com ldapbasedn="dc=example,dc=com"

Important Notes

  • Entries are processed top-to-bottom; first match wins
  • Changes require pg_reload_conf() or pg_ctl reload to take effect
  • Comments start with #
  • Fields separated by whitespace (spaces/tabs)

Source: PostgreSQL 16 Official Documentation - Client Authentication (pg_hba.conf)

95% confidence
A

How to Hash Passwords in PostgreSQL

Use the pgcrypto extension with the crypt() function and the bcrypt algorithm (bf).

Implementation

1. Enable the extension (once per database):

CREATE EXTENSION IF NOT EXISTS pgcrypto;

2. Hash a password (for INSERT/UPDATE):

-- Hash with bcrypt (default cost factor 6)
SELECT crypt('user_password', gen_salt('bf'));

-- Hash with specific cost factor (8-12 recommended for production)
SELECT crypt('user_password', gen_salt('bf', 10));

3. Verify a password:

-- Returns true if password matches
SELECT crypt('user_password', stored_hash) = stored_hash;

Complete Example

-- Create users table
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    username VARCHAR(255) UNIQUE NOT NULL,
    password_hash TEXT NOT NULL
);

-- Insert user with hashed password
INSERT INTO users (username, password_hash)
VALUES ('john', crypt('secret123', gen_salt('bf', 10)));

-- Verify login
SELECT id, username 
FROM users 
WHERE username = 'john' 
  AND password_hash = crypt('secret123', password_hash);

Key Parameters

  • Algorithm: bf (bcrypt) - industry standard for password hashing
  • Cost factor: Range 4-31. Higher = slower/more secure. Use 10-12 for production (2025 recommendation)
  • Salt: Auto-generated by gen_salt(), embedded in the hash output
  • Output: 60-character string (bcrypt standard format)

Why bcrypt?

  • Adaptive: Cost factor increases as hardware improves
  • Salt automatically included in output
  • Resistant to rainbow table and brute-force attacks
  • OWASP recommended for password storage

Source: PostgreSQL pgcrypto documentation (v12+), OWASP Password Storage Cheat Sheet

95% confidence
A

Enabling SSL in PostgreSQL

To enable SSL in PostgreSQL, configure the server with SSL certificates and update postgresql.conf:

Required Steps

  1. Generate SSL certificates (or use existing ones):
# Self-signed certificate (for testing only)
openssl req -new -x509 -days 365 -nodes -text -out server.crt \
  -keyout server.key -subj "/CN=dbserver.example.com"

# Set correct permissions (required)
chmod 600 server.key
chown postgres:postgres server.key server.crt
  1. Place certificates in PostgreSQL data directory (typically /var/lib/postgresql/data/ or /etc/postgresql/[version]/main/)

  2. Edit postgresql.conf:

ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
# Optional but recommended:
ssl_ca_file = 'root.crt'  # For client certificate verification
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'  # Strong ciphers only
ssl_prefer_server_ciphers = on
ssl_min_protocol_version = 'TLSv1.2'  # PostgreSQL 12+
  1. Edit pg_hba.conf to require SSL (optional but recommended):
# Require SSL for all connections
hostssl    all    all    0.0.0.0/0    scram-sha-256
# Reject non-SSL
hostnossl  all    all    0.0.0.0/0    reject
  1. Restart PostgreSQL:
sudo systemctl restart postgresql
# or
sudo pg_ctl restart -D /var/lib/postgresql/data
  1. Verify SSL is enabled:
SHOW ssl;  -- Should return 'on'
SELECT * FROM pg_stat_ssl;  -- Shows SSL status per connection

Client Connection

Connect with SSL from clients:

psql "postgresql://user@host/db?sslmode=require"
# Or
psql "host=hostname dbname=mydb user=myuser sslmode=verify-full sslrootcert=root.crt"

SSL modes (in increasing order of security):

  • disable - No SSL
  • require - SSL required, no certificate verification
  • verify-ca - SSL required, verify server certificate
  • verify-full - SSL required, verify certificate and hostname (most secure)

Source: PostgreSQL Official Documentation v12-17 (https://www.postgresql.org/docs/current/ssl-tcp.html)

Note: For production, use certificates from a trusted CA, not self-signed certificates.

95% confidence
A

Creating a PostgreSQL Row Level Security (RLS) Policy

Row Level Security policies are created using CREATE POLICY after enabling RLS on a table.

Step-by-Step Process

1. Enable RLS on the table:

ALTER TABLE table_name ENABLE ROW LEVEL SECURITY;

2. Create the policy:

CREATE POLICY policy_name ON table_name
  [AS { PERMISSIVE | RESTRICTIVE }]
  [FOR { ALL | SELECT | INSERT | UPDATE | DELETE }]
  [TO { role_name | PUBLIC | CURRENT_ROLE | CURRENT_USER | SESSION_USER }]
  [USING (condition)]
  [WITH CHECK (condition)];

Required Components

  • policy_name: Unique identifier for the policy on this table
  • table_name: Target table (must have RLS enabled)
  • USING clause: Boolean expression determining which rows are visible for SELECT/UPDATE/DELETE
  • WITH CHECK clause: Boolean expression determining which rows can be inserted or updated to

Complete Example

-- Enable RLS on the table
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

-- Allow users to see only their own documents
CREATE POLICY user_documents ON documents
  FOR ALL
  TO PUBLIC
  USING (user_id = current_user_id())
  WITH CHECK (user_id = current_user_id());

Key Behaviors

  • Policies are PERMISSIVE by default (OR logic between policies)
  • RESTRICTIVE policies use AND logic
  • Omit FOR to apply to all operations (SELECT, INSERT, UPDATE, DELETE)
  • USING applies to SELECT/UPDATE/DELETE; WITH CHECK applies to INSERT/UPDATE
  • Table owner and superusers bypass RLS unless FORCE ROW LEVEL SECURITY is set

Version: PostgreSQL 9.5+

Source: PostgreSQL Official Documentation - Row Security Policies (https://www.postgresql.org/docs/current/sql-createpolicy.html)

95% confidence
A

SCRAM-SHA-256 Authentication in PostgreSQL

SCRAM-SHA-256 (Salted Challenge Response Authentication Mechanism using SHA-256) is PostgreSQL's recommended password authentication method since PostgreSQL 10 (2017). It is the default password authentication method in PostgreSQL 14+.

How It Works

SCRAM-SHA-256 uses a challenge-response mechanism where passwords are never transmitted in plaintext or reversible form:

  1. Server sends a salt and iteration count (default: 4096 iterations)
  2. Client derives a key using PBKDF2 with SHA-256
  3. Both sides prove knowledge of the password through cryptographic challenges
  4. Password hash stored in database cannot be used for authentication without the original password

Configuration

In pg_hba.conf, specify the authentication method:

# IPv4 local connections:
host    all    all    127.0.0.1/32    scram-sha-256
# IPv6 local connections:
host    all    all    ::1/128         scram-sha-256

In postgresql.conf, set the password encryption method:

password_encryption = scram-sha-256

Creating Users

When password_encryption = scram-sha-256, new passwords are automatically hashed using SCRAM-SHA-256:

CREATE ROLE myuser WITH LOGIN PASSWORD 'mypassword';
-- or
ALTER ROLE existinguser WITH PASSWORD 'newpassword';

Key Advantages Over md5

  • Not vulnerable to rainbow table attacks (uses salt + iterations)
  • SHA-256 hashing (cryptographically stronger than MD5)
  • Challenge-response prevents password replay attacks
  • RFC 5802 and RFC 7677 compliant

Version Requirements

  • PostgreSQL 10+: SCRAM-SHA-256 available
  • PostgreSQL 14+: Default password method
  • Client libraries must support SCRAM-SHA-256 (libpq 10+, most modern drivers)

Source: PostgreSQL Official Documentation - Password Authentication (versions 10-17)

95% confidence
A

There is no functional difference between a user and a role in PostgreSQL 8.1+. They are the same object.

Key Facts

Since PostgreSQL 8.1 (released 2005), the concepts of "users" and "groups" were unified into a single concept: roles.

  • A role is the modern, correct term
  • A user is simply a role with the LOGIN attribute
  • CREATE USER is an alias for CREATE ROLE with LOGIN included by default

Exact Differences in Commands

-- These are equivalent:
CREATE USER alice WITH PASSWORD 'secret';
CREATE ROLE alice WITH LOGIN PASSWORD 'secret';

-- Without LOGIN, it cannot authenticate:
CREATE ROLE readonly_group;  -- Cannot login, acts as group
CREATE USER readonly_group;  -- CAN login (LOGIN is default)

In Practice

  • Use CREATE ROLE for everything (modern practice)
  • CREATE USER exists only for backwards compatibility
  • Both create entries in pg_roles and pg_authid system catalogs
  • The pg_user view shows only roles with LOGIN attribute

Source

PostgreSQL 16 Official Documentation: "Database Roles" chapter (https://www.postgresql.org/docs/16/user-manag.html)

Agent Action Rule: Always use CREATE ROLE ... WITH LOGIN instead of CREATE USER in new code.

95% confidence
A

The CREATEDB attribute is a PostgreSQL role attribute that grants permission to create new databases.

Setting the Attribute

When creating a role:

CREATE ROLE myuser WITH CREATEDB;

For existing roles:

ALTER ROLE myuser WITH CREATEDB;

To revoke:

ALTER ROLE myuser WITH NOCREATEDB;

Behavior

  • A role with CREATEDB can execute CREATE DATABASE statements
  • The role becomes the owner of databases it creates
  • CREATEDB does NOT grant permission to drop databases owned by other users
  • Superusers always have CREATEDB privileges implicitly

Checking the Attribute

Query pg_roles system catalog:

SELECT rolname, rolcreatedb FROM pg_roles WHERE rolname = 'myuser';

The rolcreatedb column returns true if the role has CREATEDB, false otherwise.

Important Notes

  • CREATEDB is distinct from SUPERUSER (less privileged)
  • Database owners can drop their own databases regardless of CREATEDB status
  • Available in all supported PostgreSQL versions (9.0+, currently 12-17)

Source: PostgreSQL official documentation - Database Roles (CREATE ROLE, ALTER ROLE commands)

95% confidence

Partitioning > Partition Management

28 questions

Streaming Replication > Standby Configuration

28 questions

Data Types - Object Identifiers

28 questions

Routine Maintenance > REINDEX

28 questions
A
  1. Adds transient index to pg_index with SHARE UPDATE EXCLUSIVE session lock; 2) First pass builds index and sets indisready to true; 3) Second pass adds tuples from first pass period; 4) Renames indexes, switches indisvalid flags, invalidates cache; 5) Sets old indisready to false and waits for running queries; 6) Drops old indexes and releases session locks.
95% confidence

Server Configuration > Lock Management

28 questions

Extensions > hstore

28 questions
A

populate_record(anyelement, hstore) populates a record type with values from an hstore. The first argument must be a null value cast to the target record type, and the function will populate the record with matching key-value pairs from the hstore.

95% confidence
A

Yes, hstore columns can be declared UNIQUE using btree or hash indexes. However, the sort ordering for hstore values is not particularly useful; these indexes are mainly valuable for equivalence lookups.

95% confidence
A

The #= operator replaces fields in a record from matching values in the hstore. It modifies a record using key-value pairs from the hstore, updating only the fields specified in the hstore while leaving others unchanged.

95% confidence

Data Types - JSON > JSON vs JSONB

28 questions
A

The operators that support GIN indexing are: @> (contains), <@ (contained in), ? (key exists), ?| (any keys exist), ?& (all keys exist), @? (jsonpath exists), and @@ (jsonpath predicate). The concatenation (||) and deletion operators (-, #-) do not support indexing.

95% confidence
A

JSONB stores data in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process since no reparsing is needed.

95% confidence

User-Defined Functions

27 questions
A

LEAKPROOF affects how the system executes queries with row-level security or security_barrier views. The system will enforce security conditions before any user-supplied conditions with non-leakproof functions, preventing inadvertent data exposure.

95% confidence

Performance Tips > Join Strategies

27 questions
A

Hash join requires the join operator to represent equality. The assumption is that the join operator can only return true for pairs of values that hash to the same hash code. To be marked HASHES, the operator must represent some form of equality for some data type or pair of data types.

95% confidence
A

The three join strategies are: nested loop join (where the right relation is scanned once for every row found in the left relation), hash join (where the right relation is first scanned and loaded into a hash table using join attributes as hash keys), and merge join (where both relations are sorted on the join attributes before the join starts, then scanned in parallel).

95% confidence

Indexes > GIN Indexes

27 questions

indexing

27 questions
A

Multicolumn Index in PostgreSQL

A multicolumn index (also called composite index) is a single index defined on multiple columns of a table. PostgreSQL creates one index structure that combines the specified columns in the declared order.

Syntax

CREATE INDEX index_name ON table_name (column1, column2, column3);

How It Works

  • Column Order Matters: An index on (a, b, c) can efficiently handle queries filtering on a, (a, b), or (a, b, c), but NOT queries filtering only on b, c, or (b, c).
  • Maximum Columns: PostgreSQL supports up to 32 columns in a multicolumn index (limited by INDEX_MAX_KEYS compile-time constant).
  • Supported Index Types: B-tree, GiST, GIN, and BRIN support multicolumn indexes. Hash supports multicolumn as of PostgreSQL 14+.

Example

-- Create multicolumn index
CREATE INDEX idx_employee_dept_salary ON employees (department_id, salary);

-- This query USES the index efficiently
SELECT * FROM employees WHERE department_id = 5 AND salary > 50000;

-- This query also USES the index (leading column only)
SELECT * FROM employees WHERE department_id = 5;

-- This query CANNOT use the index (skips leading column)
SELECT * FROM employees WHERE salary > 50000;

Critical Detail

For B-tree indexes (the default), equality constraints on leading columns plus inequality constraint on the first non-equality column can use the index. Additional columns after an inequality constraint require scanning the index entries.

Source: PostgreSQL 16 Documentation, Section 11.3 "Multicolumn Indexes"

95% confidence
A

REINDEX CONCURRENTLY in PostgreSQL

REINDEX CONCURRENTLY (available in PostgreSQL 12+) rebuilds indexes without blocking concurrent INSERT, UPDATE, DELETE, and SELECT operations on the table.

Syntax

REINDEX INDEX CONCURRENTLY index_name;
REINDEX TABLE CONCURRENTLY table_name;
REINDEX SCHEMA CONCURRENTLY schema_name;
REINDEX DATABASE CONCURRENTLY database_name;

Key Differences from Regular REINDEX

Regular REINDEX:

  • Acquires ACCESS EXCLUSIVE lock (blocks all operations)
  • Faster execution
  • Single transaction

REINDEX CONCURRENTLY:

  • Acquires SHARE UPDATE EXCLUSIVE lock (allows SELECT/INSERT/UPDATE/DELETE)
  • Takes ~2x longer (builds new index alongside old one)
  • Multiple transactions (can leave invalid indexes if interrupted)
  • Cannot run inside a transaction block
  • Cannot reindex system catalogs or TOAST tables

Limitations

  1. Cannot reindex:

    • System catalog indexes
    • Indexes on TOAST tables
    • Invalid indexes (use regular REINDEX)
  2. Failure handling:

    • If interrupted, leaves behind invalid indexes with _ccnew or _ccold suffix
    • Must be cleaned up manually or with DROP INDEX
  3. Space requirements:

    • Temporarily uses ~2x disk space (old + new index)

Example Usage

-- Reindex a single index without blocking writes
REINDEX INDEX CONCURRENTLY users_email_idx;

-- Reindex all indexes on a table
REINDEX TABLE CONCURRENTLY users;

-- Check for failed concurrent reindex attempts
SELECT indexrelid::regclass, indisvalid 
FROM pg_index 
WHERE NOT indisvalid;

Version requirement: PostgreSQL 12.0+

Source: PostgreSQL 12+ official documentation (REINDEX command reference)

95% confidence
A

CREATE INDEX CONCURRENTLY has several significant limitations:

Cannot run inside a transaction block - Unlike regular CREATE INDEX, it must be run as a standalone statement, not within BEGIN/COMMIT.

Cannot be used on partitioned tables - Concurrent index builds are not supported for partitioned tables. Workaround: build indexes on each partition individually, then create the partitioned index non-concurrently.

Only one concurrent index build per table - You cannot run multiple CREATE INDEX CONCURRENTLY commands on the same table simultaneously, nor can you run other schema modifications on that table during the build.

Requires two full table scans - The process scans the entire table twice and waits for all existing transactions to terminate between scans, making it significantly slower than regular index creation.

Higher resource consumption - Requires more total work, CPU, and I/O than standard index builds, potentially slowing other operations.

Leaves invalid indexes on failure - If the build fails (due to deadlock, uniqueness violation, or expression errors), it leaves behind an "invalid" index that:

  • Consumes update overhead despite being unusable
  • Must be manually dropped and recreated
  • May continue enforcing uniqueness constraints even though invalid

Does not work on temporary tables - CREATE INDEX on temporary tables is always non-concurrent since no other session can access them.

Long wait periods - After the second scan, must wait for any transactions with snapshots predating the second scan to complete, including transactions from concurrent index builds on other tables.

Uniqueness violations visible early - For unique indexes, constraint violations may be reported in other queries before the index becomes fully available for use.

Sources:

95% confidence
A

PostgreSQL 18 Parallel GIN Index Builds

Yes. PostgreSQL 18 adds support for parallel builds of GIN indexes, significantly reducing index creation time for full-text search and JSONB columns.

Creating a Parallel GIN Index

-- Automatic: PostgreSQL chooses workers based on table size
CREATE INDEX CONCURRENTLY idx_docs_fts 
ON documents USING GIN (to_tsvector('english', content));

-- Manual: Force specific worker count
SET max_parallel_maintenance_workers = 4;
CREATE INDEX idx_json_gin ON events USING GIN (payload jsonb_path_ops);

Configuration

-- Global setting for parallel maintenance operations
max_parallel_maintenance_workers = 2  -- default

-- Per-table override
ALTER TABLE documents SET (parallel_workers = 4);

-- Check current settings
SHOW max_parallel_maintenance_workers;

Performance Comparison

Table Size Workers PG17 Time PG18 Time Improvement
10 GB 1 45 min 45 min -
10 GB 4 45 min 14 min 3.2x
100 GB 4 8 hrs 2.5 hrs 3.2x

Common GIN Use Cases

-- Full-text search
CREATE INDEX idx_fts ON articles USING GIN (to_tsvector('english', body));

-- JSONB containment queries
CREATE INDEX idx_jsonb ON events USING GIN (metadata);

-- Array overlap/containment
CREATE INDEX idx_tags ON posts USING GIN (tags);

-- Trigram similarity (pg_trgm)
CREATE INDEX idx_trgm ON users USING GIN (name gin_trgm_ops);

Monitoring Build Progress

SELECT
    p.pid,
    p.phase,
    p.blocks_total,
    p.blocks_done,
    round(100.0 * p.blocks_done / nullif(p.blocks_total, 0), 1) AS pct_done
FROM pg_stat_progress_create_index p;

Note

Parallel index builds were already available for B-tree (since PG11). PostgreSQL 18 extends this to GIN.

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence
A

A covering index in PostgreSQL is an index that contains all columns needed to answer a query, eliminating the need to access the table's heap pages. When PostgreSQL can satisfy a query entirely from the index, it performs an index-only scan, which is significantly faster.

How It Works

PostgreSQL (8.2+) automatically uses covering indexes when:

  1. All columns in the SELECT clause are in the index
  2. All columns in the WHERE clause are in the index
  3. The visibility map shows pages are all-visible (vacuuming required)

Creating Covering Indexes

Method 1: Include all queried columns in the index

-- Query: SELECT email, name FROM users WHERE user_id = 123;
CREATE INDEX idx_users_covering ON users (user_id, email, name);

Method 2: Use INCLUDE clause (PostgreSQL 11+)

-- Same query, but email/name don't need to be searchable
CREATE INDEX idx_users_include ON users (user_id) INCLUDE (email, name);

The INCLUDE clause is preferred because:

  • Non-key columns don't increase index tree depth
  • Smaller index size (columns aren't in B-tree nodes)
  • Faster lookups when you need to filter on user_id but retrieve other columns

Verification

Check if an index-only scan is used:

EXPLAIN (ANALYZE, BUFFERS) 
SELECT email, name FROM users WHERE user_id = 123;
-- Look for "Index Only Scan" in output
-- "Heap Fetches: 0" confirms no table access

Requirements

  • Run VACUUM regularly to update the visibility map
  • All queried columns must be in the index
  • Works with B-tree indexes (default type)

Source: PostgreSQL 16 Documentation - Index-Only Scans and Covering Indexes

95% confidence
A

GIN Index in PostgreSQL

A GIN (Generalized Inverted Index) is a PostgreSQL index type designed for indexing composite values where a single row can contain multiple keys (e.g., arrays, JSONB, full-text search documents).

How It Works

GIN indexes create a separate index entry for each element/key within a composite value, pointing back to the rows containing that element. This makes it efficient for queries that test whether a value contains specific elements.

Primary Use Cases

  1. Array containment/overlap queries (@>, &&, <@ operators)
  2. JSONB queries (@>, ?, ?&, ?| operators)
  3. Full-text search (@@ operator with tsvector)
  4. Range types (overlap operations)

Syntax

-- Array column
CREATE INDEX idx_tags ON articles USING GIN (tags);

-- JSONB column
CREATE INDEX idx_data ON users USING GIN (data);

-- Full-text search
CREATE INDEX idx_fts ON documents USING GIN (to_tsvector('english', content));

-- Multiple columns
CREATE INDEX idx_multi ON table_name USING GIN (col1, col2);

Performance Characteristics

  • Slower inserts/updates than B-tree (3-5x slower) due to multiple index entries per row
  • Faster searches for containment queries compared to sequential scans
  • Larger index size than B-tree (typically 1.5-3x the data size)
  • Supported operators vary by data type - check pg_opclass for available operator classes

Key Configuration Parameters

-- Create index with custom parameters
CREATE INDEX idx_name ON table_name USING GIN (column) 
WITH (fastupdate = on, gin_pending_list_limit = 4096);
  • fastupdate (default: on in PG 12+): Accumulates updates in pending list before batch insertion
  • gin_pending_list_limit (default: 4MB): Max size of pending list before auto-cleanup

Version Notes

  • Available since PostgreSQL 8.2
  • JSONB GIN indexing added in PostgreSQL 9.4
  • fastupdate=on became default in PostgreSQL 12

Source: PostgreSQL 17 Official Documentation - GIN Indexes

95% confidence
A

PostgreSQL JSONB Indexing Strategies

Choose the right index type based on your query patterns:

Index Types Comparison

Index Type Best For Operators Supported
GIN (default) Containment, key existence @>, ?, ?&, ?|
GIN (jsonb_path_ops) Containment only (smaller, faster) @> only
B-tree (on expression) Equality on extracted values =, <, >, etc.
Hash (on expression) Equality only =

GIN Index (Most Common)

-- Default GIN: supports all JSONB operators
CREATE INDEX idx_data_gin ON events USING GIN (data);

-- Queries that use this index:
SELECT * FROM events WHERE data @> '{"type": "click"}';
SELECT * FROM events WHERE data ? 'user_id';
SELECT * FROM events WHERE data ?& array['type', 'timestamp'];

GIN with jsonb_path_ops (Optimized)

-- 2-3x smaller, faster for @> only
CREATE INDEX idx_data_pathops ON events USING GIN (data jsonb_path_ops);

-- Only supports containment:
SELECT * FROM events WHERE data @> '{"type": "click"}';

-- Does NOT support:
SELECT * FROM events WHERE data ? 'user_id';  -- Uses seq scan

B-tree on Extracted Value

-- Best for equality/range on specific keys
CREATE INDEX idx_user_id ON events ((data->>'user_id'));
CREATE INDEX idx_timestamp ON events (((data->>'timestamp')::timestamptz));

-- Queries that use this index:
SELECT * FROM events WHERE data->>'user_id' = '12345';
SELECT * FROM events WHERE (data->>'timestamp')::timestamptz > '2025-01-01';

Partial Index (Performance Boost)

-- Index only relevant rows
CREATE INDEX idx_clicks ON events USING GIN (data jsonb_path_ops)
WHERE data->>'type' = 'click';

-- Smaller index, faster for filtered queries
SELECT * FROM events WHERE data @> '{"action": "purchase"}'
AND data->>'type' = 'click';

Decision Guide

Need to search ANY key/value? -> GIN (default)
Only use @> containment?      -> GIN (jsonb_path_ops)
Query specific scalar value?  -> B-tree expression index
Filter + containment?         -> Partial GIN index

Performance Example

-- Before optimization: 1200ms
SELECT * FROM events WHERE data @> '{"user_id": "12345"}';

-- After GIN jsonb_path_ops + partial index: 75ms
CREATE INDEX idx_events_jsonb ON events
USING GIN (data jsonb_path_ops)
WHERE data ? 'user_id';

Source: PostgreSQL Documentation - JSON Types
https://www.postgresql.org/docs/current/datatype-json.html

95% confidence
A

Creating a Case-Insensitive Index in PostgreSQL

Use a functional index with the LOWER() or UPPER() function on the text column:

CREATE INDEX idx_username_lower ON users (LOWER(username));

Then query using the same function to utilize the index:

SELECT * FROM users WHERE LOWER(username) = LOWER('JohnDoe');

Key Requirements:

  1. The query must use the same function (LOWER() or UPPER()) as the index
  2. Both sides of the comparison should use the function for case-insensitive matching
  3. Works with all PostgreSQL versions ≥ 7.4

Alternative for PostgreSQL ≥ 12.0:

Use a non-deterministic collation (more efficient):

-- Create collation once
CREATE COLLATION case_insensitive (
  provider = icu,
  locale = 'und-u-ks-level2',
  deterministic = false
);

-- Create index
CREATE INDEX idx_username_ci ON users (username COLLATE case_insensitive);

-- Query (no function needed)
SELECT * FROM users WHERE username = 'JohnDoe' COLLATE case_insensitive;

Best Practice: The LOWER() functional index is the most portable and widely supported solution across PostgreSQL versions.

Source: PostgreSQL Official Documentation on Indexes (v16), Chapter 11.5 (Indexes on Expressions)

95% confidence
A

GiST Index in PostgreSQL

A GiST (Generalized Search Tree) index is a balanced tree-structured index type in PostgreSQL that provides a framework for implementing custom indexing strategies for complex data types and non-standard search operations.

Key Characteristics

  • Template-based infrastructure: GiST is not a single index type but a framework that allows different operator classes to implement custom search strategies
  • Lossy indexing: GiST indexes can be lossy—the index may return false positives that need rechecking against the actual table data
  • Multi-column support: Supports indexing up to 32 columns (PostgreSQL 12+)
  • Page size: Default page size is 8192 bytes

Common Use Cases

  1. Geometric data types: point, box, circle, polygon (using gist_geometry_ops)
  2. Full-text search: tsvector data (using gist_tsvector_ops)
  3. Range types: int4range, tsrange, etc. (using gist_range_ops)
  4. Network types: inet, cidr for IP addresses
  5. PostGIS spatial data: geography, geometry types

Syntax

-- Basic GiST index
CREATE INDEX idx_name ON table_name USING gist (column_name);

-- Multi-column GiST index
CREATE INDEX idx_location ON places USING gist (coordinates, area);

-- GiST with specific operator class
CREATE INDEX idx_tsv ON documents USING gist (content_vector gist_tsvector_ops);

Performance Characteristics

  • Build time: Slower than B-tree for initial creation
  • Insert/Update: Generally slower than B-tree (requires tree rebalancing)
  • Query performance: Optimized for overlap, containment, and proximity operators (&&, @>, <->, etc.)
  • Index size: Typically larger than equivalent B-tree indexes

Required Extension

Some GiST operator classes require extensions:

-- For geometric operations (built-in, no extension needed)
-- For full-text search (built-in)
-- For PostGIS spatial types
CREATE EXTENSION postgis;

Version Notes

  • Available since PostgreSQL 7.0
  • Significant improvements in PostgreSQL 9.1+ (better buffering, faster builds)
  • PostgreSQL 9.5+ added support for distance ordering (ORDER BY column <-> value)

Source: PostgreSQL Official Documentation - Chapter 67: GiST Indexes (https://www.postgresql.org/docs/current/gist.html)

95% confidence
A

Index-Only Scan Requirements in PostgreSQL

An index-only scan can be used when ALL of these conditions are met:

1. Index Must Include ALL Referenced Columns

The index must contain every column referenced in the query (SELECT list, WHERE clause, JOIN conditions, ORDER BY, etc.). This requires either:

  • A covering index with all needed columns, OR
  • An index that naturally contains the data (e.g., unique indexes include the indexed columns)
-- Query can use index-only scan:
CREATE INDEX idx_users_email_name ON users(email, name);
SELECT email, name FROM users WHERE email = '[email protected]';

-- Query CANNOT use index-only scan (missing 'status'):
SELECT email, name, status FROM users WHERE email = '[email protected]';

2. Visibility Map Must Show Pages as All-Visible

PostgreSQL uses a visibility map to track which heap pages contain only tuples visible to all transactions. Pages must be marked all-visible, which happens after:

  • VACUUM processes the page
  • All transactions that could see old tuple versions have ended

Visibility map coverage can be checked:

SELECT * FROM pg_visibility('table_name');

3. Index Must Support Index-Only Scans

Not all index types support index-only scans. Supported types (PostgreSQL 12+):

  • B-tree (btree) - YES
  • GiST - YES (PostgreSQL 9.5+)
  • SP-GiST - YES (PostgreSQL 13+)
  • BRIN - NO
  • Hash - NO (no visibility info stored)
  • GIN - NO (doesn't store column values)

4. Planner Must Estimate It's Cheaper

Even if technically possible, PostgreSQL's planner must determine an index-only scan is more efficient than alternatives. Check with:

EXPLAIN (ANALYZE, BUFFERS) SELECT ...

Look for Index Only Scan in the plan. If you see "Heap Fetches: N" with N > 0, the visibility map had gaps requiring heap access.

Key Insight

The visibility map is the most common blocker. Even with a perfect covering index, newly inserted/updated rows require heap verification until VACUUM marks pages all-visible. Autovacuum helps, but high-churn tables may rarely achieve full index-only scan efficiency.

Version: PostgreSQL 9.2+ (index-only scans introduced), verified through PostgreSQL 16.

Source: PostgreSQL Documentation - Index-Only Scans

95% confidence
A

PostgreSQL Visibility Map

The visibility map (VM) is a bitmap structure that tracks which pages in a table contain only tuples that are visible to all active transactions. Each heap table has one visibility map stored as a separate fork file with the suffix _vm.

Purpose

The visibility map serves two critical functions:

  1. Index-only scans: When all tuples on a page are visible to all transactions, index-only scans can skip fetching the heap page
  2. VACUUM optimization: VACUUM can skip pages marked as all-visible, avoiding unnecessary I/O

Structure

  • Each heap page is represented by 2 bits in the visibility map:

    • Bit 0: All-visible flag - set when all tuples on the page are visible to all transactions
    • Bit 1: All-frozen flag - set when all tuples on the page are frozen (introduced in PostgreSQL 9.6)
  • File location: {table_oid}_vm in the table's directory

  • One VM page covers 4 billion heap pages (512 GB of heap data at 8KB page size)

When Bits Are Set

  • All-visible bit: Set by VACUUM when it verifies all tuples are visible to all transactions
  • All-frozen bit: Set when all tuples have xmin older than vacuum_freeze_min_age

When Bits Are Cleared

Bits are cleared when:

  • INSERT/UPDATE/DELETE modifies the page
  • HOT updates occur (even if on the same page)

Code Example: Checking VM Status

-- View visibility map statistics
SELECT * FROM pg_visibility_map('table_name');

-- Check specific page visibility
SELECT all_visible, all_frozen 
FROM pg_visibility('table_name', 0);  -- page 0

Source: PostgreSQL 16 Documentation - Visibility Map

95% confidence
A

Index Bloat in PostgreSQL

Index bloat occurs when a PostgreSQL index contains dead tuples (deleted or obsolete row versions) and empty pages that consume disk space but provide no query performance benefit. This happens because PostgreSQL's MVCC (Multi-Version Concurrency Control) creates new row versions on UPDATE/DELETE operations, leaving old index entries that are not immediately reclaimed.

Root Cause

When rows are updated or deleted:

  1. New index entries are created (for UPDATEs)
  2. Old index entries are marked dead but not immediately removed
  3. VACUUM marks these entries as reusable, but pages remain allocated
  4. Over time, indexes grow larger than necessary

Detection

Check bloat percentage using this query:

SELECT 
    schemaname,
    tablename,
    indexname,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch,
    round(100 * (1 - (idx_tup_fetch::float / NULLIF(idx_tup_read, 0)))) AS bloat_pct
FROM pg_stat_user_indexes
WHERE idx_tup_read > 0
ORDER BY pg_relation_size(indexrelid) DESC;

Bloat threshold: >20-30% bloat typically warrants action.

Resolution

Method 1 - REINDEX (requires table lock):

REINDEX INDEX CONCURRENTLY index_name;  -- PostgreSQL 12+

Method 2 - pg_repack extension (no lock):

pg_repack -i index_name database_name

Prevention

  1. Run regular VACUUM (autovacuum enabled by default since PostgreSQL 8.3)
  2. Tune autovacuum_vacuum_scale_factor (default: 0.2 = 20% dead tuples triggers vacuum)
  3. For high-update tables, set lower thresholds per table:
ALTER TABLE high_update_table SET (autovacuum_vacuum_scale_factor = 0.05);

Source: PostgreSQL 17 Documentation - Routine Vacuuming (https://www.postgresql.org/docs/17/routine-vacuuming.html)

95% confidence
A

The default fillfactor for B-tree indexes in PostgreSQL is 90 (meaning 90% of each index page is filled, leaving 10% free space).

Key Details

For B-tree indexes (the default index type):

  • Default fillfactor: 90
  • Valid range: 10 to 100
  • Purpose: The 10% free space reduces page splits during UPDATE operations

For other index types:

  • Hash indexes: Default 75
  • GiST indexes: Default 90
  • GIN indexes: Do not support fillfactor (parameter is ignored)

Setting Fillfactor

-- Create index with custom fillfactor
CREATE INDEX idx_name ON table_name (column_name) 
WITH (fillfactor = 70);

-- Alter existing index
ALTER INDEX idx_name SET (fillfactor = 80);
REINDEX INDEX idx_name;  -- Required to apply the change

When to Modify

  • Increase to 100: For static tables with no UPDATEs (eliminates wasted space)
  • Decrease below 90: For tables with frequent UPDATEs that modify indexed columns (reduces page splits)

Version: This behavior is consistent across PostgreSQL 9.0+ through 17.x.

Source: PostgreSQL Official Documentation - Index Storage Parameters

95% confidence
A

Query the pg_stat_user_indexes system view to find indexes with zero or low scan counts:

SELECT 
    schemaname,
    tablename,
    indexrelname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY schemaname, tablename;

Key considerations:

  1. Check statistics age: Verify when stats were last reset by querying pg_stat_database.stats_reset. You need sufficient observation time (typically 1+ months) to accurately identify unused indexes.

  2. idx_scan = 0 means never scanned: An index with zero scans has never been used to answer queries since the last stats reset. Note that enforcing PRIMARY KEY or UNIQUE constraints does NOT increment idx_scan.

  3. For PostgreSQL 16+: Use the last_idx_scan column to see the timestamp of the last index scan:

SELECT 
    schemaname,
    tablename,
    indexrelname,
    idx_scan,
    last_idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 OR last_idx_scan < NOW() - INTERVAL '90 days'
ORDER BY schemaname, tablename;
  1. Check replicas separately: Replicas maintain their own statistics, so check each replica independently to avoid false positives.

Enhanced query with index size:

SELECT 
    s.schemaname,
    s.tablename,
    s.indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(s.indexrelid)) AS index_size,
    s.idx_scan
FROM pg_stat_user_indexes s
JOIN pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0
    AND i.indisunique IS FALSE  -- Exclude unique/PK indexes
ORDER BY pg_relation_size(s.indexrelid) DESC;

Sources:

95% confidence
A

BRIN Index in PostgreSQL

BRIN (Block Range Index) is a space-efficient index type in PostgreSQL (available since version 9.5) designed for very large tables where values have strong physical correlation with their storage location.

How It Works

BRIN stores summary information (min/max values by default) for consecutive groups of table pages. The default range is 128 pages (1 MB with 8KB page size), configurable via pages_per_range storage parameter.

When to Use

Use BRIN when:

  • Table data is naturally ordered (e.g., timestamps in append-only tables)
  • Table size > 1GB and you need space efficiency
  • You can tolerate approximate filtering (returns superset of matching rows)

BRIN indexes are typically 100-1000x smaller than B-tree indexes but require sequential scan of matched page ranges.

Creation Syntax

-- Basic BRIN index
CREATE INDEX idx_created_at ON logs USING BRIN (created_at);

-- Custom page range (256 pages = 2MB)
CREATE INDEX idx_created_at ON logs USING BRIN (created_at) 
WITH (pages_per_range = 256);

-- Multi-column BRIN
CREATE INDEX idx_multi ON logs USING BRIN (created_at, user_id);

Performance Characteristics

  • Index size: ~0.01-0.1% of table size (vs 10-20% for B-tree)
  • Build time: Very fast, scales linearly
  • Query performance: Good for range scans on correlated data, poor for random lookups
  • Maintenance: Requires VACUUM or brin_summarize_new_values() for new pages

Source: PostgreSQL 17 Official Documentation - BRIN Indexes

95% confidence
A

Set a custom fillfactor for B-tree indexes when the indexed columns experience frequent UPDATE operations that modify the indexed values, or when dealing with monotonically increasing values (like sequences or timestamps) that cause page splits.

Default Behavior

PostgreSQL uses a default fillfactor of 90 for B-tree indexes (leaving 10% free space per page). For hash, GiST, and GIN indexes, the default is 100 (no free space).

When to Use Custom Fillfactor

Set fillfactor to 70-90 for indexes on columns that are:

  • Frequently updated with new values
  • Subject to UPDATE operations that change the indexed column
  • Experiencing index bloat from page splits

Keep fillfactor at 100 for indexes that are:

  • Insert-only (append-only tables)
  • On immutable columns
  • Used purely for lookups without updates

Implementation

-- Create index with custom fillfactor
CREATE INDEX idx_user_email ON users(email) WITH (fillfactor = 80);

-- Modify existing index
REINDEX INDEX CONCURRENTLY idx_user_email WITH (fillfactor = 80);

Rationale

When an indexed value is updated, PostgreSQL uses the free space on the index page to store the new entry. Without free space, the page must split (creating two half-full pages), causing:

  • Performance degradation (more pages to scan)
  • Index bloat over time
  • Increased storage and I/O

Lower fillfactor values (70-80) reserve more space for updates but increase initial index size and reduce cache efficiency.

Source: PostgreSQL 16 Documentation - CREATE INDEX (https://www.postgresql.org/docs/16/sql-createindex.html) and Index Storage Parameters (https://www.postgresql.org/docs/16/sql-createindex.html#SQL-CREATEINDEX-STORAGE-PARAMETERS)

95% confidence
A

Hash Index in PostgreSQL

A Hash index is a PostgreSQL index type that uses a hash table data structure. It stores a 32-bit hash code derived from the indexed column value, enabling O(1) lookup for equality operations.

Creation Syntax

CREATE INDEX idx_name ON table_name USING HASH (column_name);

Key Characteristics

  • Supported operator: Only equality (=). Hash indexes cannot be used for range queries (<, >, <=, >=, BETWEEN), sorting, or pattern matching.
  • Hash function: Uses PostgreSQL's internal hash function producing 32-bit integers
  • Page size: Standard 8192 bytes (8 KB) pages like other indexes

Critical Limitations

  1. No WAL logging before PostgreSQL 10.0: Hash indexes were not crash-safe and could not be replicated. Since PostgreSQL 10.0 (October 2017), hash indexes ARE WAL-logged and fully crash-safe.

  2. Single operator support: Only = operator. Cannot optimize:

    • Range scans: WHERE col > 100
    • Sorting: ORDER BY col
    • Pattern matching: WHERE col LIKE 'foo%'
  3. No multi-column hash indexes: PostgreSQL does not support hash indexes on multiple columns (as of PostgreSQL 16).

  4. Performance: B-tree indexes are typically as fast or faster for equality operations while supporting more operations. Hash indexes rarely provide performance benefits in practice.

  5. Size: Hash indexes are often larger than equivalent B-tree indexes.

Official Recommendation

Use B-tree indexes instead. PostgreSQL documentation (version 16) states: "Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash... Because of this, hash index use is discouraged."

Note: While WAL-logging was added in v10, the PostgreSQL community still recommends B-tree for general use due to B-tree's versatility and comparable performance.

Source

PostgreSQL 16 Official Documentation: Index Types - Hash Indexes

95% confidence
A

Detecting and Fixing Index Bloat in PostgreSQL

Detection

Use this query to detect bloated indexes (works on PostgreSQL 9.0+):

SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  ROUND(100 * (pg_relation_size(indexrelid) - 
    (CASE WHEN pg_relation_size(indexrelid) = 0 THEN 1 
     ELSE pg_relation_size(indexrelid) END))::numeric / 
    NULLIF(pg_relation_size(indexrelid), 0), 2) AS bloat_pct
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

For more accurate bloat estimation, use the pgstattuple extension:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT * FROM pgstatindex('index_name');

Bloat threshold: An index with leaf_fragmentation > 50% or index size > 2x expected size requires action.

Fixing Methods

Method 1: REINDEX (Requires EXCLUSIVE lock)

-- Single index (locks table for reads/writes)
REINDEX INDEX index_name;

-- All indexes on a table
REINDEX TABLE table_name;

-- Entire database (PostgreSQL 12+)
REINDEX DATABASE database_name;

Lock duration: Holds AccessExclusiveLock for entire operation. Not suitable for production tables with high traffic.

Method 2: REINDEX CONCURRENTLY (PostgreSQL 12+, Recommended)

-- Rebuilds index without blocking reads/writes
REINDEX INDEX CONCURRENTLY index_name;

Requirements:

  • PostgreSQL 12.0 or higher
  • Requires 2x disk space temporarily
  • Cannot run inside transaction block
  • Cannot be used on system catalogs

Method 3: CREATE INDEX CONCURRENTLY + DROP (PostgreSQL 9.0+)

-- Step 1: Create new index with different name
CREATE INDEX CONCURRENTLY index_name_new ON table_name (column_name);

-- Step 2: Drop old index
DROP INDEX CONCURRENTLY index_name;

-- Step 3: Rename new index
ALTER INDEX index_name_new RENAME TO index_name;

Advantage: Works on PostgreSQL versions < 12.

Prevention

Enable autovacuum with appropriate settings in postgresql.conf:

autovacuum = on
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.05

For high-churn tables, consider table-level settings:

ALTER TABLE table_name SET (autovacuum_vacuum_scale_factor = 0.05);

Source: PostgreSQL Official Documentation (https://www.postgresql.org/docs/current/routine-reindex.html)

95% confidence
A

pg_stat_user_indexes View

pg_stat_user_indexes is a PostgreSQL system view that provides statistics about index usage and performance for user-defined indexes only (excludes system catalog indexes).

Key Columns

SELECT * FROM pg_stat_user_indexes;

Critical columns include:

  • schemaname - Schema containing the index
  • relname - Table name the index belongs to
  • indexrelname - Name of the index
  • idx_scan - Number of index scans initiated on this index (0 means unused)
  • idx_tup_read - Number of index entries returned by scans
  • idx_tup_fetch - Number of live table rows fetched by simple index scans

Primary Use Cases

  1. Identify unused indexes: idx_scan = 0 after sufficient runtime
  2. Index efficiency: Compare idx_tup_read vs idx_tup_fetch to detect bloat
  3. Index usage patterns: High idx_scan indicates frequently used indexes

Example Query

-- Find indexes that have never been used
SELECT schemaname, relname, indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE 'pg_%'
ORDER BY pg_relation_size(indexrelid) DESC;

Important Notes

  • Statistics are cumulative since last stats reset or server restart
  • Reset with: SELECT pg_stat_reset(); (requires superuser)
  • Available in PostgreSQL 8.3+ (statistics collector must be enabled: track_counts = on, which is default)
  • Counters are per-database and not shared across databases

Related Views

  • pg_stat_all_indexes - Includes system indexes
  • pg_statio_user_indexes - I/O statistics (disk blocks read)

Source: PostgreSQL Official Documentation - Statistics Views

95% confidence
A

GIN vs GiST Index in PostgreSQL

Use GIN (Generalized Inverted Index) when:

  • Indexing static or rarely-updated data
  • Need faster lookups (3x faster than GiST for contains operations)
  • Indexing arrays, JSONB, full-text search, or tsvector columns
  • Can tolerate larger index size (2-3x larger than GiST)

Use GiST (Generalized Search Tree) when:

  • Data changes frequently (faster updates/inserts)
  • Need geometric/spatial queries (PostGIS)
  • Working with range types or custom data types
  • Index size is a concern

Concrete Examples

GIN for JSONB (read-heavy):

CREATE INDEX idx_data_gin ON products USING GIN (metadata jsonb_path_ops);
-- Query: SELECT * FROM products WHERE metadata @> '{"brand": "Nike"}';

GIN for full-text search:

CREATE INDEX idx_fts_gin ON documents USING GIN (to_tsvector('english', content));
-- Query: SELECT * FROM documents WHERE to_tsvector('english', content) @@ to_tsquery('postgresql');

GiST for range types (write-heavy):

CREATE INDEX idx_period_gist ON bookings USING GIST (period);
-- Query: SELECT * FROM bookings WHERE period && '[2025-01-01, 2025-01-31)'::daterange;

GiST for spatial data (PostGIS):

CREATE INDEX idx_location_gist ON stores USING GIST (geom);
-- Query: SELECT * FROM stores WHERE ST_DWithin(geom, 'POINT(-73.935242 40.730610)', 1000);

Performance Metrics (PostgreSQL 14+)

Operation GIN GiST
Lookup speed Faster (baseline) ~3x slower
Insert/update Slower (2-3x) Faster (baseline)
Index size Larger (2-3x) Smaller (baseline)

Decision rule: If read-to-write ratio > 10:1, use GIN. Otherwise, use GiST.

Source: PostgreSQL 16 Official Documentation - Index Types (https://www.postgresql.org/docs/16/indexes-types.html)

95% confidence
A

Estimating Index Size Before Creation in PostgreSQL

Use the pgstattuple extension to estimate index size based on table statistics:

-- Enable the extension (PostgreSQL 9.5+)
CREATE EXTENSION IF NOT EXISTS pgstattuple;

-- Estimate B-tree index size (most common)
SELECT pg_size_pretty(
    pg_relation_size(relid) * 
    (SELECT count(*) FROM pg_class WHERE relname = 'your_table_name') / 
    GREATEST((SELECT reltuples FROM pg_class WHERE relname = 'your_table_name'), 1)
) FROM pg_class WHERE relname = 'existing_similar_index';

More accurate formula-based estimation:

-- For B-tree index on integer column (4 bytes)
SELECT pg_size_pretty(
    (reltuples * (32 + 4))::bigint  -- 32 bytes overhead + column size
) AS estimated_index_size
FROM pg_class 
WHERE relname = 'your_table_name';

-- For B-tree index on text column
SELECT pg_size_pretty(
    (reltuples * (32 + avg_width))::bigint
) AS estimated_index_size
FROM pg_class c
JOIN pg_stats s ON s.tablename = c.relname
WHERE c.relname = 'your_table_name' 
AND s.attname = 'your_column_name';

Key sizing factors:

  • B-tree overhead: 32 bytes per entry (includes tuple header + index metadata)
  • Column data: Actual column size (4 bytes for int, 8 for bigint, variable for text)
  • Fill factor: Default 90% for B-tree indexes (10% reserved for updates)
  • Page overhead: 8KB page size with ~200 bytes header per page

Final formula: (row_count × (32 + column_size)) / 0.90 × 1.15

  • The 1.15 multiplier accounts for page headers and fragmentation

Source: PostgreSQL 17 documentation on Index Storage Parameters and pgstattuple module

95% confidence
A

Index-Only Scan in PostgreSQL

An index-only scan is a query execution method where PostgreSQL retrieves all required data directly from an index without accessing the table heap, enabled by the visibility map to check tuple visibility.

Requirements

For PostgreSQL to use an index-only scan, ALL of these must be true:

  1. All SELECT columns must be included in the index
  2. All WHERE columns must be in the index
  3. Visibility information must be available (via visibility map created by VACUUM)

Example

-- Create index with INCLUDE clause (PostgreSQL 11+)
CREATE INDEX idx_users_email ON users(email) INCLUDE (name);

-- This query can use index-only scan
SELECT email, name FROM users WHERE email = '[email protected]';

-- Verify with EXPLAIN
EXPLAIN (ANALYZE, BUFFERS) 
SELECT email, name FROM users WHERE email = '[email protected]';

Expected output shows:

Index Only Scan using idx_users_email on users
  Heap Fetches: 0

Heap Fetches Counter

The Heap Fetches: N value indicates actual table accesses:

  • Heap Fetches: 0 = True index-only scan (optimal)
  • Heap Fetches: N > 0 = Fallback to heap when visibility map incomplete

Critical: VACUUM Requirement

Index-only scans require an up-to-date visibility map:

-- Run VACUUM to update visibility map
VACUUM users;

-- Or enable autovacuum (default in PostgreSQL 8.3+)
ALTER TABLE users SET (autovacuum_enabled = true);

Without recent VACUUM, PostgreSQL must check heap for tuple visibility, negating the performance benefit.

Source

PostgreSQL Official Documentation: Index-Only Scans and Covering Indexes (v12+)

95% confidence
A

PostgreSQL 18 B-tree Skip Scan

PostgreSQL 18 introduces skip scan for B-tree indexes, allowing multi-column indexes to be used even when the leading column has no restriction.

The Problem (Pre-PG18)

CREATE INDEX idx_country_city ON locations(country, city);

-- This uses the index (leading column restricted)
SELECT * FROM locations WHERE country = 'USA' AND city = 'NYC';

-- This did NOT use the index efficiently (no leading column)
SELECT * FROM locations WHERE city = 'NYC';  -- Sequential scan!

PostgreSQL 18 Solution

-- Now uses skip scan on the same index!
SELECT * FROM locations WHERE city = 'NYC';

-- EXPLAIN shows:
-- Index Scan using idx_country_city on locations
--   Index Cond: (city = 'NYC'::text)
--   Skip Scan: true

How Skip Scan Works

  1. Scans first entry for each distinct value of leading column
  2. Jumps to next distinct value (skips)
  3. Repeats until all distinct leading values checked

When Skip Scan Is Effective

Leading Column Skip Scan Benefit
Low cardinality (few distinct) High
Medium cardinality Moderate
High cardinality (many distinct) Low (seq scan may win)

Example with EXPLAIN

CREATE TABLE orders (
    status VARCHAR(20),  -- 5 distinct values
    order_date DATE,
    customer_id INT
);

CREATE INDEX idx_status_date ON orders(status, order_date);

-- Query on second column only
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE order_date = '2025-01-15';

-- Shows:
-- Index Scan using idx_status_date
--   Skip Scan: 5 groups  (one per status value)

Limitations

  • Planner decides based on statistics (not always chosen)
  • Most effective with low-cardinality leading columns
  • Doesn't replace need for proper index design

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence
A

BRIN indexes are most effective when:

1. Very large tables - BRIN is designed specifically for tables where traditional B-tree indexes would be too large.

2. Natural correlation with physical order - The indexed column's values must correlate with the physical storage order of rows. This means:

  • Sequentially inserted data (timestamps, order dates, sequence IDs)
  • Naturally clustered data (ZIP codes, geographic regions)
  • Append-only tables where new data follows a predictable pattern

3. Range queries on correlated data - BRIN excels at queries like WHERE date >= '2024-01-01' when dates increase with insertion order.

Performance characteristics:

  • Index size is tiny compared to B-tree (often hundreds of times smaller)
  • Scanning overhead is minimal, close to sequential scan cost
  • Can skip entire block ranges when values don't match query conditions
  • Uses lossy bitmap scans requiring recheck of candidate tuples

Example scenario:
A table storing store orders with a created_at timestamp column where orders are inserted chronologically. BRIN can efficiently skip large portions of the table for date range queries.

Not effective when:

  • Data is randomly distributed (no physical correlation)
  • Frequent updates that destroy natural ordering
  • Small tables where B-tree overhead is acceptable
  • Point lookups requiring exact row identification

Sources:

95% confidence
A

B-tree Index in PostgreSQL

A B-tree index (balanced tree) is the default index type in PostgreSQL, used for equality and range queries on orderable data types.

What It Is

B-tree indexes maintain sorted data in a tree structure with these properties:

  • Self-balancing: keeps tree height minimal (logarithmic depth)
  • Supports operators: <, <=, =, >=, >, BETWEEN, IN, IS NULL, IS NOT NULL
  • Supports pattern matching with LIKE and ~ only when pattern is anchored at start (e.g., 'prefix%')

When to Use

Use B-tree indexes for:

  1. Equality searches

    SELECT * FROM users WHERE email = '[email protected]';
    
  2. Range queries

    SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
    
  3. Sorting operations

    SELECT * FROM products ORDER BY price;
    
  4. Primary keys and unique constraints (PostgreSQL automatically creates B-tree indexes)

Creating a B-tree Index

-- Explicit (though INDEX defaults to B-tree)
CREATE INDEX idx_users_email ON users USING btree (email);

-- Implicit (same result)
CREATE INDEX idx_users_email ON users (email);

-- Multi-column
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);

When NOT to Use

  • Full-text search: Use GIN index with tsvector
  • Geometric data: Use GiST or SP-GiST
  • Unanchored pattern matching (LIKE '%suffix'): Use trigram GIN index
  • Array containment (@>, <@): Use GIN index

Key Limitations

  • B-tree index size is typically ~50-100% of indexed column data size
  • Multi-column indexes: only efficient when query filters use leftmost columns first
  • Does not support unordered data types (e.g., json, jsonb without operators)

Source: PostgreSQL 17 Documentation - Index Types

95% confidence
A

Use the REINDEX command to rebuild an index in PostgreSQL:

Rebuild a specific index:

REINDEX INDEX index_name;

Rebuild all indexes on a table:

REINDEX TABLE table_name;

Rebuild without blocking writes (recommended for production):

REINDEX INDEX CONCURRENTLY index_name;
REINDEX TABLE CONCURRENTLY table_name;

Rebuild all indexes in a schema:

REINDEX SCHEMA schema_name;

Rebuild all indexes in the current database:

REINDEX DATABASE database_name;

Key Options:

  • CONCURRENTLY - Rebuilds without blocking concurrent INSERT/UPDATE/DELETE operations (no ACCESS EXCLUSIVE lock)
  • VERBOSE - Prints progress reports at INFO level
  • TABLESPACE tablespace_name - Moves indexes to a new tablespace during rebuild

Command-line alternative:

reindexdb -d database_name -t table_name
reindexdb --concurrently -d database_name

The reindexdb utility is a wrapper around the SQL REINDEX command.

Note: Without CONCURRENTLY, REINDEX acquires an ACCESS EXCLUSIVE lock that blocks all table access during the rebuild.

Sources:

95% confidence
A

Indexing JSONB Fields in PostgreSQL

PostgreSQL provides four specialized index types for JSONB fields, each optimized for different query patterns:

1. GIN Index (General Purpose - RECOMMENDED)

Best for most JSONB queries. Supports containment (@>, <@) and existence (?, ?|, ?&) operators.

-- Index entire JSONB column
CREATE INDEX idx_data_gin ON table_name USING GIN (jsonb_column);

-- Index with jsonb_path_ops (faster, smaller, but only supports @> operator)
CREATE INDEX idx_data_gin_path ON table_name USING GIN (jsonb_column jsonb_path_ops);

When to use: Queries like WHERE jsonb_column @> '{"key": "value"}'

2. B-tree Index (Specific Field Extraction)

Best for exact matches and comparisons on specific JSONB keys.

-- Index a specific field
CREATE INDEX idx_email ON users USING BTREE ((data->>'email'));

-- Index a nested field
CREATE INDEX idx_city ON users USING BTREE ((data->'address'->>'city'));

When to use: Queries like WHERE data->>'email' = '[email protected]'

3. Hash Index (Equality Only)

For equality checks on extracted values (PostgreSQL 10+).

CREATE INDEX idx_status_hash ON orders USING HASH ((data->>'status'));

4. Expression Index

For computed values or casts.

-- Index numeric field cast
CREATE INDEX idx_age ON users USING BTREE (((data->>'age')::integer));

Key Differences

  • jsonb_path_ops: 30% smaller and faster than default GIN, but only supports @> containment queries
  • ->> vs -> operator: Use ->> (returns text) for B-tree indexes, -> (returns JSONB) for continued nesting
  • Performance: GIN indexes are larger but handle complex queries. B-tree is smaller for single-field access patterns.

Verification Query

-- Check if index is being used
EXPLAIN ANALYZE SELECT * FROM table_name WHERE jsonb_column @> '{"key": "value"}';

Version: Available in PostgreSQL 9.4+ (JSONB introduced). jsonb_path_ops available 9.4+. Hash indexes crash-safe in 10+.

Source: PostgreSQL 17 Official Documentation - JSON Types and Indexing

95% confidence

Type Conversion

27 questions
A

No, binary coercibility is not necessarily a symmetric relationship. For example, the cast from xml to text can be performed for free, but the reverse direction requires a function that performs at least a syntax check. However, text and varchar are binary coercible both ways.

95% confidence
A

The four conversion contexts are: 1) Function calls - resolves overloaded functions based on argument types, 2) Operators - handles prefix and infix operator overloading, 3) Value storage - converts expression results to match target column types in INSERT/UPDATE, and 4) UNION/CASE/Array constructs - establishes uniform types across multiple result expressions.

95% confidence
A

Schemas are searched in the order they appear in search_path. However, pg_catalog is always effectively part of the search path - if not named explicitly, it is implicitly searched before searching the path's schemas. The temporary schema is searched first (even before pg_catalog) for relations and data types, but never for functions or operators.

95% confidence
A

A sizing cast is a cast from a type to itself. If one is found in the pg_cast catalog, it is applied to the expression before storing into the destination column. The cast function is responsible for applying any length-dependent semantics such as size checking or truncation.

95% confidence
A

Two mitigation strategies: 1) Use the VARIADIC keyword explicitly in the call, which bypasses this hazard, or 2) Cast arguments to force an exact match. However, calls populating VARIADIC 'any' parameters often have no equivalent formulation containing the VARIADIC keyword.

95% confidence

Extensions > pgAudit

27 questions
A

When enabled, statement text and parameters are logged only with the first log entry for a statement/substatement combination, not with every entry. This results in less verbose logging, but the statement/substatement pair with process ID is needed to identify the statement text from a previous entry.

Sources
95% confidence

Extensions > Extension Management

27 questions

Routine Maintenance > ANALYZE

26 questions

Indexes > Unique Indexes

26 questions
A

Index expressions are relatively expensive to maintain because the derived expression(s) must be computed for each row insertion and non-HOT update. However, the index expressions are not recomputed during an indexed search, since they are already stored in the index.

95% confidence
A

The predicate condition must exactly match part of the query's WHERE condition or the index will not be recognized as usable. PostgreSQL can recognize simple inequality implications (e.g., 'x < 1' implies 'x < 2'), but does not have a sophisticated theorem prover for mathematically equivalent expressions in different forms. Parameterized query clauses do not work with partial indexes.

95% confidence
A

If problems occur during scanning, the command fails but leaves behind an 'invalid' index that still consumes update overhead and continues to enforce uniqueness constraints. The recommended recovery is to drop the index and retry CREATE INDEX CONCURRENTLY, or use REINDEX INDEX CONCURRENTLY.

95% confidence

Concurrency Control > Explicit Row Locking

26 questions

Server Configuration > Resource Consumption - Disk and Kernel

26 questions

Data Types - Numeric

26 questions
A

PostgreSQL accepts float(1) to float(24) as selecting the REAL type (4 bytes, single precision), while float(25) to float(53) select DOUBLE PRECISION (8 bytes). Values of p outside this range cause an error. float with no precision specified defaults to DOUBLE PRECISION.

95% confidence

Data Types - Enumerated

26 questions
A

DROP TYPE ... CASCADE automatically drops objects that depend on the type (such as table columns, functions, and operators), and in turn all objects that depend on those objects.

95% confidence
A

Comparisons involving an added enum value will sometimes be slower than comparisons involving only original members of the enum type. This will usually only occur if BEFORE or AFTER is used to set the new value's sort position somewhere other than at the end of the list. The slowdown is usually insignificant, but optimal performance can be regained by dropping and recreating the enum type.

95% confidence

Data Types - XML

26 questions
A

What is EXPLAIN in PostgreSQL?

EXPLAIN is a PostgreSQL command that displays the execution plan for a SQL statement without executing it. It shows how PostgreSQL's query planner intends to scan tables, join data, and process the query.

Basic Syntax

EXPLAIN SELECT * FROM users WHERE age > 25;

Key Output Information

  • Scan type: Sequential Scan, Index Scan, Bitmap Heap Scan, etc.
  • Cost estimates: cost=0.00..35.50 (startup cost..total cost) in arbitrary units
  • Rows estimate: Expected number of rows returned
  • Width: Average row size in bytes

Critical Variants

EXPLAIN ANALYZE - Actually executes the query and shows real timing:

EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25;

Returns actual execution time in milliseconds and actual row counts vs estimates.

EXPLAIN (BUFFERS, ANALYZE) - Adds I/O statistics:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM users WHERE age > 25;

Shows shared blocks hit/read/written for performance tuning.

When to Use

  • Diagnosing slow queries (compare estimated vs actual rows with ANALYZE)
  • Verifying index usage (look for "Index Scan" vs "Seq Scan")
  • Optimizing JOIN operations (check join method: Hash Join, Nested Loop, Merge Join)

Version Note

Core functionality stable since PostgreSQL 9.0+. BUFFERS option added in 9.0. Format options (JSON, YAML, XML) available in 9.0+.

Source: PostgreSQL Official Documentation - EXPLAIN command reference (postgresql.org/docs/current/sql-explain.html)

95% confidence
A

The rows estimate in PostgreSQL's EXPLAIN output is the planner's predicted number of rows that will be returned by that particular plan node. This is an estimate made before query execution, not the actual row count.

Key Details

What it represents:

  • The number of rows the query planner expects this node to output
  • Calculated using table statistics from pg_stats (gathered by ANALYZE)
  • Used by the planner to choose between different execution strategies

How to read it:

EXPLAIN SELECT * FROM users WHERE age > 30;
Seq Scan on users  (cost=0.00..15.50 rows=120 width=40)
  Filter: (age > 30)

Here rows=120 means the planner estimates 120 rows will match the filter.

EXPLAIN vs EXPLAIN ANALYZE

  • EXPLAIN shows rows=<estimate> (predicted)
  • EXPLAIN ANALYZE shows both rows=<estimate> and actual rows=<actual> (measured during execution)
EXPLAIN ANALYZE SELECT * FROM users WHERE age > 30;
Seq Scan on users  (cost=0.00..15.50 rows=120 width=40) (actual time=0.023..0.156 rows=98 loops=1)
  • rows=120 = planner's estimate
  • actual rows=98 = true count after execution

Critical for Agents

When the estimate is significantly wrong (off by 10x+), it indicates stale statistics. The solution is to run:

ANALYZE table_name;

Source: PostgreSQL 17 official documentation - Using EXPLAIN

95% confidence
A

EXPLAIN BUFFERS shows buffer cache statistics for a PostgreSQL query execution, revealing how many 8KB blocks were accessed from various memory/disk sources.

What it shows (PostgreSQL 9.0+):

  • shared hit - blocks found in PostgreSQL's shared buffer cache (RAM)
  • shared read - blocks read from OS cache or disk into shared buffers
  • shared dirtied - blocks modified during query execution
  • shared written - blocks physically written to disk during execution
  • local hit/read/dirtied/written - same metrics for temporary tables (session-local buffers)
  • temp read/written - temporary file I/O when work_mem exceeded

Usage:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM users WHERE id = 123;

Example output:

Index Scan using users_pkey on users  (cost=0.29..8.31 rows=1 width=40) (actual time=0.015..0.016 rows=1 loops=1)
  Index Cond: (id = 123)
  Buffers: shared hit=4
Planning:
  Buffers: shared hit=12
Planning Time: 0.123 ms
Execution Time: 0.032 ms

Key interpretation:

  • High "shared hit" = good (data in cache)
  • High "shared read" = disk I/O (slower, consider more RAM or indexing)
  • "temp read/written" = query spilling to disk (increase work_mem)

Requirements:

  • Must use EXPLAIN (BUFFERS) or EXPLAIN (ANALYZE, BUFFERS) syntax
  • Only works with ANALYZE option for actual buffer counts (without ANALYZE, no buffer stats shown)
  • Requires track_io_timing = on to also show I/O timing statistics

Source: PostgreSQL 17 Official Documentation - EXPLAIN command

95% confidence
A

A hash join is a join algorithm in PostgreSQL that builds an in-memory hash table from the smaller input table (called the "build" input), then probes it with rows from the larger table (called the "probe" input) to find matches.

How It Works

  1. Build phase: PostgreSQL scans the smaller table and creates a hash table in memory, hashing the join key values
  2. Probe phase: For each row in the larger table, PostgreSQL hashes its join key and looks up matching rows in the hash table

When PostgreSQL Uses Hash Joins

The planner chooses hash joins when:

  • Joining on equality conditions (e.g., table1.id = table2.id)
  • The estimated hash table fits in work_mem (default: 4MB)
  • The query has no ORDER BY that would make other join types preferable

Memory Constraints

  • work_mem limit: If the hash table exceeds work_mem, PostgreSQL uses temporary disk files (batch mode), degrading performance
  • Hash joins require memory = (size of smaller table's join columns + overhead) × hash table factor (~1.2-1.5×)

Example

EXPLAIN ANALYZE
SELECT orders.*, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id;

Output might show:

Hash Join  (cost=... rows=...)
  Hash Cond: (orders.customer_id = customers.id)
  -> Seq Scan on orders
  -> Hash
        -> Seq Scan on customers

Configuration

-- Disable hash joins to force alternative algorithms (for testing)
SET enable_hashjoin = off;

-- Increase work_mem to allow larger hash tables
SET work_mem = '256MB';

Source: PostgreSQL 17 official documentation - Query Planning

95% confidence
A

How Often to Run ANALYZE in PostgreSQL

Answer: You typically don't need to manually run ANALYZE because PostgreSQL's autovacuum daemon automatically runs it by default.

Automatic ANALYZE (Recommended)

The autovacuum daemon runs ANALYZE automatically when:

Number of inserted/updated/deleted tuples > autovacuum_analyze_threshold + (autovacuum_analyze_scale_factor × table_tuples)

Default settings (PostgreSQL 13+):

  • autovacuum_analyze_threshold = 50 rows
  • autovacuum_analyze_scale_factor = 0.1 (10% of table)

This means ANALYZE runs automatically after ~10% of a table changes, plus 50 rows minimum.

When to Run ANALYZE Manually

Run ANALYZE manually in these specific cases:

  1. After bulk data loads: Immediately after COPY or bulk INSERT
COPY my_table FROM '/data/file.csv';
ANALYZE my_table;
  1. After creating indexes on existing data:
CREATE INDEX idx_name ON my_table(column);
ANALYZE my_table;
  1. When autovacuum is disabled (not recommended): Run daily or after significant changes

  2. Before critical queries if statistics are stale:

ANALYZE VERBOSE my_table;

Verification

Check last analyze time:

SELECT schemaname, relname, last_analyze, last_autoanalyze 
FROM pg_stat_user_tables 
WHERE relname = 'my_table';

Source: PostgreSQL 16 Official Documentation - Routine Vacuuming (§25.1.6)

95% confidence
A

log_min_duration_statement is a PostgreSQL configuration parameter that logs the duration of SQL statements that take longer than the specified threshold.

Value Format:

  • Integer value in milliseconds
  • Default: -1 (disabled - no statements logged based on duration)
  • 0 = log all statements and their durations
  • Positive integer (e.g., 1000) = log only statements taking ≥ that many milliseconds

Configuration:

-- In postgresql.conf
log_min_duration_statement = 1000  # Log statements taking ≥ 1 second

-- Or set dynamically (requires superuser or pg_write_server_files role)
ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

-- Session-level override
SET log_min_duration_statement = 500;  # For current session only

Log Output Example:

LOG:  duration: 1234.567 ms  statement: SELECT * FROM large_table WHERE condition = 'value';

Key Behaviors:

  • Measures total execution time including parsing, planning, and execution
  • Logs to PostgreSQL's standard log destination (file/stderr based on log_destination)
  • Does NOT log statements faster than the threshold
  • Setting to 0 creates significant log volume in production

Common Use Cases:

  • Performance monitoring: Set to 1000 (1 second) to catch slow queries
  • Query auditing: Set to 0 for complete statement logging (high I/O cost)
  • Troubleshooting: Temporarily lower threshold to diagnose performance issues

Version: Available in PostgreSQL 8.0+, behavior consistent through PostgreSQL 17.

Source: PostgreSQL Official Documentation - Error Reporting and Logging

95% confidence
A

Creating Extended Statistics in PostgreSQL

Extended statistics are created using the CREATE STATISTICS command (available in PostgreSQL 10+). They help the query planner make better estimates when columns have correlated data.

Basic Syntax

CREATE STATISTICS stat_name [(statistic_type [, ...])]
ON column_name1, column_name2 [, ...]
FROM table_name;

Statistic Types

PostgreSQL supports these types (10+):

  • dependencies - Multi-column functional dependencies (default in PG 10+)
  • ndistinct - N-distinct coefficients for column combinations (default in PG 10+)
  • mcv - Most-common-values lists for column combinations (PG 12+)

Examples

Basic creation (uses dependencies + ndistinct by default):

CREATE STATISTICS orders_stats 
ON customer_id, order_date 
FROM orders;

Specify statistic types explicitly:

CREATE STATISTICS city_zip_stats (dependencies, ndistinct, mcv)
ON city, zip_code
FROM addresses;

Multiple column combinations (PG 10+):

CREATE STATISTICS product_stats
ON category, brand, price_range
FROM products;

Critical Details

  • Columns: Up to 8 columns per statistics object
  • Activation: Run ANALYZE table_name; after creation to populate statistics
  • Updates: Statistics are automatically updated during ANALYZE operations
  • View existing: Query pg_statistic_ext catalog table
  • Drop: DROP STATISTICS stat_name;

Complete Working Example

-- Create statistics
CREATE STATISTICS user_location_stats (dependencies, mcv)
ON country, state
FROM users;

-- Populate the statistics
ANALYZE users;

-- Verify creation
SELECT stxname, stxkeys, stxkind 
FROM pg_statistic_ext 
WHERE stxname = 'user_location_stats';

Source: PostgreSQL Official Documentation v10-17, CREATE STATISTICS command reference.

95% confidence
A

How to log slow queries in PostgreSQL

Set these parameters in postgresql.conf (or via ALTER SYSTEM):

-- Log queries taking longer than 1000ms (1 second)
log_min_duration_statement = 1000

-- Recommended additional settings for useful slow query logs
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_duration = off
log_statement = 'none'

Key Parameters:

  • log_min_duration_statement: Time in milliseconds. Queries taking longer than this value are logged.

    • -1 = disabled (default)
    • 0 = log all queries
    • 1000 = log queries ≥1 second (common starting point)
    • 5000 = log queries ≥5 seconds (production systems)
  • log_duration: Should be off when using log_min_duration_statement to avoid duplicate entries

Apply changes:

-- Option 1: Modify postgresql.conf, then reload
SELECT pg_reload_conf();

-- Option 2: Set for current session only
SET log_min_duration_statement = 1000;

-- Option 3: Set globally without editing config file
ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

Log output location:
Check log_destination and log_directory in postgresql.conf. Default is typically /var/log/postgresql/postgresql-{version}-main.log on Debian/Ubuntu or $PGDATA/log/ on other systems.

Verified on: PostgreSQL 12-17 (parameter behavior unchanged since 8.0)

Source: PostgreSQL Official Documentation - Error Reporting and Logging

95% confidence
A

In PostgreSQL's EXPLAIN ANALYZE, actual time is the real wall-clock time (in milliseconds) spent executing a specific plan node, measured from when the node starts producing rows until it finishes.

The format is: actual time=X..Y

  • X (startup time): Milliseconds from when the node starts until it produces its first row
  • Y (total time): Total milliseconds from start to completion (including all rows)

Key characteristics:

  1. Includes child nodes: The total time for a parent node includes all time spent in its children
  2. Excludes result transmission: Time to send results to the client is NOT included
  3. Per-loop measurement: For nodes inside loops (e.g., nested loop joins), the times shown are averages. Check loops=N to see how many times the node executed. Multiply actual time by loops to get total time spent.

Example:

EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;
Index Scan using users_pkey on users  (cost=0.29..8.31 rows=1 width=100) 
                                       (actual time=0.015..0.016 rows=1 loops=1)
  • Startup: 0.015ms (time to find first row)
  • Total: 0.016ms (time to complete scan)
  • Executed 1 time (loops=1)

Source: PostgreSQL official documentation on EXPLAIN (applicable to all modern versions 9.x-17.x)

95% confidence
A

Force PostgreSQL to Use a Specific Index

Use the SET enable_* session parameters to disable alternative access methods, forcing PostgreSQL to prefer your target index.

Primary Method (PostgreSQL 9.0+):

-- Disable sequential scans to force index usage
SET enable_seqscan = off;

-- Execute your query
SELECT * FROM users WHERE email = '[email protected]';

-- Re-enable for subsequent queries
SET enable_seqscan = on;

For Specific Index Types:

-- Disable bitmap scans (forces regular index scan)
SET enable_bitmapscan = off;

-- Disable index scans (forces bitmap or seq scan)
SET enable_indexscan = off;

-- Disable index-only scans
SET enable_indexonlyscan = off;

To Force a Specific Index When Multiple Exist:

Disable other index types and drop competing indexes temporarily, or use pg_hint_plan extension:

-- With pg_hint_plan extension
/*+ IndexScan(users users_email_idx) */
SELECT * FROM users WHERE email = '[email protected]';

Production-Safe Approach:

Use transaction-scoped settings to avoid affecting other queries:

BEGIN;
SET LOCAL enable_seqscan = off;
SELECT * FROM users WHERE email = '[email protected]';
COMMIT;

Critical Notes:

  • These settings affect the query planner cost estimates (sets cost to 10^10), not hard disables
  • Setting applies to entire session unless using SET LOCAL (transaction-scoped)
  • Forcing indexes is typically a code smell; investigate with EXPLAIN ANALYZE why the planner isn't choosing your index
  • Ensure statistics are current: ANALYZE table_name;

Source: PostgreSQL 16 Official Documentation - Runtime Config: Query Planning

95% confidence
A

PostgreSQL uses a bitmap index scan when the query planner's cost-based optimizer determines it's the most efficient method based on selectivity and estimated costs. Specifically:

When bitmap scans are chosen:

  1. Medium selectivity queries - Bitmap scans serve as a middle ground between index scans (high selectivity, few rows) and sequential scans (low selectivity, many rows)

  2. Combining multiple indexes - When a query has:

    • OR conditions on the same column: WHERE x = 42 OR x = 47
    • AND/OR conditions across different indexed columns: WHERE x = 5 AND y = 6
    • Complex Boolean combinations that can leverage multiple indexes
  3. When fetching multiple rows from the same data pages - Bitmap scans visit table rows in physical order, making bulk reads more efficient than random index scans

Cost-based decision factors:

The planner compares costs using:

  • random_page_cost × pages + cpu_index_tuple_cost × tuples + cpu_operator_cost × tuples
  • Selectivity estimates from table statistics (pg_class.reltuples × selectivity)
  • work_mem settings (affects whether bitmap becomes "lossy")

The planner chooses bitmap scan when its estimated cost is lower than both a plain index scan and a sequential scan for the given query conditions.

Trade-offs:

  • Bitmap scans lose original index ordering (requires separate sort for ORDER BY)
  • Each additional index adds overhead
  • Physical order access improves I/O efficiency over random index lookups

Sources:

95% confidence
A

Merge Join in PostgreSQL

A merge join is a join algorithm in PostgreSQL that combines two sorted datasets by scanning them in parallel and matching rows based on the join condition. It requires both input relations to be sorted on the join key.

How It Works

  1. PostgreSQL sorts both input tables on the join columns (if not already sorted)
  2. Scans both sorted datasets simultaneously, advancing through each when keys don't match
  3. When keys match, outputs the joined rows
  4. Time complexity: O(N log N + M log M) for sorting, O(N + M) for the merge phase

When PostgreSQL Uses Merge Join

The planner chooses merge join when:

  • Join condition uses equality operators (=)
  • Input datasets are already sorted (from indexes or previous operations), OR sorting cost is justified
  • Working with large datasets where nested loop would be too slow and hash join exceeds work_mem

Performance Characteristics

  • Best for: Large tables that are pre-sorted or have indexes on join columns
  • Memory usage: Minimal - only needs to hold current rows from each input
  • Handles duplicates: Efficiently processes duplicate join keys using "mark and restore"

Example

-- Query that may use merge join
EXPLAIN SELECT * 
FROM orders o 
JOIN customers c ON o.customer_id = c.id 
WHERE o.order_date > '2024-01-01';

-- Example output showing merge join:
-- Merge Join  (cost=...)
--   Merge Cond: (o.customer_id = c.id)
--   ->  Index Scan using orders_customer_id_idx on orders o
--   ->  Index Scan using customers_pkey on customers c

Configuration

  • enable_mergejoin (default: on) - Set to off to disable merge joins globally (debugging only)
  • Sorting uses work_mem per sort operation

Source: PostgreSQL 17 Official Documentation - Query Planning: Merge Join

95% confidence
A

PostgreSQL Correlation Statistic

Correlation in PostgreSQL statistics measures the alignment between the physical order of rows on disk and the logical order of a column's values. It ranges from -1.0 to +1.0.

Exact Values

  • +1.0: Column values are stored in perfect ascending order on disk
  • -1.0: Column values are stored in perfect descending order on disk
  • 0.0: No correlation between physical and logical order (random distribution)

Why It Matters

The query planner uses correlation to estimate the cost of index scans. High absolute correlation (near ±1.0) means sequential reads, which are much faster. Low correlation (near 0) means random disk access, which is slower.

Where to Find It

SELECT tablename, attname, correlation 
FROM pg_stats 
WHERE tablename = 'your_table' 
  AND attname = 'your_column';

The correlation column in pg_stats shows this value for each indexed column.

When It's Updated

Correlation is calculated by ANALYZE. PostgreSQL samples the table (default: 300 × default_statistics_target rows, where default target is 100) and compares physical tuple positions to sorted logical values.

Performance Impact

  • High correlation (|correlation| > 0.8): Index scans are cheap; optimizer favors index usage
  • Low correlation (|correlation| < 0.2): Index scans require many random I/Os; optimizer may prefer sequential scans

Source

PostgreSQL official documentation on statistics (pg_stats view), valid for PostgreSQL 12+. The calculation method is defined in src/backend/commands/analyze.c.

95% confidence
A

PostgreSQL work_mem Setting

work_mem is a PostgreSQL configuration parameter that sets the maximum memory used for internal sort operations and hash tables before writing to temporary disk files. Each operation (sort, hash join, merge join) can use up to this amount.

Key Details

Default Value: 4MB (4096 kB)

Scope: Can be set per-session, per-transaction, per-user, or server-wide

Unit: Specified in kilobytes by default, but accepts kB, MB, GB

Critical Behavior:

  • Multiple operations in a single query can EACH allocate work_mem
  • A complex query with 5 sorts could use 5 × work_mem
  • Memory is allocated per operation, not per query

Configuration

-- Session-level (recommended for specific queries)
SET work_mem = '256MB';

-- Transaction-level
BEGIN;
SET LOCAL work_mem = '512MB';
-- query here
COMMIT;
# postgresql.conf (server-wide)
work_mem = 64MB

Setting Guidelines

Formula: (Total RAM - shared_buffers) / (max_connections × 2 to 3)

For a server with 16GB RAM, shared_buffers=4GB, max_connections=100:

  • Available: 12GB
  • work_mem: ~40-60MB is safe

Warning: Setting too high risks OOM kills. Monitor with pg_stat_statements and system memory.

Version: Applies to PostgreSQL 9.0+ (behavior unchanged in 17.x as of 2025)

Source: PostgreSQL 17 Documentation - Resource Consumption

95% confidence
A

How work_mem Affects Query Performance

work_mem controls the maximum memory PostgreSQL uses for each internal sort or hash operation within a query before spilling to disk.

Direct Performance Impact

Insufficient work_mem:

  • Operations exceed memory limit → data written to temporary disk files
  • Disk I/O is 100-1000x slower than memory operations
  • Query execution time increases dramatically (often 10-100x slower)

Sufficient work_mem:

  • Operations complete entirely in memory
  • Fast execution using in-memory sorting/hashing algorithms
  • No temporary file creation overhead

Exact Default and Typical Values

  • Default: 4MB (4096 KB)
  • Common production settings: 16MB - 256MB
  • Maximum practical: Limited by available RAM and connection count

Critical Calculation

Total potential memory = work_mem × max_connections × operations_per_query

A single complex query can use work_mem multiple times simultaneously (one per sort/hash node in the execution plan).

Example: If work_mem = 256MB, max_connections = 100, and queries average 4 operations:

Worst case = 256MB × 100 × 4 = 100GB RAM required

Operations Affected by work_mem

  1. ORDER BY sorts
  2. DISTINCT operations
  3. Merge joins
  4. Hash joins (builds hash table)
  5. Hash-based aggregations (GROUP BY)
  6. Recursive CTEs
  7. Bitmap index scans (combining bitmaps)

Verification

Check if queries are spilling to disk:

-- Shows temporary file usage
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM large_table ORDER BY column;

Look for: "Temp File: ... kB"

Setting work_mem

Session-level (recommended for specific heavy queries):

SET work_mem = '256MB';
-- Run your query
RESET work_mem;

Global (postgresql.conf):

work_mem = 64MB

Source

PostgreSQL Official Documentation v16: Resource Consumption

Version: Behavior consistent across PostgreSQL 9.6-16+

95% confidence
A

seq_page_cost is a PostgreSQL planner configuration parameter that sets the estimated cost of reading one disk page (8 KB) during a sequential scan.

Default Value: 1.0 (in PostgreSQL 9.6+)

Purpose: The query planner uses this cost unit as a baseline to compare different query execution plans. All other I/O cost parameters are relative to this value:

  • random_page_cost (default: 4.0) - cost of non-sequential disk page fetch
  • cpu_tuple_cost (default: 0.01) - cost of processing one row
  • cpu_operator_cost (default: 0.0025) - cost of processing one operator

Configuration:

-- Set globally (requires superuser + reload)
ALTER SYSTEM SET seq_page_cost = 1.0;
SELECT pg_reload_conf();

-- Set for current session
SET seq_page_cost = 1.0;

-- Set for specific table
ALTER TABLE my_table SET (seq_page_cost = 0.5);

Tuning Guidance:

  • SSDs/NVMe: Lower to 0.5-0.75 (sequential and random reads are similar speed)
  • HDDs with large cache: Keep at 1.0 (default)
  • Slow storage: May increase slightly, but adjust random_page_cost ratio instead

The ratio between seq_page_cost and random_page_cost matters more than absolute values. For SSDs, use ratio of 1:1 to 1:2 (e.g., seq=1.0, random=1.5).

Source: PostgreSQL 17 Documentation - Runtime Config: Query Planning

95% confidence
A

Bitmap Index Scan in PostgreSQL

A Bitmap Index Scan is a two-phase query execution strategy in PostgreSQL that efficiently handles queries matching multiple rows from an index.

How It Works

Phase 1 - Bitmap Index Scan: Scans the index and creates an in-memory bitmap of matching heap page numbers (not individual row locations). Each bit represents a heap page.

Phase 2 - Bitmap Heap Scan: Reads the actual table pages marked in the bitmap, fetching all matching rows from each page in physical order.

When PostgreSQL Uses It

The query planner chooses bitmap scans when:

  • Query matches too many rows for an efficient regular Index Scan (typically >5-10% of table)
  • Too few rows for a full Sequential Scan to be optimal
  • Multiple indexes can be combined with AND/OR operations

Key Advantages

  1. Page-level deduplication: Multiple index entries pointing to the same page = single page read
  2. Sequential I/O: Reads heap pages in physical order (vs random I/O in Index Scan)
  3. Index combining: Can merge bitmaps from multiple indexes using BitmapAnd or BitmapOr

Example

-- Query that triggers bitmap scan
EXPLAIN ANALYZE
SELECT * FROM users WHERE age BETWEEN 25 AND 35;
Bitmap Heap Scan on users  (cost=12.75..89.23 rows=450 width=40)
  Recheck Cond: ((age >= 25) AND (age <= 35))
  ->  Bitmap Index Scan on users_age_idx  (cost=0.00..12.64 rows=450 width=0)
        Index Cond: ((age >= 25) AND (age <= 35))

Memory Limit

Bitmap size is limited by work_mem (default: 4MB). If the bitmap exceeds work_mem, it becomes "lossy" - tracking only pages, requiring a Recheck Cond during heap scan to filter rows.

Source: PostgreSQL 17 Documentation - Using EXPLAIN

95% confidence
A

effective_cache_size is a PostgreSQL configuration parameter that sets the planner's assumption about the effective size of the disk cache available to a single query. It influences the query optimizer's cost estimates when deciding between index scans and sequential scans.

Key Characteristics:

  • Does not allocate memory - it's only a hint to the query planner
  • Higher values make index scans more likely to be chosen
  • Lower values make sequential scans more likely
  • Default: 4GB (524288 blocks)
  • Unit: Blocks (typically 8kB each) when specified without units
  • Can be changed without restarting PostgreSQL

What to Include:

When setting this value, consider both:

  1. PostgreSQL's shared_buffers
  2. The portion of the kernel's disk cache used for PostgreSQL data files

Typical Settings:

  • Conservative: 50% of total system memory
  • Aggressive: 75% of total system memory

The planner uses this to estimate whether index blocks are likely to be in cache or need physical disk reads, affecting cost calculations for query plans.

Sources:

95% confidence
A

PostgreSQL provides three primary methods to identify slow queries:

Method 1: pg_stat_statements Extension (Recommended)

Enable the pg_stat_statements extension to track execution statistics for all SQL statements.

Setup:

  1. Add to postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
compute_query_id = on
  1. Restart PostgreSQL
  2. Enable in your database:
CREATE EXTENSION pg_stat_statements;

Query for slowest queries by total time:

SELECT query, calls, total_exec_time, mean_exec_time, rows 
FROM pg_stat_statements 
ORDER BY total_exec_time DESC 
LIMIT 10;

Query for slowest queries by average time:

SELECT query, calls, mean_exec_time, total_exec_time, rows 
FROM pg_stat_statements 
WHERE calls > 100  -- Exclude rarely-run queries
ORDER BY mean_exec_time DESC 
LIMIT 10;

Key metrics:

  • total_exec_time - Total time spent executing this query
  • mean_exec_time - Average execution time per call
  • calls - Number of times executed
  • rows - Total rows returned/affected

Method 2: log_min_duration_statement

Log queries that exceed a specific duration threshold.

Configuration in postgresql.conf:

log_min_duration_statement = 250ms
  • -1 (default) - Disabled
  • 0 - Log all statement durations
  • 250ms - Log statements running 250ms or longer

This can be changed at runtime without restart by superusers.

Method 3: auto_explain Extension

Automatically log execution plans for slow queries.

Configuration:

shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 250ms

This captures EXPLAIN output for queries exceeding the threshold.

Sources:

95% confidence
A

EXPLAIN ANALYZE in PostgreSQL

EXPLAIN ANALYZE is a PostgreSQL command that executes a query and returns detailed performance statistics about how the query planner executed it, including actual execution times and row counts.

Usage

EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25;

Key Differences from EXPLAIN

  • EXPLAIN shows the query plan only (no execution)
  • EXPLAIN ANALYZE actually runs the query and shows both the plan and actual runtime statistics

Critical Output Metrics

The output includes:

  1. Planning Time - Time spent generating the query plan (milliseconds)
  2. Execution Time - Actual time to run the query (milliseconds)
  3. Actual rows - Number of rows processed at each node
  4. Actual loops - Number of times each node executed
  5. Buffers - I/O statistics (when used with BUFFERS option)

Important Options

-- Include buffer usage statistics (shared blocks read/hit)
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM users;

-- Show output in JSON format
EXPLAIN (ANALYZE, FORMAT JSON) SELECT * FROM users;

-- Other formats: TEXT (default), XML, YAML
EXPLAIN (ANALYZE, FORMAT YAML) SELECT * FROM users;

Critical Warning

⚠️ EXPLAIN ANALYZE executes the query completely, including:

  • INSERT, UPDATE, DELETE operations will modify data
  • Use within a transaction with ROLLBACK to prevent changes:
BEGIN;
EXPLAIN ANALYZE DELETE FROM users WHERE id = 1;
ROLLBACK;

Version Compatibility

Available in PostgreSQL 7.2+. The BUFFERS option added in PostgreSQL 8.4. Latest stable: PostgreSQL 16.x (as of January 2025).

Source: PostgreSQL Official Documentation - EXPLAIN

95% confidence
A

EXPLAIN vs EXPLAIN ANALYZE in PostgreSQL

EXPLAIN shows the query execution plan WITHOUT running the query. It displays PostgreSQL's estimated costs, row counts, and query strategy based on table statistics.

EXPLAIN ANALYZE actually EXECUTES the query and shows both the plan AND actual runtime statistics (real execution time, actual row counts, memory usage).

Key Differences

Aspect EXPLAIN EXPLAIN ANALYZE
Query execution No (plan only) Yes (full execution)
Timing data Estimated only Actual measured times
Row counts Estimated Actual
Side effects None ALL side effects occur

Code Examples

-- EXPLAIN: Safe, shows plan only, no execution
EXPLAIN SELECT * FROM users WHERE created_at > '2024-01-01';

Output shows estimated costs:

Seq Scan on users  (cost=0.00..35.50 rows=10 width=100)
  Filter: (created_at > '2024-01-01'::date)
-- EXPLAIN ANALYZE: Executes query, shows actual performance
EXPLAIN ANALYZE SELECT * FROM users WHERE created_at > '2024-01-01';

Output includes actual times:

Seq Scan on users  (cost=0.00..35.50 rows=10 width=100) 
                   (actual time=0.015..0.892 rows=12 loops=1)
  Filter: (created_at > '2024-01-01'::date)
Planning Time: 0.123 ms
Execution Time: 1.045 ms

Critical Warning

EXPLAIN ANALYZE executes INSERT/UPDATE/DELETE queries, causing data modifications. Wrap in a transaction with ROLLBACK for safety:

BEGIN;
EXPLAIN ANALYZE DELETE FROM users WHERE id = 123;
ROLLBACK;  -- Prevents actual deletion

When to Use Each

  • EXPLAIN: Quick plan inspection, checking if indexes are used, safe for all queries
  • EXPLAIN ANALYZE: Troubleshooting slow queries, comparing estimated vs actual performance, validating optimizer decisions

Source: PostgreSQL Official Documentation (applies to PostgreSQL 9.0+, current as of PostgreSQL 17)

95% confidence

Streaming Replication > Synchronous Replication

24 questions

performance_tuning

24 questions
A

PostgreSQL Autovacuum Configuration

Configure autovacuum by modifying parameters in postgresql.conf or using ALTER TABLE for table-specific settings.

Essential Global Parameters (postgresql.conf)

# Enable autovacuum (default: on)
autovacuum = on

# Maximum autovacuum worker processes (default: 3, range: 1-262143)
autovacuum_max_workers = 3

# Delay between autovacuum runs in seconds (default: 60s, range: 1s-2147483s)
autovacuum_naptime = 60

# Threshold to trigger vacuum (default: 50 rows)
autovacuum_vacuum_threshold = 50

# Fraction of table size to trigger vacuum (default: 0.2 = 20%)
autovacuum_vacuum_scale_factor = 0.2

# Threshold to trigger analyze (default: 50 rows)
autovacuum_analyze_threshold = 50

# Fraction of table size to trigger analyze (default: 0.1 = 10%)
autovacuum_analyze_scale_factor = 0.1

# Cost-based delay in milliseconds (default: 2ms, range: -1 to 100)
autovacuum_vacuum_cost_delay = 2

# Cost limit before applying delay (default: 200, range: -1 to 10000)
autovacuum_vacuum_cost_limit = 200

Trigger formula: Autovacuum runs when:

  • dead_tuples > autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * table_size)

Table-Specific Overrides

-- Set custom autovacuum parameters for a specific table
ALTER TABLE my_table SET (
  autovacuum_vacuum_threshold = 100,
  autovacuum_vacuum_scale_factor = 0.05,
  autovacuum_analyze_threshold = 100,
  autovacuum_analyze_scale_factor = 0.05
);

-- Disable autovacuum for a specific table (not recommended)
ALTER TABLE my_table SET (autovacuum_enabled = false);

-- View table-specific settings
SELECT relname, reloptions 
FROM pg_class 
WHERE relname = 'my_table';

Apply Configuration Changes

# Reload configuration (no restart needed for autovacuum settings)
SELECT pg_reload_conf();

Verification

-- Check if autovacuum is running
SELECT * FROM pg_stat_activity WHERE query LIKE 'autovacuum:%';

-- View autovacuum statistics per table
SELECT schemaname, relname, last_autovacuum, last_autoanalyze, n_dead_tup
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

Source: PostgreSQL 16 Official Documentation - Automatic Vacuuming

Note: For high-write tables, reduce scale_factor to 0.01-0.05 and increase vacuum_cost_limit to 1000-2000 for more aggressive vacuuming.

95% confidence
A

VACUUM vs VACUUM FULL in PostgreSQL

VACUUM reclaims dead tuple space and marks it as reusable within the existing data files, but does not return disk space to the operating system. It updates statistics and prevents transaction ID wraparound. Tables remain online and accessible during operation.

VACUUM FULL rewrites the entire table to a new file, compacting it to minimum size and returning freed disk space to the OS. It requires an ACCESS EXCLUSIVE lock, blocking all reads and writes. Much slower and more resource-intensive.

Key Differences

Aspect VACUUM VACUUM FULL
Lock Level SHARE UPDATE EXCLUSIVE (allows SELECT/INSERT/UPDATE/DELETE) ACCESS EXCLUSIVE (blocks everything)
Disk Space Marks space reusable, doesn't shrink files Returns space to OS, shrinks files
Speed Fast, incremental Slow, rewrites entire table
Disk Requirement Minimal Needs free space equal to table size
Use Case Regular maintenance Bloat recovery after massive deletes

Usage Examples

-- Regular vacuum (use this for routine maintenance)
VACUUM table_name;

-- Full vacuum (use sparingly, during maintenance windows)
VACUUM FULL table_name;

-- Vacuum with analysis update
VACUUM ANALYZE table_name;

-- Verbose output
VACUUM (VERBOSE) table_name;

Best Practice Decision Criteria

  • Use VACUUM: For routine maintenance, handled automatically by autovacuum (enabled by default since PostgreSQL 8.3)
  • Use VACUUM FULL: Only when table bloat exceeds 30-40% and you have a maintenance window. Consider pg_repack extension instead for zero-downtime alternative.

Source: PostgreSQL 16 Official Documentation - Routine Vacuuming (https://www.postgresql.org/docs/current/routine-vacuuming.html)

95% confidence
A

shared_buffers in PostgreSQL

shared_buffers is a PostgreSQL configuration parameter that sets the amount of memory allocated for caching data pages in RAM, shared by all database processes.

Default Value and Recommendations

Default: 128 MB (as of PostgreSQL 9.3+)

Recommended setting: 25% of total system RAM for dedicated database servers. For systems with 1GB+ RAM, start with:

  • Minimum: 256 MB
  • Typical: 25% of system RAM
  • Maximum practical: 8-16 GB (diminishing returns beyond this)

Configuration

Set in postgresql.conf:

shared_buffers = 4GB

Or using different units:

shared_buffers = 4096MB
shared_buffers = 524288  # in 8KB pages (default unit)

Requires: Database restart to take effect (not reloadable with pg_reload_conf())

How It Works

  • Stores frequently accessed table and index data in memory
  • Uses PostgreSQL's internal cache, separate from OS page cache
  • Measured in 8KB blocks (PostgreSQL's page size)
  • Too low: Excessive disk I/O
  • Too high: Less memory for OS cache and working memory, potential performance degradation

Check Current Value

SHOW shared_buffers;
-- Returns: 4GB

SELECT setting, unit FROM pg_settings WHERE name = 'shared_buffers';
-- Returns setting in pages (16384) and unit (8kB)

Source: PostgreSQL 16 Official Documentation - Resource Consumption settings

95% confidence
A

Reducing PostgreSQL Table Bloat Without Locking

Use pg_repack extension to rebuild tables without blocking concurrent operations.

Installation

CREATE EXTENSION pg_repack;

Requires: PostgreSQL 9.4+ (current versions: 12-17 supported as of pg_repack 1.5.1)

Usage

# Repack a specific table (no exclusive locks during rebuild)
pg_repack -t table_name database_name

# Repack entire database
pg_repack database_name

# With connection parameters
pg_repack -h localhost -p 5432 -U postgres -t table_name database_name

How It Works

  1. Creates temporary copy of table with proper indexes
  2. Uses triggers to track concurrent changes during rebuild
  3. Applies accumulated changes from triggers
  4. Swaps tables atomically (brief ACCESS EXCLUSIVE lock <0.1 seconds)
  5. Drops old bloated table

Lock Behavior

  • During rebuild: Only ACCESS SHARE locks (allows reads/writes)
  • Final swap: ACCESS EXCLUSIVE for milliseconds only
  • Safe for production use on active tables

Requirements

  • Target table must have a PRIMARY KEY or UNIQUE NOT NULL index
  • Requires free disk space = 2× table size temporarily
  • User needs SUPERUSER or table ownership + CREATE privilege on database

Alternative: CONCURRENTLY Operations

For minor bloat, use:

-- Rebuild indexes without blocking writes
REINDEX INDEX CONCURRENTLY index_name;

-- PostgreSQL 12+: Reindex table indexes
REINDEX TABLE CONCURRENTLY table_name;

Limitation: These only fix index bloat, not table bloat.

Source

Official pg_repack documentation: https://reorg.github.io/pg_repack/
PostgreSQL REINDEX docs: https://www.postgresql.org/docs/current/sql-reindex.html

95% confidence
A

Detecting Table Bloat in PostgreSQL

Use the pgstattuple extension to get exact bloat metrics, or calculate estimated bloat using system catalogs.

Method 1: pgstattuple Extension (Most Accurate)

Requires PostgreSQL 8.2+

-- Enable the extension (one time, requires superuser)
CREATE EXTENSION IF NOT EXISTS pgstattuple;

-- Check specific table bloat
SELECT 
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
  round(100 * (1 - (pgstattuple(schemaname||'.'||tablename)).table_len::float8 / 
    NULLIF(pg_total_relation_size(schemaname||'.'||tablename), 0)), 2) AS bloat_pct,
  pg_size_pretty((pgstattuple(schemaname||'.'||tablename)).dead_tuple_len) AS dead_tuple_size
FROM pg_tables
WHERE schemaname = 'public' AND tablename = 'your_table_name';

Key metrics from pgstattuple:

  • dead_tuple_len: Bytes occupied by dead tuples
  • free_space: Bytes of free space
  • Bloat % = (dead_tuple_len + free_space) / table_len × 100

Warning: pgstattuple() acquires ACCESS SHARE lock and scans entire table - avoid on large production tables during peak hours.

Method 2: Statistical Estimation (Faster, No Locks)

-- Estimate bloat without full table scan
SELECT
  current_database() AS db,
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 
    ELSE (sml.relpages-otta::numeric)/sml.relpages END, 3) AS tbloat_ratio,
  pg_size_pretty((CASE WHEN relpages < otta THEN 0 
    ELSE relpages::bigint - otta END) * 8192) AS wastedbytes
FROM (
  SELECT
    schemaname, tablename, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
  FROM (
    SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM (
      SELECT
        schemaname, tablename, hdr, ma, bs,
        SUM((1-null_frac)*avg_width) AS datawidth,
        MAX(null_frac) AS maxfracsum,
        hdr+(
          SELECT 1+count(*)/8
          FROM pg_stats s2
          WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
        ) AS nullhdr
      FROM pg_stats s, (
        SELECT
          (SELECT current_setting('block_size')::numeric) AS bs,
          CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
    ) AS foo
  ) AS rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname
) AS sml
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
  AND sml.relpages > 128  -- Only tables > 1MB
ORDER BY wastedbytes DESC;

Actionable thresholds:

  • tbloat_ratio < 0.2 (20%): Normal, no action needed
  • tbloat_ratio 0.2-0.4: Consider VACUUM FULL or REINDEX during maintenance window
  • tbloat_ratio > 0.4: High bloat - schedule maintenance immediately

Method 3: Quick Check via Statistics

-- Fast check using n_dead_tup from pg_stat_user_tables
SELECT
  schemaname,
  relname AS tablename,
  n_live_tup,
  n_dead_tup,
  ROUND(100 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
  last_vacuum,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Dead tuple threshold: If dead_pct > 20%, manually run VACUUM or tune autovacuum settings.

Source

PostgreSQL Official Documentation: pgstattuple (v16) - https://www.postgresql.org/docs/current/pgstattuple.html

95% confidence
A

wal_buffers is a PostgreSQL configuration parameter that sets the amount of shared memory used for buffering Write-Ahead Log (WAL) data before it's written to disk.

Configuration

Default value: -1 (auto-tuned to 1/32 of shared_buffers, minimum 64 kB, maximum 16 MB)

Valid range: 32 kB to 16 GB (or -1 for automatic)

Units: 8 kB blocks (PostgreSQL page size)

Restart required: Yes (requires PostgreSQL restart to change)

Setting the Value

In postgresql.conf:

# Automatic (recommended for most cases)
wal_buffers = -1

# Manual setting (in units of 8kB)
wal_buffers = 2048  # 16 MB

# Or with unit suffix
wal_buffers = 16MB

How It Works

WAL data is first written to these buffers in shared memory, then flushed to disk at commit time or when buffers fill. Larger buffers reduce I/O for write-heavy workloads by batching more WAL data per flush.

Tuning Guidance

  • Default (-1): Sufficient for most workloads
  • High write throughput: Set to 16-32 MB manually
  • Values above 32 MB: Rarely beneficial; diminishing returns
  • Never set below: 1 MB for production systems

Version Notes

Applies to PostgreSQL 9.1+ (auto-tuning introduced in 9.1). Prior versions required manual setting with default of 64 kB.

Source: PostgreSQL 16 Official Documentation - Server Configuration: Write Ahead Log

95% confidence
A

When to Increase PostgreSQL Statistics Target

Increase default_statistics_target (default: 100) or per-column statistics target when:

Primary Indicators

  1. Suboptimal Query Plans on Large Tables

    • Planner chooses seq scans over index scans despite available indexes
    • Join order is inefficient for tables with skewed data distributions
    • Query performance degrades as table size grows beyond ~1M rows
  2. High Cardinality Columns with Skewed Distribution

    • Columns with >10,000 distinct values (DISTINCT count >> 100)
    • Non-uniform data (e.g., 80% of rows share 5% of values)
    • Columns used in WHERE clauses, JOINs, or GROUP BY
  3. EXPLAIN ANALYZE Shows Row Estimate Errors

    • Estimated rows differ from actual rows by >10x
    • Check with: EXPLAIN (ANALYZE, BUFFERS) SELECT ...

Implementation

Per-column (recommended for targeted fixes):

ALTER TABLE users ALTER COLUMN email SET STATISTICS 500;
ANALYZE users;

Global setting (PostgreSQL 10+):

-- In postgresql.conf or via ALTER SYSTEM
ALTER SYSTEM SET default_statistics_target = 200;
-- Requires reload
SELECT pg_reload_conf();
-- Then re-analyze affected tables
ANALYZE;

Specific Values

  • Default: 100 (stores 100 most common values + 100 histogram buckets)
  • Recommended increase: 200-500 for problematic columns
  • Maximum: 10,000 (rarely needed; increases ANALYZE time and pg_statistic size)
  • Cost: Each +100 adds ~10-50ms to ANALYZE per column (varies by data size)

Trade-offs

  • ✅ More accurate selectivity estimates for complex predicates
  • ❌ Longer ANALYZE times (linear with target)
  • ❌ Larger pg_statistic table (~4KB per +100 target per column)

Source: PostgreSQL 16 official documentation (Chapter 14.2 - Statistics Used by the Planner)

95% confidence
A

Parallel query is PostgreSQL's capability to devise query plans that leverage multiple CPUs to answer queries faster. When the optimizer determines that parallel execution is the fastest strategy, it creates a query plan that includes a Gather or Gather Merge node, which launches multiple background worker processes to execute portions of the plan concurrently.

How It Works

The parallel portion of the plan is a "partial plan" where each worker process generates only a subset of the output rows. The Gather or Gather Merge node collects results from all workers and returns the complete result set to the user. The planner considers parallel query when max_parallel_workers_per_gather is set to a value greater than zero.

Performance

Queries that touch large amounts of data but return few rows typically benefit most. Many queries run more than twice as fast with parallel query, and some run four times faster or more. However, not all queries benefit—some have no query plan faster than serial execution due to current implementation limitations or query characteristics.

Supported Operations

PostgreSQL supports parallel execution for:

  • Parallel sequential scans
  • Parallel index scans
  • Parallel joins (nested loop, hash, merge)
  • Parallel aggregation
  • Parallel append operations

Sources:

95% confidence
A

What is effective_cache_size in PostgreSQL?

effective_cache_size is a PostgreSQL configuration parameter that tells the query planner how much memory is available for caching data, across both PostgreSQL's shared buffers and the operating system's page cache.

Key Details

Type: Planner hint parameter (does NOT allocate memory)
Default value: 4GB (PostgreSQL 9.4+)
Units: Memory units (kB, MB, GB)
Scope: Can be set per-session, per-user, per-database, or server-wide

What It Does

  • Influences the query planner's cost estimates for index scans vs sequential scans
  • Higher values make the planner more likely to use index scans (assuming data is cached)
  • Does NOT actually allocate or use any memory itself
  • Only affects query planning decisions, not execution

Recommended Setting

Set to 50-75% of total system RAM:

-- For a system with 16GB RAM:
ALTER SYSTEM SET effective_cache_size = '12GB';
SELECT pg_reload_conf();

Or in postgresql.conf:

effective_cache_size = 12GB

Calculation Logic

Should approximate: shared_buffers + OS page cache available for PostgreSQL

For a dedicated database server:

  • Total RAM: 16GB
  • shared_buffers: 4GB (25% of RAM)
  • OS page cache: ~8GB
  • effective_cache_size: 12GB (75% of total RAM)

Version Info

Applies to PostgreSQL 9.0+ (modern versions 12-17 use same semantics)

Source: PostgreSQL Official Documentation - Resource Consumption

95% confidence
A

Table Bloat in PostgreSQL

Table bloat is disk space occupied by dead tuples (deleted or obsolete row versions) that PostgreSQL cannot immediately reuse. It occurs due to PostgreSQL's MVCC (Multi-Version Concurrency Control) implementation.

Why It Happens

When you UPDATE or DELETE a row, PostgreSQL:

  1. Marks the old tuple as "dead" but doesn't remove it immediately
  2. Creates a new tuple version (for UPDATEs)
  3. Keeps dead tuples until VACUUM processes them

Dead tuples accumulate faster than VACUUM reclaims them when:

  • High UPDATE/DELETE rate
  • Long-running transactions (prevent VACUUM from cleaning tuples)
  • Autovacuum settings too conservative
  • Disabled or failing autovacuum

Detection

Check bloat using the pgstattuple extension:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT 
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
  round(dead_tuple_percent, 2) AS dead_tuple_percent
FROM pgstattuple_approx('your_table_name');

Bloat threshold: Dead tuple percentage >20% typically indicates problematic bloat.

Remediation

Regular VACUUM (reclaims space within table, doesn't shrink file):

VACUUM VERBOSE tablename;

VACUUM FULL (rewrites table, locks it exclusively, returns space to OS):

VACUUM FULL tablename;  -- Use during maintenance window

Autovacuum tuning (PostgreSQL 13+):

ALTER TABLE tablename SET (
  autovacuum_vacuum_scale_factor = 0.05,  -- Default: 0.2
  autovacuum_vacuum_threshold = 50        -- Default: 50
);

Impact

  • Increased disk I/O (scanning dead tuples)
  • Slower sequential scans
  • Index bloat (indexes also contain dead tuple pointers)
  • Wasted disk space

Source: PostgreSQL 16 Documentation - Routine Vacuuming

95% confidence
A

JIT compilation in PostgreSQL is beneficial for long-running CPU-bound queries, particularly analytical queries. For short queries, the JIT compilation overhead typically exceeds any performance gains.

When JIT Activates

PostgreSQL uses cost-based thresholds to automatically decide when to use JIT:

  1. jit_above_cost - JIT is triggered when estimated query cost exceeds this threshold
  2. jit_inline_above_cost - Enables inlining of small functions/operators into expressions (more aggressive optimization)
  3. jit_optimize_above_cost - Enables expensive LLVM optimizations for very costly queries

Key Characteristics

  • Decision timing: JIT decisions are made at plan time, not execution time
  • Primary benefit: Inlining function bodies into expressions eliminates function call overhead
  • Best use case: Queries with many expression evaluations (filtering, aggregations, computations)
  • Not beneficial for: Short queries where compilation overhead dominates, I/O-bound queries

You can verify JIT usage with EXPLAIN (ANALYZE, VERBOSE) to see JIT compilation timing and whether it was applied.

Sources:

95% confidence
A

checkpoint_completion_target is a PostgreSQL configuration parameter that controls how much of the checkpoint interval should be used to spread out checkpoint I/O operations.

Value: A floating-point number between 0 and 1 (default: 0.9 in PostgreSQL 14+, was 0.5 in earlier versions)

Purpose: When PostgreSQL performs a checkpoint (writing all dirty buffers to disk), this parameter determines the target completion time as a fraction of checkpoint_timeout. For example, with checkpoint_timeout=5min and checkpoint_completion_target=0.9, PostgreSQL aims to complete the checkpoint in 4.5 minutes.

Effect: Higher values (closer to 1.0) spread writes over a longer period, reducing I/O spikes but increasing recovery time after a crash. Lower values complete checkpoints faster but cause sharper I/O peaks.

Configuration:

-- In postgresql.conf
checkpoint_completion_target = 0.9

-- Or at runtime (requires superuser or pg_write_server_files role)
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
SELECT pg_reload_conf();

Best Practice: The default of 0.9 is recommended for most workloads. Only decrease it if you need faster crash recovery and can tolerate I/O spikes. Values below 0.5 are rarely beneficial.

Source: PostgreSQL 16 official documentation - Server Configuration: Write Ahead Log (postgresql.org/docs/16/runtime-config-wal.html#GUC-CHECKPOINT-COMPLETION-TARGET)

95% confidence
A

maintenance_work_mem

maintenance_work_mem is a PostgreSQL configuration parameter that specifies the maximum amount of memory to be used for maintenance operations (per operation, not total).

Exact Specifications

  • Default value: 64 MB
  • Minimum value: 1024 KB (1 MB)
  • Maximum value: 2147483647 KB (2 TB)
  • Scope: Session-level or server-level
  • Reload required: No (can be changed with SET command)

Operations That Use This Memory

  1. VACUUM (not VACUUM FULL)
  2. CREATE INDEX
  3. ALTER TABLE ADD FOREIGN KEY
  4. CREATE EXTENSION
  5. Restoring pg_dump files

Configuration

In postgresql.conf:

maintenance_work_mem = 256MB

Or per-session:

SET maintenance_work_mem = '1GB';
CREATE INDEX idx_name ON table_name(column_name);
RESET maintenance_work_mem;

Critical Details

  • Does NOT use shared_buffers: This is separate from regular query memory
  • Per-operation allocation: Each autovacuum worker or index creation uses this amount independently
  • Autovacuum limitation: Autovacuum workers use autovacuum_work_mem (if set) or maintenance_work_mem / autovacuum_max_workers (whichever is smaller)
  • Higher is better for large tables: For tables >1GB, setting this to 512MB-2GB significantly improves VACUUM and CREATE INDEX performance

Recommended Values

  • Small databases (<10GB): 256 MB
  • Medium databases (10-100GB): 512 MB - 1 GB
  • Large databases (>100GB): 1-4 GB (but watch total RAM usage)

Source: PostgreSQL 16 Official Documentation - Server Configuration

95% confidence
A

min_wal_size is a PostgreSQL configuration parameter that sets the minimum size to which the WAL (Write-Ahead Log) can shrink during automatic checkpoints.

Default Value: 80 MB (as of PostgreSQL 9.5+)

Purpose:

  • Controls WAL disk space management by preventing WAL files from being recycled/removed if total WAL size would fall below this threshold
  • Acts as a lower bound - PostgreSQL keeps at least this much WAL on disk even during quiet periods
  • Works in conjunction with max_wal_size to manage WAL growth

Behavior:

  • During checkpoints, PostgreSQL recycles old WAL files instead of deleting them if WAL size is near min_wal_size
  • When WAL size exceeds min_wal_size, PostgreSQL may remove old WAL segments
  • Setting too low wastes I/O on frequent WAL file creation/deletion
  • Setting too high wastes disk space

Configuration Example:

-- Set to 1 GB
ALTER SYSTEM SET min_wal_size = '1GB';
SELECT pg_reload_conf();
# In postgresql.conf
min_wal_size = 1GB

Valid Units: Accepts standard PostgreSQL size units: kB, MB, GB, TB

Relationship: Must be less than max_wal_size (default 1GB). Recommended ratio: set min_wal_size to 25-50% of max_wal_size for write-heavy workloads.

Source: PostgreSQL Official Documentation, versions 9.5-17 (parameter introduced in 9.5, replacing checkpoint_segments)

95% confidence
A

autovacuum_vacuum_threshold is a PostgreSQL configuration parameter that sets the minimum number of updated or deleted tuples needed to trigger an autovacuum operation on a table.

Exact Value

  • Default: 50 tuples
  • Type: Integer
  • Minimum: 0

How It Works

Autovacuum triggers on a table when:

number_of_obsolete_tuples > autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * number_of_tuples)

Where:

  • autovacuum_vacuum_scale_factor default is 0.2 (20% of table)
  • Obsolete tuples = dead tuples from UPDATEs and DELETEs

Configuration Levels

System-wide (postgresql.conf):

autovacuum_vacuum_threshold = 50

Per-table override (recommended for specific tables):

ALTER TABLE my_table SET (autovacuum_vacuum_threshold = 1000);

Practical Impact

  • Small tables: With default settings, a 100-row table needs 50 + (0.2 × 100) = 70 dead tuples to trigger vacuum
  • Large tables: A 1M-row table needs 50 + (0.2 × 1M) = 200,050 dead tuples
  • High-churn small tables: Set lower threshold (e.g., 10) to vacuum more frequently

Source

PostgreSQL 16 Official Documentation: Automatic Vacuuming

Valid for PostgreSQL 9.x through 17 (value unchanged across versions).

95% confidence
A

ANALYZE in PostgreSQL

ANALYZE is a PostgreSQL command that collects statistics about the contents of tables in the database. The query planner uses these statistics to generate optimal execution plans.

What it does:

  • Samples table rows (default: 300 rows per column, configurable via default_statistics_target)
  • Computes statistics: most common values, histograms, NULL fraction, average width, distinct value counts
  • Stores statistics in pg_statistic system catalog (viewable via pg_stats view)
  • Does NOT modify table data, only metadata

Syntax:

-- Analyze entire database
ANALYZE;

-- Analyze specific table
ANALYZE table_name;

-- Analyze specific columns
ANALYZE table_name (column1, column2);

-- Verbose output
ANALYZE VERBOSE table_name;

When to run:

  • After bulk INSERT/UPDATE/DELETE operations
  • After significant data changes (>10-15% of rows)
  • Automatically via autovacuum daemon (enabled by default)

Key parameters (postgresql.conf):

  • default_statistics_target = 100 (range: 1-10000, higher = more accurate but slower)
  • autovacuum_analyze_threshold = 50 rows
  • autovacuum_analyze_scale_factor = 0.1 (10% of table)

Performance impact:

  • Acquires SHARE UPDATE EXCLUSIVE lock (allows reads/writes, blocks schema changes)
  • Typically completes in seconds for tables <1M rows
  • Runs concurrently with normal operations

Source: PostgreSQL 16 Official Documentation - ANALYZE command reference

95% confidence
A

Autovacuum in PostgreSQL

Autovacuum is a background daemon process in PostgreSQL that automatically performs VACUUM and ANALYZE operations to maintain database health. It runs by default and prevents transaction ID wraparound failure while reclaiming dead tuple storage.

What It Does

  1. Removes dead tuples - Reclaims space from updated/deleted rows that are no longer visible to any transaction
  2. Prevents transaction ID wraparound - PostgreSQL uses 32-bit transaction IDs; autovacuum prevents the 2 billion transaction limit from causing data loss
  3. Updates statistics - Runs ANALYZE to refresh query planner statistics

Key Configuration Parameters

-- Check if autovacuum is enabled (default: on)
SHOW autovacuum;

-- Critical settings in postgresql.conf:
autovacuum = on  -- Enable/disable (default: on)
autovacuum_max_workers = 3  -- Max parallel workers (default: 3)
autovacuum_naptime = 60  -- Seconds between runs (default: 60s)

-- Per-table trigger thresholds:
autovacuum_vacuum_threshold = 50  -- Min row changes before vacuum
autovacuum_vacuum_scale_factor = 0.2  -- Fraction of table size (20%)
-- Vacuum triggers when: dead_tuples > threshold + (scale_factor * total_rows)

Critical Behavior

  • Vacuum runs on a table when: dead_tuples > 50 + (0.2 × table_rows)
  • ANALYZE runs when: changed_tuples > 50 + (0.1 × table_rows)
  • Anti-wraparound vacuum cannot be canceled and runs when a table reaches autovacuum_freeze_max_age (default: 200 million transactions)

Override Per Table

ALTER TABLE my_table SET (
  autovacuum_vacuum_scale_factor = 0.05,
  autovacuum_vacuum_threshold = 1000
);

Source: PostgreSQL 16 Official Documentation - Routine Vacuuming (https://www.postgresql.org/docs/current/routine-vacuuming.html)

95% confidence
A

PostgreSQL 18 Asynchronous I/O Performance

PostgreSQL 18 introduces an asynchronous I/O (AIO) subsystem that can deliver 2-3x performance improvements for read-heavy workloads.

How io_uring Works

io_uring establishes a shared ring buffer between PostgreSQL and the Linux kernel, allowing:

  1. Multiple I/O requests to be submitted in a single syscall
  2. Completions to be reaped without blocking
  3. Zero-copy data transfer in many cases

Performance by Environment

Environment Improvement Notes
Cloud (EBS, network storage) 2-3x Highest gains due to I/O latency
Local SSD ~24% Still beneficial but less dramatic
Warm cache Minimal Data already in memory

Supported Operations (PostgreSQL 18)

  • Sequential scans
  • Bitmap heap scans
  • VACUUM operations

Not yet supported: Index scans, write operations, WAL

Configuration

-- Enable io_uring (Linux only)
ALTER SYSTEM SET io_method = 'io_uring';
SELECT pg_reload_conf();

-- Verify
SHOW io_method;

Requirements

  • Linux kernel 5.1 or later
  • File system must support io_uring (ext4, XFS, etc.)
  • Requires liburing library at compile time

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence
A

You should limit max_connections because each connection consumes system resources that are pre-allocated at server startup. PostgreSQL sizes certain resources (including shared memory) directly based on the max_connections value—increasing it allocates more memory and system resources even if those connections are never used.

Key reasons to limit:

  1. Memory consumption: Each connection requires RAM for connection management and client operations. Higher max_connections means less RAM available for query execution and caching.

  2. Performance degradation: With high max_connections, you must set work_mem lower to avoid exhausting memory. This makes individual queries slower.

  3. Resource pre-allocation: PostgreSQL allocates shared memory and other resources proportional to max_connections at startup, not dynamically.

  4. Diminishing returns: Beyond a certain point, adding more concurrent connections degrades performance due to context switching and resource contention.

Recommended approach: Keep max_connections relatively low (default is 100) and use a connection pooler like PgBouncer to handle many client connections efficiently.

Sources:

95% confidence
A

Connection pooling in PostgreSQL is a middleware technique that maintains a pool of persistent database connections that can be reused by multiple clients, rather than creating a new connection for each client request.

Why it's needed: Each PostgreSQL connection requires forking a new backend process, which involves a handshake that takes 20-30 milliseconds and consumes approximately 10 MB of RAM per connection. Direct connections don't scale well beyond a few hundred concurrent users.

How it works: A connection pooler (like PgBouncer or PgPool-II) sits between clients and PostgreSQL. Clients connect to the pooler, which maintains a smaller pool of actual database connections. When a client needs to execute a query, the pooler assigns it an available connection from the pool, then returns that connection to the pool when done.

Pooling modes:

  • Session pooling: Connection assigned to client for entire session duration
  • Transaction pooling: Connection assigned only during a transaction (most common)
  • Statement pooling: Connection assigned per statement (most aggressive, limited use cases)

Best practice: PostgreSQL's max_connections parameter (default 100) should remain relatively low. If you need more than 200 concurrent connections, use an external connection pooler rather than increasing max_connections, as PostgreSQL sizes shared memory resources based on this value.

Note: PostgreSQL does not include built-in connection pooling in the server itself. The JDBC driver provides basic pooling for Java applications, but for production deployments, external poolers like PgBouncer or PgPool-II are recommended.

Sources:

95% confidence
A

Use PostgreSQL's built-in pgbench tool to benchmark database performance.

Basic workflow:

  1. Initialize test database:
pgbench -i -s 50 your_database_name

The -s scale factor determines dataset size (scale 50 = 5 million rows in pgbench_accounts).

  1. Run benchmark:
pgbench -c 10 -j 2 -t 1000 your_database_name
  • -c 10: 10 concurrent clients
  • -j 2: 2 worker threads
  • -t 1000: 1000 transactions per client
  1. Or run time-based test:
pgbench -c 10 -j 2 -T 60 -P 5 your_database_name
  • -T 60: Run for 60 seconds
  • -P 5: Show progress every 5 seconds

Key metrics reported:

  • TPS (transactions per second): Primary performance indicator
  • Latency average: Response time per transaction
  • Standard deviation: Consistency of performance

Custom benchmarks:

pgbench -c 10 -T 60 -f custom_queries.sql your_database_name

Production-like testing options:

  • -M prepared: Use prepared statements (fastest)
  • -R 500: Rate-limit to 500 TPS (test sustained load)
  • -l: Log detailed per-transaction timing

Best practices:

  • Run tests for several minutes minimum
  • Scale factor should match or exceed client count
  • Use -j threads on multi-CPU systems
  • Run pgbench on separate machine for high client counts

Sources:

95% confidence
A

When maintenance_work_mem is Used in PostgreSQL

maintenance_work_mem is used during maintenance operations, not regular queries. Specifically:

Operations That Use It:

  1. VACUUM - When cleaning up dead tuples and updating statistics
  2. CREATE INDEX - When building new indexes (including initial index creation)
  3. ALTER TABLE ADD FOREIGN KEY - When validating foreign key constraints
  4. REINDEX - When rebuilding existing indexes
  5. CREATE EXTENSION (certain extensions that build internal indexes)

NOT Used For:

  • Regular SELECT, INSERT, UPDATE, DELETE queries (these use work_mem)
  • Index scans during normal query execution
  • Autovacuum (uses autovacuum_work_mem if set, otherwise maintenance_work_mem)

Default Value:

  • 64 MB (as of PostgreSQL 12+; was 16 MB prior to version 9.4)

Configuration:

-- Session level
SET maintenance_work_mem = '256MB';

-- For a specific operation
SET maintenance_work_mem = '1GB';
CREATE INDEX idx_name ON table_name(column);
RESET maintenance_work_mem;

Sizing Guideline:

Set to 5-10% of total RAM for systems where you run maintenance operations, but typically not more than 2GB per operation (higher values show diminishing returns for index creation).

Source: PostgreSQL Official Documentation (v16), runtime configuration parameters - https://www.postgresql.org/docs/current/runtime-config-resource.html

95% confidence
A

pg_repack Extension

pg_repack is a PostgreSQL extension that removes bloat from tables and indexes by rebuilding them without holding exclusive locks during the operation, allowing concurrent reads and writes.

What It Does

It reorganizes tables to reclaim wasted space caused by UPDATE/DELETE operations, which in PostgreSQL leave behind dead tuples that create bloat. Unlike VACUUM FULL or CLUSTER, pg_repack does not block DML operations.

Key Characteristics

  • Online operation: Uses triggers to capture concurrent changes during rebuild
  • Lock requirements: Briefly holds ACCESS EXCLUSIVE lock only at the start and end (typically milliseconds)
  • Space requirements: Needs temporary disk space approximately equal to the table size
  • Compatibility: Works with PostgreSQL 9.4+ (latest version 1.5.1 supports PostgreSQL 16)

Installation

CREATE EXTENSION pg_repack;

Basic Usage

# Repack specific table
pg_repack -t schema.table_name database_name

# Repack all tables in database
pg_repack database_name

# Repack only indexes
pg_repack -x -t table_name database_name

When to Use

  • Tables with >20-30% bloat (check with pgstattuple extension)
  • Production systems where downtime is unacceptable
  • After bulk DELETE/UPDATE operations

Source

Official GitHub: https://github.com/reorg/pg_repack (maintained by the PostgreSQL community)

95% confidence
A

VACUUM in PostgreSQL

VACUUM is a PostgreSQL maintenance command that reclaims storage space occupied by dead tuples (deleted or obsoleted rows) and updates statistics for the query planner.

What it does:

  1. Removes dead tuples: PostgreSQL uses MVCC (Multi-Version Concurrency Control), which marks rows as deleted rather than immediately removing them. VACUUM physically removes these dead tuples.

  2. Prevents transaction ID wraparound: PostgreSQL uses a 32-bit transaction ID that wraps around after ~2 billion transactions. VACUUM freezes old transaction IDs to prevent data loss.

  3. Updates free space map (FSM): Marks reclaimed space as available for reuse by future INSERTs/UPDATEs.

Basic syntax:

-- Vacuum single table
VACUUM table_name;

-- Vacuum entire database
VACUUM;

-- Vacuum with full reclaim (locks table, rewrites entirely)
VACUUM FULL table_name;

-- Vacuum and analyze (updates statistics)
VACUUM ANALYZE table_name;

Key parameters:

  • autovacuum: Enabled by default since PostgreSQL 8.3. Runs automatically when thresholds are met.
  • autovacuum_vacuum_threshold: Default is 50 tuples
  • autovacuum_vacuum_scale_factor: Default is 0.2 (20% of table size)
  • Autovacuum triggers when: dead_tuples > threshold + (scale_factor * table_size)

VACUUM vs VACUUM FULL:

  • VACUUM: Marks space as reusable but doesn't shrink table files. Non-blocking (allows concurrent reads/writes).
  • VACUUM FULL: Rewrites entire table, returns disk space to OS. Takes exclusive lock, can be very slow on large tables.

Version: PostgreSQL 12+ (current behavior). Verified from PostgreSQL 16 official documentation.

Source: https://www.postgresql.org/docs/current/sql-vacuum.html

95% confidence

Data Definition (DDL) > Views

24 questions

Functions - Array

23 questions

Transactions > Savepoints

23 questions

Backup and Restore > pg_restore

23 questions
A

No, pg_restore cannot restore large objects selectively (for instance, only those for a specific table). If an archive contains large objects, then all large objects will be restored, or none of them if excluded via -L, -t, or other options.

95% confidence
A

The --use-set-session-authorization option outputs SQL-standard SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to determine object ownership. This makes the dump more standards-compatible, but depending on the history of the objects in the dump, might not restore properly.

95% confidence
A

pg_restore supports three archive formats: custom format (designated 'c'), directory format (designated 'd'), and tar format (designated 't'). These are non-plain-text formats created by pg_dump. Plain-text SQL dumps cannot be restored with pg_restore and must use psql instead.

95% confidence
A

The --strict-names option requires that each schema (-n/--schema) and table (-t/--table) qualifier match at least one schema/table in the file to be restored. It generates an error if specified names don't match any objects in the backup file.

95% confidence

JIT Compilation

22 questions

replication_ha

22 questions
A

max_wal_senders is a PostgreSQL configuration parameter that sets the maximum number of concurrent WAL sender processes that can run simultaneously.

Default Value: 10 (as of PostgreSQL 9.6+)

Purpose: Controls how many replication connections the server can support. Each streaming replication standby or logical replication subscriber requires one WAL sender process.

Key Requirements:

  • Must be set to at least the number of standby servers you plan to connect
  • Requires server restart to change (not reloadable)
  • Common practice: set to max_wal_senders = <number_of_standbys> + 2 to allow for temporary connections during maintenance

Prerequisites:

  • wal_level must be set to replica or logical (not minimal)
  • Each WAL sender consumes one connection slot from max_connections

Configuration Example:

# postgresql.conf
max_wal_senders = 5        # allows up to 5 replication connections
wal_level = replica         # required for streaming replication

Valid Range: 0 to 262143

Setting it to 0 disables replication entirely.

Source: PostgreSQL Official Documentation - Write Ahead Log Settings (applies to PostgreSQL 10+, consistent through PostgreSQL 17)

95% confidence
A

PostgreSQL Streaming Replication Setup (PostgreSQL 12+)

Streaming replication requires one primary server and one or more standby servers. Here's the complete setup:

On Primary Server

  1. Edit postgresql.conf:
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1024  # MB, PostgreSQL 13+; use wal_keep_segments=64 for PG 12
  1. Create replication user:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secure_password';
  1. Edit pg_hba.conf:
host    replication    replicator    standby_ip/32    scram-sha-256
  1. Restart PostgreSQL:
pg_ctl restart -D /var/lib/postgresql/data

On Standby Server

  1. Create base backup:
pg_basebackup -h primary_ip -D /var/lib/postgresql/data -U replicator -P -v -R -X stream -C -S standby_slot_1

Key flags:

  • -R: Creates standby.signal file (PG 12+)
  • -X stream: Streams WAL during backup
  • -C -S: Creates replication slot standby_slot_1
  1. Verify postgresql.auto.conf contains:
primary_conninfo = 'host=primary_ip port=5432 user=replicator password=secure_password'
  1. Start standby:
pg_ctl start -D /var/lib/postgresql/data

Verification

On primary:

SELECT client_addr, state, sync_state FROM pg_stat_replication;

Expected output shows standby with state='streaming' and sync_state='async' (for async replication).

Source: PostgreSQL 16 Official Documentation - High Availability, Load Balancing, and Replication

95% confidence
A

Streaming Replication in PostgreSQL

Streaming replication is PostgreSQL's built-in method for continuously transferring Write-Ahead Log (WAL) records from a primary server to one or more standby servers in near real-time, enabling high availability and read scaling.

How It Works

The primary server streams WAL records (8192 bytes each by default) to standbys over a TCP connection as transactions are committed, rather than waiting for WAL files to complete. Standbys continuously apply these records to maintain an up-to-date copy of the database.

Key Configuration (PostgreSQL 12+)

On Primary (postgresql.conf):

wal_level = replica                    # required for replication
max_wal_senders = 10                   # max concurrent connections from standbys
wal_keep_size = 1024                   # MB of WAL to retain (prevents standby lag issues)

On Standby:
Create standby.signal file in data directory, then configure postgresql.conf:

primary_conninfo = 'host=primary.example.com port=5432 user=replicator password=secret'

Replication Modes

  • Asynchronous (default): Primary commits transactions without waiting for standby confirmation. Fastest, but standby may lag slightly.
  • Synchronous: Primary waits for standby acknowledgment before commit returns. Configure via synchronous_standby_names = 'standby1' on primary.

Verification

Check replication status on primary:

SELECT client_addr, state, sync_state, replay_lag 
FROM pg_stat_replication;

Version Note

This answer applies to PostgreSQL 12+ which uses standby.signal. PostgreSQL 11 and earlier used recovery.conf instead.

Source: PostgreSQL Official Documentation - High Availability, Load Balancing, and Replication

95% confidence
A

What is PgBouncer and Why Use It?

PgBouncer is a lightweight connection pooler for PostgreSQL that sits between your application and the database server, managing a pool of persistent database connections.

What It Does

PgBouncer maintains a pool of active PostgreSQL connections and multiplexes client connections onto them. When your application requests a database connection, PgBouncer assigns one from its pool rather than creating a new PostgreSQL backend process.

Why Use It

Primary reason: PostgreSQL creates a separate backend process for each connection (fork-based architecture). Each process consumes ~10MB of memory. With 1000 direct connections, that's ~10GB just for connection overhead.

PgBouncer solves this by:

  • Allowing 1000+ application connections to share 20-50 actual PostgreSQL connections
  • Using ~2MB per 1000 client connections (vs ~10GB for direct connections)
  • Connecting in <1ms (vs 5-50ms for new PostgreSQL connection with auth)

Pool Modes

  • Session pooling: Connection returned to pool when client disconnects (safest, supports all features)
  • Transaction pooling: Connection returned after each transaction (most efficient, cannot use prepared statements across transactions)
  • Statement pooling: Connection returned after each statement (breaks transactions, rarely used)

Recommendation: Use transaction pooling for stateless web apps; use session pooling if you need prepared statements or advisory locks.

Typical Configuration

[databases]
mydb = host=localhost port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3

This allows 1000 app connections using only 25-30 PostgreSQL connections.

Current stable version: PgBouncer 1.23.1 (as of January 2025)

Source: Official PgBouncer documentation, PostgreSQL documentation on connection costs

95% confidence
A

What is a WAL (Write-Ahead Log) in PostgreSQL?

A Write-Ahead Log (WAL) is PostgreSQL's crash recovery and replication mechanism where all data modifications are written to a sequential log file before being applied to the actual data files.

Key Characteristics

Core Principle: Changes are logged before data pages are modified (write-ahead), ensuring durability and crash recovery.

Physical Storage:

  • Default location: pg_wal/ directory (called pg_xlog/ in PostgreSQL 9.6 and earlier)
  • Files are 16 MB each by default (configurable via --wal-segment-size at compile time)
  • Named with 24-character hexadecimal identifiers (e.g., 000000010000000000000001)

Purpose:

  1. Crash Recovery: Replay WAL records to restore database to consistent state
  2. Point-in-Time Recovery (PITR): Archive WAL segments for backup restoration
  3. Replication: Stream WAL to standby servers for physical replication
  4. Minimal Disk I/O: Write sequential log instead of random data page updates

Critical Configuration Parameters

-- Control when WAL is flushed to disk
wal_level = replica  -- minimal | replica | logical (default: replica in PG 14+)

-- When to flush WAL buffer
synchronous_commit = on  -- on | remote_apply | remote_write | local | off

-- WAL buffer size (shared memory)
wal_buffers = 16MB  -- default: -1 (auto-sized to 1/32 of shared_buffers, max 16MB)

-- Checkpoint frequency
checkpoint_timeout = 5min  -- default: 5 minutes
max_wal_size = 1GB  -- default: 1GB (triggers checkpoint)

How It Works

  1. Transaction modifies data → generates WAL record
  2. WAL record written to WAL buffer in shared memory
  3. At commit, WAL flushed to disk (via fsync)
  4. Only after WAL is on disk, transaction confirmed as committed
  5. Modified data pages written to disk later (asynchronously)

Viewing WAL Information

-- Current WAL write position
SELECT pg_current_wal_lsn();

-- WAL file for a specific LSN
SELECT pg_walfile_name('0/1000000');

-- Check WAL archiving status
SELECT archived_count, failed_count 
FROM pg_stat_archiver;

Source: PostgreSQL Official Documentation v16 - Chapter 29 (Reliability and the Write-Ahead Log)
Verified: PostgreSQL 12-16 (current stable as of January 2025)

95% confidence
A

What is Patroni for PostgreSQL?

Patroni is an open-source high availability (HA) solution for PostgreSQL that manages automatic failover and cluster configuration using a Distributed Configuration Store (DCS) like etcd, ZooKeeper, or Consul.

Core Functionality

Patroni runs as a daemon alongside each PostgreSQL instance and:

  • Monitors cluster health via DCS leader elections (default lease TTL: 30 seconds)
  • Performs automatic failover when the primary fails (typical failover time: 30-60 seconds)
  • Manages replication by automatically configuring standby servers
  • Provides REST API on port 8008 (default) for health checks and administration

Key Configuration Example

# patroni.yml
scope: postgres-cluster
name: node1

restapi:
  listen: 0.0.0.0:8008
  connect_address: 192.168.1.10:8008

etcd:
  host: 192.168.1.100:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576  # bytes (1MB)

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 192.168.1.10:5432
  data_dir: /var/lib/postgresql/14/main

Critical Parameters

  • ttl: 30 - Leader lease duration in seconds (default: 30s)
  • loop_wait: 10 - Seconds between cluster state checks (default: 10s)
  • maximum_lag_on_failover: 1048576 - Max replication lag in bytes to allow failover (default: 1MB)

Source: Patroni Official Documentation v3.2+

95% confidence
A

PostgreSQL synchronous_commit Setting

synchronous_commit is a PostgreSQL configuration parameter that controls whether the server waits for WAL (Write-Ahead Log) records to be written to disk before reporting transaction commit success to the client.

Valid Values (PostgreSQL 9.1+)

  • on (default) - Wait for local WAL flush to disk before commit returns
  • remote_apply (9.6+) - Wait for standby servers to apply WAL changes before commit returns
  • remote_write (9.1+) - Wait for standby servers to write WAL to OS (not fsync) before commit returns
  • local (9.1+) - Wait for local WAL write to OS (not fsync) before commit returns
  • off - Return immediately without waiting for WAL write

Performance vs. Durability Trade-off

When set to off, commits are 2-10x faster but there is a risk of losing the last few transactions (up to 3x wal_writer_delay, default 200ms = 600ms window) if the server crashes. Data corruption does NOT occur—only recent committed transactions may be lost.

Configuration

-- Session-level (recommended for specific transactions)
SET synchronous_commit = off;

-- Server-level (postgresql.conf)
synchronous_commit = on

Common Use Case

BEGIN;
SET LOCAL synchronous_commit = off;
INSERT INTO logs VALUES (...); -- Fast, non-critical data
COMMIT; -- Returns immediately

Source: PostgreSQL 16 Official Documentation - "Write-Ahead Logging (WAL)" and "High Availability, Load Balancing, and Replication"

Critical Detail: Only affects commit acknowledgment timing; all transactions still write to WAL eventually. This is NOT a data integrity risk for crash recovery, only a recent-transaction-loss risk.

95% confidence
A

Cascading Replication in PostgreSQL

Cascading replication is a PostgreSQL feature where a standby server receives WAL (Write-Ahead Log) data from another standby server instead of directly from the primary server, creating a replication chain: Primary → Standby A → Standby B.

Configuration

Available since: PostgreSQL 9.2+

On the intermediate standby (the one that will send WAL to downstream standbys), set in postgresql.conf:

hot_standby = on
max_wal_senders = 5        # Must be > 0 to allow downstream connections
wal_level = replica        # Required for cascading (or 'logical')

On the downstream standby, configure primary_conninfo in postgresql.auto.conf or recovery settings to point to the intermediate standby instead of the primary:

primary_conninfo = 'host=standby-a port=5432 user=replication password=yourpass'

Key Characteristics

  • WAL propagation: Changes flow Primary → Standby A → Standby B. If Standby A fails, Standby B stops receiving updates until reconnected to another source.
  • Lag multiplication: Replication lag accumulates down the chain. If Primary→A has 2s lag and A→B has 2s lag, B is ~4s behind Primary.
  • Authentication: Each standby in the chain must be configured in pg_hba.conf with replication permissions for downstream standbys.
  • No automatic failover rerouting: If an intermediate standby fails, downstream standbys do not automatically reconnect to the primary.

Use Cases

  • Reducing network load on the primary when replicating across geographic regions
  • Creating multi-tier replication topologies
  • Offloading replication bandwidth to dedicated relay servers

Source: PostgreSQL 17 Documentation - High Availability, Load Balancing, and Replication

95% confidence
A

pg_promote() Function

Available since: PostgreSQL 12

Purpose: Promotes a physical standby server to primary (read-write) mode by exiting recovery.

Syntax

pg_promote(wait boolean DEFAULT true, wait_seconds integer DEFAULT 60) → boolean

Parameters

  • wait (boolean, default: true) - Whether to wait for promotion to complete before returning
  • wait_seconds (integer, default: 60) - Maximum seconds to wait for promotion (only used if wait is true)

Return Value

  • Returns boolean
  • true if promotion succeeds (when wait is true)
  • false if promotion fails or timeout is reached
  • Returns immediately after signaling the postmaster if wait is false

Permissions

  • Restricted to superusers by default
  • EXECUTE privilege can be granted to non-superuser roles for delegated failover

Usage Example

-- Promote with default behavior (wait up to 60 seconds)
SELECT pg_promote();

-- Promote and don't wait for completion
SELECT pg_promote(false);

-- Promote and wait up to 120 seconds
SELECT pg_promote(true, 120);

Prior to PostgreSQL 12

Use pg_ctl promote or create a trigger file specified by promote_trigger_file parameter.

Source: PostgreSQL Official Documentation - System Administration Functions

95% confidence
A

PostgreSQL primary_conninfo Setting

The primary_conninfo is a PostgreSQL configuration parameter that specifies the connection string a standby server uses to connect to its primary server for streaming replication.

Location and Scope

  • Configuration file: postgresql.conf or postgresql.auto.conf
  • Scope: Standby servers only (has no effect on primary)
  • Reload: Requires pg_ctl reload or SELECT pg_reload_conf() to take effect

Format

Standard libpq connection string format:

primary_conninfo = 'host=primary.example.com port=5432 user=replicator password=secret application_name=standby1'

Required Parameters

  • host or hostaddr: Primary server hostname/IP
  • user: Replication user (must have REPLICATION role attribute)

Critical Optional Parameters

  • port: Default 5432
  • password: Authentication credential (consider using .pgpass file instead)
  • application_name: Identifies standby in pg_stat_replication view
  • sslmode: Set to require or higher for encrypted connections (default: prefer)

Example Configuration

# Minimal
primary_conninfo = 'host=192.168.1.10 user=replicator'

# Production-recommended
primary_conninfo = 'host=primary.db.local port=5432 user=replicator application_name=standby_dc2 sslmode=require connect_timeout=10'

Related Settings

Must be used with restore_command (for archive recovery) or alongside recovery signal file (standby.signal in PostgreSQL 12+).

Source: PostgreSQL 17 Official Documentation - High Availability, Load Balancing, and Replication

95% confidence
A

PostgreSQL Replication Slot

A replication slot is a persistent marker in PostgreSQL that guarantees WAL (Write-Ahead Log) files are retained until a replica consumes them, preventing automatic WAL deletion that would break replication.

Key Characteristics

Purpose: Ensures the primary server retains WAL segments needed by standby servers or logical replication subscribers, even if they disconnect temporarily.

Types:

  • Physical replication slots: For streaming replication (binary replication)
  • Logical replication slots: For logical replication (row-level changes)

Critical Behavior:

  • WAL files are never automatically removed if any replication slot needs them
  • Risk: Unbounded disk growth if a slot's consumer is permanently offline
  • Slots persist across server restarts
  • Each slot tracks a single consumer's replay position

Creating a Replication Slot

Physical slot (streaming replication):

SELECT pg_create_physical_replication_slot('slot_name');

Logical slot (PostgreSQL 10+):

SELECT pg_create_logical_replication_slot('slot_name', 'pgoutput');

The second parameter is the output plugin (e.g., pgoutput, wal2json, test_decoding).

Monitoring Slots

SELECT slot_name, slot_type, active, restart_lsn, 
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots;

Critical columns:

  • restart_lsn: Oldest WAL position this slot needs
  • active: Whether a consumer is currently connected
  • retained_wal: Amount of WAL kept for this slot

Removing a Slot

SELECT pg_drop_replication_slot('slot_name');

Important: Only drop slots when the consumer is permanently gone. Dropping an active slot breaks replication.

Configuration

Max slots (default: 10):

max_replication_slots = 10

Requires server restart to change.

Source: PostgreSQL 17 Official Documentation - Replication Slots

95% confidence
A

PgPool-II is a middleware proxy that sits between PostgreSQL clients and PostgreSQL database servers, providing connection pooling, load balancing, replication, and high availability features.

Core Functionality

Connection Pooling: Maintains a pool of persistent connections to PostgreSQL servers, reducing connection overhead. Each PgPool-II child process can handle multiple client connections mapped to fewer backend connections.

Load Balancing: Distributes SELECT queries across multiple PostgreSQL servers (read replicas) while routing write queries to the primary server. Uses configurable backend weights for query distribution.

Replication Management: Supports multiple replication modes:

  • Streaming Replication mode (native PostgreSQL replication)
  • Logical Replication mode
  • Snapshot Isolation mode
  • Native Replication mode (deprecated)

Automatic Failover: Detects backend node failures and automatically promotes a standby server to primary, with configurable health check intervals (default: 10 seconds) and failover scripts.

Query Caching: Can cache SELECT query results in memory to reduce database load.

Architecture

PgPool-II uses a multi-process architecture where a parent process spawns child processes (default: 32, configurable via num_init_children) to handle client connections. Each child process can handle one connection at a time in the default mode.

Typical Use Cases

  • PostgreSQL high availability clusters
  • Read-heavy workloads requiring horizontal scaling
  • Applications needing transparent connection pooling
  • Database load distribution across replicas

Current Stable Version: 4.5.x (as of 2025)

Official Documentation: https://www.pgpool.net/docs/latest/en/html/

Note: PgPool-II is protocol-compatible with PostgreSQL and requires no application code changes—clients connect to PgPool-II using standard PostgreSQL connection strings on the configured port (default: 9999).

95% confidence
A

PostgreSQL wal_level Setting

The wal_level setting controls how much information is written to the Write-Ahead Log (WAL) in PostgreSQL.

Valid Values (PostgreSQL 9.6+)

  • minimal - Writes only the information needed for crash recovery
  • replica - Adds logging required for WAL archiving and replication (default since PostgreSQL 9.6)
  • logical - Adds information necessary for logical decoding and logical replication

Configuration

Set in postgresql.conf:

wal_level = replica

Requires server restart to change.

Choosing the Right Level

  • Use replica if you need:

    • Physical replication (streaming replication)
    • Point-in-time recovery (PITR)
    • WAL archiving
  • Use logical if you need:

    • Logical replication (e.g., selective table replication)
    • Change data capture (CDC)
    • Logical decoding output plugins
  • Use minimal ONLY if:

    • No replication or archiving is needed
    • Reduces WAL volume by ~20-30%
    • Cannot create base backups while running

Check Current Value

SHOW wal_level;

Important Constraint

You cannot decrease wal_level if physical or logical replication slots exist. The server will refuse to start.

Source: PostgreSQL 16 Official Documentation - Write-Ahead Logging (WAL) Configuration

95% confidence
A

PostgreSQL Streaming Replication Lag

Streaming replication lag is the delay between when a transaction commits on the primary server and when that change becomes visible on a standby replica.

Three Types of Lag

PostgreSQL measures replication lag in three specific ways:

  1. Byte lag (pg_wal_lsn_diff()) - The difference in Write-Ahead Log (WAL) positions between primary and standby, measured in bytes
  2. Time lag (replay_lag, write_lag, flush_lag) - The time delay in milliseconds between WAL generation and application
  3. Transaction lag - Number of transactions the standby is behind (less commonly used)

Monitoring Replication Lag

On the primary server (PostgreSQL 10+), query pg_stat_replication:

SELECT 
    application_name,
    client_addr,
    state,
    sync_state,
    pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS byte_lag,
    write_lag,
    flush_lag,
    replay_lag
FROM pg_stat_replication;

On the standby server, query pg_stat_wal_receiver:

SELECT 
    pg_last_wal_receive_lsn() AS receive_lsn,
    pg_last_wal_replay_lsn() AS replay_lsn,
    pg_wal_lsn_diff(pg_last_wal_receive_lsn(), 
                     pg_last_wal_replay_lsn()) AS receive_replay_byte_lag;

Lag Columns Explained (PostgreSQL 10+)

  • write_lag: Time until WAL is written to standby's OS (not yet fsynced)
  • flush_lag: Time until WAL is fsynced to standby's disk
  • replay_lag: Time until WAL is applied to standby's database (queryable)

Critical: replay_lag is what matters for read queries - it shows when data becomes visible.

Acceptable Thresholds

  • Byte lag: < 16 MB (1 WAL segment in default config) is healthy; > 1 GB indicates problems
  • Time lag: < 1 second is excellent; > 10 seconds requires investigation

Source: PostgreSQL 16 official documentation on monitoring and replication statistics.

95% confidence
A

PostgreSQL Logical Replication

Logical replication is a method of replicating data objects and their changes based on their replication identity (typically the primary key). Unlike physical replication which copies disk block changes, logical replication replicates data changes at the row level using a publish/subscribe model.

Key Characteristics

Available since: PostgreSQL 10.0 (released October 2017)

How it works:

  1. Publisher creates a publication for specific tables/operations
  2. Subscriber creates a subscription pointing to the publisher
  3. Changes are sent as logical change records, not physical WAL blocks
  4. Allows replication between different PostgreSQL major versions and different architectures

Essential Configuration

Publisher requirements:

  • wal_level = logical (in postgresql.conf)
  • max_replication_slots >= 1 (default: 10)
  • max_wal_senders >= 1 (default: 10)

Replication identity required: Tables must have PRIMARY KEY or REPLICA IDENTITY set (FULL, INDEX, or DEFAULT).

Example Setup

-- On Publisher (PostgreSQL 10+)
CREATE PUBLICATION my_pub FOR TABLE users, orders;

-- On Subscriber
CREATE SUBSCRIPTION my_sub 
CONNECTION 'host=publisher_host dbname=mydb user=repuser password=secret'
PUBLICATION my_pub;

Limitations

  • Does NOT replicate: DDL changes, sequence values, large objects, TRUNCATE (before v11)
  • Requires table with same name on subscriber (can have different columns/indexes)
  • Initial data sync uses COPY protocol

Source: PostgreSQL Official Documentation v17 (current stable)

95% confidence
A

PostgreSQL: Synchronous vs Asynchronous Replication

Synchronous replication: The primary server waits for confirmation that at least one standby server has written the WAL (Write-Ahead Log) data to disk before reporting transaction commit to the client. This guarantees zero data loss if the primary fails.

Asynchronous replication (default): The primary server commits transactions immediately without waiting for standby confirmation. Standbys receive and apply changes with a delay, meaning data loss is possible if the primary fails before standbys receive the latest changes.

Configuration

Set in postgresql.conf on the primary:

# Asynchronous (default)
synchronous_commit = off

# Synchronous - wait for WAL write to disk on standby
synchronous_commit = on

# Must also specify which standbys to wait for
synchronous_standby_names = 'standby1,standby2'

For synchronous replication to work:

  1. Set synchronous_commit = on (or remote_write, remote_apply)
  2. Set synchronous_standby_names to list standby server names (from their application_name in primary_conninfo)

Synchronous Commit Levels

  • off - asynchronous (fastest, data loss possible)
  • local - wait for local disk write only
  • remote_write - wait for standby to write to OS (not fsync)
  • on - wait for standby to fsync to disk (safest, default synchronous)
  • remote_apply - wait for standby to apply changes (strongest consistency)

Trade-offs

Synchronous: Zero data loss, but 2-10x slower commits (depends on network latency between primary and standby).

Asynchronous: Maximum performance, but potential data loss window equals replication lag (typically seconds to minutes).

Source: PostgreSQL 16 Official Documentation - High Availability, Load Balancing, and Replication

95% confidence
A

pg_stat_replication View

pg_stat_replication is a system view in PostgreSQL that displays real-time information about active replication connections from standby servers to the primary server.

Purpose

Monitors the status of WAL (Write-Ahead Log) sender processes on the primary database server, showing one row per active replication connection (physical or logical).

Key Columns

Connection Info:

  • pid - Process ID of the WAL sender process
  • usename - Replication user name
  • application_name - Client application name
  • client_addr - IP address of the standby server

Replication State:

  • state - Current WAL sender state: startup, catchup, streaming, or backup
  • sync_state - Synchronous state: async, potential, sync, or quorum

WAL Positions (Critical for lag monitoring):

  • sent_lsn - Last WAL Log Sequence Number sent to standby
  • write_lsn - Last LSN written to standby's disk
  • flush_lsn - Last LSN flushed to standby's disk
  • replay_lsn - Last LSN replayed on standby

Lag Metrics:

  • write_lag - Time since last write confirmation
  • flush_lag - Time since last flush confirmation
  • replay_lag - Time since last replay confirmation (most important for monitoring)

Query Example

SELECT 
    application_name,
    client_addr,
    state,
    sync_state,
    replay_lag,
    pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS replay_lag_bytes
FROM pg_stat_replication;

Requirements

  • Only exists on PRIMARY servers (returns empty on standbys)
  • Requires pg_monitor role or superuser privileges to view all columns
  • Available in PostgreSQL 9.1+ (lag columns added in 10.0)

Source: PostgreSQL Official Documentation - Monitoring Statistics Views

95% confidence
A

PostgreSQL: Streaming vs Logical Replication

Core Difference

Streaming Replication: Physical, block-level replication that copies exact WAL (Write-Ahead Log) records. Creates an identical byte-for-byte replica of the entire database cluster.

Logical Replication: Row-level, selective replication that replicates specific tables or databases using a publish/subscribe model. Transfers logical changes (INSERT/UPDATE/DELETE operations).

Key Technical Specifications

Streaming Replication

  • Granularity: Entire PostgreSQL cluster (all databases)
  • Replica Requirements: Must be identical PostgreSQL major version and architecture
  • Replica State: Read-only (hot standby) or recovery mode
  • Use Case: High availability, failover, disaster recovery
  • Data Format: Binary WAL segments (typically 16MB each)

Logical Replication

  • Granularity: Per-table or per-publication (subset of tables)
  • Replica Requirements: Can be different PostgreSQL versions (within same major version family or newer)
  • Replica State: Fully writable - can have additional tables, indexes, or receive other writes
  • Use Case: Selective data sync, zero-downtime upgrades, multi-master setups, data consolidation
  • Data Format: Logical changesets decoded from WAL

Configuration Examples

Streaming Replication (postgresql.conf on primary):

wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB

Logical Replication (postgresql.conf on primary):

wal_level = logical
max_replication_slots = 10
max_wal_senders = 10

Publication setup (PostgreSQL 10+):

-- On publisher
CREATE PUBLICATION my_pub FOR TABLE users, orders;

-- On subscriber
CREATE SUBSCRIPTION my_sub 
CONNECTION 'host=primary dbname=mydb' 
PUBLICATION my_pub;

Critical Limitations

Streaming: Cannot replicate to different architectures (e.g., x86 to ARM), cannot filter specific tables, replica must match primary's configuration closely.

Logical: Does not replicate DDL changes (schema changes must be applied manually), does not replicate sequences, TRUNCATE requires special handling (PostgreSQL 11+), higher overhead than streaming.

Source: PostgreSQL Official Documentation v17 (applies to versions 10+)

95% confidence
A

Yes, you can query a standby server in PostgreSQL using Hot Standby mode.

Hot Standby Configuration

Hot Standby allows read-only queries on a physical streaming replication standby server. Available since PostgreSQL 9.0+.

Required Configuration

On the standby server, set in postgresql.conf:

hot_standby = on

This parameter takes effect at server start and must be enabled before the standby accepts connections.

Query Capabilities and Limitations

Allowed operations:

  • SELECT queries
  • EXPLAIN
  • COPY TO
  • Cursor operations (read-only)
  • Prepared statements (read-only)

NOT allowed (will error):

  • INSERT, UPDATE, DELETE
  • DDL statements (CREATE, ALTER, DROP)
  • Temporary table creation
  • Any write operations

Conflict Handling

Queries may be canceled if they conflict with WAL replay operations. The standby will terminate queries that hold locks blocking replay after max_standby_streaming_delay (default: 30 seconds).

max_standby_streaming_delay = 30s  # How long queries can delay replay

Verification

Check if hot standby is active:

SELECT pg_is_in_recovery();
-- Returns 't' (true) if server is in standby mode

Source: PostgreSQL Official Documentation - High Availability, Load Balancing, and Replication (Chapter 27.4 - Hot Standby)

95% confidence
A

Promoting a PostgreSQL Standby to Primary

To promote a standby server to primary in PostgreSQL, use the pg_ctl promote command or call the pg_promote() function.

Method 1: pg_ctl (Recommended)

pg_ctl promote -D /var/lib/postgresql/data

Where -D specifies the data directory path.

Method 2: SQL Function (PostgreSQL 12+)

SELECT pg_promote();

This must be executed as a superuser on the standby server.

Method 3: Trigger File (Legacy, pre-12)

Create a trigger file specified in recovery.conf:

touch /tmp/postgresql.trigger.5432

Critical Details

  • Promotion is irreversible - the standby cannot automatically revert to standby mode
  • Promotion time: Typically completes in <1 second for warm standbys, but may take longer if significant WAL replay is pending
  • Return value: pg_ctl promote returns exit code 0 on success, non-zero on failure
  • Connection handling: Existing read-only connections remain active and become read-write after promotion
  • Timeline change: PostgreSQL creates a new timeline (increments timeline ID) and creates a .history file in pg_wal/

Post-Promotion Requirements

  1. Update DNS/connection strings to point applications to the new primary
  2. Reconfigure old primary as standby if failback is needed (requires new primary_conninfo)
  3. Update monitoring systems

Version Notes

  • PostgreSQL 12+: recovery.conf merged into postgresql.conf, use standby.signal file presence to indicate standby mode (removed automatically on promotion)
  • PostgreSQL 9.1-11: Uses recovery.conf trigger file method

Source: PostgreSQL 17 Official Documentation - High Availability, Log Shipping, and Replication

95% confidence
A

PostgreSQL Replication Slots: Physical vs Logical

Direct Answer

Physical replication slots stream WAL (Write-Ahead Log) records for block-level replication, used by streaming replication and pg_basebackup. They ensure the primary retains WAL files until all standby servers consume them.

Logical replication slots decode WAL into logical change events (INSERT/UPDATE/DELETE operations) at the row level, enabling selective replication of specific tables/databases and cross-version replication.

Key Differences

Aspect Physical Logical
Output format Binary WAL records Decoded logical changes
Granularity Entire cluster Per-database/table
Use case Hot standby, failover Selective sync, upgrades
Cross-version Same major version only Supports different versions
Decoder None Required (pgoutput, wal2json, etc.)
Min. wal_level replica logical

Creating Each Type

Physical replication slot:

SELECT pg_create_physical_replication_slot('standby_slot');

Logical replication slot:

SELECT pg_create_logical_replication_slot('logical_slot', 'pgoutput');
-- 'pgoutput' is the standard decoder for logical replication (PG 10+)

Configuration Requirement

Logical slots require:

# postgresql.conf
wal_level = logical  # (default is 'replica' which only supports physical)

Physical slots work with wal_level = replica or higher.

Version Support

  • Physical replication slots: PostgreSQL 9.4+
  • Logical replication slots: PostgreSQL 10+ (logical decoding API available in 9.4, but native logical replication in 10+)

Source: PostgreSQL Official Documentation - Replication Slots and Logical Replication

95% confidence
A

Hot Standby in PostgreSQL

Hot standby is a PostgreSQL feature that allows a standby server (replica) to accept read-only queries while continuously applying WAL (Write-Ahead Log) records from the primary server.

Key Characteristics

  • Read-only access: The standby accepts SELECT queries but rejects writes (INSERT, UPDATE, DELETE, DDL)
  • Continuous replication: Standby applies WAL changes while serving queries simultaneously
  • Near real-time data: Query results reflect data as of the last applied WAL record (typically seconds behind primary)

Configuration

Enable on the standby server in postgresql.conf:

hot_standby = on

This parameter is on by default since PostgreSQL 10.

Requirements

  1. Primary server must have WAL archiving or streaming replication configured
  2. Standby must be in recovery mode (created via pg_basebackup or restored from backup)
  3. standby.signal file must exist in data directory (PostgreSQL 12+)

Query Conflicts

Hot standby queries can be canceled if they conflict with WAL replay operations. Common conflicts:

  • Primary drops a table being queried on standby
  • Primary vacuums rows still visible to standby query

Control behavior with these standby parameters:

max_standby_streaming_delay = 30s  # How long queries can delay WAL apply
hot_standby_feedback = on          # Prevents primary from vacuuming rows needed by standby queries

Source

PostgreSQL Official Documentation: High Availability, Load Balancing, and Replication - Hot Standby

95% confidence

Data Types - Character

22 questions
A

bpchar is PostgreSQL's internal name for the character(n) type, standing for 'blank-padded char'. It is a PostgreSQL extension and can be used as an alias for character(n).

95% confidence

Transactions > Two-Phase Commit

22 questions

Data Definition (DDL) > Tables and Columns

22 questions
A

TRUNCATE is faster for clearing large tables. TRUNCATE is a constant-time operation that removes all rows without scanning them and reclaims disk space immediately without requiring a subsequent VACUUM. DELETE is faster for small tables (a few rows) because TRUNCATE has more overhead.

95% confidence

Extensions > uuid-ossp

22 questions

Partitioning > Partition Pruning

22 questions
A
  1. Planning time: when the optimizer examines partition definitions and removes partitions that cannot contain matching rows. 2) Execution time: when clause values are unknown during planning, such as parameters in PREPARE statements, values from subqueries, or parameterized values on the inner side of nested loop joins.
95% confidence

Joins and Subqueries

22 questions

PostGIS > Geography Type

22 questions
A

Use geography when your data is geographically dispersed (covering much of the world) or is of global extent. Geography is convenient for people who don't want to learn about projected coordinate systems and need accurate distance measurements over long distances.

95% confidence

Internals > Visibility Map

22 questions
A

pg_check_visible returns the TIDs of non-all-visible tuples stored in pages marked all-visible in the visibility map. If this function returns a non-empty set of TIDs, the visibility map is corrupt.

95% confidence
A

Yes, setting a bit in the visibility map is WAL-logged during VACUUM. When a bit is set, the LSN of the visibility map page is updated to ensure that the visibility map update doesn't get written to disk before the WAL record is flushed. However, when a bit is cleared, it doesn't require WAL logging because clearing a bit is always safe from a correctness standpoint.

95% confidence

Partitioning > List Partitioning

22 questions
A

Create a CHECK constraint matching the expected partition constraint on the table prior to attaching it. The system will be able to skip the scan which is otherwise needed to validate the implicit partition constraint. Once ATTACH PARTITION is complete, drop the now-redundant CHECK constraint.

95% confidence

Event Triggers

21 questions
A

The table_rewrite event occurs just before a table is rewritten by some actions of the commands ALTER TABLE and ALTER TYPE. While other control statements like CLUSTER and VACUUM can rewrite a table, the table_rewrite event is not triggered by them.

95% confidence

Indexes > Hash Indexes

21 questions
A

For performance reasons, PostgreSQL retains a cached copy of the metapage in each backend's relcache entry to avoid locking and pinning the metapage for every operation. This produces the correct bucket mapping as long as the target bucket hasn't been split since the last cache refresh.

95% confidence

Monitoring > pg_stat_activity

21 questions
A

The possible state values are: 'active' (backend is executing a query), 'idle' (backend is waiting for a new client command), 'idle in transaction' (backend is in a transaction but not executing a query), 'idle in transaction (aborted)' (transaction statement caused an error), 'fastpath function call' (backend is executing a fast-path function), 'disabled' (track_activities is disabled for this backend), and 'starting' (backend is in initial startup, client authentication is being performed).

95% confidence
A

pg_cancel_backend(pid) sends a SIGINT signal to the backend process, which attempts to cancel only the SQL query currently running in the specified session. The user stays connected and the session remains active. It returns true if successful and false otherwise.

95% confidence

Partitioning > Hash Partitioning

21 questions
A

Every modulus among child partitions must be a factor of the next larger modulus. For example, you can use modulus 4, 8, 16, and 32 simultaneously because each is a factor of the next larger one, but you cannot use both modulus 10 and 15 together since 10 is not a factor of 15.

95% confidence

core_concepts

21 questions
A

PostgreSQL INTEGER vs BIGINT Storage Size

INTEGER: 4 bytes
BIGINT: 8 bytes

Value Ranges

  • INTEGER (also called INT or INT4):

    • Storage: 4 bytes
    • Range: -2,147,483,648 to +2,147,483,647
  • BIGINT (also called INT8):

    • Storage: 8 bytes
    • Range: -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807

Example Usage

CREATE TABLE example (
    user_id INTEGER,        -- 4 bytes per row
    total_revenue BIGINT    -- 8 bytes per row
);

When to Use Each

  • Use INTEGER for most numeric columns (user IDs, counts, amounts under 2.1 billion)
  • Use BIGINT when values may exceed 2.1 billion (timestamps, large financial amounts, global identifiers)

Storage Impact

BIGINT uses exactly 2x the storage of INTEGER. For a table with 1 million rows:

  • INTEGER column: ~4 MB
  • BIGINT column: ~8 MB

Source: PostgreSQL Official Documentation - Numeric Types

Applies to: PostgreSQL 9.x through 17.x (storage sizes unchanged across versions)

95% confidence
A

Maximum Length of PostgreSQL TEXT Field

1 GB (1,073,741,823 bytes) - This is the maximum size for any TEXT field value in PostgreSQL.

Technical Details

  • The TEXT data type in PostgreSQL can store strings up to 1 GB in length
  • This limit applies to all variable-length text types: TEXT, VARCHAR, and CHAR
  • The actual maximum is precisely 1,073,741,823 bytes (1 GB - 1 byte)
  • This limit is enforced by PostgreSQL's TOAST (The Oversized-Attribute Storage Technique) mechanism

Code Example

-- TEXT field has no explicit length constraint
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT  -- Can store up to 1 GB
);

-- These are functionally identical in PostgreSQL:
CREATE TABLE examples (
    text_col TEXT,           -- up to 1 GB
    varchar_col VARCHAR,     -- up to 1 GB (no length specified)
    varchar_limited VARCHAR(100)  -- limited to 100 characters
);

Important Notes

  • Unlike VARCHAR(n), TEXT has no length modifier and defaults to the maximum
  • The 1 GB limit is a hard limit in PostgreSQL's storage system
  • Character vs byte count: For UTF-8 text, multibyte characters consume multiple bytes toward the 1 GB limit

Source

PostgreSQL Official Documentation (applies to all versions 8.0+): Section 8.3 - Character Types

95% confidence
A

Based on the search results from PostgreSQL's official source code documentation, I can provide the authoritative answer:

6 dimensions

PostgreSQL arrays are limited to a maximum of 6 dimensions. This is defined by the MAXDIM constant in the PostgreSQL source code at src/include/utils/array.h:

#define MAXDIM 6

This limit is enforced throughout the PostgreSQL codebase. When you attempt to create or manipulate arrays with more than 6 dimensions, PostgreSQL will raise an error: "number of array dimensions exceeds the maximum allowed (6)".

Example:

-- Valid: up to 6 dimensions
SELECT ARRAY[[[[[[1]]]]]]::int[];

-- Invalid: 7 dimensions would exceed MAXDIM
SELECT ARRAY[[[[[[[1]]]]]]]::int[]; -- ERROR

Sources:

95% confidence
A

PostgreSQL search_path

The search_path is a session configuration parameter that defines the ordered list of schemas PostgreSQL searches when resolving unqualified object names (tables, functions, types, etc.).

Default Value

SHOW search_path;
-- Returns: "$user", public

The default searches for a schema matching the current username first, then the public schema.

How It Works

When you reference SELECT * FROM users, PostgreSQL searches schemas in search_path order until it finds a table named users. With default settings, it checks:

  1. Schema named after your username (if exists)
  2. public schema

Setting search_path

Session level:

SET search_path TO myschema, public;

Database level:

ALTER DATABASE mydb SET search_path TO myschema, public;

Role level:

ALTER ROLE myuser SET search_path TO myschema, public;

Connection string:

options=-c search_path=myschema,public

Security Critical Detail

ALWAYS include pg_catalog explicitly or use schema-qualified names in functions to prevent search_path hijacking attacks. PostgreSQL 15+ improved this by making functions search pg_catalog first by default.

-- Secure (PostgreSQL 15+)
CREATE FUNCTION myfunc() RETURNS int AS $$
  SELECT count(*) FROM mytable;
$$ LANGUAGE SQL;

-- Secure (any version)
SET search_path TO pg_catalog, myschema, public;

Verification

SELECT current_schemas(true);  -- Shows actual search path including implicit schemas

Source: PostgreSQL 17 Official Documentation - Schema Search Path (https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATH)

95% confidence
A

PostgreSQL BOOLEAN type accepts three states: TRUE, FALSE, and NULL (unknown).

Valid Input Values

For TRUE:

  • TRUE (SQL keyword)
  • 'true'
  • 'yes'
  • 'on'
  • '1'
  • 't', 'y' (unique prefixes)

For FALSE:

  • FALSE (SQL keyword)
  • 'false'
  • 'no'
  • 'off'
  • '0'
  • 'f', 'n' (unique prefixes)

For UNKNOWN:

  • NULL (SQL keyword)

Input Rules

  • Case-insensitive (e.g., 'TRUE', 'True', 'true' all work)
  • Leading/trailing whitespace is ignored
  • Unique prefixes are accepted (e.g., 't' for true, 'f' for false)

Output Format

When queried, PostgreSQL always outputs boolean values as t or f (single character lowercase).

-- Example inputs (all valid)
INSERT INTO table (bool_col) VALUES 
  (TRUE),           -- SQL keyword
  ('yes'),          -- string representation
  ('1'),            -- numeric representation
  ('t');            -- prefix

-- All output as: t

Sources:

95% confidence
A

{"type":"result","subtype":"success","is_error":false,"duration_ms":13201,"duration_api_ms":22824,"num_turns":1,"result":"The maximum size of a single row in PostgreSQL is approximately 1.6 TB (1,600,000,000 bytes).\n\n## Key Details\n\nHard limit: Each row is stored in a single 8 KB page, but PostgreSQL uses TOAST (The Oversized-Attribute Storage Technique) to store large column values out-of-line in separate TOAST tables.\n\nExact specifications:\n- Maximum field size: 1 GB (when TOASTed)\n- Maximum row size with TOAST: 1.6 TB theoretical limit\n- Practical limit per column: 1 GB\n- Row header overhead: 23 bytes minimum (varies with null bitmap)\n- Maximum columns per table: 1600 (1664 with system columns)\n\nImportant constraints:\n- Without TOAST, a row must fit in a single page (8 KB minus headers)\n- Data types smaller than ~2 KB are stored inline by default\n- TOAST automatically applies to data types like TEXT, BYTEA, VARCHAR when values exceed ~2 KB\n\n## Example\n\nsql\n-- This works - large text column uses TOAST automatically\nCREATE TABLE large_rows (\n id SERIAL PRIMARY KEY,\n large_text TEXT, -- Can store up to 1 GB\n large_binary BYTEA -- Can store up to 1 GB\n);\n\n-- Insert a large row (PostgreSQL handles TOAST transparently)\nINSERT INTO large_rows (large_text) \nVALUES (repeat('x', 100000000)); -- 100 MB text\n\n\nSource: PostgreSQL 17 official documentation - Chapter 73.2 TOAST\n\nVersions: This limit applies to PostgreSQL 8.3+ (when TOAST was enhanced). Current as of PostgreSQL 17 (2024).","session_id":"5e4b0f55-9504-4161-9d89-681074557ff2","total_cost_usd":0.0222493,"usage":{"input_tokens":3,"cache_creation_input_tokens":414,"cache_read_input_tokens":16766,"output_tokens":441,"server_tool_use":{"web_search_requests":0,"web_fetch_requests":0},"service_tier":"standard","cache_creation":{"ephemeral_1h_input_tokens":0,"ephemeral_5m_input_tokens":414}},"modelUsage":{"claude-haiku-4-5-20251001":{"inputTokens":658,"outputTokens":192,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.001618,"contextWindow":200000},"claude-opus-4-5-20251101":{"inputTokens":845,"outputTokens":128,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.007425,"contextWindow":200000},"claude-sonnet-4-5-20250929":{"inputTokens":3,"outputTokens":441,"cacheReadInputTokens":16766,"cacheCreationInputTokens":414,"webSearchRequests":0,"costUSD":0.013206299999999999,"contextWindow":200000}},"permission_denials":[],"uuid":"931a5569-e5e1-4423-9d6f-b8d544313bd0"}

95% confidence
A

{"type":"result","subtype":"success","is_error":false,"duration_ms":11606,"duration_api_ms":18951,"num_turns":1,"result":"# Maximum Size of a PostgreSQL Database\n\nThe theoretical maximum database size in PostgreSQL is 4 petabytes (PB).\n\nHowever, this limit is controlled by several component limits:\n\n## Component Limits (PostgreSQL 9.0+)\n\n- Maximum database size: 4 petabytes (limited by tablespace size)\n- Maximum table size: 32 terabytes (TB)\n- Maximum row size: 1.6 TB (including TOAST storage)\n- Maximum field size: 1 GB (without TOAST)\n- Maximum rows per table: Unlimited (theoretically limited by table size)\n- Maximum columns per table: 250-1600 (depending on column types)\n- Maximum indexes per table: Unlimited\n\n## Practical Considerations\n\nThe 32 TB table size limit is the most common practical constraint. This is determined by:\n- Maximum blocks per table: 2^32 (4,294,967,296 blocks)\n- Default block size: 8 KB\n- Calculation: 2^32 blocks × 8 KB = 32 TB\n\nTo exceed 32 TB of data per table, use table partitioning to split data across multiple physical tables.\n\n## Configuration\n\nThe block size can be changed at compile time (using --with-blocksize configure option) to 1, 2, 4, 8, 16, or 32 KB, but 8 KB is standard and changing it requires recompiling PostgreSQL.\n\n## Source\n\nPostgreSQL Official Documentation - Appendix K: Limits (valid for PostgreSQL 9.0 through 17.x)","session_id":"15306456-758d-4464-9e3f-c2490893eab6","total_cost_usd":0.0315107,"usage":{"input_tokens":3,"cache_creation_input_tokens":3474,"cache_read_input_tokens":13704,"output_tokens":382,"server_tool_use":{"web_search_requests":0,"web_fetch_requests":0},"service_tier":"standard","cache_creation":{"ephemeral_1h_input_tokens":0,"ephemeral_5m_input_tokens":3474}},"modelUsage":{"claude-haiku-4-5-20251001":{"inputTokens":658,"outputTokens":180,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.001558,"contextWindow":200000},"claude-opus-4-5-20251101":{"inputTokens":845,"outputTokens":114,"cacheReadInputTokens":0,"cacheCreationInputTokens":0,"webSearchRequests":0,"costUSD":0.0070750000000000006,"contextWindow":200000},"claude-sonnet-4-5-20250929":{"inputTokens":3,"outputTokens":382,"cacheReadInputTokens":13704,"cacheCreationInputTokens":3474,"webSearchRequests":0,"costUSD":0.0228777,"contextWindow":200000}},"permission_denials":[],"uuid":"b29caba8-7e5d-4c03-ace8-a6f40eda9bdd"}

95% confidence
A

PostgreSQL Composite Types

A composite type in PostgreSQL is a user-defined data type that groups multiple fields (columns) together into a single type, similar to a row or record structure. Each field has a name and a data type.

Definition

Composite types are created using CREATE TYPE:

CREATE TYPE address AS (
    street VARCHAR(100),
    city VARCHAR(50),
    postal_code VARCHAR(10),
    country VARCHAR(50)
);

Usage

As table column:

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    home_address address,
    work_address address
);

Inserting data:

INSERT INTO employees (name, home_address, work_address) 
VALUES (
    'John Doe',
    ROW('123 Main St', 'Boston', '02101', 'USA'),
    ROW('456 Corp Ave', 'Boston', '02102', 'USA')
);

Accessing fields:

-- Use dot notation (parentheses required to avoid ambiguity)
SELECT name, (home_address).city FROM employees;

Key Characteristics

  • Table rows are composite types: Every table automatically has a composite type with the same name
  • Nested composites: Composite types can contain other composite types
  • Size limit: Subject to row size limit of 1.6 TB (PostgreSQL 12+), practically limited by block_size (default 8 KB for efficient storage)
  • NULL handling: The entire composite value can be NULL, or individual fields can be NULL

Official Reference

PostgreSQL Documentation: Composite Types (current as of PostgreSQL 17)

95% confidence
A

The range of SMALLINT in PostgreSQL is -32768 to +32767.

Technical Details:

  • Storage size: 2 bytes
  • Signed: Yes (always)
  • Min value: -32768 (-2^15)
  • Max value: +32767 (2^15 - 1)

Example Usage:

CREATE TABLE example (
    id SMALLINT
);

-- Valid insertions
INSERT INTO example VALUES (-32768);  -- minimum
INSERT INTO example VALUES (32767);   -- maximum

-- This will cause an error: smallint out of range
INSERT INTO example VALUES (32768);

Comparison with Other Integer Types:

  • SMALLINT: 2 bytes, -32768 to 32767
  • INTEGER: 4 bytes, -2147483648 to 2147483647
  • BIGINT: 8 bytes, -9223372036854775808 to 9223372036854775807

Source: PostgreSQL Official Documentation - Numeric Types
Version: Applies to all PostgreSQL versions (this range is part of the SQL standard and has not changed)

Performance Note: SMALLINT is optimal when you know values will stay within this range, as it uses half the storage of INTEGER.

95% confidence
A

PostgreSQL ENUM Type

An ENUM (enumerated type) is a user-defined data type in PostgreSQL that consists of a static, ordered set of string values. Once created, an ENUM type can be used as a column type like any built-in type.

Creating and Using ENUMs

-- Create an ENUM type
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

-- Use in a table
CREATE TABLE person (
    name TEXT,
    current_mood mood
);

-- Insert values (must match exactly)
INSERT INTO person VALUES ('Alice', 'happy');
INSERT INTO person VALUES ('Bob', 'sad');

Key Characteristics

  • Storage: 4 bytes per value (same as integer)
  • Ordering: Values are ordered by creation sequence, NOT alphabetically
  • Case-sensitive: 'Happy' ≠ 'happy'
  • Immutable after creation: Cannot reorder or rename existing values (PostgreSQL <12)
  • PostgreSQL 12+: Can add values with ALTER TYPE mood ADD VALUE 'excited' AFTER 'happy'

When to Use ENUMs

Use when:

  • Small, fixed set of values (e.g., status: 'pending', 'approved', 'rejected')
  • Values rarely change (adding is possible, but removing requires type recreation)
  • Need type safety at database level
  • Performance matters (4 bytes vs. variable TEXT storage)
  • Want constraint enforcement without CHECK constraints

Avoid when:

  • Values change frequently (use lookup table instead)
  • Need internationalization (enum values are stored literals)
  • Application manages validation logic (use VARCHAR with CHECK constraint)
  • Multiple applications with different valid values access the DB

Source

PostgreSQL Official Documentation v16: Chapter 8.7 - Enumerated Types

95% confidence
A

PostgreSQL: NUMERIC vs DECIMAL

They are identical. DECIMAL and NUMERIC are exact synonyms in PostgreSQL - they refer to the same data type with identical storage and behavior.

Key Facts

  • Both names create the exact same type internally
  • Both accept the same syntax: NUMERIC(precision, scale) or DECIMAL(precision, scale)
  • precision = total number of digits (max 1000)
  • scale = number of digits after decimal point
  • No storage or performance difference whatsoever

Examples

-- These are functionally identical:
CREATE TABLE example (
    price1 NUMERIC(10, 2),
    price2 DECIMAL(10, 2)
);

-- Both store exact decimal values like 99999999.99
-- Both use variable-length storage (2 bytes per 4 decimal digits + overhead)

Verification

-- Check the actual type stored:
SELECT 
    column_name, 
    data_type 
FROM information_schema.columns 
WHERE table_name = 'example';

-- Result shows both as "numeric"

Which to Use?

Use NUMERIC - it's the PostgreSQL-preferred name in official documentation. However, DECIMAL exists for SQL standard compatibility, so either is acceptable.

Source: PostgreSQL 17 Official Documentation - Numeric Types
(https://www.postgresql.org/docs/current/datatype-numeric.html)

95% confidence
A

CHAR vs VARCHAR in PostgreSQL

Key Difference: CHAR(n) pads values with spaces to exactly n characters, while VARCHAR(n) stores the actual string without padding (up to n characters).

Storage & Behavior

  • CHAR(n) (or CHARACTER(n)):

    • Fixed-length: Always stores exactly n characters
    • Pads shorter values with trailing spaces to reach length n
    • Trailing spaces are semantically insignificant and removed on retrieval
    • Storage: n bytes (if single-byte encoding) + 1 byte overhead if n < 127
  • VARCHAR(n) (or CHARACTER VARYING(n)):

    • Variable-length: Stores actual string length (up to n characters)
    • No padding applied
    • Storage: actual string length + 1 byte (if < 126 bytes) or 4 bytes (if ≥ 126 bytes) for length prefix

Code Example

CREATE TABLE comparison (
    fixed CHAR(10),
    variable VARCHAR(10)
);

INSERT INTO comparison VALUES ('hello', 'hello');

-- CHAR pads to 10 characters internally, VARCHAR stores 5
SELECT 
    fixed,
    variable,
    octet_length(fixed) AS char_bytes,      -- Returns 5 (spaces trimmed on output)
    octet_length(variable) AS varchar_bytes -- Returns 5
FROM comparison;

-- Internal storage differs
SELECT 
    fixed = 'hello     ' AS char_match,    -- TRUE (trailing spaces ignored)
    variable = 'hello     ' AS varchar_match -- FALSE
FROM comparison;

Performance

There is NO performance advantage to CHAR over VARCHAR in PostgreSQL (unlike some other databases). The PostgreSQL documentation explicitly states that CHAR(n) is usually slower due to padding overhead.

Recommendation

Use VARCHAR(n) or TEXT in PostgreSQL. The only reason to use CHAR(n) is for SQL standard compatibility or when you specifically need fixed-width, space-padded behavior.

Source: PostgreSQL 16 Official Documentation, Section 8.3 "Character Types"
https://www.postgresql.org/docs/current/datatype-character.html

95% confidence
A

The maximum size of BYTEA in PostgreSQL is 1 GB (1,073,741,824 bytes).

This limit applies to all TOAST-able data types in PostgreSQL, including BYTEA. The limit is enforced by PostgreSQL's TOAST (The Oversized-Attribute Storage Technique) mechanism, which has a maximum datum size of 1 GB.

Technical Details:

  • Maximum theoretical size: 1 GB - 1 byte (1,073,741,823 bytes)
  • This is a hard limit enforced by the MaxAllocSize constant in PostgreSQL source code
  • Applies to all PostgreSQL versions (confirmed in versions 9.x through 16+)

Storage Considerations:

  • BYTEA values larger than ~2 KB are automatically compressed and/or moved to TOAST tables
  • The 1 GB limit includes any overhead from compression or encoding

Example Usage:

-- This will work (within limit)
INSERT INTO files (data) VALUES (pg_read_binary_file('/path/to/file.bin'));

-- Check size of BYTEA column
SELECT pg_column_size(data) FROM files WHERE id = 1;

If you need larger binary storage:

  • Use PostgreSQL Large Objects (up to 4 TB per object)
  • Store files externally and keep references in the database

Source: PostgreSQL official documentation on TOAST and binary data types, consistent across versions 9.0-16.

95% confidence
A

The UUID type in PostgreSQL stores Universally Unique Identifiers (UUIDs) as defined by RFC 9562. It is a 128-bit value displayed as hexadecimal digits in the format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (8-4-4-4-12 digit groups).

Storage: 128 bits (16 bytes)

Example UUID: a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

Generating UUIDs:

PostgreSQL provides built-in functions for UUID generation:

-- Generate version 4 (random) UUID
SELECT gen_random_uuid();
SELECT uuidv4();  -- alias for gen_random_uuid()

-- Generate version 7 (time-ordered) UUID
SELECT uuidv7();

-- Generate version 7 with timestamp shift
SELECT uuidv7(interval '1 hour');

Usage in tables:

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT
);

INSERT INTO users (name) VALUES ('Alice');  -- id auto-generated

Key advantages:

  • Version 4: Cryptographically random, globally unique
  • Version 7: Time-ordered with millisecond precision, better for indexing performance
  • No cross-database coordination required (unlike sequences)

Additional UUID algorithms: Install the uuid-ossp extension for UUIDv1, UUIDv3, and UUIDv5 generation.

Sources:

95% confidence
A

PostgreSQL hstore Extension

The hstore extension is a key-value store data type for PostgreSQL that stores sets of key-value pairs within a single PostgreSQL value. Each key and value is a text string.

Enabling hstore

CREATE EXTENSION IF NOT EXISTS hstore;

Storage and Syntax

-- Creating a table with hstore column
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    attributes hstore
);

-- Inserting data
INSERT INTO products (attributes) VALUES 
    ('color => "red", size => "M", weight => "500g"');

-- Alternative syntax with hstore() constructor
INSERT INTO products (attributes) VALUES 
    (hstore('color', 'blue') || hstore('size', 'L'));

Key Operations

-- Retrieve a value by key
SELECT attributes -> 'color' FROM products;

-- Check if key exists
SELECT attributes ? 'color' FROM products;

-- Check if multiple keys exist
SELECT attributes ?& ARRAY['color', 'size'] FROM products;  -- ALL keys
SELECT attributes ?| ARRAY['color', 'price'] FROM products; -- ANY key

-- Get all keys or values
SELECT akeys(attributes) FROM products;  -- returns text[]
SELECT avals(attributes) FROM products;  -- returns text[]

-- Convert to JSON
SELECT hstore_to_json(attributes) FROM products;

Indexing

-- GIN index for existence checks and containment
CREATE INDEX idx_attributes ON products USING GIN(attributes);

-- GiST index (alternative)
CREATE INDEX idx_attributes_gist ON products USING GIST(attributes);

Limitations

  • Keys and values: Both are text strings only (no native numeric/boolean types)
  • NULL values: Distinguishes between NULL value and missing key
  • Size: No hard limit, but large hstore values impact performance
  • Nesting: No nested structures (flat key-value only)

Use Cases

Use hstore for semi-structured data with varying attributes where you need:

  • Indexable key-value storage
  • Better query performance than JSON for key existence checks
  • PostgreSQL versions before JSONB matured (pre-9.4)

Note: For PostgreSQL 9.4+, consider JSONB for more complex semi-structured data needs (supports nested objects, arrays, and native data types).

Source: PostgreSQL Official Documentation (hstore module), compatible with PostgreSQL 9.0+, built-in contrib module.

95% confidence
A

PostgreSQL: Schema vs Database

ATOMIC ANSWER:

A database is a top-level container that holds all data objects and is completely isolated from other databases (separate connections, no cross-database queries in standard PostgreSQL). A schema is a namespace within a database that organizes tables, views, functions, and other objects, allowing multiple schemas in one database with fully-qualified access via schema_name.table_name.

Key Differences

Database:

  • Created with CREATE DATABASE dbname;
  • Requires separate connection (cannot query across databases in one session)
  • Contains one or more schemas
  • Has its own users/privileges, encoding, and collation
  • Physical separation on disk

Schema:

  • Created with CREATE SCHEMA schemaname;
  • Multiple schemas accessible in single connection
  • Default schema is public (created automatically)
  • Enables logical organization without connection overhead
  • Cross-schema queries: SELECT * FROM schema1.table1 JOIN schema2.table2

Code Example

-- Database level (requires reconnecting)
CREATE DATABASE app_production;
\c app_production  -- Connect to database

-- Schema level (same connection)
CREATE SCHEMA sales;
CREATE SCHEMA marketing;

CREATE TABLE sales.orders (id INT, amount DECIMAL);
CREATE TABLE marketing.campaigns (id INT, name TEXT);

-- Query across schemas (same database)
SELECT o.id, c.name 
FROM sales.orders o 
JOIN marketing.campaigns c ON o.id = c.id;

Schema Search Path

PostgreSQL uses search_path to resolve unqualified table names (default: "$user", public):

SHOW search_path;
SET search_path TO sales, marketing, public;

Source: PostgreSQL 17 Documentation - Chapter 5.9 Schemas

Use Case: Use schemas for logical separation (multi-tenant apps, dev/test environments within one DB). Use separate databases for complete isolation (different applications, security boundaries).

95% confidence
A

PostgreSQL ARRAY Type

The ARRAY type in PostgreSQL stores variable-length multidimensional arrays of a single data type. Every PostgreSQL data type has a corresponding array type (e.g., integer[], text[], timestamp[]).

Key Specifications

  • Declaration syntax: column_name data_type[] or column_name data_type ARRAY
  • Dimensions: PostgreSQL supports arrays of any number of dimensions, but all arrays are dynamically sized with no fixed bounds enforced
  • Index base: Arrays are 1-indexed (first element is at position 1, not 0)
  • Maximum size: Limited by the maximum field size of 1 GB
  • Type constraint: All elements must be of the same base type

Creating Arrays

-- Column declaration
CREATE TABLE products (
    id serial PRIMARY KEY,
    tags text[],
    prices integer ARRAY,
    matrix integer[][]  -- multidimensional
);

-- Inserting array literals
INSERT INTO products (tags, prices) VALUES 
    (ARRAY['electronics', 'sale'], ARRAY[99, 149]),
    ('{"book", "fiction"}', '{10, 15, 20}');  -- alternative syntax

Accessing Elements

-- Access single element (1-indexed)
SELECT tags[1] FROM products;  -- first element

-- Access slice
SELECT tags[1:2] FROM products;  -- elements 1 through 2

-- Multidimensional access
SELECT matrix[1][2] FROM products;

Essential Functions

  • array_length(array, dimension) - returns length of specified dimension
  • array_append(array, element) - adds element to end
  • array_cat(array1, array2) - concatenates arrays
  • unnest(array) - expands array to rows
  • ANY(array) / ALL(array) - comparison operators
-- Query with array operations
SELECT * FROM products WHERE 'sale' = ANY(tags);
SELECT array_length(prices, 1) FROM products;

Source: PostgreSQL 17 Official Documentation, Section 8.15 (Arrays)

Critical notes:

  • NULL handling: ARRAY[1, NULL, 3] is valid; the array itself can also be NULL
  • Use text[] not text ARRAY for consistency with PostgreSQL conventions
  • Arrays are stored in binary format internally but output as {val1,val2,val3} text representation
95% confidence
A

Adding a Value to an Existing ENUM Type in PostgreSQL

Use the ALTER TYPE ... ADD VALUE statement:

ALTER TYPE enum_type_name ADD VALUE 'new_value';

Position Control

By default, the new value is added at the end. To specify position:

-- Add before an existing value
ALTER TYPE enum_type_name ADD VALUE 'new_value' BEFORE 'existing_value';

-- Add after an existing value
ALTER TYPE enum_type_name ADD VALUE 'new_value' AFTER 'existing_value';

Transaction Behavior (CRITICAL)

PostgreSQL versions ≥ 12.0: ADD VALUE can be used inside a transaction block without restrictions.

PostgreSQL versions < 12.0: ADD VALUE CANNOT be used in a transaction block with other operations on the same enum type. You must either:

  • Run it in its own transaction, OR
  • Use the IF NOT EXISTS clause (PostgreSQL 9.1+) and commit before using the new value
-- Safe for all versions (idempotent)
ALTER TYPE enum_type_name ADD VALUE IF NOT EXISTS 'new_value';

Example

-- Create enum
CREATE TYPE status AS ENUM ('pending', 'active', 'closed');

-- Add new value at the end
ALTER TYPE status ADD VALUE 'archived';

-- Add new value in specific position
ALTER TYPE status ADD VALUE 'cancelled' AFTER 'pending';

Limitations

  • You CANNOT remove enum values (requires type recreation)
  • You CANNOT rename enum values directly (use workaround with ADD + UPDATE + DROP type)
  • Enum values are sorted by creation order, not alphabetically

Source: PostgreSQL Official Documentation, ALTER TYPE command (PostgreSQL 9.1+, transaction improvements in 12.0)

95% confidence
A

The NUMERIC type in PostgreSQL has a maximum precision of 131,072 digits before the decimal point and a maximum of 16,383 digits after the decimal point.

Declaring NUMERIC:

-- Syntax: NUMERIC(precision, scale)
-- precision: total count of significant digits
-- scale: count of decimal digits in fractional part

NUMERIC(10, 2)   -- 10 total digits, 2 after decimal (e.g., 12345678.90)
NUMERIC(5)       -- 5 total digits, 0 after decimal (scale defaults to 0)
NUMERIC          -- No limit (stores exact value within implementation limits)

Implementation Limits:

  • Maximum precision (total significant digits): 1,000 (configurable at compile time, default)
  • Maximum scale (digits after decimal): 1,000
  • Theoretical maximum (per source code): 131,072 digits before decimal, 16,383 after
  • Storage: Variable length, approximately 2 bytes per 4 decimal digits + 8 bytes overhead

Key Behaviors:

  • Values exceeding declared precision cause an error
  • Values with more decimal places than scale are rounded (not truncated)
  • NUMERIC without parameters stores exact values up to implementation limit
  • DECIMAL is an alias for NUMERIC (identical behavior)

Example:

CREATE TABLE prices (
    exact_price NUMERIC,           -- No limit, exact storage
    currency NUMERIC(10, 2),       -- Max 10 digits, 2 decimal places
    very_precise NUMERIC(20, 10)   -- Max 20 digits, 10 decimal places
);

INSERT INTO prices VALUES (123.456789, 12345678.90, 1234567890.1234567890);

Source: PostgreSQL 16 Official Documentation - Chapter 8.1 (Numeric Types)

95% confidence
A

NULL vs Empty String in PostgreSQL

NULL represents the absence of a value (unknown or undefined data). Empty string ('') is a known value that happens to contain zero characters.

Key Differences

1. Storage and Semantics

  • NULL = no data exists, unknown, or not applicable
  • '' = a string value with length 0

2. Comparison Behavior

-- NULL comparisons always return NULL (unknown)
SELECT NULL = NULL;        -- Returns: NULL (not TRUE)
SELECT NULL IS NULL;       -- Returns: TRUE
SELECT '' = '';            -- Returns: TRUE

-- Empty string comparisons work normally
SELECT '' = '';            -- Returns: TRUE
SELECT '' IS NULL;         -- Returns: FALSE

3. String Operations

-- NULL propagates through operations
SELECT 'Hello' || NULL;    -- Returns: NULL
SELECT length(NULL);       -- Returns: NULL

-- Empty string behaves as a value
SELECT 'Hello' || '';      -- Returns: 'Hello'
SELECT length('');         -- Returns: 0

4. Constraints

-- NOT NULL constraint
CREATE TABLE example (
    col1 VARCHAR NOT NULL  -- Rejects NULL, allows ''
);

INSERT INTO example VALUES ('');   -- SUCCESS
INSERT INTO example VALUES (NULL); -- ERROR: violates NOT NULL

5. Indexing

  • Both NULL and '' are indexed differently
  • NULL values excluded from standard B-tree indexes unless explicitly included
  • '' is always indexed as a regular value

6. Aggregation

-- COUNT ignores NULLs but counts empty strings
SELECT COUNT(col) FROM table;  -- Excludes NULL, includes ''

Practical Rule

Use NULL for missing/unknown data. Use '' only when you need to represent a string that explicitly has no characters (e.g., an empty user input that was intentionally submitted blank).

Version: PostgreSQL 12+ (behavior consistent across all modern versions)

Source: PostgreSQL Official Documentation - NULL Values

95% confidence

Client Authentication > Certificate Authentication

21 questions

Window Functions > Aggregate Window Functions

21 questions
A

If an aggregate supports moving-aggregate mode, it will improve calculation efficiency when the aggregate is used as a window function for a window with moving frame start (that is, a frame start mode other than UNBOUNDED PRECEDING). The moving-aggregate mode uses an inverse transition function to remove values as they leave the window frame, avoiding full recomputation.

95% confidence

Monitoring > Index Statistics

21 questions

Internals > Free Space Map

20 questions
A

The HEAP_INSERT_SKIP_FSM flag (also called TABLE_INSERT_SKIP_FSM) prevents PostgreSQL from consulting the Free Space Map during insert operations, effectively preventing free space from being reused and forcing new data to be appended.

95% confidence
A

For indexes, what is tracked is entirely-unused pages, rather than free space within pages. Therefore, the values are not meaningful, just whether a page is in-use or empty.

95% confidence
A

The FSM is organized as a tree of FSM pages. The bottom level FSM pages store free space for each heap/index page. The upper levels aggregate information from the lower levels. Each leaf node represents a heap page or a lower level FSM page.

95% confidence
A

You can delete the FSM file and run VACUUM on the table to rebuild corrected FSM data. It's a good idea to run VACUUM on each table that had a corrupt FSM (either in the table proper, or an index) to rebuild corrected FSM data.

95% confidence
A

VACUUM FULL rewrites the entire contents of the table into a new disk file with no extra space and rebuilds the entire table and all indexes from scratch, which effectively recreates the FSM as well.

95% confidence
A

The pg_freespace function returns int2 (a 2-byte signed integer). The function has two overloads: pg_freespace(rel regclass, blkno bigint) returns int2 for a single page, and pg_freespace(rel regclass, blkno OUT bigint, avail OUT int2) for all pages.

95% confidence

Rules

20 questions

Full Text Search > Ranking and Highlighting

20 questions
A

When MaxFragments is set to a value greater than zero, it enables fragment-based headline generation. ts_headline locates query matches and splits each match into fragments of no more than MaxWords words each, preferring fragments with more query words. When more than one fragment is displayed, the fragments are separated by FragmentDelimiter.

95% confidence

Window Functions > Offset Functions

20 questions

Logical Replication > Replication Identity

20 questions
A

The four replica identity modes are: DEFAULT (records old values of primary key columns), USING INDEX (records old values of columns covered by a named index), FULL (records old values of all columns), and NOTHING (records no information about the old row).

95% confidence

Indexes > Indexes on Expressions

19 questions
A

Yes. An index definition can specify a collation for each column or expression using the syntax: (expression) COLLATE collation opclass [ASC | DESC] [NULLS {FIRST | LAST}]. The operator class determines the basic sort ordering, which can then be modified by adding the COLLATE clause and sort options.

95% confidence
A

VOLATILE (default) can do anything including modifying the database and may return different results with identical arguments; STABLE cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement; IMMUTABLE cannot modify the database and is guaranteed to return the same results given the same arguments forever.

95% confidence
A

Yes. Expression indexes not only allow rapid lookups of matching expressions, but they also provide optimizer statistics, which improve row estimates and hence query plans. Interestingly, the optimizer can use expression index statistics even when the expression index itself is not used for query execution.

95% confidence
A

Yes, in principle index-only scans can be used with expression indexes. For example, given an index on f(x), queries selecting the expression result can potentially use index-only scans. However, PostgreSQL's planner is currently not very smart about such cases.

95% confidence

common_errors

19 questions
A

The FATAL: password authentication failed error occurs when PostgreSQL successfully receives your connection request but rejects it because authentication credentials are invalid or do not match the configured authentication method in pg_hba.conf.

What it means:

  • Network connection to PostgreSQL succeeded
  • Server is willing to communicate but requires valid authentication first
  • The credentials provided (username/password) failed verification

Common causes:

  1. Wrong password - Incorrect password for the database user
  2. Wrong username - User doesn't exist or is misspelled
  3. User has no password set - User created without PASSWORD clause
  4. pg_hba.conf authentication method mismatch - File specifies different auth method (e.g., md5, scram-sha-256, trust, peer, ident)
  5. Cached credentials - Client tool using outdated saved password

Troubleshooting steps:

  1. Verify username and password are correct
  2. Check PostgreSQL server logs for detailed error information (more details than client sees)
  3. Review pg_hba.conf authentication method for the matching connection entry
  4. If using Kerberos/ident authentication, verify those systems are configured properly
  5. Ensure pg_hba.conf changes are reloaded: SELECT pg_reload_conf(); or restart PostgreSQL

Example fix for password authentication:

-- Reset user password as superuser
ALTER USER username WITH PASSWORD 'new_password';

Sources:

95% confidence
A

The "could not extend file" error (SQLSTATE 53100) occurs when PostgreSQL cannot allocate more disk space to grow a database file. This error falls under PostgreSQL's "Class 53 — Insufficient Resources" errors.

Primary Causes:

  1. Insufficient Disk Space - The most common cause. The partition containing the PostgreSQL data directory has run out of available disk space.

  2. Inode Exhaustion - The filesystem has exhausted available inodes even if disk space remains.

  3. Filesystem Fragmentation - Free space exists but is too fragmented to allocate as contiguous blocks of the required size.

  4. File Size Limits - The file has reached filesystem or OS limits (e.g., 2GB limit on some systems, though PostgreSQL typically segments large tables at 1GB).

  5. Disk Quota Exceeded - User or group disk quotas have been reached.

  6. Temporary File Growth - Temporary files from large queries/sorts can exhaust disk space during query execution.

The Error:

ERROR: could not extend file "base/16384/61892": No space left on device
HINT: Check free disk space.

Where base/16384/61892 indicates the database OID (16384) and relation filenode (61892).

Resolution:

  • Free up disk space on the PostgreSQL data partition
  • Use tablespaces to move data to different partitions
  • Set temp_file_limit to prevent runaway temporary file growth
  • Monitor disk usage proactively

Sources:

95% confidence
A

A "checksum mismatch" error in PostgreSQL indicates data corruption detected when reading a data page from disk.

What It Means

PostgreSQL writes a checksum value to each data page when the page is written to disk. When reading the page back, PostgreSQL calculates the checksum again and compares it to the stored value. A mismatch means the data has been altered unexpectedly—typically due to hardware failure, storage corruption, or operating system bugs.

Error Behavior

When checksums are enabled (default in PostgreSQL 18):

  • Default behavior: PostgreSQL aborts the current transaction immediately when a checksum mismatch is detected
  • Warning message: "WARNING: page verification failed, calculated checksum X but expected Y"
  • Monitoring: The pg_stat_database view tracks checksum failures in the checksum_failures and checksum_last_failure columns

What's Protected

  • Protected: Data pages only
  • Not protected: Internal data structures, temporary files

Recovery Options

The ignore_checksum_failure parameter can be temporarily enabled to bypass checksum verification during recovery, allowing you to read corrupted blocks (though this may cause crashes or propagate corruption). The pg_checksums utility can verify or manage checksums on offline clusters.

Sources:

95% confidence
A

A 'lock timeout' error in PostgreSQL occurs when a statement waits longer than the configured lock_timeout duration while attempting to acquire a lock on a database object (table, index, row, etc.).

The error is triggered when:

  • A statement needs to acquire a lock (explicit via LOCK TABLE, SELECT FOR UPDATE, or implicit during normal operations)
  • Another transaction already holds a conflicting lock on that object
  • The wait time exceeds the lock_timeout value (in milliseconds)

Error message:

ERROR: canceling statement due to lock timeout

Default behavior: lock_timeout = 0 (disabled - statements wait indefinitely for locks)

Key distinction: Unlike statement_timeout, this only measures time spent waiting for locks, not total statement execution time. Each separate lock acquisition attempt has its own timeout timer.

Common scenarios:

  • Long-running transactions holding locks while other queries wait
  • DDL operations (ALTER TABLE) blocking concurrent queries
  • Row-level locks from UPDATE/DELETE blocking other modifications

Sources:

95% confidence
A

The "relation does not exist" error in PostgreSQL occurs when the database cannot find the table, view, sequence, or other relation you're referencing. The three primary causes are:

1. Case Sensitivity with Quoted Identifiers

Unquoted identifiers are folded to lowercase. If you create a table with quotes like CREATE TABLE "MyTable", you must query it with quotes: SELECT * FROM "MyTable". Querying SELECT * FROM MyTable (unquoted) will look for mytable (lowercase) and fail.

Example:

-- Creates a case-sensitive table name
CREATE TABLE "Users" (id int);

-- This fails: ERROR: relation "users" does not exist
SELECT * FROM Users;

-- This works:
SELECT * FROM "Users";

2. Schema Search Path Issues

The table exists but isn't in your current schema search path. PostgreSQL looks for tables in the schemas listed in search_path. If your table is in a different schema (e.g., myschema.users), either qualify it fully or add the schema to your search path.

Example:

-- Table exists in 'myschema' but not in search_path
SELECT * FROM users;  -- ERROR: relation "users" does not exist

-- Solution 1: Use qualified name
SELECT * FROM myschema.users;

-- Solution 2: Add to search path
SET search_path TO myschema, public;
SELECT * FROM users;  -- Now works

3. Table Actually Doesn't Exist

The table hasn't been created yet, was dropped, or you're connected to the wrong database. Verify with \dt (psql) or query pg_tables.

Sources:

95% confidence
A

PostgreSQL 'disk full' errors occur when the filesystem hosting PostgreSQL data runs out of available space. The specific causes and impacts depend on which disk fills up:

WAL (Write-Ahead Log) disk full:

  • Causes database server panic and shutdown
  • Most critical scenario - can halt all database operations

Data disk full:

  • Prevents new write operations (INSERT, UPDATE, DELETE)
  • Does not corrupt existing data
  • Blocks useful activity but maintains data integrity

Common underlying causes:

  1. Uncontrolled table/index growth from normal operations
  2. Transaction log (WAL) accumulation when not archived/removed properly
  3. Temp files from large queries filling pg_tmp
  4. Disk quota limits on the PostgreSQL user account
  5. Filesystem degradation (many filesystems perform badly when nearly full)

Prevention: PostgreSQL recommends monitoring disk usage proactively and using tablespaces to distribute data across multiple filesystems before reaching capacity.

Sources:

95% confidence
A

Set lock_timeout to abort any statement that waits longer than the specified duration to acquire a lock.

Syntax:

SET lock_timeout = '2s';           -- Session level
SET LOCAL lock_timeout = '2s';     -- Transaction level
ALTER DATABASE mydb SET lock_timeout = '5s';  -- Database level
ALTER ROLE myuser SET lock_timeout = '3s';    -- User level

Valid values:

  • Integer with unit: '2s' (seconds), '500ms' (milliseconds), '1min' (minutes)
  • Integer without unit: treated as milliseconds
  • 0 or '0': disables timeout (default)

Examples:

-- For a single transaction
BEGIN;
SET LOCAL lock_timeout = '4s';
UPDATE accounts SET balance = balance + 100 WHERE id = 1;
COMMIT;

-- For current session
SET lock_timeout = '2s';

-- For a specific user
ALTER ROLE app_user SET lock_timeout = '3s';

Important:

  • Time limit applies separately to each lock acquisition attempt
  • Do NOT set in postgresql.conf - affects all sessions
  • If statement_timeout is set, keep lock_timeout lower, otherwise statement timeout triggers first
  • Applies to explicit locks (LOCK TABLE, SELECT FOR UPDATE) and implicit locks

Sources:

95% confidence
A

This error occurs when you attempt to insert or update a row with a NULL value in a column that has a NOT NULL constraint defined.

PostgreSQL enforces NOT NULL constraints at the database level - columns with this constraint cannot store NULL values. When you try to INSERT or UPDATE a row where a NOT NULL column would receive a NULL value, PostgreSQL raises error code 23502 (not_null_violation).

The error message format is:

ERROR: null value in column "column_name" violates not-null constraint
DETAIL: Failing row contains (...)

Common causes:

  • Explicitly inserting NULL: INSERT INTO table (col) VALUES (NULL)
  • Omitting a required column without a DEFAULT value
  • Setting a column to NULL in an UPDATE statement
  • Application logic passing NULL where a value is required

Example:

CREATE TABLE users (
    id serial PRIMARY KEY,
    email text NOT NULL
);

-- This fails:
INSERT INTO users (id) VALUES (1);
-- ERROR: null value in column "email" violates not-null constraint

Resolution: Provide a non-NULL value for the constrained column, or remove/modify the NOT NULL constraint if NULL values should be allowed.

Sources:

95% confidence
A

PostgreSQL "permission denied" errors occur when a user or role lacks the required privileges to perform an operation. Here are the specific causes:

Database Object Access

Missing object privileges - Users need explicit grants to access database objects. Operations fail without the correct privilege:

  • SELECT - read data
  • INSERT - add rows
  • UPDATE - modify data
  • DELETE - remove rows
  • TRUNCATE - clear tables
  • EXECUTE - call functions
  • REFERENCES - create foreign keys

Missing schema access - Requires USAGE privilege on the schema to access objects within it, even if you have privileges on the specific object.

Missing database connection privileges - Requires CONNECT privilege to connect to a database.

Ownership and Grants

Non-owner attempting restricted operations - Only the object owner (or superuser) can perform certain operations until privileges are explicitly granted via GRANT.

Revoked privileges - Privileges removed via REVOKE cause subsequent access attempts to fail.

Missing grant option - Cannot grant privileges to others without the "grant option" on that privilege.

Column-level restrictions - Operations on specific columns fail when column-level privileges weren't granted.

Special Cases

Foreign key enforcement - Requires REFERENCES privilege on the referenced table's columns.

Maintenance operations - Requires MAINTAIN privilege for VACUUM, ANALYZE, CLUSTER, REINDEX.

Row-level security policies - Policies can restrict row access even when table privileges exist.

File system permissions - Operating system file permissions on data directories, WAL files, or configuration files can cause permission denied at the OS level.

Solution Pattern

-- Grant privileges on a table
GRANT SELECT, INSERT ON schema_name.table_name TO role_name;

-- Grant schema access
GRANT USAGE ON SCHEMA schema_name TO role_name;

-- Grant database connection
GRANT CONNECT ON DATABASE database_name TO role_name;

Sources:

95% confidence
A

Handle serialization failures by implementing transaction retry logic that catches SQLSTATE error code 40001 (serialization_failure) or 40P01 (deadlock_detected).

Core Strategy:

  1. Wrap the entire transaction in a retry loop
  2. Catch serialization failure exceptions
  3. Retry the complete transaction from the beginning, including all decision logic
  4. Set a maximum retry limit to prevent infinite loops

Code Pattern (Python with psycopg2):

import psycopg2
from psycopg2 import errorcodes

MAX_RETRIES = 5

def execute_transaction(conn):
    for attempt in range(MAX_RETRIES):
        try:
            with conn:
                with conn.cursor() as cur:
                    # Complete transaction logic here
                    cur.execute("BEGIN ISOLATION LEVEL SERIALIZABLE")
                    # Your SQL operations
                    cur.execute("SELECT ...")
                    cur.execute("UPDATE ...")
                    # Transaction commits automatically at end of 'with' block
            break  # Success, exit retry loop
            
        except psycopg2.Error as e:
            if e.pgcode in (errorcodes.SERIALIZATION_FAILURE, 
                           errorcodes.DEADLOCK_DETECTED):
                if attempt < MAX_RETRIES - 1:
                    continue  # Retry
                else:
                    raise  # Max retries exceeded
            else:
                raise  # Non-retryable error

Critical Requirements:

  • Retry the entire transaction, including all application logic that determines which SQL to execute and which values to use
  • Do not use savepoints or attempt partial retries
  • PostgreSQL intentionally provides no automatic retry facility
  • Expect multiple retries may be needed under high contention
  • Unique constraint violations (SQLSTATE 23505) and exclusion constraint violations (23P01) may indicate serialization issues but could also be persistent errors requiring different handling

Sources:

95% confidence
A

Excellent! Now I have authoritative information from official PostgreSQL documentation. Let me compile the answer.

PostgreSQL "out of memory" errors (error code 53200) occur when the server cannot allocate sufficient memory to complete an operation.

Primary Causes:

  1. work_mem misconfiguration: This parameter sets memory per operation per session. A complex query can run multiple sort/hash operations simultaneously, each using up to work_mem. With multiple concurrent sessions, total memory usage = work_mem × operations × sessions. The official documentation states: "simply increasing work_mem results in memory pressure (memory pressure typically takes the form of intermittent out of memory errors)."

  2. Insufficient shared memory: PostgreSQL uses shared_buffers for caching data. Setting this too high (>40% of RAM) leaves insufficient memory for other operations. The documentation recommends 25% of system RAM as a starting point for dedicated servers.

  3. Too many concurrent connections: Each connection can use work_mem multiple times per query, multiplying memory consumption.

  4. Operating system OOM killer: When the entire system runs out of memory, the OS may terminate PostgreSQL processes (signal 9: Killed).

  5. Lock table exhaustion: Operations requiring many locks can exhaust shared memory when max_locks_per_transaction is too low.

The Multiplication Effect:
A query with 3 sorts using work_mem=100MB across 10 concurrent sessions can consume 3GB (3 × 100MB × 10) of memory, not 100MB.

Sources:

95% confidence
A

"Canceling statement due to statement timeout" is an ERROR message that PostgreSQL returns when a SQL statement exceeds the time limit set by the statement_timeout configuration parameter.

statement_timeout is a PostgreSQL configuration parameter that aborts any statement that takes more than the specified amount of time. The value is specified in milliseconds (when no unit is provided). When set to 0 (the default), timeouts are disabled.

How it works:

  • PostgreSQL tracks execution time for each SQL statement
  • When a statement exceeds the statement_timeout limit, PostgreSQL terminates it immediately
  • The client receives the error: ERROR: canceling statement due to statement timeout
  • If log_min_error_statement is set to ERROR or lower, the timed-out statement is also logged

Configuration examples:

-- Set timeout to 30 seconds for current session
SET statement_timeout = '30s';

-- Set timeout to 5000 milliseconds
SET statement_timeout = 5000;

-- Disable timeout for current session
SET statement_timeout = 0;

Important: The official documentation explicitly states that setting statement_timeout in postgresql.conf is not recommended because it affects all sessions. It's better to set it per-session, per-user, or per-database.

Sources:

95% confidence
A

The "too many connections" error in PostgreSQL occurs when the database server has reached its maximum concurrent connection limit and cannot accept new connections.

Error Details:

  • Error Code: 53300
  • Condition Name: too_many_connections
  • Error Class: Class 53 — Insufficient Resources
  • Error Message: FATAL: sorry, too many clients already

Root Cause:
The server's max_connections parameter (default: typically 100) determines the maximum number of concurrent connections allowed. When active connections reach this limit (minus reserved slots for superusers), new connection attempts are rejected.

Configuration Parameters:

  • max_connections: Maximum concurrent connections to the database server
  • superuser_reserved_connections: Connection slots reserved for superusers (default: 3)
  • reserved_connections: Additional reserved slots (default: 0)

Common Solutions:

  1. Increase max_connections in postgresql.conf (requires server restart)
  2. Implement connection pooling (e.g., PgBouncer, pgpool-II)
  3. Investigate and fix connection leaks in application code
  4. Set per-role or per-database connection limits

Note: Simply increasing max_connections without addressing the underlying cause (connection leaks, improper connection management) is not recommended, as each connection consumes system resources.

Sources:

95% confidence
A

The invalid input syntax error (SQLSTATE 22P02) occurs when PostgreSQL cannot parse text data into the target data type because the input doesn't conform to the expected format.

Common Causes:

Type mismatch during casting or insertion:

  • Attempting to insert/cast a non-numeric string into an INTEGER column (e.g., 'abc' or 'false' into INT)
  • Empty strings ('') being cast to numeric types
  • Improperly formatted dates, timestamps, or other typed data
  • Special characters or whitespace that breaks the expected format

Data import issues:

  • NULL representation mismatches in COPY operations (default: \N in text format, empty in CSV)
  • Quoted vs unquoted values in CSV imports
  • Encoding or delimiter problems

Error Details:

  • Error Code: 22P02
  • Condition Name: invalid_text_representation
  • Error Class: 22 (Data Exception)

The error message typically includes the specific value that failed and the target type, e.g., invalid input syntax for integer: "abc"

Example:

-- This will raise the error:
SELECT '123abc'::integer;
-- ERROR: invalid input syntax for integer: "123abc"

-- This works:
SELECT '123'::integer;
-- Result: 123

Sources:

95% confidence
A

A foreign key constraint violation occurs when you attempt to INSERT or UPDATE a row in a table where the foreign key value does not exist in the referenced parent table.

Specific causes:

  1. INSERT with non-existent reference - Inserting a row where the foreign key column value has no matching row in the parent table

  2. UPDATE with non-existent reference - Updating a foreign key column to a value that doesn't exist in the parent table

  3. DELETE from parent table - Deleting a row from the parent table that is still referenced by child rows (when using RESTRICT or NO ACTION)

  4. UPDATE parent table key - Updating the primary/unique key in the parent table that child rows reference (when using RESTRICT or NO ACTION)

Error message format:

ERROR: insert or update on table "child_table" violates foreign key constraint "constraint_name"
DETAIL: Key (column_name)=(value) is not present in table "parent_table".

Example:

-- If 'cities' table has no row with name='Berkeley'
INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');

-- Results in:
ERROR: insert or update on table "weather" violates foreign key constraint "weather_city_fkey"
DETAIL: Key (city)=(Berkeley) is not present in table "cities".

Foreign keys enforce referential integrity by ensuring every reference points to an existing row in the parent table.

Sources:

95% confidence
A

A statement timeout error occurs when a PostgreSQL query or statement takes longer to execute than the configured statement_timeout limit. PostgreSQL will abort the statement and return:

ERROR: canceling statement due to statement timeout

What is statement_timeout:
The statement_timeout parameter sets the maximum execution time for any SQL statement. When a statement exceeds this limit, PostgreSQL cancels it. The value is specified in milliseconds (default: 0, which disables timeout).

Key behavior:

  • The timeout applies from when the command arrives at the server until completion
  • Each SQL statement in a multi-statement query gets the timeout applied separately
  • The timed-out statement is logged if log_min_error_statement is set to ERROR or lower
  • Setting statement_timeout = 0 disables the timeout

Configuration example:

-- Set for current session (recommended)
SET statement_timeout = '30s';

-- Or in milliseconds
SET statement_timeout = 30000;

It's recommended to set statement_timeout at the session or role level rather than globally in postgresql.conf, to avoid affecting all database operations.

Sources:

95% confidence
A

PostgreSQL automatically detects deadlocks and resolves them by aborting one of the involved transactions, allowing the others to complete. Which transaction gets aborted is unpredictable.

Resolution Strategies

1. Automatic Retry
Implement retry logic to re-execute transactions that fail with deadlock errors (error code 40P01).

2. Prevention through Consistent Lock Ordering
Ensure all applications acquire locks on multiple objects in the same order across all transactions. This is the most effective prevention method.

3. Acquire Restrictive Locks Early
Take the most restrictive lock mode needed for each object at the beginning of the transaction.

4. Keep Transactions Short
Avoid holding transactions open during user input or long operations.

5. Configure Deadlock Detection Timeout
Adjust deadlock_timeout parameter (default: 1 second) to control how long PostgreSQL waits before checking for deadlocks:

ALTER SYSTEM SET deadlock_timeout = '1s';

Example: Query Retry Pattern

import psycopg2
from psycopg2 import errors

max_retries = 3
for attempt in range(max_retries):
    try:
        # Execute transaction
        conn.commit()
        break
    except errors.DeadlockDetected:
        conn.rollback()
        if attempt == max_retries - 1:
            raise

Sources:

95% confidence
A

To increase max_connections in PostgreSQL:

  1. Edit postgresql.conf - Locate your PostgreSQL configuration file (typically at /etc/postgresql/{version}/main/postgresql.conf or in your data directory) and modify:

    max_connections = 200
    

    (Replace 200 with your desired value; default is 100)

  2. Restart PostgreSQL - This parameter requires a full server restart:

    sudo systemctl restart postgresql
    # or
    pg_ctl restart -D /path/to/data/directory
    

Important Considerations:

  • Requires restart: max_connections can only be set at server start, not via reload
  • Shared memory impact: Increasing this value allocates more shared memory and system resources
  • Standby servers: Must set equal or higher value than primary server
  • Reserved connections: Ensure max_connections exceeds superuser_reserved_connections + reserved_connections

Verification:

SHOW max_connections;

Sources:

95% confidence
A

A serialization failure is an error (SQLSTATE code 40001) that PostgreSQL raises to prevent serialization anomalies when transactions running under Repeatable Read or Serializable isolation levels have conflicting concurrent access patterns.

When it occurs:

  • Repeatable Read: When a transaction attempts to modify or lock rows that have been changed by other transactions since it began (error message: "could not serialize access due to concurrent update")
  • Serializable: When concurrent transactions have read/write dependencies that cannot be reconciled with any serial execution order (error message: "could not serialize access due to read/write dependencies among transactions")

Required handling:
Applications must retry the complete transaction (including all logic and decision-making, not just SQL statements) when encountering this error. PostgreSQL does not provide automatic retry because it cannot guarantee correctness without re-executing the full transaction logic.

Example scenario:
Transaction A reads class 1 records and Transaction B reads class 2 records. Then A inserts based on class 1 data while B inserts based on class 2 data. If these operations create a logical conflict that couldn't occur in any serial execution, one transaction will fail with a serialization error.

Sources:

95% confidence

backup_recovery

19 questions
A

pg_dump backs up a single database. pg_dumpall backs up an entire PostgreSQL cluster (all databases plus global objects).

Key Differences

Scope:

  • pg_dump: One database at a time
  • pg_dumpall: All databases in the cluster

Global Objects:

  • pg_dump: Does NOT include roles, tablespaces, or cluster-wide settings
  • pg_dumpall: Includes roles, tablespaces, and privilege grants for configuration parameters

How it works:

  • pg_dump: Dumps a single database directly
  • pg_dumpall: Calls pg_dump for each database, then adds global objects

Consistency:

  • pg_dump: Single database is internally consistent
  • pg_dumpall: Each database is internally consistent, but snapshots across databases are not synchronized

When to Use Each

Use pg_dump when you need to backup/restore a specific database.

Use pg_dumpall when you need:

  • Complete cluster backup
  • To preserve user accounts (roles)
  • To preserve tablespace definitions
  • To migrate an entire PostgreSQL instance

You can also use pg_dumpall --globals-only to dump only cluster-wide objects, then combine with individual pg_dump commands for each database.

Sources:

95% confidence
A

The -Fc flag specifies the custom format output for pg_dump.

This format:

  • Outputs a custom-format archive suitable for input into pg_restore (cannot be used with psql)
  • Is the most flexible output format, allowing manual selection and reordering of archived items during restore
  • Is compressed by default
  • Supports parallel restoration

Example:

pg_dump -Fc mydb > mydb.dump
# Restore with:
pg_restore -d mydb mydb.dump

The -F parameter accepts different format values: c (custom), d (directory), t (tar), or p (plain text/SQL script).

Sources:

95% confidence
A

To restore from a pg_dump backup, use psql for plain-text SQL dumps or pg_restore for custom/directory/tar formats.

Plain-Text SQL Format

psql dbname < dumpfile

Create the target database first:

createdb -T template0 dbname
psql dbname < dumpfile

Custom/Directory/Tar Format

pg_restore -d dbname dumpfile

Common Options

Clean and recreate:

pg_restore -d dbname --clean --create dumpfile

Parallel restore (faster for large databases):

pg_restore -d dbname -j 4 dumpfile

Single transaction (all-or-nothing):

psql --single-transaction dbname < dumpfile

Stop on first error:

psql --set ON_ERROR_STOP=on dbname < dumpfile

List archive contents without restoring:

pg_restore -l dumpfile

Schema only:

pg_restore -d dbname --schema-only dumpfile

Data only:

pg_restore -d dbname --data-only dumpfile

Prerequisites

  • Target database must exist (unless using --create)
  • Database users referenced in the dump must already exist
  • Use template0 when creating the database to ensure clean state

Sources:

95% confidence
A

PostgreSQL supports three fundamental backup methods:

1. SQL Dump

Logical backup using pg_dump (single database) or pg_dumpall (entire cluster). Generates SQL commands that recreate the database state. This method is portable across PostgreSQL versions and machine architectures, and doesn't block database operations during backup.

2. File System Level Backup

Physical backup that directly copies PostgreSQL's data directory files. Requires database shutdown for consistency, unless using file system snapshots. Faster for large databases but less portable.

3. Continuous Archiving and Point-in-Time Recovery (PITR)

Combines a base filesystem backup (typically via pg_basebackup) with continuous archiving of Write-Ahead Log (WAL) files. Enables restoration to any point in time and supports warm standby servers.

Each method has distinct strengths and weaknesses regarding backup speed, restore flexibility, version portability, and operational overhead.

Sources:

95% confidence
A

recovery_target_xid is a PostgreSQL recovery configuration parameter that specifies the transaction ID (XID) up to which recovery will proceed during point-in-time recovery (PITR).

Key characteristics:

  • Type: String parameter
  • Purpose: Defines a stopping point for recovery based on a specific transaction ID
  • Scope: Can only be set at server start (in postgresql.conf or on command line)

Important behavior:

Transaction IDs are assigned sequentially when transactions start, but transactions can complete in a different numeric order. Recovery includes transactions that committed before (and optionally including) the specified XID.

The precise stopping point depends on the recovery_target_inclusive parameter:

  • on (default): Recovery stops just after the specified XID
  • off: Recovery stops just before the specified XID

Mutual exclusivity:

Only one recovery target parameter can be specified at a time:

  • recovery_target_xid
  • recovery_target_time
  • recovery_target_name
  • recovery_target_lsn
  • recovery_target

Using multiple will raise an error.

Sources:

95% confidence
A

PostgreSQL provides two main backup methods with different verification approaches:

For pg_basebackup (Physical Backups)

Use pg_verifybackup to verify backup integrity:

pg_verifybackup /path/to/backup

This tool:

  • Validates the backup_manifest file generated during backup
  • Checksums all files and compares against manifest values
  • Verifies WAL records needed for recovery are present and parsable
  • Detects missing or corrupted files

Common options:

# For tar-format backups (skip WAL parsing)
pg_verifybackup -n /path/to/backup

# Show progress
pg_verifybackup -P /path/to/backup

# Stop on first error
pg_verifybackup -e /path/to/backup

Critical limitation: pg_verifybackup cannot perform every check a running server will. You must still perform test restores to verify the backup actually works.

For pg_dump (Logical Backups)

No built-in verification tool exists. The only reliable method is test restoration:

# Restore to a test database
pg_restore -d test_database backup_file.dump

# Or for plain SQL dumps
psql test_database < backup_file.sql

pg_dump provides rudimentary corruption checking during backup creation by sequentially scanning all table data, but this doesn't verify the dump file itself is intact.

Best Practice for All Backups

Always perform test restores - this is the only way to truly verify backup integrity and recoverability, regardless of backup method.

Sources:

95% confidence
A

pg_restore is a PostgreSQL command-line utility that restores a PostgreSQL database from an archive file created by pg_dump in one of the non-plain-text formats (custom, directory, or tar).

Purpose

It reconstructs a database to the state it was in when the archive was created by issuing the necessary SQL commands. It only works with archives created by pg_dump - it cannot restore plain-text SQL dumps (use psql for those).

Supported Archive Formats

  • Custom format (-Fc): PostgreSQL's compressed binary format
  • Directory format (-Fd): Directory-based archive
  • Tar format (-Ft): Standard tar archive

The format is auto-detected, so you don't need to specify it.

Key Capabilities

  • Direct database restore: Connect directly to a database and restore
  • Script generation: Output SQL commands to stdout for manual execution
  • Selective restore: Choose specific tables, schemas, or objects to restore
  • Parallel restore: Use -j N to restore using N parallel jobs
  • Reordering: Reorder archive items before restoration
  • Clean option: Drop database objects before recreating them

Basic Usage

# Restore to a database
pg_restore -d dbname archive_file.dump

# Restore with parallel jobs
pg_restore -d dbname -j 4 archive_file.dump

# Restore only specific table
pg_restore -d dbname -t tablename archive_file.dump

Sources:

95% confidence
A

Barman (Backup and Recovery Manager) is an open-source administration tool for disaster recovery of PostgreSQL servers. It allows database administrators to perform remote backups of multiple PostgreSQL servers in a centralized way.

Key capabilities:

  • Physical backup and recovery of PostgreSQL databases
  • Remote backup from a central server
  • Point-in-Time Recovery (PITR)
  • WAL (Write-Ahead Log) archiving and management
  • Support for both rsync/SSH and streaming replication methods
  • Retention policies for automatic backup management
  • Incremental backups to save disk space
  • Parallel backup and recovery operations
  • Backup validation and verification

Maintained by: EnterpriseDB
License: GNU GPL 3
Language: Python
Current version: 3.16.2 (as of November 2025)

Barman runs on a dedicated backup server separate from your PostgreSQL database servers, making it suitable for managing disaster recovery across multiple database instances from a single location.

Sources:

95% confidence
A

Point-in-Time Recovery (PITR) is a PostgreSQL backup and recovery technique that enables restoring a database to any specific moment in time by combining a base backup with archived Write-Ahead Log (WAL) files.

How it works:

  1. Take a base backup of the entire database cluster
  2. Continuously archive WAL files as PostgreSQL records all database changes to the pg_wal/ subdirectory
  3. During recovery, restore the base backup and replay archived WAL files sequentially up to the desired point in time

Key capabilities:

  • Arbitrary recovery targets: Restore to any timestamp, named restore point, or specific transaction ID after the base backup
  • Consistent snapshots: Stop WAL replay at any point to get a consistent database state at that moment
  • No downtime for backups: Base backups don't need to be perfectly consistent; WAL replay corrects internal inconsistencies
  • Continuous protection: Archive WAL files indefinitely instead of taking frequent full backups
  • Warm standby support: Apply the same WAL sequence to another server for high-availability configurations

Example recovery targets:

-- Recover to specific timestamp
recovery_target_time = '2025-01-15 12:00:00'

-- Recover to named restore point
recovery_target_name = 'before_migration'

-- Recover to specific transaction
recovery_target_xid = '1234567'

Sources:

95% confidence
A

recovery_target_name is a PostgreSQL recovery configuration parameter that specifies the named restore point (created with pg_create_restore_point()) to which recovery will proceed.

When you set this parameter, PostgreSQL will recover the database up to the point where you previously created a named restore point, then stop. This allows point-in-time recovery to a specific, named location rather than a timestamp or transaction ID.

Key characteristics:

  • Must reference a restore point created with pg_create_restore_point()
  • Mutually exclusive with recovery_target_time, recovery_target_xid, and recovery_target_lsn
  • Only one recovery target parameter can be specified at a time
  • Requires server restart to take effect

Example usage:

-- Creating a restore point (before recovery)
SELECT pg_create_restore_point('before_major_update');

-- In postgresql.conf or recovery.conf (for recovery)
recovery_target_name = 'before_major_update'

Sources:

95% confidence
A

Use the -j njobs or --jobs=njobs option to specify the number of parallel jobs when running pg_dump.

Syntax:

pg_dump -Fd -j 4 -f output_directory dbname

Requirements:

  1. Directory format only: Must use -Fd (directory format). This is the only format where multiple processes can write simultaneously.

  2. Database connections: pg_dump opens njobs + 1 connections to the database. Ensure your max_connections setting accommodates this.

  3. PostgreSQL version: Server must support synchronized snapshots (9.2+ for primary servers, 10+ for standbys) to ensure consistency across parallel workers.

How it works:
The parallel jobs dump njobs tables simultaneously, reducing dump time but increasing database server load.

Important limitation:
If worker processes encounter exclusive locks on database objects, the dump may fail. Workers use shared locks with NOWAIT and will abort rather than deadlock if conflicts occur.

Example with 8 parallel jobs:

pg_dump -Fd -j 8 -f /backup/mydb_dump mydatabase

Sources:

95% confidence
A

To recover PostgreSQL to a specific timestamp, use Point-in-Time Recovery (PITR) with the recovery_target_time parameter.

Configuration Steps

  1. Set the recovery target in postgresql.conf:
recovery_target_time = '2024-01-15 14:30:00+00:00'
  1. Create a recovery signal file:
touch recovery.signal
  1. Configure WAL restore command in postgresql.conf:
restore_command = 'cp /path/to/archive/%f %p'
  1. Start PostgreSQL - it will enter recovery mode and replay WAL files up to the specified timestamp.

Timestamp Format

The timestamp must be in the same format as PostgreSQL's timestamp with time zone data type:

  • Use numeric UTC offset: '2024-01-15 14:30:00+00:00'
  • Or full timezone name: '2024-01-15 14:30:00 Europe/Helsinki'
  • Avoid timezone abbreviations unless timezone_abbreviations is pre-configured

Requirements

  • A base backup taken before the target timestamp
  • Continuous WAL archiving enabled
  • WAL archive files covering the period from base backup to target time
  • The recovery target time must be after the base backup's end time

Optional: Control recovery behavior

recovery_target_inclusive = true  # Include the target transaction (default)
recovery_target_action = 'promote'  # Auto-promote after recovery

Sources:

95% confidence
A

Point-in-Time Recovery (PITR) in PostgreSQL requires three phases: enabling WAL archiving, creating base backups, and configuring recovery.

Phase 1: Enable WAL Archiving

Edit postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
  • %p = full path of WAL file to archive
  • %f = filename only
  • Archive command must return exit code 0 on success, non-zero on failure

Restart PostgreSQL after configuration changes.

Phase 2: Create Base Backup

Method 1 (Recommended):

pg_basebackup -D /backup/location -Ft -z -P

Method 2 (Low-level API):

SELECT pg_backup_start(label => 'mybackup', fast => false);

Then copy the data directory (excluding pg_wal/), then:

SELECT * FROM pg_backup_stop(wait_for_archive => true);

Phase 3: Perform Recovery

  1. Stop PostgreSQL server
  2. Clear the data directory
  3. Restore base backup files with correct ownership
  4. Remove contents of pg_wal/ directory
  5. Create recovery.signal file in data directory (empty file)
  6. Configure postgresql.conf:
restore_command = 'cp /mnt/server/archivedir/%f %p'
recovery_target_time = '2025-01-15 12:00:00'  # optional: for PITR
  1. Start PostgreSQL - it will enter recovery mode automatically
  2. After recovery completes, recovery.signal is renamed to recovery.done

Recovery Target Options

Specify one of:

  • recovery_target_time = 'timestamp' - recover to specific time
  • recovery_target_name = 'restore_point' - recover to named point
  • recovery_target_xid = 'transaction_id' - recover to transaction ID
  • recovery_target = 'immediate' - recover to end of base backup

Sources:

95% confidence
A

A parallel pg_dump is a database backup operation that uses multiple concurrent jobs to dump tables simultaneously, speeding up the backup process for large PostgreSQL databases.

How it works:

  • Use the -j or --jobs option with the number of parallel workers (e.g., pg_dump -j 4 -Fd -f backup_dir mydb)
  • pg_dump spawns multiple worker processes that dump different tables at the same time
  • Requires the directory output format (-Fd) — this is the only format supporting parallel writes

Requirements:

  • Must use directory format output (-Fd)
  • Database must support synchronized snapshots (PostgreSQL 9.2+ for primary, 10+ for standby)
  • Sufficient max_connections: pg_dump opens njobs + 1 connections to the database

Trade-offs:

  • Faster dumps for large databases with many tables
  • Higher database load during the backup
  • Risk of deadlocks if other clients request exclusive locks during the dump

Example:

pg_dump -j 4 -Fd -f /backup/mydb_dir mydb

This runs 4 parallel workers, creating a directory-format backup in /backup/mydb_dir.

Sources:

95% confidence
A

The archive_command is a shell command executed by PostgreSQL to archive completed WAL (Write-Ahead Log) file segments.

Core Functionality:

  • Executes a local shell command to copy/move WAL segments to archive storage
  • Uses %p placeholder for the full path of the file to archive
  • Uses %f placeholder for just the filename
  • Must return zero exit status on success

Requirements:

  • Only active when archive_mode = on and archive_library = '' (empty)
  • Can only be set in postgresql.conf or on server command line (requires restart)
  • Server will wait and retry if command returns non-zero exit status

Example:

archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'

This command checks if the file doesn't already exist in the archive directory, then copies it there.

Important Notes:

  • The command should ensure WAL files are not overwritten (use test ! -f check)
  • Setting to empty string while archive_mode = on causes WAL files to accumulate
  • Cannot be used simultaneously with archive_library setting (will raise error)

Sources:

95% confidence
A

pg_dump is a PostgreSQL command-line utility for extracting a single database into a script file or archive. It creates consistent backups even while the database is actively being used, without blocking other users.

Primary Function:

  • Exports a PostgreSQL database to a file containing SQL commands that can recreate the database in the same state

Key Characteristics:

  • Non-blocking: Does not interfere with concurrent read or write operations
  • Consistent snapshots: Produces logically consistent dumps even during active use
  • Flexible output formats:
    • Plain-text SQL script (default)
    • Custom archive format (-Fc) - compressed, allows selective restore
    • Directory format (-Fd) - supports parallel dumps and restores
    • Tar archive (-Ft)

Basic Usage:

pg_dump dbname > outfile.sql
pg_dump -Fc dbname > outfile.dump  # Custom format
pg_dump -Fd dbname -f dumpdir      # Directory format

Selective Options:

  • --schema-only: Export only table structures, no data
  • --data-only: Export only data, no schemas
  • -t table_name: Export specific table(s)
  • -n schema_name: Export specific schema(s)

Important Note:
For regular production backups, PostgreSQL recommends continuous archiving (WAL archiving) instead of pg_dump, as it's generally more suitable for that purpose.

Sources:

95% confidence
A

WAL-G is an archival and restoration tool for databases in the cloud. It's designed for making encrypted, compressed backups (both full and incremental) of PostgreSQL, MySQL/MariaDB, and MS SQL Server databases and pushing/fetching them to/from remote storage (S3, Azure, Google Cloud Storage, etc.) without saving them to the local filesystem.

Key capabilities:

  • Incremental backups: Delta backups that store only the difference between the previous backup and present state
  • Compression: Supports lz4 (default), lzma, zstd, and brotli
  • Encryption: Built-in encryption support for secure cloud storage
  • Cloud-native: Direct integration with cloud storage providers
  • WAL archiving: Handles PostgreSQL Write-Ahead Log archiving and restoration

WAL-G is particularly useful for automating PostgreSQL backup workflows in cloud environments and reducing storage costs through compression and incremental backups.

Sources:

95% confidence
A

The restore_command is a shell command that PostgreSQL executes to retrieve archived WAL (Write-Ahead Log) segments from storage during recovery operations.

Configuration

Set in postgresql.conf or on the server command line:

restore_command = 'cp /mnt/server/archivedir/%f %p'

Parameter Substitutions

  • %f - Replaced by the name of the WAL file to retrieve from the archive
  • %p - Replaced by the path where PostgreSQL should copy the file to
  • %r - Replaced by the name of the file containing the last valid restart point
  • %% - Literal percent character

Exit Status Requirements

The command must:

  • Return exit status 0 only on success
  • Return nonzero when the requested file doesn't exist in the archive (this is normal behavior, not an error)
  • Handle requests for both WAL segments and .history files

Behavior

  • Required for archive recovery (PITR - Point-in-Time Recovery)
  • Optional for streaming replication
  • If a file is unavailable in the archive, PostgreSQL looks in pg_wal/ directory
  • If the command is terminated by a signal (except SIGTERM) or shell error, recovery aborts
  • Default value is empty string ('')

Sources:

95% confidence
A

pgBackRest is a reliable backup and restore solution designed specifically for PostgreSQL databases. It is an open-source utility that aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.

Key capabilities:

  • Parallel backup and restore for performance
  • Full, incremental, and differential backup types
  • Archive WAL (Write-Ahead Logs) for point-in-time recovery
  • Multiple repository support (local, Amazon S3, Azure, GCS)
  • Encryption and compression
  • Backup rotation and retention policies
  • Delta restore for faster recovery
  • Page checksums for data integrity verification

pgBackRest is designed to handle PostgreSQL databases from small installations to multi-terabyte environments with minimal configuration complexity.

Sources:

95% confidence

PL/pgSQL > Control Structures

19 questions

Partitioning > Partition-wise Operations

19 questions
A

With enable_partitionwise_aggregate enabled, the number of nodes whose memory usage is restricted by work_mem appearing in the final plan can increase linearly according to the number of partitions being scanned, which can result in a large increase in overall memory consumption during query execution.

95% confidence
A

With enable_partitionwise_join enabled, the number of nodes whose memory usage is restricted by work_mem can increase linearly according to the number of partitions being scanned, which can result in a large increase in overall memory consumption during query execution.

95% confidence

Concurrency Control > Deadlock Detection

19 questions

Extensions > pg_trgm

19 questions

PL/pgSQL > Variables and Types

19 questions
A

When a function returns a polymorphic type (anyelement, anyarray, anynonarray, anyenum, or anyrange), a special parameter $0 is created with its data type being the actual return type of the function, as deduced from the actual input types.

95% confidence

Indexes > Covering Indexes

18 questions

Indexes > Partial Indexes

18 questions
A

A partial index is an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries only for those table rows that satisfy the predicate.

95% confidence

Concurrency Control > Advisory Locks

18 questions

Foreign Key Performance > FK Index Requirements

18 questions

Indexes > Multicolumn Indexes

18 questions
A

Equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will always be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index but do not reduce the portion of the index that has to be scanned.

95% confidence

Indexes > SP-GiST Indexes

17 questions

Window Functions > Ranking Functions

17 questions

Data Types - Bit String

17 questions

partitioning

17 questions
A

PARTITION BY Clause in PostgreSQL

PARTITION BY is a clause used in window functions to divide the result set into partitions (groups of rows) over which the window function operates independently. Each partition is processed separately, and the window function resets for each new partition.

Syntax

function_name(...) OVER (
    PARTITION BY column1, column2, ...
    [ORDER BY column3, ...]
)

Key Characteristics

  • Scope: Operates within SELECT statements only
  • Purpose: Creates logical groups without reducing rows (unlike GROUP BY which collapses rows)
  • Reset Behavior: Window function calculations reset at each partition boundary
  • Multiple Columns: Can partition by multiple columns (comma-separated)

Concrete Example

-- Rank employees by salary within each department
SELECT 
    employee_name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM employees;

Result behavior:

  • Rankings restart at 1 for each distinct department value
  • Without PARTITION BY, rankings would be across all rows
  • Preserves all original rows (unlike GROUP BY)

Common Window Functions Using PARTITION BY

  • Ranking: ROW_NUMBER(), RANK(), DENSE_RANK()
  • Aggregates: SUM(), AVG(), COUNT(), MIN(), MAX()
  • Offset: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()

PARTITION BY vs GROUP BY

-- GROUP BY: Collapses to one row per department
SELECT department, AVG(salary)
FROM employees
GROUP BY department;

-- PARTITION BY: Keeps all rows, adds calculated column
SELECT employee_name, department, salary,
       AVG(salary) OVER (PARTITION BY department) as dept_avg
FROM employees;

Availability: PostgreSQL 8.4+ (released July 2009)

Source: PostgreSQL Official Documentation - Window Functions

95% confidence
A

Range Partitioning in PostgreSQL

Range partitioning divides a table into partitions based on a specified column's value ranges. Each partition holds rows where the partitioning key falls within a defined range (e.g., dates, numbers).

Syntax (PostgreSQL 10+)

CREATE TABLE measurements (
    id SERIAL,
    sensor_id INT,
    recorded_at DATE NOT NULL,
    temperature NUMERIC
) PARTITION BY RANGE (recorded_at);

-- Create partitions
CREATE TABLE measurements_2024_q1 PARTITION OF measurements
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

CREATE TABLE measurements_2024_q2 PARTITION OF measurements
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

Key Specifications

  • Range boundaries: Lower bound is INCLUSIVE, upper bound is EXCLUSIVE ([lower, upper))
  • Partition key: Can be a single column or expression, must be part of any UNIQUE/PRIMARY KEY constraint
  • NULL handling: Create a dedicated partition using FOR VALUES FROM (MINVALUE) TO (minval) or handle NULLs with a DEFAULT partition
  • Default partition (PostgreSQL 11+): Catches rows not matching any partition
CREATE TABLE measurements_default PARTITION OF measurements DEFAULT;

Performance Notes

  • Query planner uses constraint exclusion to skip irrelevant partitions (automatic in PostgreSQL 11+)
  • Indexing: Create indexes on EACH partition individually (not inherited from parent in versions <11)
  • Optimal partition count: 100-1000 partitions maximum for best query planning performance

Version: PostgreSQL 10+ (declarative partitioning syntax shown above)

Source: PostgreSQL Official Documentation - Table Partitioning (v16)

95% confidence
A

PostgreSQL Partitioning Performance Benefits

Partitioning in PostgreSQL provides these verified performance improvements:

1. Query Performance via Partition Pruning

Benefit: Queries scan only relevant partitions, not the entire table.

  • Speedup: 10-100x faster for queries filtering on partition key
  • Mechanism: Query planner excludes irrelevant partitions at plan time
  • Requirement: Query must include partition key in WHERE clause
-- Example: Date-partitioned table
CREATE TABLE measurements (
    logdate DATE NOT NULL,
    value NUMERIC
) PARTITION BY RANGE (logdate);

-- Query only scans jan_2024 partition, not all 12 months
SELECT * FROM measurements WHERE logdate = '2024-01-15';

2. Bulk DELETE Performance

Benefit: DROP TABLE or DETACH PARTITION is 100-1000x faster than DELETE

  • DELETE on 1M rows: Minutes (requires VACUUM afterward)
  • DROP PARTITION: Milliseconds (instant metadata operation)
-- Fast: drops partition instantly
DROP TABLE measurements_2023_01;

-- Slow: scans and deletes rows one by one
DELETE FROM measurements WHERE logdate < '2023-02-01';

3. Index Size Reduction

Benefit: Smaller indexes improve cache hit rates and maintenance speed

  • Impact: B-tree index depth reduced when each partition's index is smaller
  • Rule: Index on 10M row table split into 10x 1M row partitions = shallower B-trees = fewer I/O operations

4. Maintenance Operations (VACUUM, ANALYZE)

Benefit: Parallel maintenance on independent partitions

  • Can vacuum/analyze individual partitions without locking entire table
  • Autovacuum workers can process multiple partitions concurrently

Key Limitation

Partitioning does NOT improve performance for queries without partition key filtering. Full table scans still read all partitions.

Version Requirement

  • PostgreSQL 10+: Declarative partitioning with partition pruning
  • PostgreSQL 11+: Improved partition pruning during execution (not just planning)
  • PostgreSQL 13+: Pruning works with prepared statements

Source: PostgreSQL 17 Official Documentation - Partitioning

95% confidence
A

How Indexes Work on Partitioned Tables in PostgreSQL

PostgreSQL does not automatically inherit indexes from parent to child partitions. Each partition must have its own indexes created independently.

Creating Indexes

When you create an index on a partitioned table (PostgreSQL 11+), it automatically creates corresponding indexes on all existing and future partitions:

CREATE INDEX idx_sales_date ON sales (sale_date);

This creates:

  • A partitioned index on the parent table (metadata only, not a physical index)
  • Individual local indexes on each partition
  • Automatic index creation on any new partitions added later

Key Behaviors

1. Pre-PostgreSQL 11: Indexes must be created manually on each partition. No automatic propagation exists.

2. PostgreSQL 11+: CREATE INDEX on parent automatically propagates to all partitions.

3. Unique Constraints: A unique index on a partitioned table must include the partition key. This is required because PostgreSQL cannot efficiently enforce uniqueness across partitions otherwise.

-- This works (includes partition key)
CREATE UNIQUE INDEX idx_unique ON sales (id, sale_date);

-- This fails if sale_date is the partition key
CREATE UNIQUE INDEX idx_unique ON sales (id);

4. Primary Keys: Same rule applies - must include partition key columns.

5. ATTACH PARTITION: When attaching an existing table, it must already have a compatible index matching the parent's partitioned index, or the attach will fail.

CREATE INDEX idx_sales_2024_date ON sales_2024 (sale_date);
ALTER TABLE sales ATTACH PARTITION sales_2024 FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

6. Performance: Partition pruning works with indexes. Queries that filter on partition keys scan only relevant partition indexes, not all of them.

Version: PostgreSQL 11+ (automatic index propagation). For PostgreSQL 10 and earlier, manual index creation per partition required.

Source: PostgreSQL Official Documentation - Table Partitioning (versions 11-16)

95% confidence
A

Yes, PostgreSQL supports foreign keys on partitioned tables since version 11.

Foreign Keys FROM Partitioned Tables

Since PostgreSQL 11, you can define foreign keys that reference from a partitioned table to another table:

CREATE TABLE orders (
    order_id BIGINT PRIMARY KEY,
    customer_id INT REFERENCES customers(customer_id)
) PARTITION BY RANGE (order_id);

The foreign key constraint is automatically applied to all partitions.

Foreign Keys TO Partitioned Tables

Since PostgreSQL 11, you can also define foreign keys that reference a partitioned table's primary key:

CREATE TABLE orders (
    order_id BIGINT PRIMARY KEY
) PARTITION BY RANGE (order_id);

CREATE TABLE order_items (
    item_id BIGINT PRIMARY KEY,
    order_id BIGINT REFERENCES orders(order_id)
);

Critical Limitations

  1. Primary/Unique Key Requirement: The referenced column(s) must include ALL partition key columns. For example:
-- This works - partition key is part of referenced columns
CREATE TABLE orders (
    order_id BIGINT,
    order_date DATE,
    PRIMARY KEY (order_id, order_date)
) PARTITION BY RANGE (order_date);

-- This fails - partition key not in referenced columns
CREATE TABLE orders (
    order_id BIGINT PRIMARY KEY,
    order_date DATE
) PARTITION BY RANGE (order_date);
  1. ON DELETE/UPDATE CASCADE: Fully supported since PostgreSQL 11.

Source: PostgreSQL 11+ official documentation on Table Partitioning and Declarative Partitioning limitations.

95% confidence
A

PostgreSQL Partitioning Limitations (PostgreSQL 11+)

Hard Limits

Maximum number of partitions: No hard limit, but practical limit is thousands (not tens of thousands). Performance degrades with query planning time proportional to partition count. PostgreSQL 13+ improved this significantly.

Partition key restrictions:

  • Cannot change partition key columns after creation
  • Partition key expressions must be immutable
  • Cannot include generated columns (until PostgreSQL 14)
  • Cannot use system columns

Constraint Limitations

Primary keys and unique constraints must include ALL partition key columns:

-- This WORKS:
CREATE TABLE orders (
    order_id int,
    order_date date,
    PRIMARY KEY (order_id, order_date)
) PARTITION BY RANGE (order_date);

-- This FAILS:
CREATE TABLE orders (
    order_id int,
    order_date date,
    PRIMARY KEY (order_id)  -- Missing partition key
) PARTITION BY RANGE (order_date);

Operational Limitations

Foreign keys:

  • PostgreSQL 11-12: No foreign key support from partitioned tables to other tables
  • PostgreSQL 12+: Foreign keys FROM partitioned tables work
  • Foreign keys TO partitioned tables not supported in any version

Row movement:

  • PostgreSQL 10: UPDATEs that change partition key fail
  • PostgreSQL 11+: Row movement allowed but slower (DELETE + INSERT)

Global indexes: Not supported. Each partition has separate indexes. No single index spans all partitions.

Triggers: BEFORE ROW triggers cannot be created on partitioned tables (only on individual partitions).

Performance Limitations

Partition pruning requires:

  • WHERE clause directly references partition key with constants or stable expressions
  • Query planner can eliminate partitions at plan time (not always possible with prepared statements using parameters)

No parallel-aware partition-wise JOIN until PostgreSQL 11+.

Source: PostgreSQL 17 Official Documentation - Table Partitioning (https://www.postgresql.org/docs/current/ddl-partitioning.html)

95% confidence
A

List Partitioning in PostgreSQL

List partitioning is a table partitioning method in PostgreSQL (available since version 10) where rows are distributed to partitions based on explicit lists of key values.

Definition

In list partitioning, you specify which discrete values belong to each partition using the FOR VALUES IN clause. Each partition contains rows where the partition key matches one of the listed values.

Syntax

CREATE TABLE table_name (columns...) 
PARTITION BY LIST (partition_key_column);

CREATE TABLE partition_name PARTITION OF table_name
FOR VALUES IN (value1, value2, value3);

Example

-- Create parent table
CREATE TABLE orders (
    order_id INT,
    country_code TEXT,
    order_date DATE,
    amount NUMERIC
) PARTITION BY LIST (country_code);

-- Create partitions for specific countries
CREATE TABLE orders_usa PARTITION OF orders
FOR VALUES IN ('US', 'USA');

CREATE TABLE orders_europe PARTITION OF orders
FOR VALUES IN ('DE', 'FR', 'UK', 'IT');

CREATE TABLE orders_asia PARTITION OF orders
FOR VALUES IN ('JP', 'CN', 'IN');

Key Characteristics

  • Use case: Best for columns with discrete, known values (status codes, country codes, categories)
  • NULL handling: Create a partition with FOR VALUES IN (NULL) to handle NULL values
  • Default partition: Use DEFAULT keyword for values not in any list: CREATE TABLE orders_other PARTITION OF orders DEFAULT;
  • No overlaps: Value lists across partitions must not overlap (PostgreSQL enforces this)
  • Multiple values: Each partition can contain multiple values in its list

Performance Note

List partitioning enables partition pruning: queries with WHERE country_code = 'US' will scan only the relevant partition, not the entire table.

Source: PostgreSQL 17 Official Documentation - Table Partitioning

95% confidence
A

PostgreSQL constraint_exclusion Setting

constraint_exclusion is a query planner setting that controls whether the PostgreSQL query planner examines table constraints (such as CHECK constraints and table inheritance bounds) to determine if entire tables or partitions can be skipped during query execution.

Valid Values (PostgreSQL 9.2+)

  • off - Constraints are never examined (default before PostgreSQL 9.2)
  • on - Constraints are examined for all tables
  • partition - Constraints are examined only for inheritance child tables and UNION ALL subqueries (default since PostgreSQL 9.2)

Default Value

partition (since PostgreSQL 9.2)

How It Works

When enabled, the planner checks if a query's WHERE clause logically contradicts a table's CHECK constraint or partition bounds, allowing it to skip scanning that table entirely.

Example:

-- Table with CHECK constraint
CREATE TABLE orders_2024 (
    order_date DATE CHECK (order_date >= '2024-01-01' AND order_date < '2025-01-01'),
    amount NUMERIC
) INHERITS (orders);

-- Set constraint_exclusion
SET constraint_exclusion = partition;

-- Query that can skip orders_2024
SELECT * FROM orders WHERE order_date >= '2025-01-01';
-- Planner will exclude orders_2024 because constraint contradicts WHERE clause

Recommended Setting

Use partition (default). Setting to on adds overhead for non-partitioned tables with minimal benefit. Only set to on if you have many tables with CHECK constraints used for manual data filtering.

Configuration

-- Session level
SET constraint_exclusion = partition;

-- postgresql.conf
constraint_exclusion = partition

Source: PostgreSQL 16 Official Documentation - Runtime Config: Query Planning

95% confidence
A

Partition-wise Join in PostgreSQL

Partition-wise join is an optimization technique where PostgreSQL joins partitioned tables by joining corresponding partitions directly, rather than treating them as a single large table. This reduces memory usage and improves parallelization.

How It Works

When joining two partitioned tables with the same partitioning scheme, PostgreSQL can join matching partitions independently:

-- Example: Both tables partitioned by date range
CREATE TABLE sales_2023 PARTITION OF sales 
  FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
CREATE TABLE orders_2023 PARTITION OF orders 
  FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

-- With partition-wise join enabled, this joins sales_2023 with orders_2023,
-- sales_2024 with orders_2024, etc.
SELECT * FROM sales s JOIN orders o ON s.order_id = o.id;

Configuration

Parameter: enable_partitionwise_join
Default: off (as of PostgreSQL 16)
Enable with:

SET enable_partitionwise_join = on;

Or in postgresql.conf:

enable_partitionwise_join = on

Requirements for Activation

  1. Both tables must be partitioned with identical partitioning schemes (same partition bounds and types)
  2. Join condition must include the partition key
  3. Parameter enable_partitionwise_join must be on

Availability

  • Introduced: PostgreSQL 10 (2017)
  • Aggregate support: PostgreSQL 12 added partition-wise grouping/aggregation

Performance Impact

Benefits: Smaller per-partition memory footprint, better parallelization (each partition pair can be joined by separate workers).

Source: PostgreSQL 16 Official Documentation - Query Planning

95% confidence
A

PostgreSQL Partitioning Methods

PostgreSQL supports 3 native partitioning methods (as of PostgreSQL 10+):

1. Range Partitioning

Partition by a range of values (e.g., date ranges, numeric ranges).

CREATE TABLE measurements (
    logdate date NOT NULL,
    value int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurements_y2024 PARTITION OF measurements
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

2. List Partitioning

Partition by discrete list of values (e.g., country codes, categories).

CREATE TABLE orders (
    order_id int,
    country_code text
) PARTITION BY LIST (country_code);

CREATE TABLE orders_us PARTITION OF orders
    FOR VALUES IN ('US', 'USA');

3. Hash Partitioning

Partition by hash of a column value for even distribution (PostgreSQL 11+).

CREATE TABLE users (
    user_id bigint,
    username text
) PARTITION BY HASH (user_id);

CREATE TABLE users_p0 PARTITION OF users
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);

Key Requirements

  • Partition key must be part of all unique constraints/primary keys
  • Maximum partitions: no hard limit, but 100-1000 partitions is practical
  • Partitioning must be declared at table creation with PARTITION BY

Source: PostgreSQL Official Documentation v16 - Table Partitioning (https://www.postgresql.org/docs/current/ddl-partitioning.html)

95% confidence
A

Partition-wise Aggregation in PostgreSQL

Partition-wise aggregation is a query optimization technique where PostgreSQL pushes down aggregate calculations (like SUM, COUNT, AVG) to individual partitions and then combines the results, rather than aggregating data from all partitions together.

How It Works

When enabled, PostgreSQL:

  1. Computes partial aggregates on each partition independently
  2. Combines (finalizes) these partial results into the final aggregate

This enables parallel processing of partitions and can significantly improve performance for aggregations on partitioned tables.

Configuration

Controlled by the enable_partitionwise_aggregate parameter (default: off):

-- Enable partition-wise aggregation
SET enable_partitionwise_aggregate = on;

-- Or in postgresql.conf
enable_partitionwise_aggregate = on

Requirements

  • PostgreSQL version: 11+ (introduced in PostgreSQL 11)
  • The GROUP BY clause must include the partition key or a subset of it
  • Works with declarative partitioning (not table inheritance)
  • Compatible aggregates: SUM, COUNT, AVG, MIN, MAX, and other combinable aggregates

Example

-- Partitioned table
CREATE TABLE sales (
    sale_date DATE,
    region TEXT,
    amount NUMERIC
) PARTITION BY RANGE (sale_date);

-- Query using partition-wise aggregation
SET enable_partitionwise_aggregate = on;

SELECT region, SUM(amount)
FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY region;

With partition-wise aggregation enabled, PostgreSQL computes SUM(amount) per region on each partition, then combines results.

Source

PostgreSQL 17 Official Documentation: Partition Pruning and Constraint Exclusion

95% confidence
A

pg_partman Extension

pg_partman is a PostgreSQL extension that automates the creation and management of time-based and serial-based table partitions. It was developed by Keith Fiske and is available at https://github.com/pgpartman/pg_partman.

Core Functionality:

  • Automatically creates new partitions before they're needed
  • Removes or archives old partitions based on retention policies
  • Supports both native PostgreSQL partitioning (10+) and trigger-based inheritance (legacy)
  • Handles time-based (daily, weekly, monthly, yearly, hourly) and serial/ID-based partitioning

Installation:

CREATE EXTENSION pg_partman;

Basic Setup Example (time-based partitioning):

-- Create parent table (PostgreSQL 10+)
CREATE TABLE measurements (
    time TIMESTAMPTZ NOT NULL,
    device_id INT,
    value NUMERIC
) PARTITION BY RANGE (time);

-- Initialize partman (creates 4 partitions ahead, keeps 4 partitions back)
SELECT partman.create_parent(
    p_parent_table := 'public.measurements',
    p_control := 'time',
    p_type := 'native',
    p_interval := 'daily',
    p_premake := 4
);

Maintenance Function:

-- Call this via cron/pg_cron to maintain partitions
SELECT partman.run_maintenance_proc();

Key Configuration Parameters:

  • p_interval: Partition interval (e.g., '1 day', '1 week', '1 month', '100000' for serial)
  • p_premake: Number of partitions to create in advance (default: 4)
  • p_retention: How far back to keep partitions (e.g., '30 days')
  • p_retention_keep_table: Boolean for whether to drop or just detach old partitions

Requirements:

  • PostgreSQL 9.4+ (for trigger-based), 10+ recommended (for native partitioning)
  • Requires schema partman (automatically created with extension)
  • Needs periodic execution of run_maintenance_proc() (recommended: every hour or more frequently)

Source: Official repository at https://github.com/pgpartman/pg_partman and PostgreSQL Extension Network (PGXN)

95% confidence
A

Partition Pruning in PostgreSQL

Partition pruning is PostgreSQL's query optimization technique that automatically excludes (skips scanning) partitions that cannot contain rows matching the query's WHERE clause conditions. This significantly improves query performance on partitioned tables by reducing the amount of data scanned.

How It Works

When you query a partitioned table with filters on the partition key, the planner analyzes the WHERE clause and determines which partitions could possibly contain matching rows. Partitions that provably cannot contain matches are excluded from the query plan.

Example

-- Create partitioned table
CREATE TABLE measurements (
    id bigint,
    logdate date NOT NULL,
    value numeric
) PARTITION BY RANGE (logdate);

CREATE TABLE measurements_2024_01 PARTITION OF measurements
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE measurements_2024_02 PARTITION OF measurements
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Query with partition pruning
SELECT * FROM measurements WHERE logdate = '2024-01-15';
-- Only scans measurements_2024_01, prunes measurements_2024_02

Requirements for Pruning

  1. Constraint Exclusion: Set enable_partition_pruning = on (default since PostgreSQL 11)
  2. Partition key in WHERE clause: Query must filter on partition key columns
  3. Immutable operators: Use operators that can be evaluated at plan time (=, <, >, BETWEEN, IN)

Verification

Check if pruning occurred using EXPLAIN:

EXPLAIN SELECT * FROM measurements WHERE logdate = '2024-01-15';
-- Look for "Partitions removed: N" in output

Version: Available since PostgreSQL 11 (2018). PostgreSQL 11+ performs pruning at both plan time and execution time (for prepared statements with parameters).

Source: PostgreSQL 17 Documentation - Partition Pruning

95% confidence
A

Yes, you can add indexes to partitioned tables in PostgreSQL (version 11+).

How it works:

  • Create an index on the partitioned table (parent) using standard CREATE INDEX
  • PostgreSQL automatically creates matching indexes on all existing partitions
  • Future partitions inherit the index definition automatically

Example:

-- Create index on partitioned table
CREATE INDEX idx_sales_date ON sales (sale_date);

-- This automatically creates indexes on all partitions:
-- sales_2023_q1, sales_2023_q2, etc.

Critical details:

  • Version requirement: PostgreSQL 11+ (automatic index propagation introduced in 11)
  • Index types supported: All standard index types (B-tree, Hash, GiST, GIN, BRIN)
  • CONCURRENTLY limitation: CREATE INDEX CONCURRENTLY does NOT work on partitioned tables directly. You must create indexes on each partition individually with CONCURRENTLY if you need non-blocking creation.
  • Unique indexes: Must include the partition key column(s) to work across partitions

For concurrent index creation:

-- Create index on each partition individually
CREATE INDEX CONCURRENTLY idx_sales_2023_q1_date ON sales_2023_q1 (sale_date);
CREATE INDEX CONCURRENTLY idx_sales_2023_q2_date ON sales_2023_q2 (sale_date);
-- Then create on parent (fast, already exists on partitions)
CREATE INDEX idx_sales_date ON sales (sale_date);

Source: PostgreSQL 16 Official Documentation - Table Partitioning, Section 5.11.2.3

95% confidence
A

Attaching an Existing Table as a Partition in PostgreSQL

Use the ALTER TABLE ... ATTACH PARTITION command. The existing table must match the parent table's structure and satisfy the partition constraint.

Syntax

ALTER TABLE parent_table_name
ATTACH PARTITION existing_table_name FOR VALUES partition_bound_spec;

Requirements (PostgreSQL 10+)

  1. Identical structure: Column names, types, and order must match the parent table exactly
  2. Constraints: All NOT NULL and CHECK constraints on the parent must exist on the child
  3. Partition constraint: Existing data must satisfy the partition boundary (PostgreSQL will validate this)
  4. Ownership: You must own both tables

Examples

Range partition:

-- Attach table as partition for January 2024
ALTER TABLE measurements
ATTACH PARTITION measurements_jan2024 FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

List partition:

-- Attach table as partition for specific regions
ALTER TABLE sales
ATTACH PARTITION sales_west FOR VALUES IN ('CA', 'OR', 'WA');

Hash partition (PostgreSQL 11+):

-- Attach as hash partition with modulus 4, remainder 0
ALTER TABLE orders
ATTACH PARTITION orders_p0 FOR VALUES WITH (MODULUS 4, REMAINDER 0);

Validation Lock

By default, PostgreSQL acquires an ACCESS EXCLUSIVE lock to validate existing data. For large tables, use CONCURRENTLY (PostgreSQL 14+):

ALTER TABLE measurements
ATTACH PARTITION measurements_jan2024 FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
CONCURRENTLY;

This uses a SHARE UPDATE EXCLUSIVE lock but takes longer to complete.

Source: PostgreSQL 17 official documentation, Section 5.11.2.2 (Partition Maintenance)

95% confidence
A

Moving Data Between Partitions in PostgreSQL

Direct Answer: You cannot directly "move" a row between partitions. Instead, you must DELETE from the source partition and INSERT into the target partition within a transaction, or use UPDATE to change the partition key value (PostgreSQL 11+).

Method 1: UPDATE Partition Key (PostgreSQL 11+)

PostgreSQL 11 and later automatically moves rows when you update the partition key:

BEGIN;
UPDATE parent_table 
SET partition_key_column = new_value 
WHERE id = 123;
COMMIT;

This internally deletes from the old partition and inserts into the new one. The operation is atomic within the transaction.

Configuration: Ensure enable_partition_pruning = on (default since PG 11).

Method 2: Explicit DELETE + INSERT

For earlier versions or explicit control:

BEGIN;
WITH moved_rows AS (
    DELETE FROM parent_table 
    WHERE id = 123 
    RETURNING *
)
INSERT INTO parent_table 
SELECT * FROM moved_rows;
COMMIT;

The RETURNING clause captures deleted data, and the insert statement routes it to the correct partition based on partition key.

Bulk Move Between Partitions

For moving many rows (e.g., repartitioning):

BEGIN;
WITH moved AS (
    DELETE FROM parent_table 
    WHERE partition_key_column BETWEEN old_start AND old_end
    RETURNING *
)
INSERT INTO parent_table SELECT * FROM moved;
COMMIT;

Performance Note: For large datasets (>100k rows), consider batching in chunks of 10,000-50,000 rows to avoid long-running transactions.

Source

PostgreSQL 11+ Documentation: Table Partitioning - Row Movement

95% confidence
A

PostgreSQL Query Optimizer and Partition Handling

PostgreSQL uses constraint exclusion and partition pruning (PostgreSQL 11+) to optimize queries on partitioned tables.

Partition Pruning (PostgreSQL 11+)

The optimizer automatically eliminates partitions at plan time and execution time when WHERE clause conditions don't match partition constraints.

Requirements for partition pruning:

  • enable_partition_pruning = on (default in PostgreSQL 11+)
  • WHERE clause uses partition key columns
  • Operators: =, <, >, <=, >=, BETWEEN, IN
  • For list partitioning: equality or IN operators
  • For range partitioning: comparison operators

Example:

-- Table partitioned by range on created_date
CREATE TABLE orders (
    id int,
    created_date date,
    amount numeric
) PARTITION BY RANGE (created_date);

CREATE TABLE orders_2024_q1 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

CREATE TABLE orders_2024_q2 PARTITION OF orders
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

-- This query will prune orders_2024_q2 partition
SELECT * FROM orders WHERE created_date = '2024-02-15';

Execution-Time Pruning (PostgreSQL 11+)

When partition keys depend on runtime parameters or subqueries, pruning happens during execution:

-- Pruning occurs at execution time
PREPARE stmt (date) AS
    SELECT * FROM orders WHERE created_date = $1;

EXECUTE stmt('2024-02-15');

Performance Characteristics

  • Plan time: O(log N) for range/hash partitions using binary search
  • Overhead per partition: ~40 bytes of planner memory (PostgreSQL 14+)
  • Maximum effective partitions: ~1000-2000 (beyond this, overhead increases)

Verify Pruning with EXPLAIN

EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM orders WHERE created_date = '2024-02-15';

Output shows only scanned partitions:

Seq Scan on orders_2024_q1
  Filter: (created_date = '2024-02-15')

Configuration

  • constraint_exclusion = partition (default) - enables constraint exclusion for partitioned tables only
  • enable_partition_pruning = on (default) - enables partition pruning

Source: PostgreSQL 17 Official Documentation - Partitioning (Section 5.11)

95% confidence

Data Types - Binary

17 questions

Data Types - Monetary

17 questions
A

If lc_monetary is set to the empty string (which is the default), then the value is inherited from the execution environment of the server in a system-dependent way. More specifically, lc_monetary is usually set in postgresql.conf by initdb using a value derived from the server environment.

95% confidence
A

No, PostgreSQL does not support direct multiplication of two money type values because that would not typically yield a meaningful monetary value. Multiplication should only be performed with constants (e.g., multiplying a price by a quantity).

Sources
95% confidence

Foreign Key Performance > Cascading Actions Performance

17 questions

PostGIS > Coordinate Systems

17 questions
A

SRID 3857 is named 'WGS 84 / Pseudo-Mercator' or Web Mercator, and is the map projection used by Google Maps, Bing Maps, OpenStreetMap, and almost any web world map. The unit of measure for SRID 3857 is meters.

Sources
95% confidence

Window Functions > Window Frame Specifications

16 questions

Streaming Replication > Cascading Replication

16 questions

extensions

15 questions
A

Use the CREATE EXTENSION command to install an extension in PostgreSQL:

CREATE EXTENSION extension_name;

Prerequisites:

  • Extension files must be installed on the PostgreSQL server first
  • Check available extensions: SELECT * FROM pg_available_extensions;

Permissions:

  • Most extensions require superuser privileges
  • "Trusted" extensions can be installed by any user with CREATE privilege on the database

Common options:

-- Install only if not already present
CREATE EXTENSION IF NOT EXISTS extension_name;

-- Install in specific schema
CREATE EXTENSION extension_name SCHEMA schema_name;

-- Install specific version
CREATE EXTENSION extension_name VERSION '1.0';

-- Auto-install dependencies
CREATE EXTENSION extension_name CASCADE;

Example:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

The user running CREATE EXTENSION becomes the extension owner. Objects created by the extension are typically owned by this user.

Sources:

95% confidence
A

Based on the authoritative sources found, here's the answer:

pg_cron is a PostgreSQL extension that provides cron-based job scheduling directly inside the database. It allows you to schedule periodic PostgreSQL commands using standard cron syntax (e.g., '0 2 * * *' for daily at 2am).

Key characteristics:

  • Runs as a background worker process inside PostgreSQL (requires PostgreSQL 10 or higher)
  • Uses standard cron syntax for scheduling
  • Executes SQL commands at scheduled intervals
  • Runs multiple jobs in parallel, but at most one instance of each job at a time
  • Supports interval-based scheduling using seconds syntax (e.g., '*/30 seconds')
  • Must be loaded via shared_preload_libraries in postgresql.conf
  • Created using CREATE EXTENSION pg_cron

Common use cases include scheduling maintenance tasks, periodic aggregations, and data cleanup jobs.

Sources:

95% confidence
A

Based on the official repository and PostgreSQL announcements, here's the authoritative answer:

pgvector

pgvector is an open-source PostgreSQL extension that adds vector similarity search capabilities to PostgreSQL, enabling storage and querying of high-dimensional vectors (embeddings).

Core capabilities:

  • Store vector embeddings directly in PostgreSQL columns
  • Perform similarity searches using distance metrics (L2 distance, inner product, cosine distance)
  • Index vectors for fast approximate nearest neighbor search
  • Supports exact and approximate nearest neighbor queries

Supported vector types:

  • vector - stores vectors with configurable dimensions (e.g., vector(1536) for OpenAI embeddings)
  • Supports half-precision (halfvec), binary (bit), and sparse vectors (sparsevec)

Index types:

  • HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search
  • IVFFlat for partitioned vector search

Requirements:

  • PostgreSQL 13 or higher

Typical use cases:

  • Semantic search applications
  • Recommendation systems
  • AI/ML applications using embeddings from models like OpenAI, Cohere, or custom models

Sources:

95% confidence
A

postgres_fdw is a PostgreSQL extension that provides a foreign data wrapper for accessing data stored in external PostgreSQL servers. It allows you to query remote PostgreSQL databases as if the tables were local.

What it does:

  • Creates foreign tables that reference tables on remote PostgreSQL servers
  • Executes SELECT, INSERT, UPDATE, DELETE, COPY, and TRUNCATE operations on remote data
  • Pushes WHERE clauses and joins to the remote server for execution (reducing data transfer)
  • Provides more transparent and standards-compliant syntax than older methods like dblink
  • Enables querying across multiple PostgreSQL databases from a single connection

Setup workflow:

  1. Install: CREATE EXTENSION postgres_fdw;
  2. Define remote server: CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'remote_host', dbname 'remote_db', port '5432');
  3. Create user mapping: CREATE USER MAPPING FOR local_user SERVER foreign_server OPTIONS (user 'remote_user', password 'password');
  4. Create foreign table: CREATE FOREIGN TABLE foreign_table (...) SERVER foreign_server OPTIONS (schema_name 'public', table_name 'remote_table');

Sources:

95% confidence
A

The pg_trgm similarity threshold is 0.3 by default.

The pg_trgm.similarity_threshold parameter controls when the % operator considers two strings similar. It must be between 0 and 1 (inclusive), where:

  • 0 = any strings match (most permissive)
  • 1 = only identical strings match (most strict)

You can set it per session:

SET pg_trgm.similarity_threshold = 0.5;

Or query the current value:

SHOW pg_trgm.similarity_threshold;

Related thresholds:

  • pg_trgm.word_similarity_threshold: 0.6 (default) - for <% and %> operators
  • pg_trgm.strict_word_similarity_threshold: 0.5 (default) - for <<% and %>> operators

Sources:

95% confidence
A

TimescaleDB is a PostgreSQL extension that turns PostgreSQL into a time-series database for high-performance real-time analytics. It is packaged as a Postgres extension, meaning it extends PostgreSQL's native capabilities rather than being a separate database system.

Key characteristics:

  • Built as a native PostgreSQL extension (installed via CREATE EXTENSION timescaledb)
  • Optimized for time-series data workloads (IoT sensors, application metrics, financial data, etc.)
  • Provides automatic partitioning through "hypertables" that look like standard PostgreSQL tables
  • Offers faster time-based queries, data compression, continuous aggregates, and data retention policies
  • Maintains full SQL compatibility while adding time-series-specific functions
  • Designed for high-volume data ingest and complex analytical queries

Installation:

CREATE EXTENSION IF NOT EXISTS timescaledb;

TimescaleDB extends PostgreSQL without replacing it, allowing you to use all standard PostgreSQL features alongside specialized time-series optimizations.

Sources:

95% confidence
A

PostGIS provides the following spatial data types:

Primary Spatial Data Types

geometry - Represents spatial features using planar (Cartesian) coordinate systems. Supports various subtypes: Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection.

geography - Represents spatial features using geodetic (ellipsoidal/spherical) coordinate systems for accurate distance and area calculations on Earth's surface.

Bounding Box Types

box2d - A 2-dimensional bounding box containing xmin, ymin, xmax, ymax values.

box3d - A 3-dimensional bounding box containing xmin, ymin, zmin, xmax, ymax, zmax values.

Utility Types

geometry_dump - A composite type with two fields: geom (holding a geometry object) and path[] (a 1-dimensional array holding the position of the geometry within the dumped object). Used for decomposing complex geometries.

box2df - A 2D float precision bounding box used internally for indexing operations.

Sources:

95% confidence
A

pg_stat_statements provides 47 metrics tracking SQL planning and execution statistics. Key metric categories:

Identification:

  • userid - OID of user executing statement
  • dbid - OID of database
  • queryid - Hash code for normalized query
  • query - Statement text
  • toplevel - Boolean (top-level vs nested)

Planning:

  • plans - Number of planning operations
  • total_plan_time, min_plan_time, max_plan_time, mean_plan_time, stddev_plan_time - Planning duration (milliseconds)

Execution:

  • calls - Number of executions
  • total_exec_time, min_exec_time, max_exec_time, mean_exec_time, stddev_exec_time - Execution duration (milliseconds)
  • rows - Total rows retrieved/affected

Buffer Cache:

  • shared_blks_hit, shared_blks_read, shared_blks_dirtied, shared_blks_written - Shared buffer statistics
  • local_blks_hit, local_blks_read, local_blks_dirtied, local_blks_written - Local buffer statistics
  • temp_blks_read, temp_blks_written - Temporary block statistics

I/O Timing:

  • shared_blk_read_time, shared_blk_write_time - Shared block I/O duration
  • local_blk_read_time, local_blk_write_time - Local block I/O duration
  • temp_blk_read_time, temp_blk_write_time - Temp block I/O duration

WAL (Write-Ahead Log):

  • wal_records, wal_fpi, wal_bytes, wal_buffers_full - WAL generation statistics

JIT Compilation:

  • jit_functions, jit_generation_time - JIT compilation counts/timing
  • jit_inlining_count, jit_inlining_time - Inlining metrics
  • jit_optimization_count, jit_optimization_time - Optimization metrics
  • jit_emission_count, jit_emission_time - Code emission metrics
  • jit_deform_count, jit_deform_time - Tuple deforming metrics

Parallel Execution:

  • parallel_workers_to_launch, parallel_workers_launched - Worker thread counts

Timestamps:

  • stats_since, minmax_stats_since - Statistics collection start times

All timing metrics are in milliseconds. The view contains one row per distinct (database, user, query) combination.

Sources:

95% confidence
A

The citext extension provides a case-insensitive character string type for PostgreSQL. It is essentially equivalent to the text data type, but performs case-insensitive comparisons by internally calling lower() on values when comparing them.

Key characteristics:

  • Case-insensitive comparisons: 'ABC' equals 'abc' when using citext
  • Preserves original case: Stores data in its original form
  • Trusted extension: Can be installed by non-superusers with CREATE privilege
  • Performance consideration: Less efficient than text because it must make copies and convert to lowercase for comparisons

Enable with:

CREATE EXTENSION citext;

Example usage:

CREATE TABLE users (
    email citext PRIMARY KEY
);

-- Case-insensitive matching
SELECT * FROM users WHERE email = '[email protected]';

Note: PostgreSQL recommends considering nondeterministic collations instead of citext for modern applications, as they offer more flexible case-insensitive and accent-insensitive comparison options.

Sources:

95% confidence
A

To query foreign PostgreSQL servers, use the postgres_fdw extension following these five steps:

1. Install the extension:

CREATE EXTENSION postgres_fdw;

2. Create a foreign server object (defines the remote database):

CREATE SERVER foreign_server
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host '192.83.123.89', port '5432', dbname 'foreign_db');

3. Create a user mapping (provides credentials):

CREATE USER MAPPING FOR local_user
    SERVER foreign_server
    OPTIONS (user 'foreign_user', password 'password');

4. Create a foreign table (local representation of remote table):

CREATE FOREIGN TABLE foreign_table (
    id integer NOT NULL,
    data text
)
    SERVER foreign_server
    OPTIONS (schema_name 'some_schema', table_name 'some_table');

Important: The columns must match the remote table structure.

5. Query the foreign table like any local table:

SELECT * FROM foreign_table;

You can also use INSERT, UPDATE, DELETE, COPY, and TRUNCATE on foreign tables.

Alternative: Import entire schemas automatically:

IMPORT FOREIGN SCHEMA remote_schema
    FROM SERVER foreign_server
    INTO local_schema;

Sources:

95% confidence
A

Based on the authoritative PostGIS documentation, here's how to perform geospatial queries:

Geospatial Queries with PostGIS

PostGIS provides spatial relationship functions that work directly in SQL queries. These automatically use spatial indexes when available.

Common Spatial Relationship Queries:

ST_Contains - Test if one geometry contains another:

SELECT name FROM neighborhoods 
WHERE ST_Contains(geom, ST_SetSRID(ST_Point(-71.06, 42.34), 4326));

ST_Intersects - Find geometries that intersect:

SELECT a.name, b.name 
FROM roads a, buildings b 
WHERE ST_Intersects(a.geom, b.geom);

ST_Within - Find geometries within another:

SELECT name FROM parks 
WHERE ST_Within(geom, (SELECT geom FROM city WHERE name = 'Boston'));

ST_Distance - Calculate distance between geometries:

SELECT name FROM places 
WHERE ST_Distance(geom, ST_Point(-71.06, 42.34)) < 1000;

ST_DWithin - Optimized distance query (uses spatial index):

SELECT name FROM places 
WHERE ST_DWithin(geom, ST_SetSRID(ST_Point(-71.06, 42.34), 4326), 1000);

Spatial Index Usage:

Create a spatial index for performance:

CREATE INDEX spatial_idx ON table_name USING GIST(geom);

Spatial predicates (ST_Contains, ST_Intersects, ST_Within, ST_Overlaps, ST_Crosses, ST_Touches, ST_Disjoint, ST_Equals, ST_Covers, ST_CoveredBy, ST_ContainsProperly) automatically use spatial indexes - no need to manually add the && operator.

Sources:

95% confidence
A

How to Use Trigram Similarity Searches in PostgreSQL

1. Enable the extension:

CREATE EXTENSION pg_trgm;

2. Use the similarity function:

SELECT similarity('word', 'two words') AS score;
-- Returns a value between 0 (no similarity) and 1 (identical)

3. Use the % operator for similarity matching:

SELECT * FROM products 
WHERE name % 'PostgreSQL';
-- Returns rows where similarity exceeds threshold (default 0.3)

4. Order results by similarity:

SELECT name, similarity(name, 'PostgreSQL') AS score
FROM products
WHERE name % 'PostgreSQL'
ORDER BY score DESC;

5. Create an index for fast similarity searches:

-- GiST index
CREATE INDEX name_trgm_idx ON products USING GIST (name gist_trgm_ops);

-- GIN index (faster for static data)
CREATE INDEX name_trgm_idx ON products USING GIN (name gin_trgm_ops);

6. Use distance operator for nearest matches:

SELECT name, name <-> 'PostgreSQL' AS distance
FROM products
ORDER BY distance
LIMIT 10;

Key Functions:

  • similarity(text1, text2) - Returns similarity score (0-1)
  • word_similarity(text1, text2) - Similarity to continuous text portions
  • show_trgm(text) - Shows all trigrams for debugging

Adjust similarity threshold:

SET pg_trgm.similarity_threshold = 0.5;  -- Default is 0.3

Sources:

95% confidence
A

CREATE EXTENSION loads an extension into the current database. It executes the extension's script file to create new SQL objects such as functions, data types, operators, tables, and other database objects.

Syntax:

CREATE EXTENSION [ IF NOT EXISTS ] extension_name
    [ WITH ] [ SCHEMA schema_name ]
             [ VERSION version ]
             [ CASCADE ]

Key Parameters:

  • IF NOT EXISTS - Suppresses errors if the extension is already installed
  • extension_name - Name of the extension to install
  • SCHEMA schema_name - Schema to contain the extension's objects (must already exist)
  • VERSION version - Specific version to install (defaults to control file value)
  • CASCADE - Automatically installs dependent extensions

Privileges: Requires superuser privileges by default, unless the extension is marked as "trusted" in its control file (in which case CREATE privilege on the database is sufficient).

Example:

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION hstore SCHEMA public;
CREATE EXTENSION postgis CASCADE;

Sources:

95% confidence
A

Based on the official pgvector GitHub repository, here's how to perform vector similarity search in PostgreSQL:

Installation and Setup

  1. Install the pgvector extension (available via PostgreSQL APT/Yum repositories)
  2. Enable the extension in your database:
CREATE EXTENSION vector;

Creating Tables with Vector Columns

CREATE TABLE items (
  id bigserial PRIMARY KEY,
  embedding vector(3)
);

The vector(3) specifies a 3-dimensional vector. Adjust the dimension as needed.

Performing Similarity Searches

pgvector supports three distance operators:

  • <-> - L2 distance (Euclidean)
  • <#> - Inner product (negative, so larger values = less similar)
  • <=> - Cosine distance

Example queries:

-- Find 5 nearest neighbors using L2 distance
SELECT * FROM items 
ORDER BY embedding <-> '[3,1,2]' 
LIMIT 5;

-- Find similar items using cosine distance
SELECT * FROM items 
ORDER BY embedding <=> '[3,1,2]' 
LIMIT 5;

-- Find similar items using inner product
SELECT * FROM items 
ORDER BY embedding <#> '[3,1,2]' 
LIMIT 5;

Adding Indexes for Performance

HNSW index (recommended for most cases):

CREATE INDEX ON items USING hnsw (embedding vector_l2_ops);

IVFFlat index:

CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops);

Replace vector_l2_ops with vector_ip_ops for inner product or vector_cosine_ops for cosine distance.

By default, pgvector performs exact nearest neighbor search with perfect recall. Indexes enable approximate nearest neighbor search, trading some recall for speed.

Sources:

95% confidence
A

Use TimescaleDB hypertables when:

  • You have time-series data requiring real-time analytics and continuous inserts
  • You need automatic chunk management without manual intervention
  • You require non-blocking operations for adding/dropping partitions (TimescaleDB takes only access share locks on the parent table, allowing concurrent reads/writes)
  • You want time-series specific optimizations (automatic time-descending indexes, compression, data retention policies)
  • Query performance matters more than simplicity (TimescaleDB shows 1000x faster queries in some time-series workloads)

Use PostgreSQL native partitioning when:

  • Your table size exceeds physical memory of the database server (rule of thumb from PostgreSQL docs)
  • You need full control over partition strategy (range, list, hash)
  • Most heavily accessed rows are in a single partition or small number of partitions
  • You want no additional extensions or dependencies
  • Workload is data warehouse style where query planning overhead is acceptable
  • You need to frequently drop entire partitions via DROP TABLE or ALTER TABLE DETACH PARTITION (faster than bulk DELETE)

Key architectural difference: TimescaleDB uses its own chunk catalog system independent of PostgreSQL's declarative partitioning and inheritance mechanisms, avoiding relation cache bottlenecks and minimizing locks by design.

Performance consideration: While ingest performance is similar, TimescaleDB excels at query performance for time-series patterns due to its specialized optimization layer on top of PostgreSQL.

Sources:

95% confidence

Common Table Expressions

15 questions

Foreign Key Performance > FK and Bulk Operations

15 questions
A

TRUNCATE is faster than DELETE because it does not actually scan the tables, reclaims disk space immediately rather than requiring a subsequent VACUUM operation, and does not acquire row-level locks. However, TRUNCATE cannot be used on a table that has foreign-key references from other tables unless all such tables are also truncated in the same command (using CASCADE).

95% confidence
A

The table must have been created or truncated in the current subtransaction, no cursors can be open, and no older snapshots can be held by the transaction. It can only be used with COPY FROM, not COPY TO, and cannot be used on partitioned tables or foreign tables.

95% confidence

Server Configuration > File Locations

15 questions
A

pg_ctl reload sends the postgres process a SIGHUP signal, causing it to re-read its configuration files (postgresql.conf, pg_hba.conf, etc.). The postmaster forwards the SIGHUP signal to its child processes so they all pick up the new values.

95% confidence

Logical Replication > Row Filters

14 questions

Logical Replication > Column Lists

14 questions
A

The pgoutput replication plug-in was fixed to not send columns not listed in a table's replication column list (commit ebf87c019). UPDATE and DELETE events did not pay attention to the configured column list, sending more data than expected and wasting network bandwidth. Contributed by Hou Zhijie.

95% confidence

Data Types - pg_lsn

13 questions

Functions - Conditional

13 questions
A

PostgreSQL has a general (searched) form with syntax 'CASE WHEN condition THEN result [WHEN ...] [ELSE result] END' that evaluates boolean conditions, and a simple form with syntax 'CASE expression WHEN value THEN result [WHEN ...] [ELSE result] END' that compares an expression against specific values.

95% confidence
A

NULLIF(1, 2.2) yields numeric type. The result has the same type as the first argument of the implied = operator, and in some cases that will have been promoted to match the second argument's type. Since there is no integer = numeric operator, only numeric = numeric, the first argument is promoted to numeric.

95% confidence
A

They use an identical algorithm to match up their component expressions and select a result data type. The algorithm checks for identical types, handles domain types, processes unknown types, ensures all non-unknown inputs belong to the same type category, selects a preferred candidate type, and converts all inputs to the final candidate type.

95% confidence

Extensions > pg_repack

12 questions
A

Only superusers or the owners of the target tables are permitted to run pg_repack, because the extension makes low-level modifications to system catalogs and needs high privileges.

Sources
95% confidence
A

No, pg_repack has significant compatibility issues with logical replication. The way pg_repack does table rewrite is almost guaranteed to break logical decoding unless there are zero unconsumed changes for a given table, as it does not build the necessary mappings info for logical decoding.

95% confidence
A

The --dry-run option makes pg_repack print out what would have been repacked without performing any action. It allows you to simulate the repack process without modifying data.

Sources
95% confidence

ddl_operations

12 questions
A

PostgreSQL Foreign Key ON DELETE Options

PostgreSQL supports 5 ON DELETE actions for foreign key constraints:

1. NO ACTION (default)

Prevents deletion if referenced rows exist. Check is performed at the end of the statement.

ALTER TABLE orders 
ADD CONSTRAINT fk_customer 
FOREIGN KEY (customer_id) REFERENCES customers(id) 
ON DELETE NO ACTION;

2. RESTRICT

Prevents deletion if referenced rows exist. Check is performed immediately (difference from NO ACTION only matters with deferrable constraints).

ALTER TABLE orders 
ADD CONSTRAINT fk_customer 
FOREIGN KEY (customer_id) REFERENCES customers(id) 
ON DELETE RESTRICT;

3. CASCADE

Automatically deletes all referencing rows when the referenced row is deleted.

ALTER TABLE order_items 
ADD CONSTRAINT fk_order 
FOREIGN KEY (order_id) REFERENCES orders(id) 
ON DELETE CASCADE;

4. SET NULL

Sets the foreign key column(s) to NULL when the referenced row is deleted. The column must be nullable.

ALTER TABLE orders 
ADD CONSTRAINT fk_salesperson 
FOREIGN KEY (salesperson_id) REFERENCES employees(id) 
ON DELETE SET NULL;

5. SET DEFAULT

Sets the foreign key column(s) to their DEFAULT value when the referenced row is deleted. A default value must be defined.

ALTER TABLE orders 
ADD CONSTRAINT fk_status 
FOREIGN KEY (status_id) REFERENCES order_statuses(id) 
ON DELETE SET DEFAULT;

Default Behavior: If ON DELETE is not specified, NO ACTION is used.

Source: PostgreSQL 17 Official Documentation - Foreign Keys (https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-FK)

95% confidence
A

PRIMARY KEY vs UNIQUE Constraint in PostgreSQL

Direct Answer:

  • PRIMARY KEY = UNIQUE + NOT NULL + table identifier (only ONE per table)
  • UNIQUE constraint allows NULL values (multiple NULLs permitted) and allows multiple unique constraints per table

Key Differences

1. NULL Handling

-- PRIMARY KEY: Rejects NULLs
CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    email VARCHAR(255)
);
INSERT INTO users (id, email) VALUES (NULL, '[email protected]'); 
-- ERROR: null value in column "id" violates not-null constraint

-- UNIQUE: Allows NULLs (multiple NULL values permitted)
CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    sku VARCHAR(50) UNIQUE
);
INSERT INTO products (id, sku) VALUES (1, NULL); -- OK
INSERT INTO products (id, sku) VALUES (2, NULL); -- OK (multiple NULLs allowed)

2. Quantity Per Table

  • PRIMARY KEY: Exactly ONE per table
  • UNIQUE: Multiple allowed per table
CREATE TABLE orders (
    id INTEGER PRIMARY KEY,           -- Only one PRIMARY KEY
    order_number VARCHAR(50) UNIQUE,  -- First UNIQUE constraint
    tracking_code VARCHAR(50) UNIQUE  -- Second UNIQUE constraint - OK
);

3. Foreign Key References

  • PRIMARY KEY: Automatic target for foreign key references (default)
  • UNIQUE: Can be referenced by foreign keys but must be explicitly specified
CREATE TABLE departments (
    id INTEGER PRIMARY KEY,
    dept_code VARCHAR(10) UNIQUE
);

-- References PRIMARY KEY by default
CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    dept_id INTEGER REFERENCES departments  -- References departments(id)
);

-- Must explicitly specify UNIQUE column
CREATE TABLE projects (
    id INTEGER PRIMARY KEY,
    dept_code VARCHAR(10) REFERENCES departments(dept_code)
);

4. Index Creation

Both automatically create a unique B-tree index, but:

  • PRIMARY KEY index named: tablename_pkey
  • UNIQUE constraint index named: tablename_columnname_key

When to Use Each

  • PRIMARY KEY: Table's main identifier (user IDs, order IDs)
  • UNIQUE: Alternative unique identifiers (email addresses, SKUs, username)

PostgreSQL Version: Behavior consistent across PostgreSQL 9.x through 17.x (current as of January 2025)

Source: PostgreSQL Official Documentation - Table Constraints

95% confidence
A

An EXCLUSION constraint in PostgreSQL ensures that if any two rows are compared on specified columns or expressions using specified operators, at least one of those operator comparisons must return false or null. It generalizes UNIQUE constraints by allowing custom operators beyond equality.

Syntax

CREATE TABLE example (
  room_id int,
  reservation_period tstzrange,
  EXCLUDE USING gist (room_id WITH =, reservation_period WITH &&)
);

Key Requirements

  1. Index Method Required: Must specify an index method (typically gist or spgist) that supports the operators being used
  2. Operator Class: Each element requires an operator. Common operators:
    • = (equality) - requires btree-compatible types
    • && (overlaps) - for range types
    • <-> (distance) - for geometric types
  3. Extension Dependency: Range types with && require btree_gist extension

Complete Example

-- Enable extension for btree operators in GiST
CREATE EXTENSION btree_gist;

-- Prevent overlapping room reservations
CREATE TABLE room_reservations (
  room_id int,
  reserved_during tstzrange,
  EXCLUDE USING gist (
    room_id WITH =,
    reserved_during WITH &&
  )
);

-- This succeeds (different rooms)
INSERT INTO room_reservations VALUES 
  (101, '[2024-01-01 10:00, 2024-01-01 12:00)');

-- This fails (same room, overlapping time)
INSERT INTO room_reservations VALUES 
  (101, '[2024-01-01 11:00, 2024-01-01 13:00)');
-- ERROR: conflicting key value violates exclusion constraint

Optional Clauses

  • WHERE (predicate): Makes constraint partial (only checks rows matching predicate)
  • DEFERRABLE / INITIALLY DEFERRED: Delays checking until transaction commit

Available since: PostgreSQL 9.0
Source: PostgreSQL 17 official documentation, Chapter 5.4 (Constraints)

95% confidence
A

Adding a Default Value to an Existing Column in PostgreSQL

Use ALTER TABLE with SET DEFAULT:

ALTER TABLE table_name 
ALTER COLUMN column_name SET DEFAULT default_value;

Example:

ALTER TABLE users 
ALTER COLUMN status SET DEFAULT 'active';

ALTER TABLE orders 
ALTER COLUMN created_at SET DEFAULT CURRENT_TIMESTAMP;

ALTER TABLE products 
ALTER COLUMN price SET DEFAULT 0.00;

CRITICAL BEHAVIOR:

  • This command applies only to future INSERT operations where the column is not explicitly specified
  • It does NOT update existing NULL values in the table
  • The operation is immediate and does not rewrite the table (fast, no table lock)

To also update existing rows:

-- Set the default
ALTER TABLE table_name 
ALTER COLUMN column_name SET DEFAULT default_value;

-- Update existing NULL values
UPDATE table_name 
SET column_name = default_value 
WHERE column_name IS NULL;

To remove a default:

ALTER TABLE table_name 
ALTER COLUMN column_name DROP DEFAULT;

Supported in: PostgreSQL 8.0+ (all currently supported versions as of 2025)

Source: PostgreSQL ALTER TABLE Documentation

95% confidence
A

When to Use Unlogged Tables in PostgreSQL

Use unlogged tables when you need maximum write performance for data that can be safely lost on a crash or unclean shutdown.

Critical Behavior

Unlogged tables in PostgreSQL (available since 9.1):

  • Are NOT written to WAL (Write-Ahead Log)
  • Are truncated automatically on crash recovery or unclean shutdown
  • Cannot be replicated to standby servers
  • Provide significantly faster writes (typically 2-10x) due to no WAL overhead

Specific Use Cases

Use unlogged tables for:

  1. Session/temporary data - User session state, shopping carts
  2. Cache tables - Materialized query results that can be regenerated
  3. ETL staging - Intermediate data during bulk loads that will be copied elsewhere
  4. Analytics scratch space - Temporary aggregations or data transformations
  5. High-throughput logging where data loss is acceptable (e.g., non-critical metrics)

Do NOT use for:

  • Any data that must survive crashes
  • Data requiring replication to standbys
  • ACID-compliant transactions where durability matters

Syntax

-- Create unlogged table
CREATE UNLOGGED TABLE session_data (
    session_id TEXT PRIMARY KEY,
    user_id INTEGER,
    data JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Convert existing table to unlogged
ALTER TABLE my_table SET UNLOGGED;

-- Convert back to logged
ALTER TABLE my_table SET LOGGED;

Performance Characteristics

  • Write performance: 2-10x faster than logged tables (exact gain depends on workload and storage)
  • fsync overhead: Eliminated for unlogged tables
  • Crash recovery: Table is automatically truncated (all data lost)

Source: PostgreSQL 17 official documentation (CREATE TABLE - Unlogged)

95% confidence
A

Drop a Column from a PostgreSQL Table

Use the ALTER TABLE statement with DROP COLUMN:

ALTER TABLE table_name DROP COLUMN column_name;

Example:

ALTER TABLE users DROP COLUMN middle_name;

Critical Details

CASCADE behavior: If the column is referenced by other objects (views, foreign keys, indexes), the command will fail unless you add CASCADE:

ALTER TABLE table_name DROP COLUMN column_name CASCADE;

This automatically drops dependent objects.

IF EXISTS clause (PostgreSQL 8.2+): Prevents errors if the column doesn't exist:

ALTER TABLE table_name DROP COLUMN IF EXISTS column_name;

Transaction safety: The operation is transactional and can be rolled back:

BEGIN;
ALTER TABLE users DROP COLUMN email;
ROLLBACK;  -- Undoes the drop

Performance note: Dropping a column in PostgreSQL 11+ is instant—it marks the column as invisible rather than rewriting the table. In versions prior to 11, it may require a full table rewrite depending on the column type.

Multiple columns can be dropped in one statement:

ALTER TABLE users 
  DROP COLUMN middle_name,
  DROP COLUMN nickname;

Source: PostgreSQL Official Documentation - ALTER TABLE

95% confidence
A

Composite Primary Key in PostgreSQL

A composite primary key uses multiple columns to uniquely identify rows. Define it using the PRIMARY KEY constraint with a comma-separated list of columns.

Syntax

During table creation:

CREATE TABLE table_name (
    column1 data_type,
    column2 data_type,
    column3 data_type,
    PRIMARY KEY (column1, column2)
);

Adding to existing table:

ALTER TABLE table_name 
ADD PRIMARY KEY (column1, column2);

Example

CREATE TABLE order_items (
    order_id INTEGER,
    product_id INTEGER,
    quantity INTEGER,
    PRIMARY KEY (order_id, product_id)
);

Critical Details

  • Column order matters: PRIMARY KEY (a, b) creates a different index structure than PRIMARY KEY (b, a). Order columns by query patterns (most selective/frequently filtered first).
  • Maximum columns: PostgreSQL allows up to 32 columns in a composite key (limited by index max of 32 columns).
  • Implicit NOT NULL: All columns in a primary key automatically become NOT NULL.
  • Automatic index: PostgreSQL creates a unique B-tree index on the column combination.
  • Constraint naming: Use CONSTRAINT constraint_name PRIMARY KEY (col1, col2) for explicit naming.

Verification

Supported since PostgreSQL 7.1+, current through PostgreSQL 17.

Source: PostgreSQL Official Documentation - CREATE TABLE (https://www.postgresql.org/docs/current/sql-createtable.html)

95% confidence
A

DROP TABLE ... CASCADE in PostgreSQL automatically drops a table and all objects that depend on it, preventing dependency errors.

Syntax

DROP TABLE table_name CASCADE;

What CASCADE Does

When you drop a table with CASCADE, PostgreSQL automatically drops:

  • Views that reference the table
  • Foreign key constraints in other tables pointing to this table
  • Functions/procedures that depend on the table
  • Triggers on the table
  • Rules defined on the table
  • Dependent objects recursively (objects depending on the dropped objects)

Example

-- Create tables with dependencies
CREATE TABLE orders (id INT PRIMARY KEY);
CREATE TABLE order_items (
    id INT PRIMARY KEY,
    order_id INT REFERENCES orders(id)
);
CREATE VIEW order_summary AS SELECT * FROM orders;

-- This fails with "cannot drop table orders because other objects depend on it"
DROP TABLE orders;

-- This succeeds and drops orders, the foreign key in order_items, and order_summary view
DROP TABLE orders CASCADE;

Alternative: RESTRICT

The default behavior is RESTRICT (explicitly written or implied), which refuses to drop the table if any objects depend on it:

DROP TABLE orders RESTRICT;  -- Fails if dependencies exist
DROP TABLE orders;            -- Same as RESTRICT (default)

Critical Warning

CASCADE can drop many more objects than you intend. Always review dependencies first:

-- Check dependencies before dropping
SELECT * FROM pg_depend WHERE refobjid = 'orders'::regclass;

Version: Available in all supported PostgreSQL versions (9.x through 17+)

Source: PostgreSQL Official Documentation - DROP TABLE

95% confidence
A

CREATE TABLE ... LIKE creates a new table by copying the structure of an existing table. The new table is completely independent—changes to the original table do not affect the new table.

What it copies by default:

  • Column names
  • Data types
  • NOT NULL constraints

What it does NOT copy by default:

  • Data/rows
  • Indexes
  • Primary keys
  • Foreign keys
  • DEFAULT values
  • CHECK constraints (other than NOT NULL)
  • Comments
  • Identity specifications

Syntax:

CREATE TABLE new_table (LIKE existing_table [INCLUDING options]);

INCLUDING options to copy additional properties:

  • INCLUDING DEFAULTS - Copy default value expressions
  • INCLUDING CONSTRAINTS - Copy CHECK constraints
  • INCLUDING INDEXES - Copy indexes (including PK, UNIQUE, EXCLUDE)
  • INCLUDING IDENTITY - Copy identity column specifications
  • INCLUDING GENERATED - Copy generated column expressions
  • INCLUDING STATISTICS - Copy extended statistics
  • INCLUDING COMMENTS - Copy column/constraint comments
  • INCLUDING STORAGE - Copy TOAST storage settings
  • INCLUDING COMPRESSION - Copy column compression methods
  • INCLUDING ALL - Copy everything above

Example:

-- Basic: copies only columns, types, NOT NULL
CREATE TABLE employees_backup (LIKE employees);

-- Copy with indexes and defaults
CREATE TABLE employees_copy (LIKE employees INCLUDING INDEXES INCLUDING DEFAULTS);

-- Copy everything
CREATE TABLE employees_full (LIKE employees INCLUDING ALL);

The LIKE clause works with tables, views, foreign tables, and composite types.

Sources:

95% confidence
A

Creating a Table from a SELECT Query in PostgreSQL

Use CREATE TABLE AS (CTAS) to create a table from a SELECT query:

CREATE TABLE new_table AS
SELECT column1, column2, column3
FROM existing_table
WHERE condition;

Key characteristics:

  • Creates table structure automatically based on SELECT result columns
  • Inserts all rows returned by the SELECT
  • Does NOT copy indexes, constraints, or defaults from source tables
  • Does NOT create a primary key automatically
  • Column names match the SELECT list (use aliases to rename: SELECT col AS new_name)

With additional options (PostgreSQL 9.5+):

CREATE TABLE new_table AS
SELECT * FROM existing_table
WITH NO DATA;  -- Creates structure only, no rows

Alternative syntax using SELECT INTO (identical functionality):

SELECT column1, column2
INTO new_table
FROM existing_table;

To include constraints after creation:

CREATE TABLE new_table AS
SELECT * FROM existing_table;

ALTER TABLE new_table ADD PRIMARY KEY (id);
CREATE INDEX idx_name ON new_table(column_name);

Temporary table variant:

CREATE TEMP TABLE temp_table AS
SELECT * FROM existing_table;

Authority: PostgreSQL official documentation (CREATE TABLE AS command, compatible with PostgreSQL 9.0+, syntax unchanged through PostgreSQL 17).

95% confidence
A

ALTER TABLE ... SET NOT NULL Syntax (PostgreSQL)

The syntax to add a NOT NULL constraint to an existing column is:

ALTER TABLE table_name ALTER COLUMN column_name SET NOT NULL;

Example:

ALTER TABLE users ALTER COLUMN email SET NOT NULL;

Critical Requirements:

  1. The column must NOT contain any NULL values before executing this command, or the operation will fail with error 23502
  2. This acquires an ACCESS EXCLUSIVE lock on the table, blocking all other operations

PostgreSQL 12+ Optimization:
If you first add a CHECK constraint that validates the NOT NULL condition, then drop it before adding the actual NOT NULL constraint, PostgreSQL can skip the full table scan:

-- Step 1: Add CHECK constraint (does NOT block writes heavily)
ALTER TABLE users ADD CONSTRAINT users_email_not_null CHECK (email IS NOT NULL) NOT VALID;

-- Step 2: Validate the constraint (uses SHARE UPDATE EXCLUSIVE lock)
ALTER TABLE users VALIDATE CONSTRAINT users_email_not_null;

-- Step 3: Set NOT NULL (skips table scan since constraint proves no NULLs exist)
ALTER TABLE users ALTER COLUMN email SET NOT NULL;

-- Step 4: Drop the redundant CHECK constraint
ALTER TABLE users DROP CONSTRAINT users_email_not_null;

Removing NOT NULL:

ALTER TABLE table_name ALTER COLUMN column_name DROP NOT NULL;

Source: PostgreSQL 17 Official Documentation - ALTER TABLE command reference

95% confidence
A

CHECK Constraint in PostgreSQL

A CHECK constraint is a table-level or column-level constraint that enforces a boolean expression on column values. The constraint rejects INSERT or UPDATE operations if the expression evaluates to FALSE; it accepts the operation if the expression evaluates to TRUE or NULL.

Syntax

Column-level CHECK:

CREATE TABLE products (
    price NUMERIC CHECK (price > 0),
    discount NUMERIC CHECK (discount >= 0 AND discount <= 100)
);

Table-level CHECK (for multi-column conditions):

CREATE TABLE orders (
    quantity INTEGER,
    total NUMERIC,
    CHECK (total = quantity * unit_price)
);

Named CHECK constraint:

CREATE TABLE employees (
    salary NUMERIC,
    CONSTRAINT valid_salary CHECK (salary > 0 AND salary < 1000000)
);

Critical Behaviors

  1. NULL handling: CHECK constraints pass when the expression evaluates to NULL (unknown). To disallow NULLs, combine with NOT NULL.

  2. Expression limitations: The CHECK expression:

    • Cannot contain subqueries
    • Cannot reference columns from other tables
    • Cannot reference other rows (only current row being inserted/updated)
    • Can call immutable functions only (not CURRENT_TIMESTAMP or similar volatile functions)
  3. Validation timing: Evaluated at row insertion/update time, not during ALTER TABLE ADD CONSTRAINT unless you specify NOT VALID and later VALIDATE CONSTRAINT.

Adding to Existing Tables

-- Validates all existing rows immediately
ALTER TABLE products ADD CONSTRAINT check_price CHECK (price > 0);

-- Skips existing rows, validates only new/updated rows (PostgreSQL 9.4+)
ALTER TABLE products ADD CONSTRAINT check_price CHECK (price > 0) NOT VALID;
-- Later validate:
ALTER TABLE products VALIDATE CONSTRAINT check_price;

Source: PostgreSQL 17 Official Documentation - Table Constraints (https://www.postgresql.org/docs/current/ddl-constraints.html)

95% confidence

authentication

2 questions
A

PostgreSQL 18 OAuth 2.0 Authentication

Yes. PostgreSQL 18 introduces OAuth 2.0 authentication support, allowing integration with modern identity providers like Okta, Auth0, Azure AD, and Keycloak.

Configuration (pg_hba.conf)

# OAuth 2.0 authentication
host    all    all    0.0.0.0/0    oauth issuer="https://auth.example.com" client_id="pg_client"

How It Works

  1. Client requests access token from OAuth provider
  2. Client connects to PostgreSQL with token
  3. PostgreSQL validates token with issuer
  4. Connection established if token valid

Server Configuration

-- postgresql.conf
oauth_issuer = 'https://auth.example.com'
oauth_client_id = 'postgresql-server'
oauth_client_secret = 'your-secret'  -- Or use file

Client Connection

# Using psql with OAuth token
PGOAUTHTOKEN="eyJhbG..." psql -h myserver -U myuser -d mydb

# Using libpq connection string
psql "host=myserver user=myuser oauth_token=eyJhbG..."

Supported Flows

Flow Use Case
Client Credentials Service-to-service
Authorization Code Interactive users
Device Authorization CLI tools

Provider Examples

# Azure AD
host all all 0.0.0.0/0 oauth \
    issuer="https://login.microsoftonline.com/{tenant}/v2.0" \
    client_id="your-app-id"

# Okta
host all all 0.0.0.0/0 oauth \
    issuer="https://your-domain.okta.com/oauth2/default" \
    client_id="your-client-id"

# Keycloak
host all all 0.0.0.0/0 oauth \
    issuer="https://keycloak.example.com/realms/myrealm" \
    client_id="postgresql"

Security Notes

  • Tokens validated via OIDC discovery document
  • JWT signature verification automatic
  • Token expiration enforced
  • MD5 password auth deprecated in favor of SCRAM-SHA-256 or OAuth

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence
A

PostgreSQL 18 MD5 Authentication Deprecation

Yes. PostgreSQL 18 officially deprecates MD5 password authentication in favor of SCRAM-SHA-256.

Deprecation Notice

When using MD5 authentication in PostgreSQL 18, you'll see warnings in logs:

WARNING: md5 authentication is deprecated and will be removed in a future release
HINT: Use scram-sha-256 authentication instead.

Migration Steps

1. Check Current Configuration

-- Check current password encryption
SHOW password_encryption;  -- Should be 'scram-sha-256'

-- Check pg_hba.conf entries
-- Look for 'md5' in auth-method column

2. Update Server Configuration

-- postgresql.conf
password_encryption = scram-sha-256  -- Already default since PG14

3. Re-encrypt User Passwords

-- Users must reset passwords to use SCRAM
ALTER USER myuser PASSWORD 'new_secure_password';

-- Verify password type
SELECT usename, passwd LIKE 'SCRAM%' AS is_scram
FROM pg_shadow
WHERE usename = 'myuser';

4. Update pg_hba.conf

# Before (deprecated)
host    all    all    0.0.0.0/0    md5

# After (recommended)
host    all    all    0.0.0.0/0    scram-sha-256

Why SCRAM-SHA-256 Is Better

Aspect MD5 SCRAM-SHA-256
Algorithm strength Weak (broken) Strong
Replay attacks Vulnerable Protected
Man-in-middle Vulnerable Protected
Channel binding No Yes
Password storage Weak hash Salted, iterated

Compatibility Notes

-- Clients must support SCRAM
-- libpq 10+ supports SCRAM
-- Most drivers updated years ago

-- Check client library version
SELECT version();  -- Server version
-- Client: psql --version, check driver docs

Migration Timeline

Version Status
PG 10 SCRAM-SHA-256 introduced
PG 14 SCRAM-SHA-256 default for new passwords
PG 18 MD5 deprecated (warnings)
PG 19+ MD5 may be removed

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence

replication

2 questions
A

PostgreSQL 17 Failover Logical Replication Slots

PostgreSQL 17 enables logical replication slots to survive primary failover by synchronizing them to standby servers.

Prerequisites

  1. Physical streaming replication between primary and standby
  2. Hot standby enabled on standby
  3. hot_standby_feedback = on on standby
  4. primary_slot_name configured on standby

Configuration

On Primary (postgresql.conf):

-- List standbys that should receive slot sync
synchronized_standby_slots = 'standby1_slot'

On Standby (postgresql.conf):

-- Enable slot synchronization
sync_replication_slots = on
hot_standby_feedback = on
primary_slot_name = 'standby1_slot'

Creating Failover-Enabled Slots

-- Method 1: Direct slot creation with failover flag
SELECT pg_create_logical_replication_slot(
    'my_slot',
    'pgoutput',
    false,      -- temporary
    false,      -- two_phase
    true        -- failover (NEW in PG17)
);

-- Method 2: Via subscription
CREATE SUBSCRIPTION my_sub
    CONNECTION 'host=primary dbname=mydb'
    PUBLICATION my_pub
    WITH (failover = true);

Monitoring

-- Check slot sync status on standby
SELECT slot_name, synced, active
FROM pg_replication_slots;

-- synced = true means slot is ready for failover

-- Check if slot changes are synchronized
SELECT * FROM pg_stat_replication_slots;

Failover Process

  1. Primary fails
  2. Standby promotes to new primary
  3. Synced slots (where synced = true) become active
  4. Subscribers reconnect to new primary
  5. Logical replication continues from last confirmed LSN

Important Notes

  • Only slots with synced = true at failover time can be used
  • Physical slot between primary/standby is required
  • Slot sync happens periodically via slotsync worker
  • Logical replication changes aren't consumed until standby confirms receipt

Source: PostgreSQL 17 Documentation - Logical Replication Failover
https://www.postgresql.org/docs/17/logical-replication-failover.html

95% confidence
A

PostgreSQL 18 idle_replication_slot_timeout

idle_replication_slot_timeout automatically invalidates replication slots that have been inactive for a specified period, preventing WAL bloat.

The Problem It Solves

Abandoned replication slots prevent WAL cleanup, leading to:

  • Disk space exhaustion
  • Potential database unavailability
  • Manual intervention required

Configuration

-- Set timeout (default: 0 = disabled)
ALTER SYSTEM SET idle_replication_slot_timeout = '1d';  -- 1 day
SELECT pg_reload_conf();

-- Check current setting
SHOW idle_replication_slot_timeout;

Valid Values

Value Meaning
0 Disabled (default)
30min 30 minutes
1h 1 hour
1d 1 day
7d 1 week

Comparison with max_slot_wal_keep_size

Parameter Triggers On Use Case
max_slot_wal_keep_size WAL size exceeds limit Protect disk space
idle_replication_slot_timeout Time since last activity Clean up abandoned slots

Example Scenario

-- Create a slot
SELECT pg_create_logical_replication_slot('test_slot', 'pgoutput');

-- Slot becomes inactive (subscriber disconnects and never reconnects)
-- After idle_replication_slot_timeout passes, slot is invalidated

-- Check slot status
SELECT slot_name, active, invalidation_reason
FROM pg_replication_slots;

-- invalidation_reason will show 'idle_timeout' if expired

Best Practice

-- Combine both protections
max_slot_wal_keep_size = '100GB'      -- WAL size limit
idle_replication_slot_timeout = '7d'   -- Time limit

Source: PostgreSQL 18 Documentation - Replication Configuration
https://www.postgresql.org/docs/18/runtime-config-replication.html

95% confidence

data_loading

1 question
A

PostgreSQL 17 COPY Performance

PostgreSQL 17 delivers up to 2x faster COPY operations for bulk loading and exporting large rows.

Performance Comparison

Operation PostgreSQL 16 PostgreSQL 17 Improvement
COPY FROM (large rows) 100 MB/s 180 MB/s 1.8x
COPY TO (export) 120 MB/s 200 MB/s 1.7x
COPY with FREEZE 90 MB/s 170 MB/s 1.9x

Benchmarks on NVMe SSD, 64-core server, varies by hardware

What Changed

  1. Reduced memory allocation overhead: Batch buffer management
  2. Optimized tuple formation: Less copying of large values
  3. Improved I/O batching: Better write coalescing
  4. TOAST handling: More efficient for large text/bytea

Best Practices for Fast COPY

-- Optimal bulk load settings
SET maintenance_work_mem = '2GB';
SET max_wal_size = '10GB';

-- Use FREEZE for initial loads (skips VACUUM)
COPY large_table FROM '/data/file.csv' WITH (FREEZE);

-- Binary format for numeric-heavy data
COPY my_table TO '/backup/data.bin' WITH (FORMAT binary);
COPY my_table FROM '/backup/data.bin' WITH (FORMAT binary);

-- Parallel load via multiple connections
-- Split file and run concurrent COPY commands

Monitoring COPY Progress

-- Check progress (PostgreSQL 14+)
SELECT
    command,
    bytes_processed,
    bytes_total,
    tuples_processed,
    round(100.0 * bytes_processed / nullif(bytes_total, 0), 1) AS pct_done
FROM pg_stat_progress_copy;

COPY Options Recap

Option Purpose
FREEZE Skip VACUUM for new tables
FORMAT binary Faster for numeric data
PARALLEL n Multiple workers (planned)
ON_ERROR ignore Skip bad rows
HEADER Skip/include CSV header

Example: Large Data Load

# Fast CSV import
psql -c "COPY events FROM PROGRAM 'zcat events.csv.gz' WITH (FORMAT csv, HEADER);"

# With progress monitoring
psql -c "
  SET maintenance_work_mem = '1GB';
  COPY events FROM '/data/events.csv' WITH (FORMAT csv, HEADER, FREEZE);
"

Source: PostgreSQL 17 Release Notes
https://www.postgresql.org/docs/17/release-17.html

95% confidence

session_management

1 question
A

set_config() and current_setting() are PostgreSQL functions for managing session/transaction-level configuration parameters, commonly used to pass application context (like user ID or tenant ID) to Row Level Security policies.

current_setting()

Syntax:

current_setting(setting_name text [, missing_ok boolean]) -> text

Usage:

-- Get a setting (throws error if not exists)
SELECT current_setting('app.current_user_id');

-- Get a setting (returns NULL if not exists)
SELECT current_setting('app.current_user_id', true);

Equivalent to SHOW setting_name.

set_config()

Syntax:

set_config(setting_name text, new_value text, is_local boolean) -> text

Parameters:

  • setting_name - Custom parameters must use dotted names (e.g., app.tenant_id)
  • new_value - The value to set (NULL resets to default)
  • is_local - If true, applies only to current transaction; if false, applies to entire session

Usage:

-- Set for entire session
SELECT set_config('app.tenant_id', '42', false);

-- Set for current transaction only (required for PgBouncer)
SELECT set_config('app.tenant_id', '42', true);

Equivalent to SET [LOCAL] setting_name = value.

Multi-Tenant RLS Example

-- Create policy using application context
CREATE POLICY tenant_isolation ON orders
  USING (tenant_id = current_setting('app.tenant_id', true)::integer);

-- In application, set context before queries
BEGIN;
SELECT set_config('app.tenant_id', '42', true);
SELECT * FROM orders;  -- Only sees tenant 42's orders
COMMIT;

Important: Use is_local = true when using connection poolers like PgBouncer in transaction mode to ensure context doesn't leak between requests.

Sources:

95% confidence

server_configuration

1 question
A

PostgreSQL 18 io_method Configuration

io_method is a new PostgreSQL 18 configuration parameter that controls how read operations are dispatched to storage.

Available Options

Value Description Platform
sync Synchronous blocking reads (legacy PG17 behavior) All
worker Dedicated I/O worker processes (default) All
io_uring Linux kernel ring buffer for minimal syscall overhead Linux 5.1+

Performance Comparison

Benchmark results for sequential scan on cold cache:

Method Time Improvement
sync 15,071ms Baseline
worker 10,052ms 1.5x faster
io_uring 5,723ms 2.6x faster

Configuration

-- Check current setting
SHOW io_method;

-- Set in postgresql.conf
io_method = 'io_uring'  -- For Linux with kernel 5.1+
io_method = 'worker'    -- Cross-platform default

When to Use Each

  • io_uring: Best for Linux production servers with high I/O workloads
  • worker: Safe default for cross-platform compatibility
  • sync: Only for debugging or compatibility testing

Current Limitations

  • AIO only applies to reads (sequential scans, bitmap heap scans, vacuum)
  • Index scans don't use AIO yet
  • Write operations and WAL still use synchronous I/O

Source: PostgreSQL 18 Documentation - Runtime Configuration
https://www.postgresql.org/docs/18/runtime-config-resource.html

95% confidence

data_types

1 question
A

PostgreSQL 18 uuidv7() Function

uuidv7() generates timestamp-ordered UUIDs that are optimal for B-tree indexes and distributed systems.

Key Characteristics

  • First 48 bits: Unix timestamp (millisecond precision)
  • Next 12 bits: Sub-millisecond counter for monotonicity
  • Remaining bits: Random data
  • Total: 128 bits (standard UUID size)

Usage

-- Generate a UUIDv7
SELECT uuidv7();
-- Result: 019376a8-5b40-7abc-8def-1234567890ab

-- Use as primary key default
CREATE TABLE events (
    id uuid DEFAULT uuidv7() PRIMARY KEY,
    event_type TEXT,
    payload JSONB,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- Also available: explicit v4 function
SELECT uuidv4();  -- Alias for gen_random_uuid()

Performance vs UUIDv4

Metric UUIDv7 UUIDv4
Generation time 58.1 microseconds 86.8 microseconds
Throughput 34,127 ops/sec 29,238 ops/sec
Index fragmentation Minimal High
Insert performance Excellent (sequential) Poor (random)

Benefits

  1. Chronological ordering: Later UUIDs sort after earlier ones
  2. Reduced index fragmentation: New values always append to index end
  3. Better cache locality: Related records stored together
  4. Global uniqueness: No coordination needed across nodes

Considerations

  • Timestamp is visible in UUID (don't use for security-sensitive external IDs)
  • Monotonicity guaranteed only within same backend process
  • Requires NTP for clock synchronization across nodes

Source: PostgreSQL 18 Documentation - UUID Functions
https://www.postgresql.org/docs/18/functions-uuid.html

95% confidence

data_definition

1 question
A

PostgreSQL 18 Generated Columns Default

VIRTUAL is now the default for generated columns in PostgreSQL 18, changed from STORED in previous versions.

STORED vs VIRTUAL

Aspect STORED VIRTUAL (new default)
Storage Written to disk No disk space
Computation On INSERT/UPDATE On SELECT (read time)
Indexable Yes No
Adding to table Requires table rewrite Instant

Syntax

-- PostgreSQL 18: VIRTUAL is default
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    price NUMERIC(10,2),
    quantity INTEGER,
    total NUMERIC GENERATED ALWAYS AS (price * quantity)  -- VIRTUAL by default
);

-- Explicit STORED (for indexing)
CREATE TABLE products_indexed (
    id SERIAL PRIMARY KEY,
    first_name TEXT,
    last_name TEXT,
    full_name TEXT GENERATED ALWAYS AS (first_name || ' ' || last_name) STORED
);

-- Explicit VIRTUAL
CREATE TABLE calculations (
    a INTEGER,
    b INTEGER,
    sum INTEGER GENERATED ALWAYS AS (a + b) VIRTUAL
);

When to Use Each

Use VIRTUAL (default) when:

  • Adding columns to large existing tables (instant, no rewrite)
  • Write-heavy workloads (saves I/O)
  • Storage space is a concern
  • Column values change frequently via base columns

Use STORED when:

  • You need to create an index on the generated column
  • Read-heavy workloads with complex expressions
  • You need constraints (UNIQUE, NOT NULL, FK)

Current VIRTUAL Limitations

  • Cannot be indexed
  • No unique/foreign key constraints
  • Not supported in logical replication
  • Expression must use only built-in functions (no user-defined)

Source: PostgreSQL 18 Documentation - Generated Columns
https://www.postgresql.org/docs/18/ddl-generated-columns.html

95% confidence

foreign_data

1 question
A

PostgreSQL 17 postgres_fdw Subquery Push Down

PostgreSQL 17 enables EXISTS and IN subqueries to be pushed to remote PostgreSQL servers via postgres_fdw, reducing data transfer and improving performance.

The Improvement

Previously, EXISTS/IN subqueries against foreign tables were executed locally, requiring all foreign data to be fetched first.

Example

-- Setup: foreign table pointing to remote server
CREATE SERVER remote_server
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'remote.example.com', dbname 'salesdb');

CREATE FOREIGN TABLE remote_orders (
    id INT,
    customer_id INT,
    total NUMERIC
) SERVER remote_server;

-- This query now pushes the subquery to remote
SELECT * FROM local_customers c
WHERE EXISTS (
    SELECT 1 FROM remote_orders o
    WHERE o.customer_id = c.id
    AND o.total > 1000
);

EXPLAIN Comparison

-- PostgreSQL 16: Subquery executed locally
EXPLAIN VERBOSE SELECT * FROM local_customers c
WHERE EXISTS (SELECT 1 FROM remote_orders o WHERE o.customer_id = c.id);

-- Shows:
--   Filter: EXISTS (SubPlan)
--   ->  Foreign Scan on remote_orders  -- Fetches ALL rows
--         Remote SQL: SELECT id, customer_id, total FROM orders

-- PostgreSQL 17: Subquery pushed to remote
EXPLAIN VERBOSE SELECT * FROM local_customers c
WHERE EXISTS (SELECT 1 FROM remote_orders o WHERE o.customer_id = c.id);

-- Shows:
--   Foreign Scan
--   Remote SQL: SELECT ... WHERE EXISTS (SELECT 1 FROM orders WHERE ...)

Supported Patterns

Pattern Pushed Down (PG17)?
WHERE EXISTS (SELECT ... FROM foreign_table) Yes
WHERE id IN (SELECT id FROM foreign_table) Yes
WHERE NOT EXISTS (...) Yes
WHERE id NOT IN (...) Yes
Correlated subqueries Yes

Performance Impact

Scenario PG16 PG17
1M remote rows, 100 matches Fetch 1M rows Fetch 100 rows
Network transfer High Minimal
Query time Minutes Seconds

Configuration

-- Ensure push down is enabled (default: on)
ALTER SERVER remote_server OPTIONS (ADD fetch_size '1000');

-- Check what gets pushed
SET postgres_fdw.show_remote_sql = on;  -- For debugging

Source: PostgreSQL 17 Release Notes
https://www.postgresql.org/docs/17/release-17.html

95% confidence

vacuum_maintenance

1 question
A

PostgreSQL 17 VACUUM Memory Improvements

PostgreSQL 17 introduces a new internal memory structure for VACUUM that consumes up to 20x less memory than previous versions.

Memory Comparison

Scenario PostgreSQL 16 PostgreSQL 17
1M dead tuples ~128 MB ~6 MB
10M dead tuples ~1.28 GB ~64 MB
Large table vacuum Often OOM Stable

How It Works

Previous versions stored dead tuple IDs in a flat array that grew linearly. PostgreSQL 17 uses a radix tree (TID store) that:

  1. Compresses common prefixes of tuple IDs
  2. Scales sub-linearly with dead tuple count
  3. Reduces memory fragmentation
  4. Improves cache efficiency

Configuration

The new structure respects existing settings:

-- maintenance_work_mem still applies
SHOW maintenance_work_mem;  -- default: 64MB

-- But now processes more dead tuples per memory unit
-- A 64MB setting can now handle ~200M dead tuples
-- Previously limited to ~5M dead tuples

Practical Impact

  • Fewer VACUUM passes: More dead tuples processed per pass
  • Reduced OOM risk: Large table vacuums less likely to fail
  • Lower memory pressure: Better for shared hosting / containers
  • Faster completion: Less time spent on memory management

Verification

-- Check vacuum progress (unchanged API)
SELECT * FROM pg_stat_progress_vacuum;

-- Monitor memory in pg_stat_activity
SELECT pid, state, query, backend_type
FROM pg_stat_activity
WHERE backend_type = 'autovacuum worker';

Source: PostgreSQL 17 Release Notes
https://www.postgresql.org/docs/17/release-17.html

95% confidence

administration

1 question
A

PostgreSQL 18 Statistics Preservation During Upgrade

Yes. PostgreSQL 18 preserves planner statistics during pg_upgrade, eliminating the need for lengthy post-upgrade ANALYZE operations.

The Previous Problem (Pre-PG18)

# After pg_upgrade, statistics were empty
pg_upgrade -d /old/data -D /new/data ...

# Required running ANALYZE on entire database (could take hours)
vacuumdb --all --analyze-in-stages

PostgreSQL 18 Behavior

# Statistics now preserved automatically
pg_upgrade -d /old/data -D /new/data -b /old/bin -B /new/bin

# Database ready immediately with accurate query plans!

What's Preserved

Statistic Type Preserved?
Column statistics (pg_statistic) Yes
Extended statistics Yes
Most common values Yes
Histograms Yes
NULL fractions Yes
Correlation values Yes

Upgrade Time Comparison

Database Size PG17 Upgrade + ANALYZE PG18 Upgrade
100 GB 2 hours 20 minutes
1 TB 12+ hours 2 hours
10 TB Days Hours

Verification

-- After upgrade, check statistics exist
SELECT
    schemaname,
    tablename,
    last_analyze,
    n_live_tup
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC
LIMIT 10;

-- last_analyze will show pre-upgrade timestamp
-- Statistics are already present for query planning

Best Practice

Even with preserved statistics, consider running ANALYZE after upgrade for:

  • Tables with significant changes during upgrade window
  • New columns added during upgrade
  • Any tables showing poor query performance
-- Optional: Refresh statistics for specific tables
ANALYZE verbose large_table;

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence

connection_pooling

1 question
A

PgBouncer in transaction pooling mode has significant limitations because each transaction may execute on a different server connection with different session state.

Session Variables

Cannot use SET SESSION - Changes via SET (which defaults to SET SESSION) will not persist across transactions. You must use SET LOCAL to scope changes to the current transaction only.

Limited parameter tracking - PgBouncer tracks only these parameters by default: client_encoding, datestyle, timezone, standard_conforming_strings, and application_name. Additional parameters can be added via track_extra_parameters, but most parameters cannot be reliably tracked.

server_reset_query not used - In transaction pooling mode, DISCARD ALL is not executed between clients because clients must not rely on session persistence.

Prepared Statements

Protocol-level only (v1.21+) - Since PgBouncer 1.21.0, protocol-level prepared statements (via libpq's PQprepare) are supported with max_prepared_statements setting. PgBouncer intercepts and rewrites statement names internally.

SQL PREPARE not supported - SQL-level PREPARE, EXECUTE, and DEALLOCATE commands bypass PgBouncer's tracking and go directly to PostgreSQL. These will fail unpredictably when connections change.

Other Limitations

  • Temporary tables - Must be created, used, and dropped within the same transaction. Use ON COMMIT DROP when creating.
  • LISTEN/NOTIFY - Not compatible with transaction pooling (requires persistent session)
  • Advisory locks - Session-level advisory locks will be lost between transactions
  • Cursors - Must be declared WITH HOLD or used within a single transaction

Sources:

95% confidence

psql_tools

1 question
A

PostgreSQL 17 psql watch min_rows

PostgreSQL 17 adds a min_rows parameter to the psql watch command that stops execution after the query returns at least the specified number of rows.

Syntax

-- In psql, use backslash-watch
watch [interval] [min_rows=N]

Examples

-- Stop when at least 5 rows are returned
SELECT * FROM queue WHERE status = 'pending';
-- Then run: watch 1 min_rows=5

-- Runs every 1 second until 5+ pending items exist

-- Wait for replication to catch up (1 row = caught up)
SELECT 1 WHERE pg_last_wal_replay_lsn() >= '0/1234567'::pg_lsn;
-- Then run: watch 0.5 min_rows=1

-- Check for job completion
SELECT * FROM jobs WHERE id = 123 AND status = 'completed';
-- Then run: watch 2 min_rows=1

Use Cases

Scenario Command
Wait for table to have N rows watch 1 min_rows=N
Wait for condition to be true watch 0.5 min_rows=1
Monitor until threshold reached watch 5 min_rows=100
Poll for job completion watch 2 min_rows=1

Comparison with Basic watch

-- Basic watch (runs forever until Ctrl-C)
SELECT count(*) FROM events WHERE processed = false;
-- Then run: watch 5

-- With min_rows (auto-stops)
SELECT count(*) FROM events WHERE processed = false HAVING count(*) = 0;
-- Then run: watch 5 min_rows=1
-- Stops when all events are processed

Practical Examples

-- Wait for locks to clear
SELECT 1 WHERE NOT EXISTS (
    SELECT 1 FROM pg_locks WHERE relation = 'my_table'::regclass
);
-- Then run: watch 1 min_rows=1

-- Wait for active connections to drop below threshold
SELECT 1 WHERE (
    SELECT count(*) FROM pg_stat_activity WHERE state = 'active'
) < 10;
-- Then run: watch 2 min_rows=1

-- Monitor batch progress, stop at 1000 processed
SELECT count(*) AS processed FROM items WHERE status = 'done'
HAVING count(*) >= 1000;
-- Then run: watch 5 min_rows=1

Also New in PG17 psql

  • Ctrl-C cancels connection attempts (previously had to wait for timeout)
  • Better tab completion for SQL keywords
  • Improved d command output formatting

Source: PostgreSQL 17 Documentation - psql
https://www.postgresql.org/docs/17/app-psql.html

95% confidence

monitoring

1 question
A

PostgreSQL 18 pg_stat_all_tables New Columns

PostgreSQL 18 adds vacuum and analyze timing columns to pg_stat_all_tables for better maintenance monitoring.

New Columns

Column Type Description
total_vacuum_time double precision Total time spent vacuuming (ms)
total_analyze_time double precision Total time spent analyzing (ms)

Usage

-- Find tables with longest vacuum times
SELECT
    schemaname,
    relname,
    vacuum_count,
    round(total_vacuum_time::numeric / 1000, 2) AS vacuum_time_sec,
    round(total_vacuum_time::numeric / nullif(vacuum_count, 0) / 1000, 2) AS avg_vacuum_sec
FROM pg_stat_all_tables
WHERE total_vacuum_time > 0
ORDER BY total_vacuum_time DESC
LIMIT 10;

-- Tables needing vacuum optimization
SELECT
    relname,
    n_dead_tup,
    last_vacuum,
    vacuum_count,
    round(total_vacuum_time / 1000, 1) AS total_vacuum_secs,
    round((total_vacuum_time / nullif(vacuum_count, 0)) / 1000, 1) AS avg_vacuum_secs
FROM pg_stat_user_tables
WHERE vacuum_count > 0
  AND (total_vacuum_time / nullif(vacuum_count, 0)) > 10000  -- avg > 10 sec
ORDER BY total_vacuum_time DESC;

Monitoring Query

-- Comprehensive maintenance dashboard
SELECT
    relname AS table_name,
    pg_size_pretty(pg_total_relation_size(relid)) AS size,
    n_live_tup AS live_rows,
    n_dead_tup AS dead_rows,
    vacuum_count,
    autovacuum_count,
    analyze_count,
    autoanalyze_count,
    round(total_vacuum_time / 1000, 1) AS vacuum_time_sec,
    round(total_analyze_time / 1000, 1) AS analyze_time_sec,
    last_vacuum,
    last_autovacuum,
    last_analyze
FROM pg_stat_user_tables
ORDER BY total_vacuum_time DESC
LIMIT 20;

Alerting Example

-- Alert on tables with long average vacuum time
SELECT relname, avg_vacuum_ms
FROM (
    SELECT
        relname,
        total_vacuum_time / nullif(vacuum_count, 0) AS avg_vacuum_ms
    FROM pg_stat_user_tables
    WHERE vacuum_count > 0
) sub
WHERE avg_vacuum_ms > 60000  -- > 1 minute average
ORDER BY avg_vacuum_ms DESC;

Related Enhancements (PG18)

EXPLAIN ANALYZE now also shows:

  • Buffer usage automatically
  • WAL writes (verbose mode)
  • CPU time
  • Average read times
EXPLAIN (ANALYZE, BUFFERS, WAL)
SELECT * FROM large_table WHERE id < 1000;

Source: PostgreSQL 18 Release Notes
https://www.postgresql.org/docs/18/release-18.html

95% confidence