postgresql_jsonb_asyncpg 18 Q&As

PostgreSQL Jsonb Asyncpg FAQ & Answers

18 expert PostgreSQL Jsonb Asyncpg answers researched from official documentation. Every answer cites authoritative sources you can verify.

Jump to section:

server_configuration (12) sql_json_features (2) sql_query_language (2) physical_replication (2)

server_configuration

12 questions

How does PostgreSQL's information_schema.tables handle schema-qualified table queries?

PostgreSQL's information_schema.tables view requires filtering by the table_schema column to query tables within a specific schema. The information schema is SQL standard-compliant and portable across databases, unlike PostgreSQL-specific system catalogs.

Query Pattern:

-- Query tables in a specific schema
SELECT table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'my_schema'
  AND table_type = 'BASE TABLE';  -- Excludes views

-- Example: List all user tables (exclude system schemas)
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema', 'pg_catalog')
  AND table_type = 'BASE TABLE'
ORDER BY table_schema, table_name;

-- Check if specific table exists in schema
SELECT EXISTS (
    SELECT 1
    FROM information_schema.tables
    WHERE table_schema = 'public'
      AND table_name = 'users'
) AS table_exists;

Key Columns:

Column	Description	Values
`table_catalog`	Database name	Current database
`table_schema`	Schema name	`public`, `my_schema`, etc.
`table_name`	Table name	`users`, `orders`, etc.
`table_type`	Object type	`BASE TABLE`, `VIEW`, `FOREIGN`

Using asyncpg:

import asyncpg

async def check_table_exists(schema: str, table: str) -> bool:
    """
    Check if table exists in specified schema.
    """
    conn = await asyncpg.connect(
        host='localhost',
        port=5432,
        user='postgres',
        password='password',
        database='mydb'
    )
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM information_schema.tables
                WHERE table_schema = $1
                  AND table_name = $2
                  AND table_type = 'BASE TABLE'
            )
            """,
            schema,
            table
        )
        return exists
    finally:
        await conn.close()

# Usage
exists = await check_table_exists('public', 'users')
print(f"Table exists: {exists}")

Common Patterns:

-- 1. List all tables in database (excluding system)
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
  AND table_type = 'BASE TABLE';

-- 2. List tables with column count
SELECT 
    t.table_schema,
    t.table_name,
    COUNT(c.column_name) AS column_count
FROM information_schema.tables t
LEFT JOIN information_schema.columns c
    ON t.table_schema = c.table_schema
    AND t.table_name = c.table_name
WHERE t.table_schema = 'public'
  AND t.table_type = 'BASE TABLE'
GROUP BY t.table_schema, t.table_name;

-- 3. Search for tables by name pattern
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_name LIKE '%user%'
  AND table_type = 'BASE TABLE';

Best Practices:

Always filter by table_schema - Improves performance and accuracy
Use table_type = 'BASE TABLE' - Excludes views unless specifically needed
Avoid SELECT * - Select only needed columns for performance
Use EXISTS for existence checks - More efficient than COUNT(*)
Exclude system schemas - pg_catalog, information_schema for user tables

information_schema vs pg_catalog:

-- information_schema: Standard, portable across databases
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';

-- pg_catalog: PostgreSQL-specific, more detailed info
SELECT tablename
FROM pg_tables
WHERE schemaname = 'public';

-- pg_catalog also provides size info (not in information_schema)
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public';

Performance Considerations:

According to PostgreSQL documentation and best practices (2024), querying information_schema can impact performance on large databases. Apply filters and select only required columns to reduce computational complexity. For PostgreSQL-specific features like table sizes and row counts, use pg_catalog views instead.

Complete FastAPI Example:

from fastapi import FastAPI, HTTPException
import asyncpg
from typing import List, Dict

app = FastAPI()

DATABASE_URL = "postgresql://user:pass@localhost/mydb"

@app.get("/tables/{schema}")
async def list_tables(schema: str) -> List[Dict[str, str]]:
    """
    List all tables in specified schema.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        tables = await conn.fetch(
            """
            SELECT table_schema, table_name, table_type
            FROM information_schema.tables
            WHERE table_schema = $1
              AND table_type = 'BASE TABLE'
            ORDER BY table_name
            """,
            schema
        )
        
        return [
            {
                "schema": row['table_schema'],
                "name": row['table_name'],
                "type": row['table_type']
            }
            for row in tables
        ]
    finally:
        await conn.close()

@app.get("/tables/{schema}/{table}/exists")
async def table_exists(schema: str, table: str) -> Dict[str, bool]:
    """
    Check if table exists in schema.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM information_schema.tables
                WHERE table_schema = $1
                  AND table_name = $2
            )
            """,
            schema,
            table
        )
        
        return {"exists": exists}
    finally:
        await conn.close()

Security Note:

The information_schema is read-only and respects user permissions. Users can only see tables they have privileges to access.

Version Note: information_schema behavior consistent since PostgreSQL 9.1+, recommended for PostgreSQL 12+

Sources

postgresql.org beekeeperstudio.io magicstack.github.io

99% confidence

How do you verify if a PostgreSQL materialized view exists before querying it?

To verify if a PostgreSQL materialized view exists, query the pg_matviews system catalog view using the schemaname and matviewname columns. This is more reliable than querying information_schema.views which doesn't distinguish materialized views from regular views.

Recommended Pattern:

-- Check if materialized view exists
SELECT EXISTS (
    SELECT 1
    FROM pg_matviews
    WHERE schemaname = 'public'
      AND matviewname = 'my_matview'
) AS matview_exists;

-- Get materialized view details
SELECT 
    schemaname,
    matviewname,
    matviewowner,
    tablespace,
    hasindexes,
    ispopulated,  -- Important: false means not refreshed yet
    definition
FROM pg_matviews
WHERE schemaname = 'public'
  AND matviewname = 'my_matview';

Using asyncpg:

import asyncpg
from typing import Optional, Dict

async def matview_exists(schema: str, matview: str) -> bool:
    """
    Check if materialized view exists in specified schema.
    """
    conn = await asyncpg.connect(
        host='localhost',
        database='mydb',
        user='postgres',
        password='password'
    )
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM pg_matviews
                WHERE schemaname = $1
                  AND matviewname = $2
            )
            """,
            schema,
            matview
        )
        return exists
    finally:
        await conn.close()

# Usage
exists = await matview_exists('public', 'user_stats')
if not exists:
    raise ValueError("Materialized view 'user_stats' does not exist")

Check if Populated (Critical for Queries):

async def matview_is_populated(schema: str, matview: str) -> bool:
    """
    Check if materialized view exists AND is populated.
    Unpopulated matviews will error when queried.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        ispopulated = await conn.fetchval(
            """
            SELECT ispopulated
            FROM pg_matviews
            WHERE schemaname = $1
              AND matviewname = $2
            """,
            schema,
            matview
        )
        
        # ispopulated is None if matview doesn't exist
        # False if exists but not refreshed
        # True if exists and has data
        return ispopulated is True
        
    finally:
        await conn.close()

# Usage
if not await matview_is_populated('public', 'user_stats'):
    print("Materialized view not populated, refreshing...")
    await conn.execute('REFRESH MATERIALIZED VIEW public.user_stats')

Complete Safe Query Pattern:

from fastapi import FastAPI, HTTPException
import asyncpg
from typing import List, Dict, Any

app = FastAPI()

DATABASE_URL = "postgresql://user:pass@localhost/mydb"

async def safe_query_matview(
    schema: str,
    matview: str,
    query: str,
    *args
) -> List[Dict[str, Any]]:
    """
    Safely query materialized view with existence and population checks.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        # Step 1: Check if matview exists and is populated
        matview_info = await conn.fetchrow(
            """
            SELECT ispopulated, definition
            FROM pg_matviews
            WHERE schemaname = $1
              AND matviewname = $2
            """,
            schema,
            matview
        )
        
        if not matview_info:
            raise HTTPException(
                status_code=404,
                detail=f"Materialized view '{schema}.{matview}' does not exist"
            )
        
        if not matview_info['ispopulated']:
            # Option 1: Auto-refresh (may be slow)
            await conn.execute(
                f'REFRESH MATERIALIZED VIEW {schema}.{matview}'
            )
            
            # Option 2: Return error (let client decide)
            # raise HTTPException(
            #     status_code=503,
            #     detail=f"Materialized view '{schema}.{matview}' not populated"
            # )
        
        # Step 2: Query the matview
        rows = await conn.fetch(query, *args)
        
        return [dict(row) for row in rows]
        
    finally:
        await conn.close()

@app.get("/stats/users")
async def get_user_stats():
    """
    Query user_stats materialized view with safety checks.
    """
    results = await safe_query_matview(
        schema='public',
        matview='user_stats',
        query='SELECT * FROM public.user_stats ORDER BY user_count DESC LIMIT 100'
    )
    return {"stats": results}

Alternative: Using pg_class

-- pg_class can also detect materialized views
SELECT EXISTS (
    SELECT 1
    FROM pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
    WHERE n.nspname = 'public'
      AND c.relname = 'my_matview'
      AND c.relkind = 'm'  -- 'm' = materialized view
) AS matview_exists;

-- Get more details from pg_class
SELECT 
    n.nspname AS schema,
    c.relname AS matview,
    c.relkind,  -- 'm' for materialized view
    pg_size_pretty(pg_total_relation_size(c.oid)) AS size,
    c.reltuples::bigint AS estimated_rows
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public'
  AND c.relname = 'my_matview'
  AND c.relkind = 'm';

Error Handling for Missing/Unpopulated Matview:

import asyncpg
import logging

logger = logging.getLogger(__name__)

async def query_matview_safe(matview_name: str):
    """
    Query matview with proper error handling.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Try to query directly
        results = await conn.fetch(f'SELECT * FROM {matview_name}')
        return [dict(row) for row in results]
        
    except asyncpg.UndefinedTableError:
        # Matview doesn't exist
        logger.error(f"Materialized view '{matview_name}' does not exist")
        raise HTTPException(
            status_code=404,
            detail=f"Materialized view not found: {matview_name}"
        )
        
    except asyncpg.ObjectNotInPrerequisiteStateError:
        # Matview exists but not populated
        logger.warning(
            f"Materialized view '{matview_name}' not populated, refreshing..."
        )
        
        try:
            await conn.execute(f'REFRESH MATERIALIZED VIEW {matview_name}')
            results = await conn.fetch(f'SELECT * FROM {matview_name}')
            return [dict(row) for row in results]
        except Exception as e:
            logger.exception(f"Failed to refresh matview: {e}")
            raise HTTPException(
                status_code=503,
                detail="Materialized view refresh failed"
            )
    
    finally:
        await conn.close()

Refresh Strategies:

# 1. Blocking refresh (locks for reads)
await conn.execute('REFRESH MATERIALIZED VIEW user_stats')

# 2. Concurrent refresh (allows reads, requires UNIQUE index)
await conn.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats')

# 3. Scheduled refresh (cron job pattern)
from datetime import datetime, timedelta

async def refresh_if_stale(matview: str, max_age_minutes: int = 60):
    """
    Refresh matview if last refresh was > max_age_minutes ago.
    """
    # Store last refresh time in separate table
    last_refresh = await conn.fetchval(
        'SELECT last_refresh FROM matview_refresh_log WHERE matview_name = $1',
        matview
    )
    
    if not last_refresh or datetime.now() - last_refresh > timedelta(minutes=max_age_minutes):
        await conn.execute(f'REFRESH MATERIALIZED VIEW {matview}')
        await conn.execute(
            'INSERT INTO matview_refresh_log (matview_name, last_refresh) VALUES ($1, NOW()) '
            'ON CONFLICT (matview_name) DO UPDATE SET last_refresh = NOW()',
            matview
        )

pg_matviews Columns:

Column	Type	Description
`schemaname`	name	Schema containing matview
`matviewname`	name	Materialized view name
`matviewowner`	name	Owner of matview
`tablespace`	name	Tablespace (NULL = default)
`hasindexes`	boolean	Has indexes defined
`ispopulated`	boolean	Critical: false = no data
`definition`	text	SQL query definition

Best Practices:

✅ Always check ispopulated before querying production matviews
✅ Use pg_matviews for materialized views (not information_schema.views)
✅ Handle ObjectNotInPrerequisiteStateError for unpopulated matviews
✅ Implement refresh strategies (scheduled or on-demand)
✅ Use REFRESH MATERIALIZED VIEW CONCURRENTLY when possible (requires UNIQUE index)
❌ Don't assume matviews are always populated after creation
❌ Don't query matviews without existence checks in production

Version Note: pg_matviews available since PostgreSQL 9.3+, ispopulated column reliable since 9.4+

Sources

postgresql.org postgresql.org magicstack.github.io

99% confidence

How does asyncpg handle Python dict to PostgreSQL JSONB column type conversion?

Asyncpg automatically converts Python dictionaries and lists to PostgreSQL JSONB (and JSON) types without requiring manual JSON serialization. This is built into asyncpg's type codec system as of version 0.18.0+.

Automatic Conversion:

import asyncpg

# Python dict automatically converts to JSONB
conn = await asyncpg.connect(
    host='localhost',
    database='mydb',
    user='postgres',
    password='password'
)

# INSERT: Python dict → PostgreSQL JSONB
user_data = {
    'name': 'John Doe',
    'age': 30,
    'email': '[email protected]',
    'preferences': {
        'theme': 'dark',
        'notifications': True
    }
}

await conn.execute(
    'INSERT INTO users (id, data) VALUES ($1, $2)',
    1,
    user_data  # ← Python dict, no json.dumps() needed!
)

# SELECT: PostgreSQL JSONB → Python dict
row = await conn.fetchrow('SELECT id, data FROM users WHERE id = $1', 1)

print(row['data'])  # ← Automatically a Python dict!
# Output: {'name': 'John Doe', 'age': 30, 'email': '[email protected]', ...}

print(type(row['data']))  # <class 'dict'>

Type Mappings:

Python Type	PostgreSQL Type	Automatic
`dict`	`jsonb` or `json`	✅ Yes
`list`	`jsonb[]` or `json[]`	✅ Yes
`None`	`NULL`	✅ Yes
`str`	`text` (not JSON)	✅ Yes

Complete Example:

import asyncpg
from typing import Dict, Any, List

# Table schema
"""
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    metadata JSONB,  -- Python dict
    tags JSONB,      -- Python list
    created_at TIMESTAMP DEFAULT NOW()
);
"""

async def create_product(name: str, metadata: Dict[str, Any], tags: List[str]):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # All JSON data auto-converted
        product_id = await conn.fetchval(
            """
            INSERT INTO products (name, metadata, tags)
            VALUES ($1, $2, $3)
            RETURNING id
            """,
            name,
            metadata,  # dict → jsonb (automatic)
            tags       # list → jsonb (automatic)
        )
        
        return product_id
    finally:
        await conn.close()

# Usage - no JSON serialization needed
product_id = await create_product(
    name='Laptop',
    metadata={
        'brand': 'Dell',
        'model': 'XPS 15',
        'specs': {
            'ram': '32GB',
            'cpu': 'Intel i9',
            'storage': '1TB SSD'
        },
        'price': 1999.99,
        'in_stock': True
    },
    tags=['electronics', 'computers', 'laptops']
)

print(f"Created product ID: {product_id}")

Querying JSONB Data:

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Query with JSONB operators
products = await conn.fetch(
    """
    SELECT id, name, metadata, tags
    FROM products
    WHERE metadata->>'brand' = $1  -- JSONB text extraction
      AND (metadata->>'price')::numeric < $2
      AND tags @> $3  -- JSONB containment
    """,
    'Dell',
    2000.00,
    ['laptops']  # ← Python list auto-converted for comparison
)

for product in products:
    print(f"Product: {product['name']}")
    print(f"Metadata: {product['metadata']}")  # Already a dict!
    print(f"Brand: {product['metadata']['brand']}")  # Direct dict access
    print(f"Tags: {product['tags']}")  # Already a list!

Updating JSONB Fields:

# Update entire JSONB column
new_metadata = {
    'brand': 'Dell',
    'model': 'XPS 17',  # Changed
    'specs': {
        'ram': '64GB',  # Changed
        'cpu': 'Intel i9',
        'storage': '2TB SSD'  # Changed
    }
}

await conn.execute(
    'UPDATE products SET metadata = $1 WHERE id = $2',
    new_metadata,  # ← Dict auto-converted
    product_id
)

# Update specific JSONB field using || operator
await conn.execute(
    """
    UPDATE products
    SET metadata = metadata || $1
    WHERE id = $2
    """,
    {'price': 1799.99},  # ← Merge this dict into existing JSONB
    product_id
)

# Update nested JSONB field using jsonb_set()
await conn.execute(
    """
    UPDATE products
    SET metadata = jsonb_set(
        metadata,
        '{specs,ram}',  -- Path to nested field
        $1              -- New value (as JSONB)
    )
    WHERE id = $2
    """,
    '"128GB"',  # ← Must be valid JSON string
    product_id
)

Edge Cases and Known Issues:

Based on asyncpg GitHub issues and documentation (2024), there are some edge cases:

# 1. None values in dicts (Issue #440)
data_with_none = {
    'name': 'Test',
    'value': None  # ← Works fine, becomes JSON null
}
await conn.execute(
    'INSERT INTO table (data) VALUES ($1)',
    data_with_none  # ✅ Works correctly
)

# 2. executemany with JSONB (requires nested lists)
data_list = [
    [1, {'name': 'Item 1'}],  # ← Each row is a list of column values
    [2, {'name': 'Item 2'}],
    [3, {'name': 'Item 3'}]
]

await conn.executemany(
    'INSERT INTO items (id, data) VALUES ($1, $2)',
    data_list  # ← List of lists
)

# 3. Raw JSON strings (if you have pre-serialized JSON)
import json

json_string = '{"key": "value"}'

# Option A: Parse to dict (recommended)
data = json.loads(json_string)
await conn.execute('INSERT INTO table (data) VALUES ($1)', data)

# Option B: Cast in SQL
await conn.execute(
    'INSERT INTO table (data) VALUES ($1::jsonb)',
    json_string
)

Type Hints for Clarity:

from typing import Dict, Any, List, Optional
import asyncpg

async def store_user_preferences(
    user_id: int,
    preferences: Dict[str, Any]
) -> None:
    """
    Store user preferences as JSONB.
    
    Args:
        user_id: User ID
        preferences: Preferences dict (auto-converted to JSONB)
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'UPDATE users SET preferences = $1 WHERE id = $2',
            preferences,  # Dict[str, Any] → JSONB
            user_id
        )
    finally:
        await conn.close()

async def get_user_preferences(user_id: int) -> Optional[Dict[str, Any]]:
    """
    Retrieve user preferences from JSONB column.
    
    Returns:
        Preferences dict (auto-converted from JSONB) or None
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        preferences = await conn.fetchval(
            'SELECT preferences FROM users WHERE id = $1',
            user_id
        )
        return preferences  # JSONB → Dict[str, Any] or None
    finally:
        await conn.close()

FastAPI Integration:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Dict, Any
import asyncpg

app = FastAPI()

class ProductMetadata(BaseModel):
    brand: str
    model: str
    specs: Dict[str, Any]
    price: float
    in_stock: bool

@app.post("/products")
async def create_product(
    name: str,
    metadata: ProductMetadata  # Pydantic validates structure
):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # metadata.dict() returns Python dict
        # asyncpg auto-converts to JSONB
        product_id = await conn.fetchval(
            'INSERT INTO products (name, metadata) VALUES ($1, $2) RETURNING id',
            name,
            metadata.dict()  # ← Dict auto-converted to JSONB
        )
        
        return {"product_id": product_id}
    finally:
        await conn.close()

@app.get("/products/{product_id}")
async def get_product(product_id: int):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        product = await conn.fetchrow(
            'SELECT name, metadata FROM products WHERE id = $1',
            product_id
        )
        
        if not product:
            raise HTTPException(status_code=404, detail="Product not found")
        
        # product['metadata'] is already a Python dict
        return {
            "name": product['name'],
            "metadata": product['metadata']  # ← Dict from JSONB
        }
    finally:
        await conn.close()

JSON vs JSONB:

# Both JSON and JSONB columns work the same with asyncpg

# JSONB (recommended): Binary format, faster queries, supports indexing
await conn.execute(
    'INSERT INTO table (data_jsonb) VALUES ($1)',
    {'key': 'value'}  # ← Auto-converted to JSONB
)

# JSON: Text format, preserves formatting/whitespace
await conn.execute(
    'INSERT INTO table (data_json) VALUES ($1)',
    {'key': 'value'}  # ← Auto-converted to JSON
)

# Asyncpg handles both identically from Python's perspective

Performance Note:

According to asyncpg documentation and benchmarks (2024), asyncpg's native type conversion is significantly faster than manual json.dumps()/json.loads() because it uses optimized C extensions.

Version Note: Automatic JSONB/JSON conversion available since asyncpg 0.18.0+, stable and reliable in 0.27.0+ (2023-2024)

Sources

magicstack.github.io github.com postgresql.org

99% confidence

What is the recommended hashing algorithm for storing API keys in PostgreSQL?

For PostgreSQL API key storage in 2024, hash in the application using Argon2id or bcrypt, not in the database. If hashing in PostgreSQL is required, use pgcrypto's crypt() with 'bf' (bcrypt), but application-level hashing is preferred for security and performance.

Recommended: Application-Level Hashing (Python)

import asyncpg
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
import secrets

# Initialize Argon2 hasher (most secure in 2024)
ph = PasswordHasher(
    time_cost=2,        # iterations
    memory_cost=65536,  # 64 MB
    parallelism=4,      # threads
    hash_len=32,        # output length
    salt_len=16         # salt length
)

async def create_api_key_argon2(user_id: int, name: str) -> str:
    """
    Create API key with Argon2id hashing (recommended 2024).
    """
    # Generate key
    key = f"sk_live_{secrets.token_urlsafe(32)}"
    
    # Hash with Argon2id
    key_hash = ph.hash(key)
    # Example hash: $argon2id$v=19$m=65536,t=2,p=4$...
    
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'INSERT INTO api_keys (user_id, name, key_hash) VALUES ($1, $2, $3)',
            user_id,
            name,
            key_hash  # Store Argon2 hash
        )
        return key  # Return plaintext (show once)
    finally:
        await conn.close()

async def validate_api_key_argon2(key: str) -> bool:
    """
    Validate API key against Argon2 hash.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Fetch hash from database
        key_hash = await conn.fetchval(
            'SELECT key_hash FROM api_keys WHERE is_active = true AND key_hash LIKE $1',
            '$argon2id$%'  # Filter Argon2 hashes for performance
        )
        
        if not key_hash:
            return False
        
        try:
            # Verify key against hash
            ph.verify(key_hash, key)
            
            # Optional: Check if hash needs rehashing (algorithm updated)
            if ph.check_needs_rehash(key_hash):
                new_hash = ph.hash(key)
                await conn.execute(
                    'UPDATE api_keys SET key_hash = $1 WHERE key_hash = $2',
                    new_hash,
                    key_hash
                )
            
            return True
            
        except VerifyMismatchError:
            return False
    finally:
        await conn.close()

Alternative: bcrypt (Still Secure)

import bcrypt
import asyncpg
import secrets

async def create_api_key_bcrypt(user_id: int, name: str) -> str:
    """
    Create API key with bcrypt hashing.
    """
    key = f"sk_live_{secrets.token_urlsafe(32)}"
    
    # Hash with bcrypt (cost=12 recommended for 2024)
    key_hash = bcrypt.hashpw(key.encode('utf-8'), bcrypt.gensalt(rounds=12))
    
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'INSERT INTO api_keys (user_id, name, key_hash) VALUES ($1, $2, $3)',
            user_id,
            name,
            key_hash.decode('utf-8')  # Store as string
        )
        return key
    finally:
        await conn.close()

async def validate_api_key_bcrypt(key: str) -> bool:
    """
    Validate API key against bcrypt hash.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        key_hash = await conn.fetchval(
            'SELECT key_hash FROM api_keys WHERE is_active = true LIMIT 1'
        )
        
        if not key_hash:
            return False
        
        return bcrypt.checkpw(key.encode('utf-8'), key_hash.encode('utf-8'))
    finally:
        await conn.close()

pgcrypto (Database-Level, Less Recommended)

-- Enable pgcrypto extension
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Insert with bcrypt hash
INSERT INTO api_keys (user_id, name, key_hash)
VALUES (1, 'Production', crypt('sk_live_key123', gen_salt('bf', 12)));
-- Note: 'bf' = blowfish (bcrypt), 12 = cost factor

-- Validate key
SELECT EXISTS (
    SELECT 1
    FROM api_keys
    WHERE key_hash = crypt('sk_live_key123', key_hash)
      AND is_active = true
) AS is_valid;

Why Application-Level is Better:

According to 2024 security best practices:

More Modern Algorithms: Argon2id not available in pgcrypto
Security: Plaintext keys don't appear in database query logs
Performance: Distributes hashing load across application servers
Flexibility: Easier to upgrade hashing algorithm
Monitoring: Better control over hashing performance metrics

Algorithm Comparison (2024):

Algorithm	Security	Speed	Memory Hard	Recommendation
Argon2id	⭐⭐⭐⭐⭐	Moderate	✅ Yes	✅ Best choice
bcrypt	⭐⭐⭐⭐	Slow	❌ No (4KB)	✅ Good alternative
scrypt	⭐⭐⭐⭐	Slow	✅ Yes	⚠️ Complex tuning
PBKDF2	⭐⭐⭐	Fast	❌ No	❌ Avoid (GPU vulnerable)
SHA-256	⭐	Very Fast	❌ No	❌ Never use alone

FastAPI Integration:

from fastapi import FastAPI, Header, HTTPException
from argon2 import PasswordHasher
import asyncpg

app = FastAPI()
ph = PasswordHasher()

async def verify_api_key(x_api_key: str = Header()):
    """
    Dependency to verify API key.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Fetch all active key hashes (cache this in production)
        hashes = await conn.fetch(
            'SELECT id, key_hash FROM api_keys WHERE is_active = true'
        )
        
        for row in hashes:
            try:
                ph.verify(row['key_hash'], x_api_key)
                return row['id']  # Valid key
            except:
                continue
        
        raise HTTPException(status_code=401, detail="Invalid API key")
    finally:
        await conn.close()

@app.get("/protected")
async def protected_endpoint(key_id: int = Depends(verify_api_key)):
    return {"message": "Authenticated", "key_id": key_id}

Best Practices:

✅ Use Argon2id for new systems (most secure 2024)
✅ Use bcrypt if Argon2 not available (still secure)
✅ Hash in application, not database
✅ Use high cost factors (bcrypt 12+, Argon2 defaults)
✅ Never log plaintext keys
❌ Don't use SHA-256/SHA-512 alone (too fast, GPU vulnerable)
❌ Don't use MD5 (completely broken)

Installation:

# Argon2 (recommended)
pip install argon2-cffi

# bcrypt
pip install bcrypt

Version Note: Argon2id winner of Password Hashing Competition (2015), current best practice (2024). pgcrypto bcrypt available PostgreSQL 8.3+

Sources

securityboulevard.com postgresql.org cheatsheetseries.owasp.org

99% confidence

How does PostgreSQL handle schema validation when inserting into JSONB columns?

PostgreSQL does not perform automatic schema validation for JSONB columns. JSONB is schema-less by design - any valid JSON can be inserted. For validation, use CHECK constraints, triggers, or application-level validation.

No Automatic Validation:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB  -- No schema enforcement!
);

-- All of these succeed (any valid JSON)
INSERT INTO products (metadata) VALUES ('{"price": 19.99}');
INSERT INTO products (metadata) VALUES ('{"name": "Product"}');
INSERT INTO products (metadata) VALUES ('{"random": "data"}');
INSERT INTO products (metadata) VALUES ('{}');
INSERT INTO products (metadata) VALUES ('[]');

-- Only syntax errors fail
INSERT INTO products (metadata) VALUES ('{invalid json}');
-- ERROR: invalid input syntax for type json

Validation Method 1: CHECK Constraints

-- Require specific keys exist
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB,
    CONSTRAINT metadata_has_price CHECK (metadata ? 'price'),
    CONSTRAINT metadata_has_name CHECK (metadata ? 'name')
);

-- Require specific value types
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB,
    CONSTRAINT price_is_number CHECK (
        jsonb_typeof(metadata->'price') = 'number'
    ),
    CONSTRAINT name_is_string CHECK (
        jsonb_typeof(metadata->'name') = 'string'
    )
);

-- Complex validation
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    profile JSONB,
    CONSTRAINT valid_profile CHECK (
        profile ? 'email' AND
        profile ? 'name' AND
        jsonb_typeof(profile->'age') = 'number' AND
        (profile->>'age')::int BETWEEN 0 AND 150
    )
);

Validation Method 2: Triggers (Complex Validation)

-- Create validation function
CREATE OR REPLACE FUNCTION validate_product_metadata()
RETURNS TRIGGER AS $$
BEGIN
    -- Check required fields
    IF NOT (NEW.metadata ? 'name' AND NEW.metadata ? 'price') THEN
        RAISE EXCEPTION 'metadata must contain name and price';
    END IF;
    
    -- Check types
    IF jsonb_typeof(NEW.metadata->'price') != 'number' THEN
        RAISE EXCEPTION 'price must be a number';
    END IF;
    
    -- Check value ranges
    IF (NEW.metadata->>'price')::numeric <= 0 THEN
        RAISE EXCEPTION 'price must be positive';
    END IF;
    
    -- Check string length
    IF length(NEW.metadata->>'name') < 3 THEN
        RAISE EXCEPTION 'name must be at least 3 characters';
    END IF;
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Attach trigger
CREATE TRIGGER validate_product_trigger
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION validate_product_metadata();

-- Test
INSERT INTO products (metadata)
VALUES ('{"name": "Widget", "price": 19.99}');  -- ✅ Success

INSERT INTO products (metadata)
VALUES ('{"name": "X"}');  -- ❌ Error: price required

Validation Method 3: Application-Level (Recommended)

from pydantic import BaseModel, Field, validator
from typing import Dict, Any
import asyncpg
from fastapi import FastAPI, HTTPException

app = FastAPI()

# Define schema with Pydantic
class ProductMetadata(BaseModel):
    name: str = Field(..., min_length=3, max_length=100)
    price: float = Field(..., gt=0)
    stock: int = Field(0, ge=0)
    description: str | None = None
    tags: list[str] = []
    
    @validator('price')
    def validate_price(cls, v):
        if v > 1000000:
            raise ValueError('price cannot exceed 1,000,000')
        return round(v, 2)  # Ensure 2 decimals

@app.post("/products")
async def create_product(metadata: ProductMetadata):
    """
    Pydantic validates before database insertion.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # metadata.dict() is already validated
        product_id = await conn.fetchval(
            'INSERT INTO products (metadata) VALUES ($1) RETURNING id',
            metadata.dict()  # Validated dict → JSONB
        )
        
        return {"id": product_id, "metadata": metadata}
    finally:
        await conn.close()

# Invalid request automatically rejected by FastAPI/Pydantic
# POST {"name": "X", "price": -10}
# Response: 422 Unprocessable Entity
# {
#   "detail": [
#     {"loc": ["body", "name"], "msg": "ensure this value has at least 3 characters"},
#     {"loc": ["body", "price"], "msg": "ensure this value is greater than 0"}
#   ]
# }

Comparison of Validation Methods:

Method	Pros	Cons	Use When
CHECK Constraints	Database-enforced, simple	Limited validation logic	Basic field requirements
Triggers	Complex validation, database-enforced	Performance impact, harder to maintain	Complex business rules in DB
Application	Full validation control, clear errors	Requires app-level enforcement	✅ Recommended for most cases

PostgreSQL 16+ JSON Schema Validation (Experimental):

-- PostgreSQL 16+ supports limited JSON Schema via extensions
-- Still experimental, not production-ready in 2024

CREATE EXTENSION IF NOT EXISTS pg_jsonschema;

CREATE TABLE validated_data (
    id SERIAL PRIMARY KEY,
    data JSONB,
    CONSTRAINT valid_schema CHECK (
        jsonschema_is_valid(
            '{"type": "object", "required": ["name", "age"], "properties": {"name": {"type": "string"}, "age": {"type": "number"}}}',
            data
        )
    )
);

Best Practices:

✅ Validate in application - Use Pydantic, JSON Schema libraries
✅ Use CHECK constraints for critical fields - Ensure required keys exist
✅ Document expected schema - Even if not enforced, document structure
✅ Use triggers for complex business logic - When database-level enforcement needed
❌ Don't assume JSONB validates schema - It doesn't!
❌ Don't rely only on application validation - Add database constraints for critical data

Version Note: JSONB available PostgreSQL 9.4+, jsonb_typeof() and ? operator since 9.4+, CHECK constraints all versions

Sources

postgresql.org postgresql.org docs.pydantic.dev

99% confidence

What columns are typically included in a PostgreSQL API keys table schema?

A secure API keys table schema in PostgreSQL should include columns for key identification, hashing, user association, expiration, usage tracking, and access control. The actual API key should never be stored in plaintext.

Recommended Schema:

CREATE TABLE api_keys (
    -- Primary key
    id SERIAL PRIMARY KEY,
    
    -- Key identification (shows first/last chars to user)
    key_prefix VARCHAR(8) NOT NULL,  -- e.g., 'sk_live_'
    key_suffix VARCHAR(4) NOT NULL,  -- Last 4 chars for identification
    
    -- Hashed key (NEVER store plaintext!)
    key_hash VARCHAR(255) NOT NULL UNIQUE,
    
    -- User/owner association
    user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    
    -- Descriptive info
    name VARCHAR(100),  -- e.g., 'Production API', 'Dev Environment'
    description TEXT,
    
    -- Access control
    scopes JSONB DEFAULT '[]'::jsonb,  -- e.g., ["read:users", "write:orders"]
    is_active BOOLEAN DEFAULT true,
    
    -- Expiration
    expires_at TIMESTAMP,  -- NULL = never expires
    
    -- Usage tracking
    last_used_at TIMESTAMP,
    last_used_ip INET,
    use_count INTEGER DEFAULT 0,
    
    -- Rate limiting metadata (optional)
    rate_limit_per_minute INTEGER DEFAULT 60,
    
    -- IP whitelist (optional)
    allowed_ips JSONB,  -- e.g., ["192.168.1.0/24", "10.0.0.1"]
    
    -- Audit timestamps
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW(),
    revoked_at TIMESTAMP,
    
    -- Indexes for performance
    CONSTRAINT chk_expires CHECK (expires_at IS NULL OR expires_at > created_at)
);

-- Indexes
CREATE UNIQUE INDEX idx_api_keys_hash ON api_keys(key_hash);
CREATE INDEX idx_api_keys_user_id ON api_keys(user_id);
CREATE INDEX idx_api_keys_active ON api_keys(is_active) WHERE is_active = true;
CREATE INDEX idx_api_keys_prefix ON api_keys(key_prefix);

Using with asyncpg:

import asyncpg
import secrets
import hashlib
from datetime import datetime, timedelta
from typing import Optional, List

async def create_api_key(
    user_id: int,
    name: str,
    scopes: List[str],
    expires_days: Optional[int] = None
) -> tuple[str, int]:
    """
    Create new API key. Returns (plaintext_key, key_id).
    Plaintext key shown ONCE, then hashed.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Generate secure random key
        key = f"sk_live_{secrets.token_urlsafe(32)}"
        
        # Hash key for storage (use bcrypt in production)
        key_hash = hashlib.sha256(key.encode()).hexdigest()
        
        # Extract prefix and suffix for display
        key_prefix = key[:8]  # 'sk_live_'
        key_suffix = key[-4:]  # Last 4 chars
        
        # Calculate expiration
        expires_at = (
            datetime.now() + timedelta(days=expires_days)
            if expires_days
            else None
        )
        
        # Insert into database
        key_id = await conn.fetchval(
            """
            INSERT INTO api_keys (
                key_prefix, key_suffix, key_hash, user_id,
                name, scopes, expires_at
            )
            VALUES ($1, $2, $3, $4, $5, $6, $7)
            RETURNING id
            """,
            key_prefix,
            key_suffix,
            key_hash,
            user_id,
            name,
            scopes,  # asyncpg converts list to JSONB
            expires_at
        )
        
        # Return plaintext key (show to user ONCE)
        return (key, key_id)
        
    finally:
        await conn.close()

# Usage
key, key_id = await create_api_key(
    user_id=1,
    name="Production API",
    scopes=["read:users", "write:orders"],
    expires_days=365
)

print(f"API Key (save this, won't be shown again): {key}")
print(f"Key ID: {key_id}")

Key Validation:

async def validate_api_key(key: str) -> Optional[dict]:
    """
    Validate API key and return key details if valid.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Hash provided key
        key_hash = hashlib.sha256(key.encode()).hexdigest()
        
        # Look up key
        key_data = await conn.fetchrow(
            """
            SELECT id, user_id, scopes, is_active, expires_at,
                   rate_limit_per_minute, allowed_ips
            FROM api_keys
            WHERE key_hash = $1
              AND is_active = true
              AND (expires_at IS NULL OR expires_at > NOW())
            """,
            key_hash
        )
        
        if not key_data:
            return None
        
        # Update last used
        await conn.execute(
            """
            UPDATE api_keys
            SET last_used_at = NOW(),
                use_count = use_count + 1
            WHERE id = $1
            """,
            key_data['id']
        )
        
        return dict(key_data)
        
    finally:
        await conn.close()

Security Best Practices (2024):

Based on PostgreSQL security best practices and OWASP guidelines:

Never store plaintext keys - Hash with bcrypt, Argon2, or scrypt
Use pgcrypto extension for hashing if hashing in database
Implement rate limiting - Track usage and enforce limits
Support key rotation - Allow users to create new keys and revoke old ones
Log key usage - Track for security auditing
Enforce expiration - Keys should have expiration dates
Scope-based permissions - Limit what each key can do

Version Note: Schema patterns apply to PostgreSQL 12+, JSONB recommended for scopes/metadata

Sources

enterprisedb.com cheatsheetseries.owasp.org postgresql.org

99% confidence

How does PostgreSQL's information_schema.tables handle schema-qualified table queries?

Query Pattern:

-- Query tables in a specific schema
SELECT table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'my_schema'
  AND table_type = 'BASE TABLE';  -- Excludes views

-- Example: List all user tables (exclude system schemas)
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema', 'pg_catalog')
  AND table_type = 'BASE TABLE'
ORDER BY table_schema, table_name;

-- Check if specific table exists in schema
SELECT EXISTS (
    SELECT 1
    FROM information_schema.tables
    WHERE table_schema = 'public'
      AND table_name = 'users'
) AS table_exists;

Key Columns:

Column	Description	Values
`table_catalog`	Database name	Current database
`table_schema`	Schema name	`public`, `my_schema`, etc.
`table_name`	Table name	`users`, `orders`, etc.
`table_type`	Object type	`BASE TABLE`, `VIEW`, `FOREIGN`

Using asyncpg:

import asyncpg

async def check_table_exists(schema: str, table: str) -> bool:
    """
    Check if table exists in specified schema.
    """
    conn = await asyncpg.connect(
        host='localhost',
        port=5432,
        user='postgres',
        password='password',
        database='mydb'
    )
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM information_schema.tables
                WHERE table_schema = $1
                  AND table_name = $2
                  AND table_type = 'BASE TABLE'
            )
            """,
            schema,
            table
        )
        return exists
    finally:
        await conn.close()

# Usage
exists = await check_table_exists('public', 'users')
print(f"Table exists: {exists}")

Common Patterns:

-- 1. List all tables in database (excluding system)
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
  AND table_type = 'BASE TABLE';

-- 2. List tables with column count
SELECT 
    t.table_schema,
    t.table_name,
    COUNT(c.column_name) AS column_count
FROM information_schema.tables t
LEFT JOIN information_schema.columns c
    ON t.table_schema = c.table_schema
    AND t.table_name = c.table_name
WHERE t.table_schema = 'public'
  AND t.table_type = 'BASE TABLE'
GROUP BY t.table_schema, t.table_name;

-- 3. Search for tables by name pattern
SELECT table_schema, table_name
FROM information_schema.tables
WHERE table_name LIKE '%user%'
  AND table_type = 'BASE TABLE';

Best Practices:

Always filter by table_schema - Improves performance and accuracy
Use table_type = 'BASE TABLE' - Excludes views unless specifically needed
Avoid SELECT * - Select only needed columns for performance
Use EXISTS for existence checks - More efficient than COUNT(*)
Exclude system schemas - pg_catalog, information_schema for user tables

information_schema vs pg_catalog:

-- information_schema: Standard, portable across databases
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';

-- pg_catalog: PostgreSQL-specific, more detailed info
SELECT tablename
FROM pg_tables
WHERE schemaname = 'public';

-- pg_catalog also provides size info (not in information_schema)
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public';

Performance Considerations:

Complete FastAPI Example:

from fastapi import FastAPI, HTTPException
import asyncpg
from typing import List, Dict

app = FastAPI()

DATABASE_URL = "postgresql://user:pass@localhost/mydb"

@app.get("/tables/{schema}")
async def list_tables(schema: str) -> List[Dict[str, str]]:
    """
    List all tables in specified schema.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        tables = await conn.fetch(
            """
            SELECT table_schema, table_name, table_type
            FROM information_schema.tables
            WHERE table_schema = $1
              AND table_type = 'BASE TABLE'
            ORDER BY table_name
            """,
            schema
        )
        
        return [
            {
                "schema": row['table_schema'],
                "name": row['table_name'],
                "type": row['table_type']
            }
            for row in tables
        ]
    finally:
        await conn.close()

@app.get("/tables/{schema}/{table}/exists")
async def table_exists(schema: str, table: str) -> Dict[str, bool]:
    """
    Check if table exists in schema.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM information_schema.tables
                WHERE table_schema = $1
                  AND table_name = $2
            )
            """,
            schema,
            table
        )
        
        return {"exists": exists}
    finally:
        await conn.close()

Security Note:

The information_schema is read-only and respects user permissions. Users can only see tables they have privileges to access.

Version Note: information_schema behavior consistent since PostgreSQL 9.1+, recommended for PostgreSQL 12+

Sources

postgresql.org beekeeperstudio.io magicstack.github.io

99% confidence

How do you verify if a PostgreSQL materialized view exists before querying it?

Recommended Pattern:

-- Check if materialized view exists
SELECT EXISTS (
    SELECT 1
    FROM pg_matviews
    WHERE schemaname = 'public'
      AND matviewname = 'my_matview'
) AS matview_exists;

-- Get materialized view details
SELECT 
    schemaname,
    matviewname,
    matviewowner,
    tablespace,
    hasindexes,
    ispopulated,  -- Important: false means not refreshed yet
    definition
FROM pg_matviews
WHERE schemaname = 'public'
  AND matviewname = 'my_matview';

Using asyncpg:

import asyncpg
from typing import Optional, Dict

async def matview_exists(schema: str, matview: str) -> bool:
    """
    Check if materialized view exists in specified schema.
    """
    conn = await asyncpg.connect(
        host='localhost',
        database='mydb',
        user='postgres',
        password='password'
    )
    
    try:
        exists = await conn.fetchval(
            """
            SELECT EXISTS (
                SELECT 1
                FROM pg_matviews
                WHERE schemaname = $1
                  AND matviewname = $2
            )
            """,
            schema,
            matview
        )
        return exists
    finally:
        await conn.close()

# Usage
exists = await matview_exists('public', 'user_stats')
if not exists:
    raise ValueError("Materialized view 'user_stats' does not exist")

Check if Populated (Critical for Queries):

async def matview_is_populated(schema: str, matview: str) -> bool:
    """
    Check if materialized view exists AND is populated.
    Unpopulated matviews will error when queried.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        ispopulated = await conn.fetchval(
            """
            SELECT ispopulated
            FROM pg_matviews
            WHERE schemaname = $1
              AND matviewname = $2
            """,
            schema,
            matview
        )
        
        # ispopulated is None if matview doesn't exist
        # False if exists but not refreshed
        # True if exists and has data
        return ispopulated is True
        
    finally:
        await conn.close()

# Usage
if not await matview_is_populated('public', 'user_stats'):
    print("Materialized view not populated, refreshing...")
    await conn.execute('REFRESH MATERIALIZED VIEW public.user_stats')

Complete Safe Query Pattern:

from fastapi import FastAPI, HTTPException
import asyncpg
from typing import List, Dict, Any

app = FastAPI()

DATABASE_URL = "postgresql://user:pass@localhost/mydb"

async def safe_query_matview(
    schema: str,
    matview: str,
    query: str,
    *args
) -> List[Dict[str, Any]]:
    """
    Safely query materialized view with existence and population checks.
    """
    conn = await asyncpg.connect(DATABASE_URL)
    
    try:
        # Step 1: Check if matview exists and is populated
        matview_info = await conn.fetchrow(
            """
            SELECT ispopulated, definition
            FROM pg_matviews
            WHERE schemaname = $1
              AND matviewname = $2
            """,
            schema,
            matview
        )
        
        if not matview_info:
            raise HTTPException(
                status_code=404,
                detail=f"Materialized view '{schema}.{matview}' does not exist"
            )
        
        if not matview_info['ispopulated']:
            # Option 1: Auto-refresh (may be slow)
            await conn.execute(
                f'REFRESH MATERIALIZED VIEW {schema}.{matview}'
            )
            
            # Option 2: Return error (let client decide)
            # raise HTTPException(
            #     status_code=503,
            #     detail=f"Materialized view '{schema}.{matview}' not populated"
            # )
        
        # Step 2: Query the matview
        rows = await conn.fetch(query, *args)
        
        return [dict(row) for row in rows]
        
    finally:
        await conn.close()

@app.get("/stats/users")
async def get_user_stats():
    """
    Query user_stats materialized view with safety checks.
    """
    results = await safe_query_matview(
        schema='public',
        matview='user_stats',
        query='SELECT * FROM public.user_stats ORDER BY user_count DESC LIMIT 100'
    )
    return {"stats": results}

Alternative: Using pg_class

-- pg_class can also detect materialized views
SELECT EXISTS (
    SELECT 1
    FROM pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
    WHERE n.nspname = 'public'
      AND c.relname = 'my_matview'
      AND c.relkind = 'm'  -- 'm' = materialized view
) AS matview_exists;

-- Get more details from pg_class
SELECT 
    n.nspname AS schema,
    c.relname AS matview,
    c.relkind,  -- 'm' for materialized view
    pg_size_pretty(pg_total_relation_size(c.oid)) AS size,
    c.reltuples::bigint AS estimated_rows
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public'
  AND c.relname = 'my_matview'
  AND c.relkind = 'm';

Error Handling for Missing/Unpopulated Matview:

import asyncpg
import logging

logger = logging.getLogger(__name__)

async def query_matview_safe(matview_name: str):
    """
    Query matview with proper error handling.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Try to query directly
        results = await conn.fetch(f'SELECT * FROM {matview_name}')
        return [dict(row) for row in results]
        
    except asyncpg.UndefinedTableError:
        # Matview doesn't exist
        logger.error(f"Materialized view '{matview_name}' does not exist")
        raise HTTPException(
            status_code=404,
            detail=f"Materialized view not found: {matview_name}"
        )
        
    except asyncpg.ObjectNotInPrerequisiteStateError:
        # Matview exists but not populated
        logger.warning(
            f"Materialized view '{matview_name}' not populated, refreshing..."
        )
        
        try:
            await conn.execute(f'REFRESH MATERIALIZED VIEW {matview_name}')
            results = await conn.fetch(f'SELECT * FROM {matview_name}')
            return [dict(row) for row in results]
        except Exception as e:
            logger.exception(f"Failed to refresh matview: {e}")
            raise HTTPException(
                status_code=503,
                detail="Materialized view refresh failed"
            )
    
    finally:
        await conn.close()

Refresh Strategies:

# 1. Blocking refresh (locks for reads)
await conn.execute('REFRESH MATERIALIZED VIEW user_stats')

# 2. Concurrent refresh (allows reads, requires UNIQUE index)
await conn.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats')

# 3. Scheduled refresh (cron job pattern)
from datetime import datetime, timedelta

async def refresh_if_stale(matview: str, max_age_minutes: int = 60):
    """
    Refresh matview if last refresh was > max_age_minutes ago.
    """
    # Store last refresh time in separate table
    last_refresh = await conn.fetchval(
        'SELECT last_refresh FROM matview_refresh_log WHERE matview_name = $1',
        matview
    )
    
    if not last_refresh or datetime.now() - last_refresh > timedelta(minutes=max_age_minutes):
        await conn.execute(f'REFRESH MATERIALIZED VIEW {matview}')
        await conn.execute(
            'INSERT INTO matview_refresh_log (matview_name, last_refresh) VALUES ($1, NOW()) '
            'ON CONFLICT (matview_name) DO UPDATE SET last_refresh = NOW()',
            matview
        )

pg_matviews Columns:

Column	Type	Description
`schemaname`	name	Schema containing matview
`matviewname`	name	Materialized view name
`matviewowner`	name	Owner of matview
`tablespace`	name	Tablespace (NULL = default)
`hasindexes`	boolean	Has indexes defined
`ispopulated`	boolean	Critical: false = no data
`definition`	text	SQL query definition

Best Practices:

✅ Always check ispopulated before querying production matviews
✅ Use pg_matviews for materialized views (not information_schema.views)
✅ Handle ObjectNotInPrerequisiteStateError for unpopulated matviews
✅ Implement refresh strategies (scheduled or on-demand)
✅ Use REFRESH MATERIALIZED VIEW CONCURRENTLY when possible (requires UNIQUE index)
❌ Don't assume matviews are always populated after creation
❌ Don't query matviews without existence checks in production

Version Note: pg_matviews available since PostgreSQL 9.3+, ispopulated column reliable since 9.4+

Sources

postgresql.org postgresql.org magicstack.github.io

99% confidence

How does asyncpg handle Python dict to PostgreSQL JSONB column type conversion?

Automatic Conversion:

import asyncpg

# Python dict automatically converts to JSONB
conn = await asyncpg.connect(
    host='localhost',
    database='mydb',
    user='postgres',
    password='password'
)

# INSERT: Python dict → PostgreSQL JSONB
user_data = {
    'name': 'John Doe',
    'age': 30,
    'email': '[email protected]',
    'preferences': {
        'theme': 'dark',
        'notifications': True
    }
}

await conn.execute(
    'INSERT INTO users (id, data) VALUES ($1, $2)',
    1,
    user_data  # ← Python dict, no json.dumps() needed!
)

# SELECT: PostgreSQL JSONB → Python dict
row = await conn.fetchrow('SELECT id, data FROM users WHERE id = $1', 1)

print(row['data'])  # ← Automatically a Python dict!
# Output: {'name': 'John Doe', 'age': 30, 'email': '[email protected]', ...}

print(type(row['data']))  # <class 'dict'>

Type Mappings:

Python Type	PostgreSQL Type	Automatic
`dict`	`jsonb` or `json`	✅ Yes
`list`	`jsonb[]` or `json[]`	✅ Yes
`None`	`NULL`	✅ Yes
`str`	`text` (not JSON)	✅ Yes

Complete Example:

import asyncpg
from typing import Dict, Any, List

# Table schema
"""
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    metadata JSONB,  -- Python dict
    tags JSONB,      -- Python list
    created_at TIMESTAMP DEFAULT NOW()
);
"""

async def create_product(name: str, metadata: Dict[str, Any], tags: List[str]):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # All JSON data auto-converted
        product_id = await conn.fetchval(
            """
            INSERT INTO products (name, metadata, tags)
            VALUES ($1, $2, $3)
            RETURNING id
            """,
            name,
            metadata,  # dict → jsonb (automatic)
            tags       # list → jsonb (automatic)
        )
        
        return product_id
    finally:
        await conn.close()

# Usage - no JSON serialization needed
product_id = await create_product(
    name='Laptop',
    metadata={
        'brand': 'Dell',
        'model': 'XPS 15',
        'specs': {
            'ram': '32GB',
            'cpu': 'Intel i9',
            'storage': '1TB SSD'
        },
        'price': 1999.99,
        'in_stock': True
    },
    tags=['electronics', 'computers', 'laptops']
)

print(f"Created product ID: {product_id}")

Querying JSONB Data:

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Query with JSONB operators
products = await conn.fetch(
    """
    SELECT id, name, metadata, tags
    FROM products
    WHERE metadata->>'brand' = $1  -- JSONB text extraction
      AND (metadata->>'price')::numeric < $2
      AND tags @> $3  -- JSONB containment
    """,
    'Dell',
    2000.00,
    ['laptops']  # ← Python list auto-converted for comparison
)

for product in products:
    print(f"Product: {product['name']}")
    print(f"Metadata: {product['metadata']}")  # Already a dict!
    print(f"Brand: {product['metadata']['brand']}")  # Direct dict access
    print(f"Tags: {product['tags']}")  # Already a list!

Updating JSONB Fields:

# Update entire JSONB column
new_metadata = {
    'brand': 'Dell',
    'model': 'XPS 17',  # Changed
    'specs': {
        'ram': '64GB',  # Changed
        'cpu': 'Intel i9',
        'storage': '2TB SSD'  # Changed
    }
}

await conn.execute(
    'UPDATE products SET metadata = $1 WHERE id = $2',
    new_metadata,  # ← Dict auto-converted
    product_id
)

# Update specific JSONB field using || operator
await conn.execute(
    """
    UPDATE products
    SET metadata = metadata || $1
    WHERE id = $2
    """,
    {'price': 1799.99},  # ← Merge this dict into existing JSONB
    product_id
)

# Update nested JSONB field using jsonb_set()
await conn.execute(
    """
    UPDATE products
    SET metadata = jsonb_set(
        metadata,
        '{specs,ram}',  -- Path to nested field
        $1              -- New value (as JSONB)
    )
    WHERE id = $2
    """,
    '"128GB"',  # ← Must be valid JSON string
    product_id
)

Edge Cases and Known Issues:

Based on asyncpg GitHub issues and documentation (2024), there are some edge cases:

# 1. None values in dicts (Issue #440)
data_with_none = {
    'name': 'Test',
    'value': None  # ← Works fine, becomes JSON null
}
await conn.execute(
    'INSERT INTO table (data) VALUES ($1)',
    data_with_none  # ✅ Works correctly
)

# 2. executemany with JSONB (requires nested lists)
data_list = [
    [1, {'name': 'Item 1'}],  # ← Each row is a list of column values
    [2, {'name': 'Item 2'}],
    [3, {'name': 'Item 3'}]
]

await conn.executemany(
    'INSERT INTO items (id, data) VALUES ($1, $2)',
    data_list  # ← List of lists
)

# 3. Raw JSON strings (if you have pre-serialized JSON)
import json

json_string = '{"key": "value"}'

# Option A: Parse to dict (recommended)
data = json.loads(json_string)
await conn.execute('INSERT INTO table (data) VALUES ($1)', data)

# Option B: Cast in SQL
await conn.execute(
    'INSERT INTO table (data) VALUES ($1::jsonb)',
    json_string
)

Type Hints for Clarity:

from typing import Dict, Any, List, Optional
import asyncpg

async def store_user_preferences(
    user_id: int,
    preferences: Dict[str, Any]
) -> None:
    """
    Store user preferences as JSONB.
    
    Args:
        user_id: User ID
        preferences: Preferences dict (auto-converted to JSONB)
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'UPDATE users SET preferences = $1 WHERE id = $2',
            preferences,  # Dict[str, Any] → JSONB
            user_id
        )
    finally:
        await conn.close()

async def get_user_preferences(user_id: int) -> Optional[Dict[str, Any]]:
    """
    Retrieve user preferences from JSONB column.
    
    Returns:
        Preferences dict (auto-converted from JSONB) or None
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        preferences = await conn.fetchval(
            'SELECT preferences FROM users WHERE id = $1',
            user_id
        )
        return preferences  # JSONB → Dict[str, Any] or None
    finally:
        await conn.close()

FastAPI Integration:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Dict, Any
import asyncpg

app = FastAPI()

class ProductMetadata(BaseModel):
    brand: str
    model: str
    specs: Dict[str, Any]
    price: float
    in_stock: bool

@app.post("/products")
async def create_product(
    name: str,
    metadata: ProductMetadata  # Pydantic validates structure
):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # metadata.dict() returns Python dict
        # asyncpg auto-converts to JSONB
        product_id = await conn.fetchval(
            'INSERT INTO products (name, metadata) VALUES ($1, $2) RETURNING id',
            name,
            metadata.dict()  # ← Dict auto-converted to JSONB
        )
        
        return {"product_id": product_id}
    finally:
        await conn.close()

@app.get("/products/{product_id}")
async def get_product(product_id: int):
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        product = await conn.fetchrow(
            'SELECT name, metadata FROM products WHERE id = $1',
            product_id
        )
        
        if not product:
            raise HTTPException(status_code=404, detail="Product not found")
        
        # product['metadata'] is already a Python dict
        return {
            "name": product['name'],
            "metadata": product['metadata']  # ← Dict from JSONB
        }
    finally:
        await conn.close()

JSON vs JSONB:

# Both JSON and JSONB columns work the same with asyncpg

# JSONB (recommended): Binary format, faster queries, supports indexing
await conn.execute(
    'INSERT INTO table (data_jsonb) VALUES ($1)',
    {'key': 'value'}  # ← Auto-converted to JSONB
)

# JSON: Text format, preserves formatting/whitespace
await conn.execute(
    'INSERT INTO table (data_json) VALUES ($1)',
    {'key': 'value'}  # ← Auto-converted to JSON
)

# Asyncpg handles both identically from Python's perspective

Performance Note:

According to asyncpg documentation and benchmarks (2024), asyncpg's native type conversion is significantly faster than manual json.dumps()/json.loads() because it uses optimized C extensions.

Version Note: Automatic JSONB/JSON conversion available since asyncpg 0.18.0+, stable and reliable in 0.27.0+ (2023-2024)

Sources

magicstack.github.io github.com postgresql.org

99% confidence

What is the recommended hashing algorithm for storing API keys in PostgreSQL?

Recommended: Application-Level Hashing (Python)

import asyncpg
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
import secrets

# Initialize Argon2 hasher (most secure in 2024)
ph = PasswordHasher(
    time_cost=2,        # iterations
    memory_cost=65536,  # 64 MB
    parallelism=4,      # threads
    hash_len=32,        # output length
    salt_len=16         # salt length
)

async def create_api_key_argon2(user_id: int, name: str) -> str:
    """
    Create API key with Argon2id hashing (recommended 2024).
    """
    # Generate key
    key = f"sk_live_{secrets.token_urlsafe(32)}"
    
    # Hash with Argon2id
    key_hash = ph.hash(key)
    # Example hash: $argon2id$v=19$m=65536,t=2,p=4$...
    
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'INSERT INTO api_keys (user_id, name, key_hash) VALUES ($1, $2, $3)',
            user_id,
            name,
            key_hash  # Store Argon2 hash
        )
        return key  # Return plaintext (show once)
    finally:
        await conn.close()

async def validate_api_key_argon2(key: str) -> bool:
    """
    Validate API key against Argon2 hash.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Fetch hash from database
        key_hash = await conn.fetchval(
            'SELECT key_hash FROM api_keys WHERE is_active = true AND key_hash LIKE $1',
            '$argon2id$%'  # Filter Argon2 hashes for performance
        )
        
        if not key_hash:
            return False
        
        try:
            # Verify key against hash
            ph.verify(key_hash, key)
            
            # Optional: Check if hash needs rehashing (algorithm updated)
            if ph.check_needs_rehash(key_hash):
                new_hash = ph.hash(key)
                await conn.execute(
                    'UPDATE api_keys SET key_hash = $1 WHERE key_hash = $2',
                    new_hash,
                    key_hash
                )
            
            return True
            
        except VerifyMismatchError:
            return False
    finally:
        await conn.close()

Alternative: bcrypt (Still Secure)

import bcrypt
import asyncpg
import secrets

async def create_api_key_bcrypt(user_id: int, name: str) -> str:
    """
    Create API key with bcrypt hashing.
    """
    key = f"sk_live_{secrets.token_urlsafe(32)}"
    
    # Hash with bcrypt (cost=12 recommended for 2024)
    key_hash = bcrypt.hashpw(key.encode('utf-8'), bcrypt.gensalt(rounds=12))
    
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        await conn.execute(
            'INSERT INTO api_keys (user_id, name, key_hash) VALUES ($1, $2, $3)',
            user_id,
            name,
            key_hash.decode('utf-8')  # Store as string
        )
        return key
    finally:
        await conn.close()

async def validate_api_key_bcrypt(key: str) -> bool:
    """
    Validate API key against bcrypt hash.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        key_hash = await conn.fetchval(
            'SELECT key_hash FROM api_keys WHERE is_active = true LIMIT 1'
        )
        
        if not key_hash:
            return False
        
        return bcrypt.checkpw(key.encode('utf-8'), key_hash.encode('utf-8'))
    finally:
        await conn.close()

pgcrypto (Database-Level, Less Recommended)

-- Enable pgcrypto extension
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Insert with bcrypt hash
INSERT INTO api_keys (user_id, name, key_hash)
VALUES (1, 'Production', crypt('sk_live_key123', gen_salt('bf', 12)));
-- Note: 'bf' = blowfish (bcrypt), 12 = cost factor

-- Validate key
SELECT EXISTS (
    SELECT 1
    FROM api_keys
    WHERE key_hash = crypt('sk_live_key123', key_hash)
      AND is_active = true
) AS is_valid;

Why Application-Level is Better:

According to 2024 security best practices:

More Modern Algorithms: Argon2id not available in pgcrypto
Security: Plaintext keys don't appear in database query logs
Performance: Distributes hashing load across application servers
Flexibility: Easier to upgrade hashing algorithm
Monitoring: Better control over hashing performance metrics

Algorithm Comparison (2024):

Algorithm	Security	Speed	Memory Hard	Recommendation
Argon2id	⭐⭐⭐⭐⭐	Moderate	✅ Yes	✅ Best choice
bcrypt	⭐⭐⭐⭐	Slow	❌ No (4KB)	✅ Good alternative
scrypt	⭐⭐⭐⭐	Slow	✅ Yes	⚠️ Complex tuning
PBKDF2	⭐⭐⭐	Fast	❌ No	❌ Avoid (GPU vulnerable)
SHA-256	⭐	Very Fast	❌ No	❌ Never use alone

FastAPI Integration:

from fastapi import FastAPI, Header, HTTPException
from argon2 import PasswordHasher
import asyncpg

app = FastAPI()
ph = PasswordHasher()

async def verify_api_key(x_api_key: str = Header()):
    """
    Dependency to verify API key.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Fetch all active key hashes (cache this in production)
        hashes = await conn.fetch(
            'SELECT id, key_hash FROM api_keys WHERE is_active = true'
        )
        
        for row in hashes:
            try:
                ph.verify(row['key_hash'], x_api_key)
                return row['id']  # Valid key
            except:
                continue
        
        raise HTTPException(status_code=401, detail="Invalid API key")
    finally:
        await conn.close()

@app.get("/protected")
async def protected_endpoint(key_id: int = Depends(verify_api_key)):
    return {"message": "Authenticated", "key_id": key_id}

Best Practices:

✅ Use Argon2id for new systems (most secure 2024)
✅ Use bcrypt if Argon2 not available (still secure)
✅ Hash in application, not database
✅ Use high cost factors (bcrypt 12+, Argon2 defaults)
✅ Never log plaintext keys
❌ Don't use SHA-256/SHA-512 alone (too fast, GPU vulnerable)
❌ Don't use MD5 (completely broken)

Installation:

# Argon2 (recommended)
pip install argon2-cffi

# bcrypt
pip install bcrypt

Version Note: Argon2id winner of Password Hashing Competition (2015), current best practice (2024). pgcrypto bcrypt available PostgreSQL 8.3+

Sources

securityboulevard.com postgresql.org cheatsheetseries.owasp.org

99% confidence

How does PostgreSQL handle schema validation when inserting into JSONB columns?

No Automatic Validation:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB  -- No schema enforcement!
);

-- All of these succeed (any valid JSON)
INSERT INTO products (metadata) VALUES ('{"price": 19.99}');
INSERT INTO products (metadata) VALUES ('{"name": "Product"}');
INSERT INTO products (metadata) VALUES ('{"random": "data"}');
INSERT INTO products (metadata) VALUES ('{}');
INSERT INTO products (metadata) VALUES ('[]');

-- Only syntax errors fail
INSERT INTO products (metadata) VALUES ('{invalid json}');
-- ERROR: invalid input syntax for type json

Validation Method 1: CHECK Constraints

-- Require specific keys exist
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB,
    CONSTRAINT metadata_has_price CHECK (metadata ? 'price'),
    CONSTRAINT metadata_has_name CHECK (metadata ? 'name')
);

-- Require specific value types
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    metadata JSONB,
    CONSTRAINT price_is_number CHECK (
        jsonb_typeof(metadata->'price') = 'number'
    ),
    CONSTRAINT name_is_string CHECK (
        jsonb_typeof(metadata->'name') = 'string'
    )
);

-- Complex validation
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    profile JSONB,
    CONSTRAINT valid_profile CHECK (
        profile ? 'email' AND
        profile ? 'name' AND
        jsonb_typeof(profile->'age') = 'number' AND
        (profile->>'age')::int BETWEEN 0 AND 150
    )
);

Validation Method 2: Triggers (Complex Validation)

-- Create validation function
CREATE OR REPLACE FUNCTION validate_product_metadata()
RETURNS TRIGGER AS $$
BEGIN
    -- Check required fields
    IF NOT (NEW.metadata ? 'name' AND NEW.metadata ? 'price') THEN
        RAISE EXCEPTION 'metadata must contain name and price';
    END IF;
    
    -- Check types
    IF jsonb_typeof(NEW.metadata->'price') != 'number' THEN
        RAISE EXCEPTION 'price must be a number';
    END IF;
    
    -- Check value ranges
    IF (NEW.metadata->>'price')::numeric <= 0 THEN
        RAISE EXCEPTION 'price must be positive';
    END IF;
    
    -- Check string length
    IF length(NEW.metadata->>'name') < 3 THEN
        RAISE EXCEPTION 'name must be at least 3 characters';
    END IF;
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Attach trigger
CREATE TRIGGER validate_product_trigger
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION validate_product_metadata();

-- Test
INSERT INTO products (metadata)
VALUES ('{"name": "Widget", "price": 19.99}');  -- ✅ Success

INSERT INTO products (metadata)
VALUES ('{"name": "X"}');  -- ❌ Error: price required

Validation Method 3: Application-Level (Recommended)

from pydantic import BaseModel, Field, validator
from typing import Dict, Any
import asyncpg
from fastapi import FastAPI, HTTPException

app = FastAPI()

# Define schema with Pydantic
class ProductMetadata(BaseModel):
    name: str = Field(..., min_length=3, max_length=100)
    price: float = Field(..., gt=0)
    stock: int = Field(0, ge=0)
    description: str | None = None
    tags: list[str] = []
    
    @validator('price')
    def validate_price(cls, v):
        if v > 1000000:
            raise ValueError('price cannot exceed 1,000,000')
        return round(v, 2)  # Ensure 2 decimals

@app.post("/products")
async def create_product(metadata: ProductMetadata):
    """
    Pydantic validates before database insertion.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # metadata.dict() is already validated
        product_id = await conn.fetchval(
            'INSERT INTO products (metadata) VALUES ($1) RETURNING id',
            metadata.dict()  # Validated dict → JSONB
        )
        
        return {"id": product_id, "metadata": metadata}
    finally:
        await conn.close()

# Invalid request automatically rejected by FastAPI/Pydantic
# POST {"name": "X", "price": -10}
# Response: 422 Unprocessable Entity
# {
#   "detail": [
#     {"loc": ["body", "name"], "msg": "ensure this value has at least 3 characters"},
#     {"loc": ["body", "price"], "msg": "ensure this value is greater than 0"}
#   ]
# }

Comparison of Validation Methods:

Method	Pros	Cons	Use When
CHECK Constraints	Database-enforced, simple	Limited validation logic	Basic field requirements
Triggers	Complex validation, database-enforced	Performance impact, harder to maintain	Complex business rules in DB
Application	Full validation control, clear errors	Requires app-level enforcement	✅ Recommended for most cases

PostgreSQL 16+ JSON Schema Validation (Experimental):

-- PostgreSQL 16+ supports limited JSON Schema via extensions
-- Still experimental, not production-ready in 2024

CREATE EXTENSION IF NOT EXISTS pg_jsonschema;

CREATE TABLE validated_data (
    id SERIAL PRIMARY KEY,
    data JSONB,
    CONSTRAINT valid_schema CHECK (
        jsonschema_is_valid(
            '{"type": "object", "required": ["name", "age"], "properties": {"name": {"type": "string"}, "age": {"type": "number"}}}',
            data
        )
    )
);

Best Practices:

✅ Validate in application - Use Pydantic, JSON Schema libraries
✅ Use CHECK constraints for critical fields - Ensure required keys exist
✅ Document expected schema - Even if not enforced, document structure
✅ Use triggers for complex business logic - When database-level enforcement needed
❌ Don't assume JSONB validates schema - It doesn't!
❌ Don't rely only on application validation - Add database constraints for critical data

Version Note: JSONB available PostgreSQL 9.4+, jsonb_typeof() and ? operator since 9.4+, CHECK constraints all versions

Sources

postgresql.org postgresql.org docs.pydantic.dev

99% confidence

What columns are typically included in a PostgreSQL API keys table schema?

Recommended Schema:

CREATE TABLE api_keys (
    -- Primary key
    id SERIAL PRIMARY KEY,
    
    -- Key identification (shows first/last chars to user)
    key_prefix VARCHAR(8) NOT NULL,  -- e.g., 'sk_live_'
    key_suffix VARCHAR(4) NOT NULL,  -- Last 4 chars for identification
    
    -- Hashed key (NEVER store plaintext!)
    key_hash VARCHAR(255) NOT NULL UNIQUE,
    
    -- User/owner association
    user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    
    -- Descriptive info
    name VARCHAR(100),  -- e.g., 'Production API', 'Dev Environment'
    description TEXT,
    
    -- Access control
    scopes JSONB DEFAULT '[]'::jsonb,  -- e.g., ["read:users", "write:orders"]
    is_active BOOLEAN DEFAULT true,
    
    -- Expiration
    expires_at TIMESTAMP,  -- NULL = never expires
    
    -- Usage tracking
    last_used_at TIMESTAMP,
    last_used_ip INET,
    use_count INTEGER DEFAULT 0,
    
    -- Rate limiting metadata (optional)
    rate_limit_per_minute INTEGER DEFAULT 60,
    
    -- IP whitelist (optional)
    allowed_ips JSONB,  -- e.g., ["192.168.1.0/24", "10.0.0.1"]
    
    -- Audit timestamps
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW(),
    revoked_at TIMESTAMP,
    
    -- Indexes for performance
    CONSTRAINT chk_expires CHECK (expires_at IS NULL OR expires_at > created_at)
);

-- Indexes
CREATE UNIQUE INDEX idx_api_keys_hash ON api_keys(key_hash);
CREATE INDEX idx_api_keys_user_id ON api_keys(user_id);
CREATE INDEX idx_api_keys_active ON api_keys(is_active) WHERE is_active = true;
CREATE INDEX idx_api_keys_prefix ON api_keys(key_prefix);

Using with asyncpg:

import asyncpg
import secrets
import hashlib
from datetime import datetime, timedelta
from typing import Optional, List

async def create_api_key(
    user_id: int,
    name: str,
    scopes: List[str],
    expires_days: Optional[int] = None
) -> tuple[str, int]:
    """
    Create new API key. Returns (plaintext_key, key_id).
    Plaintext key shown ONCE, then hashed.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Generate secure random key
        key = f"sk_live_{secrets.token_urlsafe(32)}"
        
        # Hash key for storage (use bcrypt in production)
        key_hash = hashlib.sha256(key.encode()).hexdigest()
        
        # Extract prefix and suffix for display
        key_prefix = key[:8]  # 'sk_live_'
        key_suffix = key[-4:]  # Last 4 chars
        
        # Calculate expiration
        expires_at = (
            datetime.now() + timedelta(days=expires_days)
            if expires_days
            else None
        )
        
        # Insert into database
        key_id = await conn.fetchval(
            """
            INSERT INTO api_keys (
                key_prefix, key_suffix, key_hash, user_id,
                name, scopes, expires_at
            )
            VALUES ($1, $2, $3, $4, $5, $6, $7)
            RETURNING id
            """,
            key_prefix,
            key_suffix,
            key_hash,
            user_id,
            name,
            scopes,  # asyncpg converts list to JSONB
            expires_at
        )
        
        # Return plaintext key (show to user ONCE)
        return (key, key_id)
        
    finally:
        await conn.close()

# Usage
key, key_id = await create_api_key(
    user_id=1,
    name="Production API",
    scopes=["read:users", "write:orders"],
    expires_days=365
)

print(f"API Key (save this, won't be shown again): {key}")
print(f"Key ID: {key_id}")

Key Validation:

async def validate_api_key(key: str) -> Optional[dict]:
    """
    Validate API key and return key details if valid.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        # Hash provided key
        key_hash = hashlib.sha256(key.encode()).hexdigest()
        
        # Look up key
        key_data = await conn.fetchrow(
            """
            SELECT id, user_id, scopes, is_active, expires_at,
                   rate_limit_per_minute, allowed_ips
            FROM api_keys
            WHERE key_hash = $1
              AND is_active = true
              AND (expires_at IS NULL OR expires_at > NOW())
            """,
            key_hash
        )
        
        if not key_data:
            return None
        
        # Update last used
        await conn.execute(
            """
            UPDATE api_keys
            SET last_used_at = NOW(),
                use_count = use_count + 1
            WHERE id = $1
            """,
            key_data['id']
        )
        
        return dict(key_data)
        
    finally:
        await conn.close()

Security Best Practices (2024):

Based on PostgreSQL security best practices and OWASP guidelines:

Never store plaintext keys - Hash with bcrypt, Argon2, or scrypt
Use pgcrypto extension for hashing if hashing in database
Implement rate limiting - Track usage and enforce limits
Support key rotation - Allow users to create new keys and revoke old ones
Log key usage - Track for security auditing
Enforce expiration - Keys should have expiration dates
Scope-based permissions - Limit what each key can do

Version Note: Schema patterns apply to PostgreSQL 12+, JSONB recommended for scopes/metadata

Sources

enterprisedb.com cheatsheetseries.owasp.org postgresql.org

99% confidence

sql_json_features

2 questions

What is the correct syntax for casting string values to JSONB in PostgreSQL queries?

PostgreSQL supports two syntaxes for casting strings to JSONB: the ::jsonb cast operator (PostgreSQL-specific) and the standard CAST() function. Both validate that the string is valid JSON during the cast.

Cast Operator (Recommended):

-- Cast JSON string to JSONB
SELECT '{"name": "John", "age": 30}'::jsonb;

-- Insert with cast
INSERT INTO users (data)
VALUES ('{"email": "[email protected]", "active": true}'::jsonb);

-- Update with cast
UPDATE products
SET metadata = '{"price": 19.99, "stock": 100}'::jsonb
WHERE id = 1;

-- Query with cast for comparison
SELECT * FROM users
WHERE data @> '{"active": true}'::jsonb;

CAST Function (SQL Standard):

-- Standard SQL CAST syntax
SELECT CAST('{"name": "John"}' AS jsonb);

-- Insert
INSERT INTO users (data)
VALUES (CAST('{"email": "[email protected]"}' AS jsonb));

Using with asyncpg:

import asyncpg
import json

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Method 1: Let asyncpg handle conversion (recommended)
data = {'name': 'John', 'age': 30}
await conn.execute(
    'INSERT INTO users (data) VALUES ($1)',
    data  # asyncpg auto-converts dict to JSONB
)

# Method 2: Pre-serialized JSON string with cast
json_str = json.dumps({'name': 'John', 'age': 30})
await conn.execute(
    'INSERT INTO users (data) VALUES ($1::jsonb)',
    json_str
)

# Method 3: Cast in query (when needed for operators)
await conn.execute(
    "UPDATE users SET data = data || $1::jsonb WHERE id = $2",
    '{"updated": true}',  # String literal
    user_id
)

Validation During Cast:

-- Valid JSON - succeeds
SELECT '{"valid": "json"}'::jsonb;
-- Result: {"valid": "json"}

-- Invalid JSON - raises error
SELECT '{invalid json}'::jsonb;
-- ERROR: invalid input syntax for type json

-- Empty object - valid
SELECT '{}'::jsonb;
-- Result: {}

-- Array - valid
SELECT '[1, 2, 3]'::jsonb;
-- Result: [1, 2, 3]

Common Use Cases:

-- 1. Dynamic JSONB construction
SELECT jsonb_build_object(
    'name', name,
    'email', email
) FROM users;

-- 2. Merging JSONB values
UPDATE users
SET data = data || '{"last_login": "2024-01-15"}'::jsonb
WHERE id = 1;

# 3. Creating JSONB from text column
SELECT config_text::jsonb AS config_jsonb
FROM settings;

Best Practices:

✅ Use ::jsonb for brevity in PostgreSQL-specific code
✅ Use CAST() for SQL standard compatibility
✅ Let asyncpg handle dict→JSONB conversion (no manual casting needed)
✅ Validate JSON strings before casting in application code
❌ Don't cast unless necessary (asyncpg handles automatically)

Version Note: ::jsonb syntax available since PostgreSQL 9.4+, CAST() since 9.2+

Sources

postgresql.org postgresql.org magicstack.github.io

99% confidence

What is the correct syntax for casting string values to JSONB in PostgreSQL queries?

Cast Operator (Recommended):

-- Cast JSON string to JSONB
SELECT '{"name": "John", "age": 30}'::jsonb;

-- Insert with cast
INSERT INTO users (data)
VALUES ('{"email": "[email protected]", "active": true}'::jsonb);

-- Update with cast
UPDATE products
SET metadata = '{"price": 19.99, "stock": 100}'::jsonb
WHERE id = 1;

-- Query with cast for comparison
SELECT * FROM users
WHERE data @> '{"active": true}'::jsonb;

CAST Function (SQL Standard):

-- Standard SQL CAST syntax
SELECT CAST('{"name": "John"}' AS jsonb);

-- Insert
INSERT INTO users (data)
VALUES (CAST('{"email": "[email protected]"}' AS jsonb));

Using with asyncpg:

import asyncpg
import json

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Method 1: Let asyncpg handle conversion (recommended)
data = {'name': 'John', 'age': 30}
await conn.execute(
    'INSERT INTO users (data) VALUES ($1)',
    data  # asyncpg auto-converts dict to JSONB
)

# Method 2: Pre-serialized JSON string with cast
json_str = json.dumps({'name': 'John', 'age': 30})
await conn.execute(
    'INSERT INTO users (data) VALUES ($1::jsonb)',
    json_str
)

# Method 3: Cast in query (when needed for operators)
await conn.execute(
    "UPDATE users SET data = data || $1::jsonb WHERE id = $2",
    '{"updated": true}',  # String literal
    user_id
)

Validation During Cast:

-- Valid JSON - succeeds
SELECT '{"valid": "json"}'::jsonb;
-- Result: {"valid": "json"}

-- Invalid JSON - raises error
SELECT '{invalid json}'::jsonb;
-- ERROR: invalid input syntax for type json

-- Empty object - valid
SELECT '{}'::jsonb;
-- Result: {}

-- Array - valid
SELECT '[1, 2, 3]'::jsonb;
-- Result: [1, 2, 3]

Common Use Cases:

-- 1. Dynamic JSONB construction
SELECT jsonb_build_object(
    'name', name,
    'email', email
) FROM users;

-- 2. Merging JSONB values
UPDATE users
SET data = data || '{"last_login": "2024-01-15"}'::jsonb
WHERE id = 1;

# 3. Creating JSONB from text column
SELECT config_text::jsonb AS config_jsonb
FROM settings;

Best Practices:

✅ Use ::jsonb for brevity in PostgreSQL-specific code
✅ Use CAST() for SQL standard compatibility
✅ Let asyncpg handle dict→JSONB conversion (no manual casting needed)
✅ Validate JSON strings before casting in application code
❌ Don't cast unless necessary (asyncpg handles automatically)

Version Note: ::jsonb syntax available since PostgreSQL 9.4+, CAST() since 9.2+

Sources

postgresql.org postgresql.org magicstack.github.io

99% confidence

sql_query_language

2 questions

Does asyncpg support executing multiple SQL statements in a single execute() call?

No, asyncpg's execute() method does NOT support executing multiple SQL statements in a single call. Each execute() call must contain exactly one SQL statement. For multiple statements, use transactions or multiple execute() calls.

Single Statement Only:

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# ✅ CORRECT: Single statement
await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'John', '[email protected]')

# ❌ WRONG: Multiple statements (will error)
try:
    await conn.execute("""
        INSERT INTO users (name, email) VALUES ('John', '[email protected]');
        INSERT INTO users (name, email) VALUES ('Jane', '[email protected]');
    """)
except asyncpg.exceptions.PostgresSyntaxError as e:
    print(f"Error: {e}")
    # Error: cannot insert multiple commands into a prepared statement

Why This Restriction Exists:

Asynchpg uses PostgreSQL's prepared statement protocol for performance and security. Prepared statements only support single SQL commands, not multiple statements separated by semicolons.

Alternative Approaches:

1. Use Transactions (Recommended for Atomicity):

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Execute multiple statements in a transaction
async with conn.transaction():
    # All statements succeed or all fail (atomic)
    await conn.execute(
        'INSERT INTO users (name, email) VALUES ($1, $2)',
        'John', '[email protected]'
    )
    
    await conn.execute(
        'INSERT INTO users (name, email) VALUES ($1, $2)',
        'Jane', '[email protected]'
    )
    
    await conn.execute(
        'UPDATE accounts SET balance = balance - $1 WHERE user_id = $2',
        100.00, 1
    )
    
    await conn.execute(
        'UPDATE accounts SET balance = balance + $1 WHERE user_id = $2',
        100.00, 2
    )
    
    # If any statement fails, entire transaction rolls back

2. Use executemany() for Batch Inserts:

# Insert multiple rows efficiently
users = [
    ('John', '[email protected]'),
    ('Jane', '[email protected]'),
    ('Bob', '[email protected]')
]

await conn.executemany(
    'INSERT INTO users (name, email) VALUES ($1, $2)',
    users  # List of tuples
)

# executemany with dicts and JSONB
products = [
    [1, 'Product 1', {'price': 19.99, 'stock': 100}],
    [2, 'Product 2', {'price': 29.99, 'stock': 50}],
    [3, 'Product 3', {'price': 39.99, 'stock': 25}]
]

await conn.executemany(
    'INSERT INTO products (id, name, metadata) VALUES ($1, $2, $3)',
    products
)

3. Multiple execute() Calls:

# Execute statements sequentially (no transaction)
await conn.execute('INSERT INTO users (name) VALUES ($1)', 'John')
await conn.execute('INSERT INTO users (name) VALUES ($1)', 'Jane')
await conn.execute('UPDATE stats SET user_count = user_count + 2')

# Note: If one fails, previous statements are NOT rolled back
# Use transaction if you need atomicity

4. Use fetch() or execute() with DO Blocks:

# PostgreSQL anonymous code block (single statement to PostgreSQL)
await conn.execute("""
    DO $$
    BEGIN
        INSERT INTO users (name, email) VALUES ('John', '[email protected]');
        INSERT INTO users (name, email) VALUES ('Jane', '[email protected]');
        UPDATE stats SET total_users = (SELECT COUNT(*) FROM users);
    END $$;
""")

# Note: Less efficient than transactions, harder to debug
# Use transactions instead for most cases

5. Stored Procedures/Functions:

# Create stored procedure
await conn.execute("""
    CREATE OR REPLACE FUNCTION create_user_and_account(
        p_name TEXT,
        p_email TEXT,
        p_initial_balance NUMERIC
    ) RETURNS INTEGER AS $$
    DECLARE
        v_user_id INTEGER;
    BEGIN
        INSERT INTO users (name, email)
        VALUES (p_name, p_email)
        RETURNING id INTO v_user_id;
        
        INSERT INTO accounts (user_id, balance)
        VALUES (v_user_id, p_initial_balance);
        
        RETURN v_user_id;
    END;
    $$ LANGUAGE plpgsql;
""")

# Call stored procedure (single execute)
user_id = await conn.fetchval(
    'SELECT create_user_and_account($1, $2, $3)',
    'John', '[email protected]', 100.00
)

Transaction Best Practices:

import asyncpg
from typing import List, Dict, Any

async def create_order_with_items(
    user_id: int,
    items: List[Dict[str, Any]]
) -> int:
    """
    Create order and order items in a transaction.
    Ensures atomicity: all inserts succeed or all fail.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        async with conn.transaction():
            # Step 1: Create order
            order_id = await conn.fetchval(
                'INSERT INTO orders (user_id, status) VALUES ($1, $2) RETURNING id',
                user_id,
                'pending'
            )
            
            # Step 2: Insert order items
            for item in items:
                await conn.execute(
                    'INSERT INTO order_items (order_id, product_id, quantity, price) '
                    'VALUES ($1, $2, $3, $4)',
                    order_id,
                    item['product_id'],
                    item['quantity'],
                    item['price']
                )
            
            # Step 3: Update inventory
            for item in items:
                await conn.execute(
                    'UPDATE products SET stock = stock - $1 WHERE id = $2',
                    item['quantity'],
                    item['product_id']
                )
            
            # Step 4: Calculate and store order total
            total = sum(item['quantity'] * item['price'] for item in items)
            await conn.execute(
                'UPDATE orders SET total = $1 WHERE id = $2',
                total,
                order_id
            )
            
            # All statements committed together
            return order_id
            
    except Exception as e:
        # Transaction automatically rolled back on exception
        print(f"Order creation failed: {e}")
        raise
        
    finally:
        await conn.close()

# Usage
order_id = await create_order_with_items(
    user_id=1,
    items=[
        {'product_id': 10, 'quantity': 2, 'price': 19.99},
        {'product_id': 11, 'quantity': 1, 'price': 29.99}
    ]
)

Nested Transactions (Savepoints):

async with conn.transaction():
    await conn.execute('INSERT INTO users (name) VALUES ($1)', 'John')
    
    # Nested transaction = savepoint
    async with conn.transaction():
        await conn.execute('INSERT INTO logs (message) VALUES ($1)', 'User created')
        
        # This can fail without rolling back outer transaction
        try:
            await conn.execute('INSERT INTO invalid_table (data) VALUES ($1)', 'test')
        except Exception:
            pass  # Inner transaction rolls back, outer continues
    
    # This still executes
    await conn.execute('UPDATE stats SET user_count = user_count + 1')

Error Handling:

import asyncpg

try:
    async with conn.transaction():
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'John', '[email protected]')
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'Jane', '[email protected]')
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'Bob', '[email protected]')  # Duplicate email
        
except asyncpg.UniqueViolationError as e:
    print(f"Unique constraint violated: {e}")
    # Transaction automatically rolled back
    # John and Jane were NOT inserted
    
except asyncpg.PostgresError as e:
    print(f"Database error: {e}")
    # Transaction automatically rolled back

Comparison Table:

Approach	Use Case	Atomic	Performance
Transaction	Related operations	✅ Yes	Fast
executemany()	Batch same-type inserts	❌ No*	Very fast
Multiple execute()	Unrelated operations	❌ No	Moderate
DO Block	Simple multi-statement	✅ Yes	Slow
Stored Procedure	Complex reusable logic	✅ Yes	Fast

*executemany() can be wrapped in transaction for atomicity

When to Use Each:

Transactions: Multiple related statements that must all succeed or fail together
executemany(): Inserting many rows with same structure (bulk inserts)
Multiple execute(): Independent operations that don't require atomicity
DO Block: Quick one-off multi-statement operations (development/migrations)
Stored Procedures: Complex business logic used repeatedly

Version Note: Single-statement restriction consistent across all asyncpg versions (0.10.0+)

Sources

magicstack.github.io magicstack.github.io postgresql.org

99% confidence

Does asyncpg support executing multiple SQL statements in a single execute() call?

Single Statement Only:

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# ✅ CORRECT: Single statement
await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'John', '[email protected]')

# ❌ WRONG: Multiple statements (will error)
try:
    await conn.execute("""
        INSERT INTO users (name, email) VALUES ('John', '[email protected]');
        INSERT INTO users (name, email) VALUES ('Jane', '[email protected]');
    """)
except asyncpg.exceptions.PostgresSyntaxError as e:
    print(f"Error: {e}")
    # Error: cannot insert multiple commands into a prepared statement

Why This Restriction Exists:

Asynchpg uses PostgreSQL's prepared statement protocol for performance and security. Prepared statements only support single SQL commands, not multiple statements separated by semicolons.

Alternative Approaches:

1. Use Transactions (Recommended for Atomicity):

import asyncpg

conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')

# Execute multiple statements in a transaction
async with conn.transaction():
    # All statements succeed or all fail (atomic)
    await conn.execute(
        'INSERT INTO users (name, email) VALUES ($1, $2)',
        'John', '[email protected]'
    )
    
    await conn.execute(
        'INSERT INTO users (name, email) VALUES ($1, $2)',
        'Jane', '[email protected]'
    )
    
    await conn.execute(
        'UPDATE accounts SET balance = balance - $1 WHERE user_id = $2',
        100.00, 1
    )
    
    await conn.execute(
        'UPDATE accounts SET balance = balance + $1 WHERE user_id = $2',
        100.00, 2
    )
    
    # If any statement fails, entire transaction rolls back

2. Use executemany() for Batch Inserts:

# Insert multiple rows efficiently
users = [
    ('John', '[email protected]'),
    ('Jane', '[email protected]'),
    ('Bob', '[email protected]')
]

await conn.executemany(
    'INSERT INTO users (name, email) VALUES ($1, $2)',
    users  # List of tuples
)

# executemany with dicts and JSONB
products = [
    [1, 'Product 1', {'price': 19.99, 'stock': 100}],
    [2, 'Product 2', {'price': 29.99, 'stock': 50}],
    [3, 'Product 3', {'price': 39.99, 'stock': 25}]
]

await conn.executemany(
    'INSERT INTO products (id, name, metadata) VALUES ($1, $2, $3)',
    products
)

3. Multiple execute() Calls:

# Execute statements sequentially (no transaction)
await conn.execute('INSERT INTO users (name) VALUES ($1)', 'John')
await conn.execute('INSERT INTO users (name) VALUES ($1)', 'Jane')
await conn.execute('UPDATE stats SET user_count = user_count + 2')

# Note: If one fails, previous statements are NOT rolled back
# Use transaction if you need atomicity

4. Use fetch() or execute() with DO Blocks:

# PostgreSQL anonymous code block (single statement to PostgreSQL)
await conn.execute("""
    DO $$
    BEGIN
        INSERT INTO users (name, email) VALUES ('John', '[email protected]');
        INSERT INTO users (name, email) VALUES ('Jane', '[email protected]');
        UPDATE stats SET total_users = (SELECT COUNT(*) FROM users);
    END $$;
""")

# Note: Less efficient than transactions, harder to debug
# Use transactions instead for most cases

5. Stored Procedures/Functions:

# Create stored procedure
await conn.execute("""
    CREATE OR REPLACE FUNCTION create_user_and_account(
        p_name TEXT,
        p_email TEXT,
        p_initial_balance NUMERIC
    ) RETURNS INTEGER AS $$
    DECLARE
        v_user_id INTEGER;
    BEGIN
        INSERT INTO users (name, email)
        VALUES (p_name, p_email)
        RETURNING id INTO v_user_id;
        
        INSERT INTO accounts (user_id, balance)
        VALUES (v_user_id, p_initial_balance);
        
        RETURN v_user_id;
    END;
    $$ LANGUAGE plpgsql;
""")

# Call stored procedure (single execute)
user_id = await conn.fetchval(
    'SELECT create_user_and_account($1, $2, $3)',
    'John', '[email protected]', 100.00
)

Transaction Best Practices:

import asyncpg
from typing import List, Dict, Any

async def create_order_with_items(
    user_id: int,
    items: List[Dict[str, Any]]
) -> int:
    """
    Create order and order items in a transaction.
    Ensures atomicity: all inserts succeed or all fail.
    """
    conn = await asyncpg.connect('postgresql://user:pass@localhost/mydb')
    
    try:
        async with conn.transaction():
            # Step 1: Create order
            order_id = await conn.fetchval(
                'INSERT INTO orders (user_id, status) VALUES ($1, $2) RETURNING id',
                user_id,
                'pending'
            )
            
            # Step 2: Insert order items
            for item in items:
                await conn.execute(
                    'INSERT INTO order_items (order_id, product_id, quantity, price) '
                    'VALUES ($1, $2, $3, $4)',
                    order_id,
                    item['product_id'],
                    item['quantity'],
                    item['price']
                )
            
            # Step 3: Update inventory
            for item in items:
                await conn.execute(
                    'UPDATE products SET stock = stock - $1 WHERE id = $2',
                    item['quantity'],
                    item['product_id']
                )
            
            # Step 4: Calculate and store order total
            total = sum(item['quantity'] * item['price'] for item in items)
            await conn.execute(
                'UPDATE orders SET total = $1 WHERE id = $2',
                total,
                order_id
            )
            
            # All statements committed together
            return order_id
            
    except Exception as e:
        # Transaction automatically rolled back on exception
        print(f"Order creation failed: {e}")
        raise
        
    finally:
        await conn.close()

# Usage
order_id = await create_order_with_items(
    user_id=1,
    items=[
        {'product_id': 10, 'quantity': 2, 'price': 19.99},
        {'product_id': 11, 'quantity': 1, 'price': 29.99}
    ]
)

Nested Transactions (Savepoints):

async with conn.transaction():
    await conn.execute('INSERT INTO users (name) VALUES ($1)', 'John')
    
    # Nested transaction = savepoint
    async with conn.transaction():
        await conn.execute('INSERT INTO logs (message) VALUES ($1)', 'User created')
        
        # This can fail without rolling back outer transaction
        try:
            await conn.execute('INSERT INTO invalid_table (data) VALUES ($1)', 'test')
        except Exception:
            pass  # Inner transaction rolls back, outer continues
    
    # This still executes
    await conn.execute('UPDATE stats SET user_count = user_count + 1')

Error Handling:

import asyncpg

try:
    async with conn.transaction():
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'John', '[email protected]')
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'Jane', '[email protected]')
        await conn.execute('INSERT INTO users (name, email) VALUES ($1, $2)', 'Bob', '[email protected]')  # Duplicate email
        
except asyncpg.UniqueViolationError as e:
    print(f"Unique constraint violated: {e}")
    # Transaction automatically rolled back
    # John and Jane were NOT inserted
    
except asyncpg.PostgresError as e:
    print(f"Database error: {e}")
    # Transaction automatically rolled back

Comparison Table:

Approach	Use Case	Atomic	Performance
Transaction	Related operations	✅ Yes	Fast
executemany()	Batch same-type inserts	❌ No*	Very fast
Multiple execute()	Unrelated operations	❌ No	Moderate
DO Block	Simple multi-statement	✅ Yes	Slow
Stored Procedure	Complex reusable logic	✅ Yes	Fast

*executemany() can be wrapped in transaction for atomicity

When to Use Each:

Transactions: Multiple related statements that must all succeed or fail together
executemany(): Inserting many rows with same structure (bulk inserts)
Multiple execute(): Independent operations that don't require atomicity
DO Block: Quick one-off multi-statement operations (development/migrations)
Stored Procedures: Complex business logic used repeatedly

Version Note: Single-statement restriction consistent across all asyncpg versions (0.10.0+)

Sources

magicstack.github.io magicstack.github.io postgresql.org

99% confidence

physical_replication

2 questions

What is the correct way to handle connection pooling with asyncpg in FastAPI lifespan events?

The correct way to handle asyncpg connection pooling in FastAPI is using the lifespan context manager (FastAPI 0.93+) or the deprecated startup/shutdown events. Lifespan creates the pool on startup, stores it in app.state, and closes it on shutdown.

Recommended: Lifespan Context Manager (FastAPI 0.93+)

from fastapi import FastAPI, Request
from contextlib import asynccontextmanager
import asyncpg
import os

# Define lifespan context manager
@asynccontextmanager
async def lifespan(app: FastAPI):
    """
    Manage database connection pool lifecycle.
    Runs once on startup and cleanup on shutdown.
    """
    # Startup: Create connection pool
    app.state.db_pool = await asyncpg.create_pool(
        host=os.getenv('DB_HOST', 'localhost'),
        port=int(os.getenv('DB_PORT', 5432)),
        user=os.getenv('DB_USER', 'postgres'),
        password=os.getenv('DB_PASSWORD'),
        database=os.getenv('DB_NAME', 'mydb'),
        
        # Pool configuration
        min_size=10,              # Minimum connections
        max_size=20,              # Maximum connections
        max_queries=50000,        # Queries per connection before recycling
        max_inactive_connection_lifetime=300,  # 5 minutes
        
        # Connection settings
        command_timeout=60,       # Command timeout (seconds)
        timeout=30,               # Connection timeout (seconds)
    )
    
    print(f"Database pool created: {app.state.db_pool.get_size()} connections")
    
    yield  # Application runs here
    
    # Shutdown: Close connection pool
    await app.state.db_pool.close()
    print("Database pool closed")

# Create FastAPI app with lifespan
app = FastAPI(lifespan=lifespan)

# Use pool in routes
@app.get("/users/{user_id}")
async def get_user(user_id: int, request: Request):
    """
    Use connection from pool via request.app.state.db_pool
    """
    async with request.app.state.db_pool.acquire() as conn:
        user = await conn.fetchrow(
            'SELECT id, name, email FROM users WHERE id = $1',
            user_id
        )
        
        if not user:
            return {"error": "User not found"}
        
        return dict(user)

@app.post("/users")
async def create_user(name: str, email: str, request: Request):
    async with request.app.state.db_pool.acquire() as conn:
        user_id = await conn.fetchval(
            'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
            name,
            email
        )
        return {"id": user_id, "name": name, "email": email}

Dependency Injection Pattern (Cleaner):

from fastapi import FastAPI, Depends, HTTPException
from contextlib import asynccontextmanager
import asyncpg
from typing import AsyncGenerator

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Create pool
    app.state.db_pool = await asyncpg.create_pool(
        'postgresql://user:pass@localhost/mydb',
        min_size=10,
        max_size=20
    )
    yield
    # Close pool
    await app.state.db_pool.close()

app = FastAPI(lifespan=lifespan)

# Dependency to get database connection
async def get_db_conn(
    request: Request
) -> AsyncGenerator[asyncpg.Connection, None]:
    """
    Dependency that yields database connection from pool.
    Automatically released after route completes.
    """
    async with request.app.state.db_pool.acquire() as conn:
        yield conn

# Use in routes with dependency injection
@app.get("/users/{user_id}")
async def get_user(
    user_id: int,
    conn: asyncpg.Connection = Depends(get_db_conn)
):
    user = await conn.fetchrow(
        'SELECT * FROM users WHERE id = $1',
        user_id
    )
    
    if not user:
        raise HTTPException(status_code=404, detail="User not found")
    
    return dict(user)

@app.post("/users")
async def create_user(
    name: str,
    email: str,
    conn: asyncpg.Connection = Depends(get_db_conn)
):
    user_id = await conn.fetchval(
        'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
        name, email
    )
    return {"id": user_id, "name": name, "email": email}

Transaction Support:

from fastapi import Depends, HTTPException
import asyncpg

async def get_db_transaction(
    conn: asyncpg.Connection = Depends(get_db_conn)
) -> asyncpg.Connection:
    """
    Dependency that wraps request in transaction.
    """
    async with conn.transaction():
        yield conn

@app.post("/orders")
async def create_order(
    user_id: int,
    items: list[dict],
    conn: asyncpg.Connection = Depends(get_db_transaction)
):
    """
    All database operations in this route run in a transaction.
    Automatically commits on success, rolls back on error.
    """
    # Create order
    order_id = await conn.fetchval(
        'INSERT INTO orders (user_id) VALUES ($1) RETURNING id',
        user_id
    )
    
    # Insert items
    for item in items:
        await conn.execute(
            'INSERT INTO order_items (order_id, product_id, quantity) VALUES ($1, $2, $3)',
            order_id, item['product_id'], item['quantity']
        )
    
    # If any operation fails, entire transaction rolls back
    return {"order_id": order_id}

Legacy: Startup/Shutdown Events (Deprecated)

# ⚠️ Deprecated in FastAPI 0.93+, use lifespan instead
from fastapi import FastAPI
import asyncpg

app = FastAPI()

@app.on_event("startup")
async def startup():
    app.state.db_pool = await asyncpg.create_pool(
        'postgresql://user:pass@localhost/mydb'
    )

@app.on_event("shutdown")
async def shutdown():
    await app.state.db_pool.close()

Pool Configuration Best Practices (2024):

Based on asyncpg documentation and production patterns:

pool = await asyncpg.create_pool(
    'postgresql://user:pass@localhost/mydb',
    
    # Pool size: 2-4x CPU cores typical
    min_size=10,              # Keep warm connections
    max_size=20,              # Limit based on PostgreSQL max_connections
    
    # Connection lifetime
    max_inactive_connection_lifetime=300,  # 5 min (default: 300s)
    max_queries=50000,        # Recycle after 50k queries
    
    # Timeouts
    command_timeout=60,       # Query timeout (seconds)
    timeout=30,               # Pool acquisition timeout
    
    # Connection initialization (optional)
    setup=async_setup_connection
)

async def async_setup_connection(conn):
    """
    Called for each new connection.
    Use for SET commands, custom types, etc.
    """
    await conn.execute('SET timezone TO "UTC"')

Performance Benefits:

According to FastAPI and asyncpg benchmarks (2024):

Without pooling: 50-100 ms connection overhead per request
With pooling: <1 ms connection acquisition from pool
Result: 50-100x faster database operations

Monitoring Pool Health:

@app.get("/health/db")
async def database_health(request: Request):
    pool = request.app.state.db_pool
    return {
        "status": "healthy",
        "pool_size": pool.get_size(),
        "pool_free": pool.get_idle_size(),
        "pool_max": pool.get_max_size(),
        "pool_min": pool.get_min_size()
    }

Version Note: Lifespan context manager recommended FastAPI 0.93+ (2023), asyncpg connection pooling since 0.10.0+

Sources

fastapi.tiangolo.com magicstack.github.io usamabjw.medium.com

99% confidence

What is the correct way to handle connection pooling with asyncpg in FastAPI lifespan events?

Recommended: Lifespan Context Manager (FastAPI 0.93+)

from fastapi import FastAPI, Request
from contextlib import asynccontextmanager
import asyncpg
import os

# Define lifespan context manager
@asynccontextmanager
async def lifespan(app: FastAPI):
    """
    Manage database connection pool lifecycle.
    Runs once on startup and cleanup on shutdown.
    """
    # Startup: Create connection pool
    app.state.db_pool = await asyncpg.create_pool(
        host=os.getenv('DB_HOST', 'localhost'),
        port=int(os.getenv('DB_PORT', 5432)),
        user=os.getenv('DB_USER', 'postgres'),
        password=os.getenv('DB_PASSWORD'),
        database=os.getenv('DB_NAME', 'mydb'),
        
        # Pool configuration
        min_size=10,              # Minimum connections
        max_size=20,              # Maximum connections
        max_queries=50000,        # Queries per connection before recycling
        max_inactive_connection_lifetime=300,  # 5 minutes
        
        # Connection settings
        command_timeout=60,       # Command timeout (seconds)
        timeout=30,               # Connection timeout (seconds)
    )
    
    print(f"Database pool created: {app.state.db_pool.get_size()} connections")
    
    yield  # Application runs here
    
    # Shutdown: Close connection pool
    await app.state.db_pool.close()
    print("Database pool closed")

# Create FastAPI app with lifespan
app = FastAPI(lifespan=lifespan)

# Use pool in routes
@app.get("/users/{user_id}")
async def get_user(user_id: int, request: Request):
    """
    Use connection from pool via request.app.state.db_pool
    """
    async with request.app.state.db_pool.acquire() as conn:
        user = await conn.fetchrow(
            'SELECT id, name, email FROM users WHERE id = $1',
            user_id
        )
        
        if not user:
            return {"error": "User not found"}
        
        return dict(user)

@app.post("/users")
async def create_user(name: str, email: str, request: Request):
    async with request.app.state.db_pool.acquire() as conn:
        user_id = await conn.fetchval(
            'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
            name,
            email
        )
        return {"id": user_id, "name": name, "email": email}

Dependency Injection Pattern (Cleaner):

from fastapi import FastAPI, Depends, HTTPException
from contextlib import asynccontextmanager
import asyncpg
from typing import AsyncGenerator

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Create pool
    app.state.db_pool = await asyncpg.create_pool(
        'postgresql://user:pass@localhost/mydb',
        min_size=10,
        max_size=20
    )
    yield
    # Close pool
    await app.state.db_pool.close()

app = FastAPI(lifespan=lifespan)

# Dependency to get database connection
async def get_db_conn(
    request: Request
) -> AsyncGenerator[asyncpg.Connection, None]:
    """
    Dependency that yields database connection from pool.
    Automatically released after route completes.
    """
    async with request.app.state.db_pool.acquire() as conn:
        yield conn

# Use in routes with dependency injection
@app.get("/users/{user_id}")
async def get_user(
    user_id: int,
    conn: asyncpg.Connection = Depends(get_db_conn)
):
    user = await conn.fetchrow(
        'SELECT * FROM users WHERE id = $1',
        user_id
    )
    
    if not user:
        raise HTTPException(status_code=404, detail="User not found")
    
    return dict(user)

@app.post("/users")
async def create_user(
    name: str,
    email: str,
    conn: asyncpg.Connection = Depends(get_db_conn)
):
    user_id = await conn.fetchval(
        'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING id',
        name, email
    )
    return {"id": user_id, "name": name, "email": email}

Transaction Support:

from fastapi import Depends, HTTPException
import asyncpg

async def get_db_transaction(
    conn: asyncpg.Connection = Depends(get_db_conn)
) -> asyncpg.Connection:
    """
    Dependency that wraps request in transaction.
    """
    async with conn.transaction():
        yield conn

@app.post("/orders")
async def create_order(
    user_id: int,
    items: list[dict],
    conn: asyncpg.Connection = Depends(get_db_transaction)
):
    """
    All database operations in this route run in a transaction.
    Automatically commits on success, rolls back on error.
    """
    # Create order
    order_id = await conn.fetchval(
        'INSERT INTO orders (user_id) VALUES ($1) RETURNING id',
        user_id
    )
    
    # Insert items
    for item in items:
        await conn.execute(
            'INSERT INTO order_items (order_id, product_id, quantity) VALUES ($1, $2, $3)',
            order_id, item['product_id'], item['quantity']
        )
    
    # If any operation fails, entire transaction rolls back
    return {"order_id": order_id}

Legacy: Startup/Shutdown Events (Deprecated)

# ⚠️ Deprecated in FastAPI 0.93+, use lifespan instead
from fastapi import FastAPI
import asyncpg

app = FastAPI()

@app.on_event("startup")
async def startup():
    app.state.db_pool = await asyncpg.create_pool(
        'postgresql://user:pass@localhost/mydb'
    )

@app.on_event("shutdown")
async def shutdown():
    await app.state.db_pool.close()

Pool Configuration Best Practices (2024):

Based on asyncpg documentation and production patterns:

pool = await asyncpg.create_pool(
    'postgresql://user:pass@localhost/mydb',
    
    # Pool size: 2-4x CPU cores typical
    min_size=10,              # Keep warm connections
    max_size=20,              # Limit based on PostgreSQL max_connections
    
    # Connection lifetime
    max_inactive_connection_lifetime=300,  # 5 min (default: 300s)
    max_queries=50000,        # Recycle after 50k queries
    
    # Timeouts
    command_timeout=60,       # Query timeout (seconds)
    timeout=30,               # Pool acquisition timeout
    
    # Connection initialization (optional)
    setup=async_setup_connection
)

async def async_setup_connection(conn):
    """
    Called for each new connection.
    Use for SET commands, custom types, etc.
    """
    await conn.execute('SET timezone TO "UTC"')

Performance Benefits:

According to FastAPI and asyncpg benchmarks (2024):

Without pooling: 50-100 ms connection overhead per request
With pooling: <1 ms connection acquisition from pool
Result: 50-100x faster database operations

Monitoring Pool Health:

@app.get("/health/db")
async def database_health(request: Request):
    pool = request.app.state.db_pool
    return {
        "status": "healthy",
        "pool_size": pool.get_size(),
        "pool_free": pool.get_idle_size(),
        "pool_max": pool.get_max_size(),
        "pool_min": pool.get_min_size()
    }

Version Note: Lifespan context manager recommended FastAPI 0.93+ (2023), asyncpg connection pooling since 0.10.0+

Sources

fastapi.tiangolo.com magicstack.github.io usamabjw.medium.com

99% confidence

Browse All Topics