production_debugging_patterns 16 Q&As

Production Debugging Patterns FAQ & Answers

16 expert Production Debugging Patterns answers researched from official documentation. Every answer cites authoritative sources you can verify.

unknown

16 questions
A

kill -15 (SIGTERM) is the graceful shutdown signal that allows Python processes to clean up (close connections, flush buffers, run finally blocks). Syntax: kill -15 <PID> or kill <PID> (SIGTERM is default). Process receives signal, executes cleanup handlers, exits normally. kill -9 (SIGKILL) immediately terminates process without cleanup - kernel forcibly ends execution, no signal handlers run, connections dropped, file descriptors leaked, database transactions uncommitted. Syntax: kill -9 <PID>. Only use when SIGTERM fails after 30+ seconds. pkill sends signals by process name pattern instead of PID. Syntax: pkill -15 uvicorn kills all uvicorn processes gracefully. Flags: -f (match full command line), -u user (specific user), -9 (force kill). Best practice for Python servers: (1) Try systemctl stop service (sends SIGTERM), (2) Wait 30 seconds, (3) systemctl kill -s SIGKILL service if still running. Never use kill -9 as first option - causes data corruption, orphaned child processes, resource leaks. For debugging: ps aux | grep uvicorn to find PID, kill -15 <PID>, verify with ps -p <PID> (returns nothing if dead).

99% confidence
A

TIME_WAIT is a TCP state where a socket remains allocated for 2×MSL (Maximum Segment Lifetime, typically 60-120 seconds) after connection close to ensure reliable delivery of final FIN/ACK packets and prevent delayed packets from being accepted by new connections using same port. When you kill a server process (e.g., uvicorn on port 8000), the OS keeps port in TIME_WAIT state. Running lsof -i :8000 shows no process, but ss -tuln | grep 8000 or netstat -tuln | grep 8000 shows TIME_WAIT state. Attempting uvicorn main:app --port 8000 fails with 'Address already in use' error. Solutions: (1) Wait 60-120 seconds for TIME_WAIT to expire (cleanest). (2) Enable SO_REUSEADDR socket option (most servers do this by default) - allows binding to TIME_WAIT port: uvicorn main:app --port 8000 works if server sets SO_REUSEADDR. (3) Use different port temporarily: uvicorn main:app --port 8001. (4) Force clear (not recommended): sudo sysctl -w net.ipv4.tcp_tw_recycle=1 (deprecated and unsafe). Check state: ss -tan | grep 8000 shows TIME_WAIT, ESTABLISHED, LISTEN states. Best practice: Always use SO_REUSEADDR in production servers (FastAPI/uvicorn enable this by default). TIME_WAIT prevents 'ghost packet' issues in high-traffic scenarios.

99% confidence
A

No. Deleting .pyc files (compiled bytecode in __pycache__/) does NOT affect already-running Python processes. Once a module is imported, Python stores it in sys.modules (in-memory module cache). The .pyc file is only used at import time to skip recompilation from .py source. Workflow: (1) First import: Python compiles .py → .pyc, loads into sys.modules. (2) Subsequent imports: Check sys.modules first (cache hit), return cached module without reading .pyc. (3) Process restart: sys.modules cleared, Python checks .pyc timestamp vs .py, recompiles if .py newer. Deleting .pyc while process runs has zero effect because sys.modules already populated. To reload changes in running process: (1) Module-level reload (limited): import importlib; importlib.reload(module) - only reloads that specific module, not dependencies, doesn't update existing references. (2) Full restart required: systemctl restart service or kill -15 <PID> && start - only reliable method. Example: Edit api/main.py, delete __pycache__/main.cpython-311.pyc, uvicorn still serves old code because sys.modules has cached version. Proper workflow: Edit .py files, restart server process. .pyc cleanup useful for: deployment hygiene (stale bytecode from deleted .py files), debugging import issues, but does not hot-reload code. For development hot-reload, use uvicorn main:app --reload (watches .py files, auto-restarts process on changes).

99% confidence
A

Three methods to verify loaded module version in running Python process: (1) Add debug endpoint to FastAPI app exposing module versions: @app.get('/debug/versions') def versions(): import mymodule, fastapi; return {'mymodule': mymodule.__version__, 'fastapi': fastapi.__version__, 'file': mymodule.__file__}. Access via curl http://localhost:8000/debug/versions. Shows exact version and file path loaded in memory. (2) Check process logs on startup: Most frameworks log imported module versions. For custom modules, add logging at module level: logger.info(f'Loaded mymodule {__version__} from {__file__}'). View logs: journalctl -u myservice -f (systemd) or pm2 logs (pm2). (3) Introspect running process with gdb/py-spy (advanced): py-spy dump --pid <PID> shows module file paths in stack traces. Useful for verifying code reload after deployment. Common issues: (1) Old code served after deployment → sys.modules cached old version, need restart. (2) Mixing virtual environments → wrong module path loaded. (3) sys.path pollution → shadowed imports. Best practice workflow: (1) Deploy new code via git pull. (2) Restart service: systemctl restart myservice. (3) Verify via debug endpoint or check logs for new version string. (4) Test functionality. For added safety: Include deployment timestamp or git commit hash in version string: __version__ = '1.2.3+abc123f', expose in /health endpoint. Essential for deployment verification in production.

99% confidence
A

Methods for running processes that survive SSH disconnection: & (background job): Runs process in background but still attached to shell session. Syntax: python script.py &. Process dies when SSH session ends (receives SIGHUP). Use for: quick background tasks within active session only. nohup (no hang up): Ignores SIGHUP signal, redirects stdout/stderr to nohup.out. Syntax: nohup python script.py &. Process survives SSH disconnection. Limitations: Cannot reconnect to process, output buffering may delay nohup.out writes. Use for: fire-and-forget scripts. disown: Removes job from shell's job table, prevents SIGHUP. Syntax: python script.py & disown or disown %1 (job ID). Process survives but no session control. Cannot reconnect. Use for: detaching accidentally started foreground process. screen (terminal multiplexer): Creates virtual terminal session that persists. Syntax: screen -S mysession, run process, Ctrl+A D to detach. Reconnect: screen -r mysession. Supports multiple windows, scrollback, session sharing. Survives SSH disconnection, full interactivity. tmux (modern alternative to screen): Similar to screen with better defaults. Syntax: tmux new -s mysession, Ctrl+B D to detach, tmux attach -t mysession to reconnect. Better pane management, status bar, configurability. systemd service (production standard): Define service in /etc/systemd/system/myapp.service, manage with systemctl start/stop/restart myapp. Auto-restart on failure, dependency management, logging to journald, proper process supervision. Best practices: Development: tmux/screen (interactive debugging). Production: systemd services (proper process management, logging, auto-restart). One-off scripts: nohup if no reconnection needed. Never use & alone for persistent processes. Example systemd service for Python app: [Service] Type=simple, ExecStart=/usr/bin/python3 /path/to/app.py, Restart=always. Essential for reliable long-running processes.

99% confidence
A

/etc/nginx/sites-available/ stores ALL nginx site configurations (enabled and disabled). Files here are not loaded by nginx automatically. /etc/nginx/sites-enabled/ contains symbolic links to configurations that nginx should actually load. Main nginx.conf includes: include /etc/nginx/sites-enabled/*; (only loads enabled sites). Workflow: (1) Create config in sites-available: sudo nano /etc/nginx/sites-available/myapp.conf. (2) Test syntax: sudo nginx -t. (3) Enable site by creating symlink: sudo ln -s /etc/nginx/sites-available/myapp.conf /etc/nginx/sites-enabled/. (4) Reload nginx: sudo nginx -s reload. (5) Disable site: sudo rm /etc/nginx/sites-enabled/myapp.conf (keeps config in sites-available for re-enabling later). (6) Re-enable: Re-create symlink. Benefits: Easy enable/disable without deleting configs, version control friendly (track sites-available only), clear separation of active vs inactive configs. Default files: default symlink in sites-enabled → sites-available/default. Best practice: (1) Store configs in sites-available with descriptive names (api.example.com.conf). (2) Use full paths in symlinks to avoid broken links. (3) Never edit files in sites-enabled directly (edit in sites-available). (4) Delete symlink to disable, not the source file. (5) Always run nginx -t before reload. Common error: Editing sites-enabled/default (a symlink), unintentionally modifying sites-available/default. Verify symlink: ls -la /etc/nginx/sites-enabled/ shows -> sites-available/file. Note: This structure is Debian/Ubuntu convention. RHEL/CentOS use /etc/nginx/conf.d/*.conf directly (no sites-available pattern). Essential nginx configuration management pattern.

99% confidence
A

postgresql:// is the official, standards-compliant PostgreSQL connection URL scheme per RFC 3896 and PostgreSQL documentation. Format: postgresql://user:password@host:port/database?options. postgres:// is a legacy alias supported for backward compatibility but not recommended. Both work identically in modern libraries (psycopg2, asyncpg, SQLAlchemy, Psycopg 3) due to aliasing, but postgresql:// is the canonical form. Connection string structure: postgresql://username:password@hostname:port/database_name?sslmode=require. Components: username (database user), password (URL-encoded if contains special chars), hostname (localhost, IP, or domain), port (5432 default, optional), database_name (target database), query parameters (sslmode, connect_timeout, application_name). Special character encoding: Password with @ symbol: postgresql://user:p%40ssword@host/db (@ → %40). Examples: Local: postgresql://postgres:pass@localhost/mydb, Remote with SSL: postgresql://user:[email protected]:5432/prod?sslmode=require, Unix socket: postgresql:///dbname?host=/var/run/postgresql, Asyncpg: postgresql://user:pass@host/db (identical syntax). Library support: SQLAlchemy (since 1.4): Warns if using postgres://, recommends postgresql://. Psycopg2/Psycopg 3: Both schemes work. Asyncpg: Accepts both. Best practice: Always use postgresql:// in new code for RFC compliance and clarity. Update legacy code gradually: SQLAlchemy provides automatic postgresql:// → postgresql:// translation. Environment variable: DATABASE_URL=postgresql://user:pass@host/db. Connection pooling: Add ?pool_size=10&max_overflow=20 for SQLAlchemy. Security: Never log connection strings (credentials exposed), use environment variables or secrets managers. Essential for database connection reliability.

99% confidence
A

nginx -s reload performs graceful reload: (1) Master process validates new configuration with nginx -t. (2) If valid, spawns new worker processes with new config. (3) Old workers finish serving current requests (no dropped connections). (4) Old workers shut down after completing requests. (5) New workers handle all new requests. Zero downtime, connections preserved. Syntax: nginx -s reload or systemctl reload nginx. Use for: Config changes (server blocks, upstreams, SSL certs), adding/removing sites, updating proxy settings, changing worker_processes count. nginx -s restart (via systemctl): (1) Stops all nginx processes (SIGTERM to workers, master). (2) Closes all connections immediately or after timeout. (3) Starts fresh nginx master + workers. Brief downtime (connections dropped). Syntax: systemctl restart nginx or nginx -s stop && nginx. Use for: nginx binary upgrade, complete reset needed, troubleshooting configuration lock issues, after editing main nginx.conf structure. Full stop: nginx -s quit (graceful, waits for workers) or nginx -s stop (immediate, SIGTERM). Workflow for config changes: (1) Edit config files. (2) Test syntax: nginx -t (critical step, prevents breaking production). (3) Reload: nginx -s reload or systemctl reload nginx. (4) Verify: curl http://localhost or systemctl status nginx. (5) Check logs if issues: tail -f /var/log/nginx/error.log. Error handling: If nginx -t fails, reload aborted, old config continues serving (safe). If reload fails mid-process, master attempts rollback to old config. Best practice: Always use reload for config changes (zero downtime). Only use restart for nginx binary upgrades or when reload fails. Monitor: nginx -V shows running version, ps aux | grep nginx shows worker processes with start times. Reload preserves: Active connections, worker uptime stats (until workers exit), file descriptor limits. Essential nginx operations pattern.

99% confidence
A

kill -15 (SIGTERM) is the graceful shutdown signal that allows Python processes to clean up (close connections, flush buffers, run finally blocks). Syntax: kill -15 <PID> or kill <PID> (SIGTERM is default). Process receives signal, executes cleanup handlers, exits normally. kill -9 (SIGKILL) immediately terminates process without cleanup - kernel forcibly ends execution, no signal handlers run, connections dropped, file descriptors leaked, database transactions uncommitted. Syntax: kill -9 <PID>. Only use when SIGTERM fails after 30+ seconds. pkill sends signals by process name pattern instead of PID. Syntax: pkill -15 uvicorn kills all uvicorn processes gracefully. Flags: -f (match full command line), -u user (specific user), -9 (force kill). Best practice for Python servers: (1) Try systemctl stop service (sends SIGTERM), (2) Wait 30 seconds, (3) systemctl kill -s SIGKILL service if still running. Never use kill -9 as first option - causes data corruption, orphaned child processes, resource leaks. For debugging: ps aux | grep uvicorn to find PID, kill -15 <PID>, verify with ps -p <PID> (returns nothing if dead).

99% confidence
A

TIME_WAIT is a TCP state where a socket remains allocated for 2×MSL (Maximum Segment Lifetime, typically 60-120 seconds) after connection close to ensure reliable delivery of final FIN/ACK packets and prevent delayed packets from being accepted by new connections using same port. When you kill a server process (e.g., uvicorn on port 8000), the OS keeps port in TIME_WAIT state. Running lsof -i :8000 shows no process, but ss -tuln | grep 8000 or netstat -tuln | grep 8000 shows TIME_WAIT state. Attempting uvicorn main:app --port 8000 fails with 'Address already in use' error. Solutions: (1) Wait 60-120 seconds for TIME_WAIT to expire (cleanest). (2) Enable SO_REUSEADDR socket option (most servers do this by default) - allows binding to TIME_WAIT port: uvicorn main:app --port 8000 works if server sets SO_REUSEADDR. (3) Use different port temporarily: uvicorn main:app --port 8001. (4) Force clear (not recommended): sudo sysctl -w net.ipv4.tcp_tw_recycle=1 (deprecated and unsafe). Check state: ss -tan | grep 8000 shows TIME_WAIT, ESTABLISHED, LISTEN states. Best practice: Always use SO_REUSEADDR in production servers (FastAPI/uvicorn enable this by default). TIME_WAIT prevents 'ghost packet' issues in high-traffic scenarios.

99% confidence
A

No. Deleting .pyc files (compiled bytecode in __pycache__/) does NOT affect already-running Python processes. Once a module is imported, Python stores it in sys.modules (in-memory module cache). The .pyc file is only used at import time to skip recompilation from .py source. Workflow: (1) First import: Python compiles .py → .pyc, loads into sys.modules. (2) Subsequent imports: Check sys.modules first (cache hit), return cached module without reading .pyc. (3) Process restart: sys.modules cleared, Python checks .pyc timestamp vs .py, recompiles if .py newer. Deleting .pyc while process runs has zero effect because sys.modules already populated. To reload changes in running process: (1) Module-level reload (limited): import importlib; importlib.reload(module) - only reloads that specific module, not dependencies, doesn't update existing references. (2) Full restart required: systemctl restart service or kill -15 <PID> && start - only reliable method. Example: Edit api/main.py, delete __pycache__/main.cpython-311.pyc, uvicorn still serves old code because sys.modules has cached version. Proper workflow: Edit .py files, restart server process. .pyc cleanup useful for: deployment hygiene (stale bytecode from deleted .py files), debugging import issues, but does not hot-reload code. For development hot-reload, use uvicorn main:app --reload (watches .py files, auto-restarts process on changes).

99% confidence
A

Three methods to verify loaded module version in running Python process: (1) Add debug endpoint to FastAPI app exposing module versions: @app.get('/debug/versions') def versions(): import mymodule, fastapi; return {'mymodule': mymodule.__version__, 'fastapi': fastapi.__version__, 'file': mymodule.__file__}. Access via curl http://localhost:8000/debug/versions. Shows exact version and file path loaded in memory. (2) Check process logs on startup: Most frameworks log imported module versions. For custom modules, add logging at module level: logger.info(f'Loaded mymodule {__version__} from {__file__}'). View logs: journalctl -u myservice -f (systemd) or pm2 logs (pm2). (3) Introspect running process with gdb/py-spy (advanced): py-spy dump --pid <PID> shows module file paths in stack traces. Useful for verifying code reload after deployment. Common issues: (1) Old code served after deployment → sys.modules cached old version, need restart. (2) Mixing virtual environments → wrong module path loaded. (3) sys.path pollution → shadowed imports. Best practice workflow: (1) Deploy new code via git pull. (2) Restart service: systemctl restart myservice. (3) Verify via debug endpoint or check logs for new version string. (4) Test functionality. For added safety: Include deployment timestamp or git commit hash in version string: __version__ = '1.2.3+abc123f', expose in /health endpoint. Essential for deployment verification in production.

99% confidence
A

Methods for running processes that survive SSH disconnection: & (background job): Runs process in background but still attached to shell session. Syntax: python script.py &. Process dies when SSH session ends (receives SIGHUP). Use for: quick background tasks within active session only. nohup (no hang up): Ignores SIGHUP signal, redirects stdout/stderr to nohup.out. Syntax: nohup python script.py &. Process survives SSH disconnection. Limitations: Cannot reconnect to process, output buffering may delay nohup.out writes. Use for: fire-and-forget scripts. disown: Removes job from shell's job table, prevents SIGHUP. Syntax: python script.py & disown or disown %1 (job ID). Process survives but no session control. Cannot reconnect. Use for: detaching accidentally started foreground process. screen (terminal multiplexer): Creates virtual terminal session that persists. Syntax: screen -S mysession, run process, Ctrl+A D to detach. Reconnect: screen -r mysession. Supports multiple windows, scrollback, session sharing. Survives SSH disconnection, full interactivity. tmux (modern alternative to screen): Similar to screen with better defaults. Syntax: tmux new -s mysession, Ctrl+B D to detach, tmux attach -t mysession to reconnect. Better pane management, status bar, configurability. systemd service (production standard): Define service in /etc/systemd/system/myapp.service, manage with systemctl start/stop/restart myapp. Auto-restart on failure, dependency management, logging to journald, proper process supervision. Best practices: Development: tmux/screen (interactive debugging). Production: systemd services (proper process management, logging, auto-restart). One-off scripts: nohup if no reconnection needed. Never use & alone for persistent processes. Example systemd service for Python app: [Service] Type=simple, ExecStart=/usr/bin/python3 /path/to/app.py, Restart=always. Essential for reliable long-running processes.

99% confidence
A

/etc/nginx/sites-available/ stores ALL nginx site configurations (enabled and disabled). Files here are not loaded by nginx automatically. /etc/nginx/sites-enabled/ contains symbolic links to configurations that nginx should actually load. Main nginx.conf includes: include /etc/nginx/sites-enabled/*; (only loads enabled sites). Workflow: (1) Create config in sites-available: sudo nano /etc/nginx/sites-available/myapp.conf. (2) Test syntax: sudo nginx -t. (3) Enable site by creating symlink: sudo ln -s /etc/nginx/sites-available/myapp.conf /etc/nginx/sites-enabled/. (4) Reload nginx: sudo nginx -s reload. (5) Disable site: sudo rm /etc/nginx/sites-enabled/myapp.conf (keeps config in sites-available for re-enabling later). (6) Re-enable: Re-create symlink. Benefits: Easy enable/disable without deleting configs, version control friendly (track sites-available only), clear separation of active vs inactive configs. Default files: default symlink in sites-enabled → sites-available/default. Best practice: (1) Store configs in sites-available with descriptive names (api.example.com.conf). (2) Use full paths in symlinks to avoid broken links. (3) Never edit files in sites-enabled directly (edit in sites-available). (4) Delete symlink to disable, not the source file. (5) Always run nginx -t before reload. Common error: Editing sites-enabled/default (a symlink), unintentionally modifying sites-available/default. Verify symlink: ls -la /etc/nginx/sites-enabled/ shows -> sites-available/file. Note: This structure is Debian/Ubuntu convention. RHEL/CentOS use /etc/nginx/conf.d/*.conf directly (no sites-available pattern). Essential nginx configuration management pattern.

99% confidence
A

postgresql:// is the official, standards-compliant PostgreSQL connection URL scheme per RFC 3896 and PostgreSQL documentation. Format: postgresql://user:password@host:port/database?options. postgres:// is a legacy alias supported for backward compatibility but not recommended. Both work identically in modern libraries (psycopg2, asyncpg, SQLAlchemy, Psycopg 3) due to aliasing, but postgresql:// is the canonical form. Connection string structure: postgresql://username:password@hostname:port/database_name?sslmode=require. Components: username (database user), password (URL-encoded if contains special chars), hostname (localhost, IP, or domain), port (5432 default, optional), database_name (target database), query parameters (sslmode, connect_timeout, application_name). Special character encoding: Password with @ symbol: postgresql://user:p%40ssword@host/db (@ → %40). Examples: Local: postgresql://postgres:pass@localhost/mydb, Remote with SSL: postgresql://user:[email protected]:5432/prod?sslmode=require, Unix socket: postgresql:///dbname?host=/var/run/postgresql, Asyncpg: postgresql://user:pass@host/db (identical syntax). Library support: SQLAlchemy (since 1.4): Warns if using postgres://, recommends postgresql://. Psycopg2/Psycopg 3: Both schemes work. Asyncpg: Accepts both. Best practice: Always use postgresql:// in new code for RFC compliance and clarity. Update legacy code gradually: SQLAlchemy provides automatic postgresql:// → postgresql:// translation. Environment variable: DATABASE_URL=postgresql://user:pass@host/db. Connection pooling: Add ?pool_size=10&max_overflow=20 for SQLAlchemy. Security: Never log connection strings (credentials exposed), use environment variables or secrets managers. Essential for database connection reliability.

99% confidence
A

nginx -s reload performs graceful reload: (1) Master process validates new configuration with nginx -t. (2) If valid, spawns new worker processes with new config. (3) Old workers finish serving current requests (no dropped connections). (4) Old workers shut down after completing requests. (5) New workers handle all new requests. Zero downtime, connections preserved. Syntax: nginx -s reload or systemctl reload nginx. Use for: Config changes (server blocks, upstreams, SSL certs), adding/removing sites, updating proxy settings, changing worker_processes count. nginx -s restart (via systemctl): (1) Stops all nginx processes (SIGTERM to workers, master). (2) Closes all connections immediately or after timeout. (3) Starts fresh nginx master + workers. Brief downtime (connections dropped). Syntax: systemctl restart nginx or nginx -s stop && nginx. Use for: nginx binary upgrade, complete reset needed, troubleshooting configuration lock issues, after editing main nginx.conf structure. Full stop: nginx -s quit (graceful, waits for workers) or nginx -s stop (immediate, SIGTERM). Workflow for config changes: (1) Edit config files. (2) Test syntax: nginx -t (critical step, prevents breaking production). (3) Reload: nginx -s reload or systemctl reload nginx. (4) Verify: curl http://localhost or systemctl status nginx. (5) Check logs if issues: tail -f /var/log/nginx/error.log. Error handling: If nginx -t fails, reload aborted, old config continues serving (safe). If reload fails mid-process, master attempts rollback to old config. Best practice: Always use reload for config changes (zero downtime). Only use restart for nginx binary upgrades or when reload fails. Monitor: nginx -V shows running version, ps aux | grep nginx shows worker processes with start times. Reload preserves: Active connections, worker uptime stats (until workers exit), file descriptor limits. Essential nginx operations pattern.

99% confidence