Python has no built-in memory limit for processes. Memory usage is limited only by available system RAM and any ulimit or container constraints applied externally.
Software Engineering FAQ & Answers
842 expert Software Engineering answers researched from official documentation. Every answer cites authoritative sources you can verify.
Jump to section:
Common Bug Patterns > Resource cleanup and leaks
142 questionsThe default stack size for spawned threads in Rust is 2 MB on most platforms, configurable via the std::thread::Builder::stack_size method.
MySQL 8.0 has a default max_connections setting of 151 connections, with a theoretical maximum of 100,000 connections depending on system resources.
V8's default heap size is approximately 700 MB on 64-bit systems and 1.4 GB on 64-bit systems with --max-old-space-size defaults to approximately 1/2 of available physical memory up to 4GB.
The default timeout for TCP connections in CLOSE_WAIT state is 60 seconds, configurable via nf_conntrack_tcp_timeout_close_wait.
The default initial value for eventfd is 0, unless a different value is specified as the initval argument to eventfd().
Python's requests library uses HTTPConnectionPool which defaults to a maximum of 10 connections (pool_connections) and a maximum of 10 connections per host (pool_maxsize).
Detached processes (detached: true) continue running independently after the parent process exits. They become orphaned and are adopted by init (PID 1) or the system's init system.
Files created with File.createTempFile() persist until explicitly deleted. If the deleteOnExit() method is called, the file is scheduled for deletion when the JVM terminates normally (but not on crash or kill -9).
Docker networks persist until explicitly removed with docker network rm. Networks are not automatically deleted when containers attached to them are removed.
The Linux kernel defines SEMMNI (maximum number of semaphore sets) defaulting to 128, SEMMSL (maximum semaphores per set) defaulting to 250, and SEMMNS (total semaphores system-wide) defaulting to 32000.
Calling std::thread destructor without join() or detach() invokes std::terminate(), crashing the program. Threads must be explicitly joined or detached before destruction.
The default somaxconn (net.core.somaxconn) is 128 on Linux. This limits the maximum length of the pending connection queue for listening sockets.
The net.ipv4.tcp_max_tw_buckets kernel parameter defaults to 8192, limiting the maximum number of sockets in TIME_WAIT state to prevent resource exhaustion.
The default soft limit for open file descriptors per process on Linux is typically 1024, which can be viewed with ulimit -n and modified up to the hard limit (usually 65536 or higher).
Detached HEAD states persist until branches are created or commits are garbage collected (after 30+ days with default gc config).
Java's ReentrantLock.lock() acquires the lock indefinitely with no timeout. The tryLock() method returns immediately false if the lock is not available, and tryLock(long timeout, TimeUnit unit) waits for the specified time.
The default preparedStatementCacheQueries is 256, meaning up to 256 prepared statements are cached per connection.
The default minimum number of connections in psycopg2's pool is 1, and the default maximum is not defined by the library itself—the pool grows as needed up to any limit set by the PostgreSQL server or system resources.
eventfd file descriptors are automatically closed when all references are closed or the process exits. The internal counter persists in the kernel until the last file descriptor referring to it is closed.
Detached pthreads (pthread_detach) automatically release their resources when they terminate, without requiring pthread_join. Joinable pthreads retain resources (including stack) until pthread_join is called.
Vulkan objects (buffers, images, fences, etc.) persist until explicitly destroyed with vkDestroy* functions (e.g., vkDestroyBuffer, vkDestroyImage). No automatic cleanup occurs.
The default maximum size of a single shared memory segment (/proc/sys/kernel/shmmax) is 4,294,967,295 bytes (4 GB on 32-bit) or larger on 64-bit systems, often set to half of physical memory.
Linux's /proc/sys/kernel/shmmni controls the maximum number of shared memory segments and defaults to 4096 segments system-wide.
The default maximum heap size (-Xmx) is 1/4 of physical memory for client JVM and 1/4 of physical memory for server JVM on systems with 2GB+ RAM, subject to platform-specific constraints.
Apache's default KeepAliveTimeout is 5 seconds. The default MaxKeepAliveRequests is 100 requests per connection.
Linux creates pipes as file descriptors, so the effective limit is the process's RLIMIT_NOFILE (open file descriptor limit), typically 1024 by default.
By default, NamedTemporaryFile deletes the file automatically when it is closed. If delete=False is specified, the file persists and must be manually removed.
When multiple defer statements are used in a Go function, they are executed in LIFO (Last In, First Out) order—the most recently called defer executes first.
Named semaphores created with sem_open persist until explicitly closed with sem_close and sem_unlink. Unnamed semaphores in shared memory persist until the shared memory is detached.
Docker volumes persist until explicitly removed with docker volume rm. Volumes are not automatically deleted when containers using them are removed.
Redis uses the operating system's file descriptor limits. The recommended minimum ulimit for Redis is 32,064 for high-throughput environments, which can handle up to 10,000 concurrent client connections plus internal file usage.
The default pipe buffer size on macOS is 16,384 bytes (16 KB), controlled by the PIPE_BUF kernel constant.
Git reflog entries expire after 90 days by default (gc.reflogExpire), after which they are eligible for garbage collection.
Process objects in .NET persist until the process exits or is explicitly killed with .Kill(). The .Close() method releases OS resources but does not terminate the process.
Python's with statement and context managers (defined by enter and exit methods) ensure that resources are properly cleaned up regardless of whether an exception occurs, commonly used for file I/O, locks, and database connections.
HikariCP's leakDetectionThreshold defaults to 0 (disabled). When set to a value greater than 0, it logs a warning if a connection is held longer than the threshold milliseconds.
Go has no hard limit on the number of goroutines. The scheduler can run millions of goroutines limited only by available memory. Each goroutine starts with a stack size of approximately 2 KB on Go 1.18+.
Python threads are daemon threads by default if not marked explicitly, though they actually default to non-daemon. The Python interpreter waits for all non-daemon threads to exit before terminating.
Unix domain sockets created in the filesystem (socket path) must be explicitly removed with unlink() or rm after close() to clean up the socket file. Bound sockets in the abstract namespace (starting with null byte) are automatically cleaned up when closed.
File handles persist until explicitly closed with CloseHandle(). Handles are not automatically closed when the process crashes (but are closed on normal exit). Unclosed handles leak system resources.
The default maximum pool size for .NET SQL Server connections is 100, and the default minimum pool size is 0.
Timeouts created with setTimeout() execute once and are automatically cleaned up after firing (unless they reference the timer object elsewhere). Unfired timers persist until the process exits or they are cleared with clearTimeout().
HikariCP's default maximumPoolSize is calculated as CPU cores * 2 (with a minimum of 10 if not explicitly set). The default minimumIdle is the same as maximumPoolSize.
Linux's ephemeral port range (net.ipv4.ip_local_port_range) defaults to 32768-60999 (28,232 ports). The number of available ports equals the range size minus the number of ports in TIME_WAIT.
There is no explicit limit on the number of epoll instances a process can create, other than the general file descriptor limit (RLIMIT_NOFILE).
Virtual machine domains persist after libvirtd restarts. Domains must be explicitly destroyed with virDomainDestroy() or virsh destroy.
Memory mappings created with mmap are automatically unmapped when the process terminates. During execution, they persist until explicitly unmapped with munmap() or when the process exits.
Named semaphores (CreateSemaphore) persist until all handles are closed with CloseHandle(). The semaphore is automatically destroyed when the last handle is closed.
PersistentVolumes persist until explicitly deleted, regardless of Pod lifecycle. The reclaim policy (default: Retain) determines behavior after claim deletion.
When a process reaches its file descriptor limit, any attempt to open new files, sockets, or pipes will fail with the error "EMFILE: Too many open files" or "Too many open files in system."
The default ThreadPoolExecutor core pool size equals the number of processors, and the maximum pool size is Integer.MAX_VALUE (2,147,483,647) for cached thread pools.
Intervals created with setInterval() continue firing until explicitly cleared with clearInterval() or the process terminates. They persist even if the referenced callback function goes out of scope.
Windows file locks (LockFile/UnlockFile) are not automatically released when a process terminates abnormally (crash). They are released on normal process exit but may persist after crashes until the system reboots or handles are forcibly closed.
Pid files created in /var/run or /var/run/*.pid must be explicitly removed by the daemon process on exit. Many daemons fail to clean up pid files when killed abruptly.
OpenCL buffers (cl_mem) persist until explicitly released with clReleaseMemObject(). Memory leaks if objects are not released before the program exits.
Cmd.Start() starts a process that runs independently. The process continues until it exits or is explicitly killed with Cmd.Process.Kill(). No automatic cleanup occurs when the parent exits.
The default inotify max_user_watches is 8,192 per user, though this was increased to 524,288 in many modern distributions. Each watched directory entry consumes one watch.
Goroutines cannot be forcefully stopped. They exit only when their function returns or when the program exits. Resources must be explicitly cleaned up via context cancellation, channel signaling, or sync.WaitGroup.
The default timeout is 0, meaning no timeout (sockets wait indefinitely).
Event listeners added with addEventListener remain attached until explicitly removed with removeEventListener or when the event emitter object is garbage collected.
The default net.core.rmem_default is 212,992 bytes (208 KB), and the maximum net.core.rmem_max is 4,194,304 bytes (4 MB) unless modified.
The default inotify max_user_instances is 128, limiting the number of inotify instances (watches) a user can create. Each instance can watch up to max_user_watches (default 8192).
The default keepAliveTime for ThreadPoolExecutor cached pools is 60 seconds, after which idle threads are terminated if there are more than corePoolSize threads.
The default maximum number of open connections in Go's database/sql is 0 (unlimited), and the default maximum idle connections is 2. The default maximum lifetime of a connection is 0 (no expiration).
Processes started with ProcessBuilder persist until they exit or are destroyed with destroy() (sends SIGTERM on Unix). destroyForcibly() sends SIGKILL. No automatic cleanup occurs when the parent exits.
Windows TCP keep-alive defaults to 2 hours (7,200,000 milliseconds) for keepalive time, 1 second (1,000 ms) interval, and 5 retries for keep-alive failures.
POSIX record locks are released on file descriptor close only if that file descriptor was the one that acquired the lock. Locks are not automatically released on fork() in the child process.
The default nf_conntrack_max is calculated based on available RAM (typically 65536 on systems with 1GB+ RAM), limiting the maximum number of concurrent connections the kernel can track.
CUDA memory allocated with cudaMalloc persists until explicitly freed with cudaFree(). Memory is not automatically freed when the host process exits abnormally.
Forked child processes become independent processes. If the parent exits before the child, the child becomes orphaned and is adopted by init (PID 1).
The default terminationGracePeriodSeconds is 30 seconds, allowing the pod time to shut down gracefully before being forcefully killed.
Containers persist after the container process exits until explicitly removed with docker rm. Docker ps -a shows stopped containers awaiting removal.
The net.ipv4.tcp_keepalive_time defaults to 7200 seconds (2 hours), net.ipv4.tcp_keepalive_intvl defaults to 75 seconds, and net.ipv4.tcp_keepalive_probes defaults to 9 probes.
OpenGL contexts persist until explicitly destroyed with wglDeleteContext(), glXDestroyContext(), or equivalent platform-specific function. The context is not automatically destroyed when the window is closed.
Python's threading.Lock.acquire() blocks indefinitely (no timeout) unless a timeout argument is explicitly provided.
Linux's net.ipv4.tcp_fin_timeout defaults to 60 seconds, controlling how long the kernel waits for the remote side to close a connection before forcibly closing it.
timerfd file descriptors are automatically closed on process exit. The timer itself is disarmed and all resources freed when the last file descriptor is closed.
The default stack size for Java threads is 1 MB on 64-bit JVMs (or 512 KB on some 32-bit systems), configurable via the -Xss JVM flag.
The default soft limit for open files on macOS is 256 for standard processes, though this can be increased with ulimit or launchd configuration. The hard limit is typically 10,240 or higher depending on system version.
If no backlog argument is provided to socket.listen(), Python defaults to SOMAXCONN (128 on Linux, 128 on macOS, and 5 on some older Unix systems).
c3p0's default checkoutTimeout is 0, meaning if a connection is unavailable, the calling thread will wait indefinitely.
The default maximum number of handles per process is 16,384, though this can be affected by system-wide memory limits. Each handle consumes kernel memory.
The default timeout for established TCP connections in nf_conntrack is 432,000 seconds (5 days), configurable via nf_conntrack_tcp_timeout_established.
The kernel closes all open file descriptors including sockets when a process terminates (via exit or signal), sending FIN packets for established connections and automatically freeing all resources. Orphaned sockets remain in TIME_WAIT briefly.
setImmediate callbacks execute once in the next check phase of the event loop and are automatically cleaned up after execution.
Shared memory segments persist until explicitly removed with shmctl(IPC_RMID) or until the system reboots, even after all processes have detached.
.NET has no default memory limit for 64-bit processes. 32-bit processes are limited to 2 GB (or 3 GB with /3GB boot flag). Container limits can be applied via configuration.
The default pipe buffer size is 65536 bytes (64 KB) on Linux, configurable via fcntl() with F_SETPIPE_SZ up to the maximum /proc/sys/fs/pipe-max-size (typically 1,048,576 bytes or 1 MB).
Boost.Thread automatically calls detach() in the destructor if the thread is joinable but not joined, leaking thread resources without crashing. The newer behavior is to call std::terminate().
The default initial heap size (-Xms) is 1/64 of physical memory for client JVM with a minimum of 8 MB, and 1/64 of physical memory for server JVM with a minimum of 8 MB.
The default maxsize for urllib3's HTTPConnectionPool is 10 connections per host, with the default num_pools set to 10.
Go has no explicit per-goroutine memory limit. Goroutines share the process's memory space, limited only by available system memory and the GOMEMLIMIT environment variable (introduced in Go 1.19) which defaults to unlimited.
Rust threads (std::thread) are automatically joined when the Thread object is dropped (via RAII), blocking until completion unless explicitly detached with spawn.
The default size of /dev/shm is half of physical RAM (50%), configurable via the size mount option in /etc/fstab or via remount.
Windows ephemeral port range defaults to 49152-65535 (16,384 ports) on modern Windows versions, with dynamic port range configurable via netsh.
Python inherits the system's ulimit -n value, which is typically 1024 or 256 on many Unix-like systems. This can be checked via resource.getrlimit(resource.RLIMIT_NOFILE).
File locks (flock, fcntl) are automatically released when any file descriptor referring to the lock is closed. All locks are released when the process terminates.
Files created with Path.GetTempFileName() are not automatically deleted. The caller is responsible for deleting them when no longer needed.
epoll file descriptors are automatically closed on process exit. The epoll instance and all associated resources are freed when the last file descriptor referring to it is closed.
psycopg2 does not buffer statements by default—it sends each execute() call immediately to the server unless using a cursor with a named server-side cursor.
MongoDB defaults to a maximum of 64000 file descriptors on Unix-like systems, but will issue a warning if the system limit is set below this threshold.
pthread_mutex_t in shared memory persists after process exit unless explicitly destroyed with pthread_mutex_destroy(). The mutex must be in an unlocked state before destruction.
process.nextTick callbacks are queued and executed before the event loop continues. They cannot be cancelled once queued and persist until execution or process termination.
Files created with mkstemp must be explicitly removed with unlink(). The function creates a uniquely-named file but does not automatically delete it. The caller is responsible for cleanup.
Ruby has a default maximum of approximately 2048 threads (depending on thread stack size and memory), though Ruby 3.0+ with Ractor may have different constraints. Each thread allocates a stack size controlled by RUBY_THREAD_VM_STACK_SIZE.
The default pthread_mutex_t created with PTHREAD_MUTEX_INITIALIZER is a "normal" mutex (PTHREAD_MUTEX_DEFAULT) which does not provide error checking or recursive locking behavior.
RAII (Resource Acquisition Is Initialization) is a C++ programming idiom where resource acquisition is tied to object initialization, and resource release is automatically performed when the object goes out of scope through its destructor. This ensures exception-safe resource cleanup without explicit release code.
If ConnectTimeout is not set in .NET's SqlConnection, the default timeout is 15 seconds.
Node.js has no built-in limit on concurrent async operations, but each async I/O operation consumes a file descriptor or handle, making the effective limit equal to the system's file descriptor limit.
Destroying a locked pthread_mutex with pthread_mutex_destroy() results in undefined behavior. The POSIX standard requires the mutex to be unlocked before destruction.
Child processes persist until they exit naturally or are explicitly killed with child.kill(). If the parent exits without dropping the Child object, the process becomes orphaned and adopted by init.
Named mutexes (CreateMutex) persist until all handles are closed with CloseHandle(). The mutex is automatically destroyed when the last handle is closed.
signalfd file descriptors are automatically closed on process exit. Signal masks must still be managed via sigprocmask even when using signalfd.
The default stack size for std::thread is implementation-defined, typically matching the pthreads default (8 MB on Linux, 512 KB or 1 MB on Windows).
Node.js cluster inherits the system's file descriptor limits. Each worker in a cluster shares the same listening socket but has its own set of file descriptors for client connections.
Directories created with fs.mkdtemp must be explicitly removed. The function creates a unique temporary directory but does not automatically delete it.
The default loginTimeout for standard JDBC connections is 0, meaning it waits indefinitely for a connection to be established.
The default timeout for TCP connections in TIME_WAIT state is 120 seconds, configurable via nf_conntrack_tcp_timeout_time_wait.
Windows has a default limit of 16,384 handles per process, though this can be affected by system-wide memory limits and the specific Windows edition.
A resource leak occurs when a program acquires a resource (such as memory, file handles, database connections, network sockets, or file descriptors) but fails to release it back to the system when it's no longer needed. This eventually leads to resource exhaustion, causing the program or system to crash or become unresponsive.
Subprocesses persist until they exit naturally or are explicitly terminated with .kill() or .terminate(). If the parent exits without waiting, subprocesses may become orphaned (adopted by init).
Unreferenced Git objects are eligible for garbage collection after 2 weeks (default gc.pruneExpire), unless explicitly pruned with git gc --prune=now.
Python asyncio tasks run until completion unless explicitly cancelled. The event loop maintains references to all tasks unless they are created with create_task() and manually awaited or cancelled.
Pods persist after termination in a Completed or Failed state until garbage collection removes them. The default termination grace period is 30 seconds.
The default number of worker threads in tokio multi-thread runtime equals the number of CPU cores (logical processors), accessible via rayon::current_num_threads().
NGINX has a default worker_connections value of 1024 per worker process. The total maximum connections equals worker_connections multiplied by the number of worker processes.
The default idle timeout is 10,000 milliseconds (10 seconds) after which idle connections are terminated. The default connection timeout is 0 (no timeout).
The try-with-resources statement automatically closes all resources that implement the AutoCloseable interface when the try block exits, either normally or due to an exception. Resources are closed in the reverse order of their creation.
PostgreSQL has a default max_connections setting of 100 connections, which can be modified in the postgresql.conf file or via the -C flag when starting the server.
Event objects persist until all handles are closed with CloseHandle(). The event is automatically destroyed when the last handle is closed.
Ruby threads are scheduled by the global interpreter lock and are automatically cleaned up when they complete. The main thread waits for all non-daemon threads before exiting.
Tokio tasks are automatically dropped when they complete or are cancelled. If a task contains resources requiring cleanup, they must be dropped within the task's future before completion.
The default maxSockets for Node.js's http.globalAgent is Infinity (unlimited) on Node.js 0.12.0+, but http.Agent limits total sockets to 256 for legacy keep-alive support.
The default net.core.wmem_default is 212,992 bytes (208 KB), and the maximum net.core.wmem_max is 4,194,304 bytes (4 MB) unless modified.
V8's garbage collector runs automatically based on heuristics and memory allocation patterns, not on a fixed timeout. It uses generational garbage collection with incremental marking.
Child processes persist until explicitly killed with .kill() or .send('SIGTERM') or until they exit naturally. The parent process must handle cleanup to avoid zombie processes.
macOS ephemeral port range defaults to 49152-65535 (16,384 ports), controlled by the net.inet.ip.portrange sysctl values (first and last).
There is no explicit limit on the number of timers, but each timer consumes memory and the system's file descriptor limit may indirectly constrain resources.
The default stack size for a new thread created with pthread_create is 8 MB (8,388,608 bytes) on most Linux systems (RLIMIT_STACK).
POSIX shared memory objects created with shm_open persist until explicitly unlinked with shm_unlink() or system reboot, similar to files in /dev/shm.
Node.js server.listen() defaults to 511 as the backlog parameter, which can be overridden by passing an explicit backlog value.
Fix Verification > Pre-commit verification checklist
79 questions--no-ensure-ascii preserves unicode characters instead of converting to escape sequences.
That test files are named correctly according to the specified test framework.
Use -p / --pattern with a regex pattern (e.g., --pattern, release/.*). Can be specified multiple times.
Use --ignore=type1,type2,… to ignore requirements for specific builtin types.
Run pre-commit install which sets up the hooks in .git/hooks/pre-commit.
auto (default, replaces with most frequent), crlf (forces CRLF), lf (forces LF), and no (checks without modifying).
Run pre-commit sample-config to generate a very basic configuration.
False, meaning pre-commit will continue running hooks after failures unless set to true.
A framework for managing and maintaining multi-language pre-commit hooks that specifies a list of hooks to use and manages the installation and execution of any hook written in any language before every commit.
Runs during git push after remote refs have been updated but before any objects are transferred, receiving the remote name and location plus a list of to-be-updated refs through stdin.
Prevents addition of new git submodules, intended as a helper to migrate away from submodules.
The path to the file holding the commit message, the type of commit, and the commit SHA-1 if this is an amended commit.
Use args: [--branch, staging, --branch, main] or -b / --branch (can be specified multiple times) to set protected branches.
The pre-commit hook is run first, before typing in a commit message, used to inspect the snapshot that's about to be committed. Exiting non-zero aborts the commit.
Instead of loading files, it simply parses them for syntax, enabling extensions and unsafe constructs which would otherwise be forbidden. This implies --allow-multiple-documents.
Run by commands that replace commits such as git commit --amend and git rebase, receiving which command triggered the rewrite as an argument and a list of rewrites on stdin.
--allow-missing-credentials allows the hook to pass when no credentials are detected.
.pre-commit-config.yaml which should be placed at the root of your project.
Runs before rebasing anything and can halt the process by exiting non-zero, commonly used to disallow rebasing commits that have already been pushed.
Runs before the commit message editor fires up but after the default message is created, allowing editing of the default message before the commit author sees it.
In the hooks subdirectory of the Git directory, typically .git/hooks.
Set always_run: false to allow the hook to be skipped according to file filters, but note that empty commits (git commit --allow-empty) would always be allowed.
auto mode which automatically replaces with the most frequent line ending. This is the default argument.
--assume-in-merge allows running the hook even when no merge is currently in progress.
Both main and master branches are protected by default if no branch argument is set.
Runs after a successful git merge command, used to restore data in the working tree that Git can't track such as permissions data.
Files that contain merge conflict strings such as <<<<<<<, =======, >>>>>>>.
Run pre-commit autoupdate which brings hooks to the latest tag on the default branch.
Debugger imports and Python 3.7+ breakpoint() calls in Python source code.
Runs after a successful git checkout, used to set up the working directory properly for the project environment.
Takes the path to a temporary file containing the commit message as a parameter; exiting non-zero aborts the commit process, used to validate commit messages.
Yes, no-commit-to-branch is configured by default to always_run: true, ignoring any settings of files, exclude, types or exclude_types.
2 spaces, which can be controlled with --indent option (specify a number for spaces or a string of whitespace).
--allow-multiple-documents which allows YAML files using the multi-document syntax.
Use --additional-github-domain DOMAIN which can be repeated multiple times (e.g., github.example.com for GitHub Enterprise).
--pytest (default: .*_test\.py), --pytest-test-first (test_.*\.py), --django or --unittest (test.*\.py).
Symlinks which are changed to regular files with content of a path that the symlink was pointing to, commonly happening on Windows when users clone repositories without symlink creation permissions.
--no-sort-keys retains the original key ordering instead of sorting when autofixing.
Sorts the lines in specified files alphabetically, removes blank lines, does not respect comments, and converts all newlines to line feeds (\n).
Run pip install pre-commit and add pre-commit to your requirements.txt or requirements-dev.txt file.
Files with names that would conflict on a case-insensitive filesystem like macOS HFS+ or Windows FAT.
Runs after the entire commit process is completed, takes no parameters, and can get the last commit by running git log -1 HEAD.
Run pre-commit run --all-files to execute hooks against all files in the repository.
All stages, meaning hooks run at all git hook stages by default unless overridden.
Use --credentials-file CREDENTIALS_FILE to specify additional AWS CLI style configuration files in non-standard locations. Can be repeated multiple times.
[pre-commit] meaning hooks are installed as pre-commit hooks by default.
Requires literal syntax when initializing empty or zero Python builtin types (e.g., {} instead of dict(), [] instead of list()).
Test Failure Analysis > Tracing test inputs to assertions
57 questionsPytest stores exception information in sys.last_value, sys.last_type, and sys.last_traceback. In interactive use, this allows dropping into postmortem debugging. You can access line numbers with sys.last_traceback.tb_lineno.
Use --test-timeout=0 to prevent tests from timing out when stopping at breakpoints during debugging.
Use --no-file-parallelism to prevent test files from running in parallel, which is necessary for proper debugging.
Use the toHaveBeenLastCalledWith matcher. For example: expect(drink).toHaveBeenLastCalledWith('mango') tests the last call's arguments.
Use the toThrow matcher. For example: expect(() => compileAndroidCode()).toThrow() or expect(() => compileAndroidCode()).toThrow('you are using the wrong JDK') to match error message text.
Use the toHaveBeenCalledWith matcher. For example: expect(drink).toHaveBeenCalledWith(beverage) checks that the function was called with specific arguments.
Use the context manager: with caplog.at_level(logging.INFO): to temporarily change the log level inside a with block.
Use pytest.raises() as a context manager: with pytest.raises(ZeroDivisionError): 1 / 0.
Use the toHaveBeenNthCalledWith(nthCall, arg1, arg2, ...) matcher to check arguments for a specific call number.
Access all log records via caplog.records (list of logging.LogRecord instances) or the final log text via caplog.text.
The faulthandler module is automatically enabled for pytest runs. The faulthandler_timeout=X configuration option can be used to dump the traceback of all threads if a test takes longer than X seconds to finish.
Use --log-file-mode=a to open the log file in append mode instead of overwrite mode.
Call caplog.clear() to reset the captured log records in a test.
Use the -p no:unraisableexception command-line option to disable warnings about unraisable exceptions.
When breakpoint() is called and PYTHONBREAKPOINT is set to the default value, pytest uses a custom internal PDB trace UI instead of the system default Pdb. When tests complete, the system defaults back to the system Pdb trace UI.
Each captured log message shows the module name, line number, log level, and message.
Use caplog.get_records(when) method where when can be "setup", "call", or "teardown" to access logs from specific stages.
Pass --inspect or --inspect-brk to the Vitest CLI to enable the Node.js inspector. The --inspect-brk option breaks before starting execution.
Use --log-file=/path/to/log/file to record the whole test suite logging calls to a file. By default, this opens in write mode (overwrites each run).
Use pytest --pdb --maxfail=3 to drop to PDB for the first three failures only.
Use -p no:faulthandler on the command-line to disable the faulthandler module.
Use toBeCloseTo instead of toEqual for floating-point comparisons because of tiny rounding errors. For example, 0.1 + 0.2 is not strictly equal to 0.3 in JavaScript.
pytest.RaisesGroup accepts: a match parameter (checks against group message), a check parameter (arbitrary callable that receives the group), flatten_subgroups (boolean), and allow_unwrapped (boolean) parameters.
By default, Vitest uses port 9229 for debugging. You can override this by passing a value to --inspect-brk, for example: --inspect-brk=127.0.0.1:3000.
Use the .resolves modifier to unwrap a fulfilled promise. For example: await expect(Promise.resolve('lemon')).resolves.toBe('lemon').
The caplog fixture allows changing the log level for captured log messages using caplog.set_level(logging.INFO).
Use toEqual which recursively checks every field of an object or array. For stricter comparison that considers undefined properties and array sparseness, use toStrictEqual.
Use pytest -x or pytest --exitfirst to stop the testing process after the first failure. This is useful for focusing on debugging one issue at a time.
Use pytest --log-format="%(asctime)s %(levelname)s %(message)s" and optionally --log-date-format="%Y-%m-%d %H:%M:%S" to customize log output format.
Assertion introspection is pytest's ability to automatically display the values of subexpressions when an assertion fails, showing function call returns, attribute accesses, comparisons, and operators without requiring boilerplate code. For example, when assert f() == 4 fails where f() returns 3, pytest displays assert 3 == 4 with + where 3 = f().
Use the toHaveBeenCalledTimes(number) matcher. For example: expect(drink).toHaveBeenCalledTimes(2) verifies the function was called exactly 2 times.
Use the -p no:threadexception command-line option to disable warnings about unhandled thread exceptions.
Use assert (0.1 + 0.2) == pytest.approx(0.3) instead of exact equality checks. pytest.approx works with scalars, lists, dictionaries, and NumPy arrays, and supports comparisons involving NaNs.
The toBe matcher uses Object.is to test exact equality, which is even better for testing than the === strict equality operator.
Use caplog.record_tuples which contains tuples of (logger_name, level, message). For example: python assert caplog.record_tuples == [("root", logging.INFO, "boo arg")]
Use the as clause to capture exception info: with pytest.raises(RuntimeError) as excinfo:. The ExceptionInfo instance provides .type, .value, and .traceback attributes.
pytest.RaisesExc specifies details about contained exceptions within an exception group, including match patterns for exception messages. It provides a matches() method for checking outside of the context manager usage.
The excinfo.group_contains() helper makes it easy to check for the presence of specific exceptions but is very bad for checking that the group does not contain any other exceptions. A test can pass even if unexpected exceptions are present in the group.
Pytest shows the failed assertion with the actual values displayed, including a "where" clause that traces the source of intermediate values. For the assertion assert f() == 4 failing, pytest shows: E assert 3 == 4 E + where 3 = f()
In strict assertion mode (available via require('node:assert/strict')), non-strict methods behave like their corresponding strict methods, and error messages display diffs. In legacy mode (default require('node:assert')), methods like deepEqual use the == operator which can have surprising results.
You can use Chrome DevTools by opening chrome://inspect in your browser, or use IDE debuggers like VS Code or IntelliJ IDEA.
Use pytest --maxfail=2 (or any number) to stop after a specified number of failures. For example, --maxfail=2 stops after two failures.
Use vitest --inspect-brk --no-file-parallelism to run tests in a single worker suitable for debugging with Chrome DevTools or other debuggers.
Pytest detects and issues warnings visible in the test run summary for unraisable exceptions (exceptions raised in del implementations) and unhandled thread exceptions. The warning categories are pytest.PytestUnraisableExceptionWarning and pytest.PytestUnhandledThreadExceptionWarning.
Use pytest --trace to invoke the Python debugger at the start of every test.
Pytest captures log messages of level WARNING or above automatically and displays them in their own section for each failed test.
Use the .rejects modifier to unwrap a rejected promise. For example: await expect(Promise.reject(new Error('octopus'))).rejects.toThrow('octopus').
Use --log-cli-level to specify the logging level for console output. This accepts logging level names or numeric values.
Pass the match parameter with a regex pattern: with pytest.raises(ValueError, match=r".*123.*"):. The match parameter uses re.search() and also matches against PEP-678 notes.
Use pytest.RaisesGroup to expect BaseExceptionGroup or ExceptionGroup. For example: with pytest.RaisesGroup(ValueError): raise ExceptionGroup("group msg", [ValueError("value msg")]).
Use pytest --pdb to invoke the Python debugger on every failure or KeyboardInterrupt. Use pytest -x --pdb to drop to PDB on first failure then end the test session.
Use pytest --show-capture=no to disable reporting of captured content on failed tests.
The diff parameter accepts 'simple' (default) or 'full' to control the verbosity of diffs in assertion error messages.
Use --log-disable={logger_name} to disable specific loggers. This argument can be passed multiple times, for example: pytest --log-disable=main --log-disable=testing.
Use caplog.set_level(logging.CRITICAL, logger="root.baz") to set the log level for a specific logger instead of the root logger.
Set the log_cli configuration option to true to output logging records as they are emitted directly to the console.
AssertionError instances contain: actual (the actual value), expected (the expected value), operator (the operator used), generatedMessage (boolean indicating if message was auto-generated), code (always 'ERR_ASSERTION'), and the standard Error properties (message, name).
Debugging Methodology > Reading and interpreting error messages
55 questionsThe "cause" property was added to JavaScript errors in ES2022 and indicates the reason why the current error was thrown—usually another caught error. When creating a new Error, developers can pass { cause: originalError } as the second argument to the constructor. This enables error chaining and ensures original error information is preserved for debugging.
HTTP 401 Unauthorized means the client must authenticate itself to get the requested response (semantically means "unauthenticated"). HTTP 403 Forbidden means the client does not have access rights to the content; the server is refusing to give the requested resource even though the client's identity is known to the server.
The print and log debugging strategy involves adding print statements or "logs" to the code to display values of variables, call stacks, the flow of execution, and other relevant information. This approach is especially useful for debugging concurrent or distributed systems where order of execution can impact the program's behavior.
A Python RuntimeError is raised when an error is detected that doesn't fall into any of the standard exception categories. It's used for generic runtime errors that don't have a more specific exception type.
HTTP 404 Not Found indicates the server cannot find the requested resource, which may be temporary or permanent. HTTP 410 Gone indicates the resource is permanently removed and clients should delete any links to it. 410 is intended for permanent removal situations, whereas 404 is used when the status of the resource is unknown.
Cause elimination is a hypothesis-driven debugging technique where teams speculate about the causes of the error and test each possibility independently. This approach works best when the team is familiar with the code and the circumstances surrounding the bug.
The request entity has a media type which the server or resource does not support. For example, the client may have sent JSON when the server expects XML.
A Python IndexError is raised when a sequence subscript (index) is out of range. This occurs when trying to access a list, tuple, or string element at an index that doesn't exist (negative indices beyond -1 or positive indices greater than or equal to the sequence length).
A Python ModuleNotFoundError is a subclass of ImportError raised by import when a module could not be located, or when None is found in sys.modules. This is more specific than ImportError and indicates the module itself cannot be found.
The request entity is larger than the server is willing or able to process. The server may close the connection or return a Retry-After header.
HTTP 500 Internal Server Error indicates a generic server-side error. HTTP 502 Bad Gateway indicates that a server acting as a gateway or proxy received an invalid response from an upstream server. HTTP 503 Service Unavailable indicates the server is currently unable to handle the request due to temporary overload or maintenance.
A JavaScript URIError is thrown when URI handling functions encounter malformed URI strings, such as when decodeURI(), decodeURIComponent(), encodeURI(), or encodeURIComponent() are called with invalid input containing malformed percent-encoding sequences, incomplete encoding sequences (like "%2"), or invalid UTF-8 sequences.
Error messages should safeguard against likely mistakes by detecting and warning about common errors before they cause problems, preserve the user's input (let users correct errors by editing their original action instead of starting over), reduce error-correction effort (guess the correct action and let users pick it from a small list of fixes), and concisely educate on how the system works to help users avoid the problem in the future.
The standard properties of a JavaScript Error object include: message (a human-readable description of the error), name (the error type name such as "Error", "TypeError", etc.), stack (a non-standard but widely-supported stack trace showing the call path), and cause (added in ES2022, indicating the original error that caused this error for error chaining).
In Python, all exceptions must be instances of a class that derives from BaseException. The base class for all built-in, non-system-exiting exceptions is Exception. Other base classes include ArithmeticError (for arithmetic errors like OverflowError, ZeroDivisionError), LookupError (for lookup errors like IndexError, KeyError), and OSError (for operating system-related errors). User-defined exceptions should derive from Exception.
This is used for caching purposes. It tells the client that the response has not been modified, so the client can continue to use the same cached version of the response.
Both indicate permanent redirection, but 301 Moved Permanently allows the user agent to change the HTTP method (e.g., from POST to GET) in the redirected request, while 308 Permanent Redirect requires the user agent to preserve the same HTTP method used in the original request.
A Python NameError is raised when a local or global name is not found. This occurs when trying to access a variable that hasn't been defined in the current scope.
Python exceptions have three context attributes: context (automatically set when a new exception is raised while handling another), cause (explicitly set using "raise new_exc from original_exc"), and suppress_context (automatically set to True when cause is set, determining whether context is displayed).
A Python AssertionError is raised when an assert statement fails. Assert statements are used for debugging purposes to check if a condition is true, and if the condition is false, an AssertionError is raised with an optional error message.
A Python KeyError is raised when a mapping (dictionary) key is not found in the set of existing keys. This occurs when trying to access or delete a dictionary key that doesn't exist using bracket notation (dict['missing_key']).
The user has sent too many requests in a given amount of time ("rate limiting"). The response should include details about the limiting conditions and may indicate when to retry the request.
A Python TypeError is raised when an operation or function is applied to an object of inappropriate type. For example, trying to add a string and an integer, calling a non-callable object, or iterating over a non-iterable object will raise a TypeError.
The server cannot find the requested resource. In a browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client.
The server understands the content type of the request entity and the syntax is correct, but was unable to process the contained instructions. This is commonly used in WebDAV scenarios and REST APIs when semantic errors prevent processing.
HTTP response status codes are grouped into five classes based on the first digit: Informational responses (100-199), Successful responses (200-299), Redirection messages (300-399), Client error responses (400-499), and Server error responses (500-599).
The debugging process typically involves: (1) Reproduce the conditions to observe the error firsthand, (2) Find the bug by pinpointing its source, (3) Determine the root cause by examining code logic and flow, (4) Fix the bug by revising the code, (5) Test to validate the fix with unit, integration, system, and regression tests, and (6) Document the process including what caused the bug and how it was fixed.
A Python PermissionError is raised when attempting an operation without adequate permissions (file system permissions). This occurs when trying to read, write, or execute files or directories without the proper access rights.
500 Internal Server Error is a generic error message indicating the server encountered an unexpected condition that prevented it from fulfilling the request. It is a catch-all error when no more specific message is suitable.
Backtracking is a debugging approach where developers work backward from the point the error was detected to find the origin of the bug. Developers retrace the steps the program took with the problematic source code to see where things went wrong. This can be effective when used alongside a debugger tool.
A JavaScript RangeError is thrown when a numeric value is outside the valid range for its intended use, such as creating an array with a negative length, calling toFixed() with a precision value outside the allowed range (0-100), or providing invalid values to numeric methods that expect specific ranges.
A Python ValueError is raised when an operation or function receives an argument that has the right type but an inappropriate value. For example, trying to convert a non-numeric string to an integer using int("abc") or finding the square root of a negative number will raise a ValueError.
A stack trace is a report of the active stack frames at a certain point in time during program execution. It shows the sequence of function calls that led to an error, including function names, file names, line numbers, and column numbers. The top line typically shows the error type and message, followed by the call hierarchy from the error location back to the entry point.
Error chaining in Python allows developers to explicitly chain exceptions using the "from" keyword: "raise new_exc from original_exc". This sets the cause attribute on the new exception and preserves the original exception for debugging. The default traceback display shows both chained exceptions, with the cause always shown and the context shown only when cause is None.
Error messages should use human-readable language (avoid technical jargon), concisely and precisely describe the issue (avoiding generic messages like "An error occurred"), offer constructive advice or remedies, take a positive tone and not blame the user (avoid words like "invalid," "illegal," or "incorrect"), and avoid humor since it can become stale if users encounter the error frequently.
The server is currently unable to handle the request due to a temporary overload or scheduled maintenance. This response should be used for temporary conditions and the Retry-After header should indicate how long the client should wait before retrying.
There is no content to send for this request, but the headers are useful. The user agent may update its cached headers for this resource with the new ones. This is commonly used for DELETE operations or successful PUT/POST requests that don't return data.
The add_note() method was added in Python 3.11 and allows adding string notes to exceptions that appear in the standard traceback after the exception string. It takes a single string argument and raises TypeError if the note is not a string. Notes are stored in the notes attribute as a list.
A Python RecursionError is raised when the maximum recursion depth is exceeded. This typically occurs with infinite recursion or deeply nested recursive calls beyond Python's recursion limit.
When debugging large codebases, teams divide lines of code into segments—functions, modules, class methods, or other testable logical divisions—and test each one separately to locate the error. When the problem segment is identified, it can be divided further and tested until the source of the bug is identified.
Error messages must present themselves noticeably and recognizably to users by: displaying the error message close to the error's source, using noticeable, redundant, and accessible indicators (bold, high-contrast, red text, icons), designing errors based on their impact (differentiating between warnings and barriers), and avoiding prematurely displaying errors before users complete their input.
A Python AttributeError is raised when an attribute reference or assignment fails, such as when trying to access an attribute that doesn't exist on an object. This occurs when an object does not support attribute references or when attempting to access a non-existent attribute.
A JavaScript ReferenceError is thrown when attempting to access an undeclared variable, accessing a property of null or undefined, using a variable in strict mode that hasn't been declared, or accessing global properties that don't exist. This occurs during runtime when the JavaScript engine cannot resolve a reference to a variable or property.
A JavaScript TypeError is thrown when a value is not of the expected type, such as calling a non-callable value as a function, accessing properties on null or undefined, passing wrong argument types to functions, or performing operations on incompatible data types. For example, attempting to call a number as a function (5()) will throw a TypeError.
HTTP 301 Moved Permanently indicates the URL of the requested resource has been changed permanently and future requests should use the new URL. HTTP 302 Found indicates the URI has been changed temporarily and future requests should continue to use the original URI. Both methods preserve the HTTP method when redirecting.
A Python FileNotFoundError is raised when a file or directory is requested but doesn't exist. This occurs when trying to open a file that doesn't exist or when performing file operations on non-existent paths.
The request succeeded and a new resource was created as a result. This is typically the response sent after POST requests, or some PUT requests. The response should include a Location header with the URI of the newly created resource.
The server, while acting as a gateway or proxy, did not receive a timely response from an upstream server it needed to access to complete the request. This differs from 503 Service Unavailable in that 504 indicates a timeout with an upstream server, not server overload.
The server cannot or will not process the request due to something that is perceived to be a client error, such as malformed request syntax, invalid request message framing, or deceptive request routing.
Rubber duck debugging is an approach where developers "explain or talk out" the code, line by line, to any inanimate object. The idea is that by trying to explain the code out loud, developers can better understand its logic (or lack thereof) and spot bugs more easily.
The four main types of coding errors are: Semantic errors (code violates language rules and won't produce meaningful output), Syntax errors (missing elements like parentheses, commas, or other typographical errors), Logical errors (syntax is correct but instructions cause undesired output), and Runtime errors (errors that happen when an application is running or starting up).
A JavaScript SyntaxError is thrown when the JavaScript engine encounters code that violates the language's grammar rules during parsing, before code execution begins. Common causes include missing parentheses, brackets, or braces; invalid variable declarations; incorrect use of operators; malformed string literals; invalid escape sequences; and duplicate parameter names in functions.
A Python MemoryError is raised when an operation runs out of memory but the situation may still be rescued by deleting some objects. The associated value is a string indicating what kind of internal operation ran out of memory.
A Python ImportError is raised when the import statement has trouble trying to load a module, or when the "from list" in a "from ... import" statement has a name that cannot be found. This can occur due to missing modules, circular imports, or incorrect module paths.
Both indicate temporary redirection, but 302 Found allows the user agent to change the HTTP method (e.g., from POST to GET) in the redirected request, while 307 Temporary Redirect requires the user agent to preserve the same HTTP method used in the original request.
Code Search Techniques > Finding all usages of a function or variable
53 questionsWildcard characters cannot be used: . , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ] @. The search will simply ignore these symbols.
Use the "-t" flag followed by the file type. For example, "rg -tpy foo" limits the search to Python files.
Use the "-n" or "--line-number" flag to prefix the line number to matching lines.
The "language:" qualifier searches for code based on what language it's written in. For example, "language:xml element size:100" matches code with the word "element" marked as XML with exactly 100 bytes.
Use "-C
Up to 4,000 private repositories are searchable, which will be the most recently updated of the first 10,000 private repositories that you have access to.
The "extension:" qualifier matches code files with a certain file extension. For example, "extension:pm form path:cgi-bin" matches code with the word "form" under cgi-bin with the .pm file extension.
Use the "--recurse-submodules" flag to recursively search in each active and checked-out submodule.
Use the "-P/--pcre2" flag to use PCRE2 always, or "--auto-hybrid-regex" to use PCRE2 only if needed. An alternative syntax is "--engine (default|pcre2|auto)".
Use the "-T" flag followed by the file type to exclude. For example, "rg -Tjs foo" excludes JavaScript files from the search.
Use the "user:USERNAME" qualifier. For example, "user:defunkt extension:rb" matches code from @defunkt that ends in .rb.
By default, ripgrep respects gitignore rules and automatically skips hidden files/directories and binary files.
The "filename:" qualifier matches code files with a certain filename. For example, "filename:.vimrc commands" matches .vimrc files with the word "commands."
Use the "--column" flag to prefix the 1-indexed byte-offset of the first match from the start of the matching line.
Ctrl+Alt+Shift+F7 opens the Find Usages Settings dialog in IntelliJ IDEA, allowing you to configure search scope and other options.
Use the "-w" or "--word-regexp" flag to match the pattern only at word boundaries (beginning of line, preceded by non-word character, end of line, or followed by non-word character).
Use the "repo:USERNAME/REPOSITORY" qualifier. For example, "repo:mozilla/shumway extension:as" matches code from @mozilla's shumway project that ends in .as.
Use the "-i" or "--ignore-case" flag to ignore case differences between the patterns and the files.
Use the "-W" or "--function-context" flag to show the surrounding text from the previous line containing a function name up to the line before the next function name, effectively showing the whole function containing the match.
You can access Find All References by pressing Shift+Alt+F12 or by right-clicking on a symbol and selecting "Find All References" from the context menu. This feature shows all references to the symbol across your project.
The scope is determined by the currently selected element and can be configured in the Find Usages Settings dialog. You can select a pre-defined scope or define a custom scope.
The "in:file" qualifier restricts search to the contents of the source code file only. For example, "in:file octocat" matches code where "octocat" appears in the file contents.
Use "path:/" to search for files located at the root level of a repository. For example, "path:/ octocat filename:readme" matches readme files with the word "octocat" located at the root level.
Alt+F7 (Windows/Linux) or Option+F7 (macOS) opens the Find Usages feature in IntelliJ IDEA.
Use the "-l" or "--files-with-matches" flag to show only the names of files that contain matches instead of every matched line.
When selected, each search's results are shown on a separate tab of the Find tool window. If not selected, search results are shown on the current tab.
Use the "-L" or "--files-without-match" flag to show only the names of files that do NOT contain matches.
Use the "-c" or "--count" flag to show the number of lines that match instead of showing every matched line.
The "in:path" qualifier restricts search to the file path only. For example, "in:path octocat" matches code where "octocat" appears in the file path.
Only repositories with fewer than 500,000 files are searchable on GitHub.
Use the "--max-depth
Use "rg -uuu" to disable all automatic filtering (gitignore, hidden files, binary files).
Shift+Alt+F12 is the keyboard shortcut to find all references of a symbol in VS Code.
Git grep looks for specified patterns in the tracked files in the work tree, blobs registered in the index file, or blobs in given tree objects.
Use the "-F" or "--fixed-strings" flag to treat the pattern as a fixed string rather than interpreting it as a regular expression.
Only files smaller than 384 KB are searchable in GitHub code search.
Yes, ripgrep supports searching files compressed in brotli, bzip2, gzip, lz4, lzma, xz, or zstandard formats with the "-z/--search-zip" flag.
The "size:" qualifier searches for source code based on file size using greater than, less than, and range qualifiers. For example, "size:>10000 function" matches code with "function" in files larger than 10 KB.
The "--cached" flag searches blobs registered in the index file instead of searching tracked files in the working tree.
Find Usages in IntelliJ IDEA locates all occurrences of a symbol including fields, variables, parameters, classes, tags, attributes, and references in HTML, XML, and CSS files. The results are displayed in the Find tool window.
Use the "path:" qualifier followed by the directory path. For example, "path:app/public console" searches for "console" in JavaScript files within the app/public directory or any of its subdirectories.
Use the "org:ORGNAME" qualifier. For example, "org:github extension:js" matches code from GitHub that ends in .js.
Yes, unlike GNU grep, ripgrep stays fast while supporting Unicode, which is always on by default.
Provide the tree object as an argument after your pattern. For example: "git grep
Use the "--no-index" flag to search files in the current directory that are not managed by Git, or to ignore that the current directory is managed by Git. This is similar to running "grep -r" but with additional benefits like using pathspec patterns.
Use the "--untracked" flag to search also in untracked files in addition to tracked files in the working tree.
PCRE2 support enables look-around and backreferences, which are not supported in ripgrep's default regex engine.
Ripgrep supports searching files in UTF-8, UTF-16, latin-1, GBK, EUC-JP, Shift_JIS, and more. UTF-16 has some automatic detection support, while other encodings must be specified with the "-E/--encoding" flag.
When selected, this checkbox navigates directly to the found usage without displaying the Find tool window when only one usage is found.
No, filename searches are the exception to the rule. You must include at least one search term for all other code searches, but filename searches can be used alone.
Test Failure Analysis > Understanding test output and stack traces
46 questionsYou can control the number of stack frames by setting the Error.stackTraceLimit variable. Setting it to 0 disables stack trace collection, any finite integer sets the maximum number of frames to collect, and setting it to Infinity means all frames get collected. This variable only affects the current context and must be set explicitly for each context that needs a different value.
Test functions must have names that begin with Test and take a pointer to the testing.T type as their only parameter.
If the exception type is SyntaxError and the value has the appropriate format, it prints the line where the syntax error occurred with a caret (^) indicating the approximate position of the error.
The internal threshold is 240 characters. Strings longer than this are truncated with "..." in the output when displaying assertion failures.
Use pytest --showlocals or the shorthand pytest -l to show local variables in tracebacks. Use --no-showlocals to hide them if addopts enables them.
Python 3.14 added colorized output by default, which can be controlled using environment variables.
You use the t.Errorf() method to print a failure message to the console when a test condition is not met.
The -v flag enables verbose output that lists all of the tests and their results, showing each test with === RUN and `
The properties include: actual, expected, operator, generatedMessage, code (always 'ERR_ASSERTION'), message, and name (always 'AssertionError'). These are set on instances of AssertionError when assertions fail.
RSpec is composed of rspec-core (the spec runner), rspec-expectations (provides readable API for expected outcomes), and rspec-mocks (test double framework).
A negative limit value corresponds to a positive value of sys.tracebacklimit, whereas the behavior of a positive limit value cannot be achieved with sys.tracebacklimit. This was added in Python 3.5.
Use the -f or --failfast option to stop the test run on the first error or failure.
The --full-trace option causes very long traces to be printed on error (longer than --tb=long). It also ensures that a stack trace is printed on KeyboardInterrupt (Ctrl+C), which is useful for finding where tests are hanging when interrupted.
The accepted values are 'simple' (default) and 'full'. When set to 'full', it shows the full diff in assertion errors.
If the file parameter is omitted or None, the output goes to sys.stderr.
The available options are: --tb=auto (default, 'long' for first and last entry, 'short' for others), --tb=long (exhaustive, informative), --tb=short (shorter format), --tb=line (only one line per failure), --tb=native (Python standard library format), and --tb=no (no traceback at all).
Use the --stack-trace-limit <value> flag. To pass this flag to V8 when running Google Chrome, use --js-flags='--stack-trace-limit <value>'.
The default discovery parameters are: start directory is . (current directory), pattern is test*.py (matches files starting with "test" and ending with ".py"), and top-level directory defaults to the start directory.
The -c or --catch option causes Control-C during the test run to wait for the current test to end and then reports all the results so far. A second Control-C raises the normal KeyboardInterrupt exception. This was added in Python 3.2.
Python 3.13 added colorized output by default, which can be controlled using environment variables.
Strict assertion mode was exposed as require('node:assert/strict') in Node.js v15.0.0.
By default, V8 captures the topmost 10 stack frames when an error is created. This is considered enough to be useful without having a noticeable negative performance impact.
The available flags are: --quiet or -q (less verbose mode), -v (increase verbosity, display individual test names), -vv (more verbose, display more details from test output), and -vvv (not a standard flag but may be used for even more detail in certain setups).
The methods available are: getClassName() (fully qualified class name), getFileName() (source file name), getLineNumber() (line number), getMethodName() (method name), isNativeMethod() (returns true if native method), toString() (string representation), equals(), and hashCode().
The constructors are: Throwable() (null detail message), Throwable(String message) (specified detail message), Throwable(String message, Throwable cause) (specified detail message and cause), and Throwable(Throwable cause) (specified cause with detail message from cause.toString()).
The --durations N option shows the N slowest test cases (N=0 shows all). This was added in Python 3.12.
The getStackTrace() method returns an array of StackTraceElement objects providing programmatic access to the stack trace information printed by printStackTrace().
For instance initializers, it returns <init>, and for class initializers, it returns <clinit>, as per Section 3.9 of the Java Virtual Machine Specification.
The default value for bail is 0, which means Jest runs all tests and produces all errors into the console upon completion. Setting bail to true is the same as setting bail to 1.
Test files must end with _test.go to be recognized by the go test command.
The -k option only runs test methods and classes that match the pattern or substring. Patterns containing wildcards (*) are matched using fnmatch.fnmatchcase(); otherwise simple case-sensitive substring matching is used. This option may be used multiple times. This was added in Python 3.7.
The -b or --buffer option buffers the standard output and standard error streams during the test run. Output during a passing test is discarded. Output is echoed normally on test fail or error and is added to the failure messages. This was added in Python 3.2.
The etype parameter was renamed to exc and became positional-only in Python 3.10.
A negative line number indicates that the line number information is unavailable. A specific value of -2 indicates that the method containing the execution point is a native method.
The format is typically: "ClassName.methodName(FileName.java:lineNumber)" - where ClassName is the fully-qualified name, methodName is the method name, FileName.java is the source file, and lineNumber is the line number. Variations exist when line number or file name are unavailable, such as "ClassName.methodName(Unknown Source)" or "ClassName.methodName(Native Method)".
The strict option defaults to true, which means non-strict methods behave like their corresponding strict methods (e.g., assert.deepEqual() behaves like assert.deepStrictEqual()).
It prints the header "Traceback (most recent call last):" before printing the exception type and value after the stack trace.
The documentation indicates this is a configurable option, but the specific default is not explicitly stated in the retrieved content.
When set to true, it skips prototype and constructor comparison in deep equality checks. The default value is false.
The NO_COLOR or NODE_DISABLE_COLORS environment variables can be used to deactivate colors in error diffs and other assertion output.
Use the --locals option to show local variables in tracebacks. This was added in Python 3.5.
The returned list from format_exception_only() started including exception notes in Python 3.11.
The main assertion methods include: assertEqual(), assertTrue(), assertFalse(), assertRaises(), along with many others for different comparison needs.
Fix Verification > Regression testing basics
45 questions"Suite hygiene" sessions involve scheduling regular maintenance to eliminate obsolete tests, update existing ones, and add new tests to align with newly added functions. Version control integrations are essential to maintaining traceability and clean test suites.
The four main regression testing techniques are: 1) Retest all, 2) Regression test selection, 3) Test case prioritization, and 4) Hybrid (a combination of selection and prioritization).
Test cases should be selected using a combination of techniques: retest all (run every test, time-consuming but highest assurance), regression test selection (run only relevant test cases to save time while maintaining targeted coverage), test case prioritization (prioritize by risk level, business value, and historical failure patterns), and automation (automate stable, repeatable tests).
The primary purpose of regression testing is to catch bugs that may have been accidentally introduced into a new build or release candidate and to ensure that previously eradicated bugs continue to stay dead. It verifies that code modifications haven't broken existing functionality or introduced new bugs.
Best practices include prioritizing test cases by risk and business value, automating repetitive and stable test cases, regularly reviewing and maintaining the regression suite, integrating regression tests with CI/CD pipelines, and using a hybrid approach of automated and manual testing.
Tests that should not be automated include one-off or volatile tests. Only stable, repeatable test cases should be automated as they are needed for frequent builds and CI runs.
The "retest all" technique checks all test cases on the current program to verify its integrity. Though expensive as it needs to re-run all cases, it ensures there are no errors because of the modified code. This approach is exhaustive and offers maximum test coverage but is the most time and resource-intensive technique.
The hybrid technique combines regression test selection and test case prioritization to optimize the testing process.
Progressive regression testing is a mixed approach that evaluates both new features and existing features to detect bugs introduced through new functionality. It's typically used when releasing a new update to an existing software product.
Regression testing is an integral part of the extreme programming software development method. In this method, design documents are replaced by extensive, repeatable, and automated testing of the entire software package throughout each stage of the software development process.
The two types of test case prioritization are: 1) General prioritization - prioritizing test cases that will be beneficial on subsequent versions, and 2) Version-specific prioritization - prioritizing test cases with respect to a particular version of the software.
Changes that require regression testing include bug fixes, software enhancements, configuration changes, and even substitution of electronic components (hardware).
Performing regression testing can be tricky with black box components from a third party because any change in the third-party component may interfere with the rest of the system, yet performing regression testing on a third-party component is difficult because it is an unknown entity.
Corrective regression testing ensures data consistency by re-running test cases to verify whether similar test results occur. It is often conducted when no changes have been made to the codebase, such as when code is refactored to ensure refactoring isn't introducing code errors.
Regression test selection involves running only a part of the test suite if the cost of selecting the subset of tests is less than the retest all technique. It runs tests covering areas most likely to be affected by code changes and requires testers to prune obsolete test cases and narrow down relevant ones to be reused.
Partial regression testing is used when the goal is to find out whether recent changes have impacted only a subset of the updated system. It detects that subset and performs suitable diagnostics, such as when integrating a new payment gateway and evaluating only that portion of functionality.
Flaky tests should be handled by implementing tools for flakiness detection and quarantining, and by prioritizing automation of high-impact, stable test cases. Environments should also be standardized via cloud labs or containers.
Test case prioritization involves scheduling test cases so that higher priority tests are executed before lower priority ones. Tests are prioritized by business value, significance, and historical failure rate. Critical functions, features that often fail, and high-risk modules are tested first and most frequently.
Developer testing compels a developer to focus on unit testing and to include both positive and negative test cases. Unlike traditional tests that verify only intended outcomes, developer testing helps catch regressions earlier by having developers write comprehensive test cases as part of the development cycle.
The "minefield problem" refers to when automated regression testing becomes too static and rote. Developers may learn how to pass a fixed library of tests, causing standard regression tests to stop testing effectively. This results in clearing a single safe path while ignoring potential bugs in other areas of the application.
Non-functional regression tests assess aspects such as performance, security, or reliability. They present additional challenges because detecting changes in performance is often statistically complex, while security issues typically arise from vulnerabilities in the software ecosystem rather than from individual code changes.
Regression tests account for approximately 80% of the total testing cost in software development.
Theoretically, after each fix, one must run the entire batch of test cases previously run against the system to ensure that it has not been damaged in an obscure way. However, this ideal is very costly in practice.
Functional tests, unit tests, integration tests, and build verification tests can all be incorporated into a regression testing suite. Any test that has successfully verified that various components work as intended can be included.
Regression testing can be used not only for testing the correctness of a program but also for tracking the quality of its output. For instance, in compiler design, regression testing could track code size and compilation/execution time of test suite cases.
Regression tests can be done at any level, from unit through to system integration. Functional tests exercise the complete program with various inputs, while non-functional tests assess aspects such as performance, security, or reliability.
Time and resource constraints can be addressed by breaking down regression suites into smaller suites, running critical tests first in CI pipelines, and utilizing cloud-parallel execution to accelerate test cycles.
Regression testing is done after functional testing has concluded, to verify that the other functionalities are working. Traditionally, it has been performed by a software quality assurance team after the development team has completed work.
Selective regression testing introduces a predictive element where test cases from the test suite are selected based on the testers' belief that those areas are going to receive impacts from code changes. For example, developers updating a mobile application's user interface might use selective regression testing to ensure ongoing stability.
Regression testing uses a more precise scope to focus on changes recently made, while QA (Quality Assurance) evaluates the entire system and its workings. Both share similar missions to optimize user experience and deliver high-quality software, but they look at different scopes of the system.
Change impact analysis is performed to determine an appropriate subset of tests (non-regression analysis). It involves understanding the scope of changes by reviewing commit logs, feature tickets, bug fixes, and pull requests, and tracing dependencies to find interconnected components likely to be impacted by changes.
The three execution modes are: 1) Manual testing (ideal for exploratory tests or UI-focused tests needing human judgment), 2) Automated testing (ideal for sanity checks, API validations, and high-frequency test runs within CI/CD workflows), and 3) Hybrid (combining automated high-volume regression runs with manual testing of edge cases).
Retest-all regression testing is considered post-final testing and involves running tests on all regression test cases that have already been cleared to ensure everything works together harmoniously. It's often used for checking changes accompanying major architectural shifts.
Defects found during traditional regression testing are most costly to fix because they are discovered later in the development cycle, after the development team has completed work. This problem is being addressed by the rise of unit testing and developer testing.
In agile software development—where software development life cycles are very short, resources are scarce, and changes are very frequent—regression testing might introduce a lot of unnecessary overhead compared to traditional environments.
Test automation is frequently involved because regression test suites tend to grow with each found defect, making manual repetition too time-consuming and complicated. Many projects have automated Continuous Integration systems to re-run all regression tests at specified intervals and report any failures.
The eight steps are: 1) Introduce a code change, 2) Consider possible impacts, 3) Choose test cases, 4) Prioritize test cases, 5) Run test cases, 6) Report and analyze test results, 7) Make fixes and retest cases, and 8) Repeat the entire process as needed.
Change impact analysis mapping involves understanding the scope of changes by reviewing commit logs, feature tickets, bug fixes, and pull requests to verify which functions and modules have changed, then studying architecture diagrams, code ownership, and dynamic analysis to trace dependencies.
Regression testing is re-running functional and non-functional tests to ensure that previously developed and tested software still performs as expected after a change. This includes checking whether bug fixes, software enhancements, configuration changes, or even hardware substitutions have introduced new faults or caused previously fixed bugs to re-emerge.
Complete regression testing involves retesting the whole system or application and is used when more comprehensive testing is needed, such as following major code changes. It ensures ongoing functionality after significant updates like adding a product gallery to a website.
Unit regression testing concentrates on the components or modules (units) that make up a system and checks whether errors have been introduced into individual units. For example, when adding a "Forgot password" feature to a website, unit regression testing would verify that the original login mechanism continues to work as intended.
Defect reporting should include steps to reproduce each bug, severity levels, screenshots/logs, and links to impacted test cases. Issues should then be escalated and shared with relevant developers and product managers to initiate triage.
Regression testing should be performed after bug fixes, before all major releases, after adding new features, after changes to performance metrics or test environments (such as OS and database upgrades), and after each commit on a CI/CD pipeline.
Common challenges include maintaining the test suite (accumulation of outdated or redundant tests), time and resource constraints (complete regression suites are slow and effort-intensive), selecting the right tests (balancing thoroughness with practicality), complexity in large applications (cross-module issues), and integrating with automation effectively (brittle, flaky test scripts).
Common strategies include running the system after every successful compile (for small projects), every night, or once a week. These strategies can be automated by an external tool.
Test Failure Analysis > Analyzing why tests fail after a fix
44 questionsTests must be defined synchronously for Jest to collect them. You cannot use setTimeout or asynchronous operations to define tests with it() because Jest will not find them during collection.
All test files must be modules or packages importable from the top-level directory of the project, meaning their filenames must be valid Python identifiers.
Run node --inspect-brk node_modules/.bin/jest --runInBand on Unix or node --inspect-brk ./node_modules/jest/bin/jest.js --runInBand on Windows, then connect via Chrome DevTools at chrome://inspect or use VS Code's debugger.
The -k flag only runs test methods and classes that match the pattern or substring, supporting wildcards with * for fnmatch-style matching or simple case-sensitive substring matching.
The faulthandler standard module is automatically enabled for pytest runs (unless disabled with -p no:faulthandler), and the faulthandler_timeout=X configuration option dumps tracebacks of all threads if a test takes longer than X seconds.
Call jest.setTimeout(10000) to set a 10 second timeout (or any value in milliseconds).
Use pytest -x --pdb which drops to PDB on first failure and then ends the test session, or use pytest --pdb --maxfail=3 to drop to PDB for the first three failures.
Python 3.14 added colorized output by default, which can be controlled using environment variables.
The -c or --catch option causes Control-C during the test run to wait for the current test to end and then report all results, with a second Control-C raising the normal KeyboardInterrupt exception.
The default timeout is controlled by jasmine.DEFAULT_TIMEOUT_INTERVAL, and if a promise doesn't resolve within this timeout, Jest throws an error: "Timeout - Async callback was not invoked within timeout specified by jasmine.DEFAULT_TIMEOUT_INTERVAL."
The default compression level is 6 on a scale of 0-9, where 0 is fastest but no compression and 9 is maximum compression but slowest.
Exception information is stored in sys.last_value, sys.last_type, and sys.last_traceback for postmortem debugging access.
Use --runInBand to run tests sequentially in the same thread, or set --maxWorkers=4 to limit the worker pool to approximately 4 workers, which can improve speed by up to 50%.
This is most commonly caused by conflicting Promise implementations, and can be fixed by replacing the global promise implementation with globalThis.Promise = jest.requireActual('promise'); and/or consolidating to a single Promise library.
The -b or --buffer flag buffers the standard output and standard error streams during the test run, discarding output during passing tests and echoing it on test failures or errors.
Use the --no-cache flag to bypass Jest's cache of transformed module files.
The minimum retention is 1 day, the maximum is 90 days unless changed from repository settings. A value of 0 means using the default repository settings.
The action will fail with an error unless you set overwrite: true to delete the existing artifact before uploading a new one.
The -f or --failfast flag stops the test run on the first error or failure.
Use the -x flag to stop pytest after the first failure, or use --maxfail=N to stop after N failures.
Pytest detects these conditions and issues warnings visible in the test run summary, using categories pytest.PytestUnraisableExceptionWarning and pytest.PytestUnhandledThreadExceptionWarning. These can be disabled with -p no:unraisableexception and -p no:threadexception respectively.
Record-playback tools are almost always a bad idea because they resist changeability and obstruct useful abstractions, making tests brittle when the application changes.
Strict assertion mode (using import { strict as assert } from 'node:assert' or import assert from 'node:assert/strict') shows diffs with + for actual and - for expected values in error messages.
Allure Report provides test stability analysis through history tracking, retry information, and visual analytics that help identify flaky tests and patterns in test failures.
The code property is always set to 'ERR_ASSERTION' for AssertionError instances.
The accepted values are 'simple' (default) and 'full', which controls the verbosity of diffs in assertion error messages.
Python 3.11 deprecated the behavior of returning a value from a test method (other than the default None value).
Allure Report supports sorting and filtering by test statuses including failed, broken, passed, and skipped, with categories for different types of defects.
Use the command pytest --trace to invoke the Python debugger at the start of every test.
Allure Report is an open-source framework-agnostic test result visualization tool that transforms test execution data into clear, interactive HTML reports, working with 30+ testing frameworks across JavaScript, Python, Java, C#, PHP, and Ruby.
Before fixing a bug exposed by a high-level test, you should replicate the bug with a unit test to ensure the bug stays dead.
Buildbot, Jenkins, GitHub Actions, or AppVeyor are recommended for production environments.
JUnit requires Java 17 (or higher) at runtime, though it can still test code compiled with previous JDK versions.
The --durations N flag shows the N slowest test cases (use N=0 to show all).
Run Jest with --no-watchman or set the watchman configuration option to false.
Mike Cohn described it in his 2009 book "Succeeding with Agile" as the "Test Automation Pyramid," though he originally drew it in conversation with Lisa Crispin in 2003-2004.
Set the NO_COLOR or NODE_DISABLE_COLORS environment variables to deactivate colors in assertion error messages.
This happens if you are using the babel-plugin-istanbul plugin, which instruments every file processed by Babel with coverage collection code, bypassing Jest's ignore patterns.
The --runInBand flag makes Jest run tests in the same process rather than spawning processes for individual tests, which is necessary for debugging.
The Test Pyramid states that you should have many more low-level unit tests than high-level broad-stack tests running through a GUI, as UI tests are brittle, expensive to write, and time-consuming to run.
Common Bug Patterns > Null and undefined handling
37 questionsThe || operator removes both null and undefined from the type of the left operand in the resulting union type. For example, with x: Entity | null, the expression x || { name: "test" } results in type Entity.
In strict null checking mode, the compiler requires every reference to a local variable of a type that doesn't include undefined to be preceded by an assignment to that variable in every possible preceding code path.
Expression operators permit operand types to include null and/or undefined but always produce values of non-null and non-undefined types. For example, function sum(a: number | null, b: number | null) { return a + b; } produces a value of type number.
The no-unsafe-optional-chaining rule disallows use of optional chaining in contexts where the undefined value is not allowed. It detects cases where optional chaining expressions are used in positions where short-circuiting to undefined causes a TypeError, such as (obj?.foo)() or (obj?.foo).bar.
The typeof operator returns "object" for null. This is a known bug in JavaScript since the first implementation, where null was represented as the NULL pointer (0x00) with type tag 0, which was the same type tag as objects.
The nullish coalescing operator has the fifth-lowest operator precedence, directly lower than || and directly higher than the conditional (ternary) operator. It is not possible to combine && or || directly with ?? without parentheses, or a SyntaxError will be thrown.
The optional chaining operator accesses an object's property or calls a function. If the object accessed or function called using this operator is undefined or null, the expression short circuits and evaluates to undefined instead of throwing an error.
In strict null checking mode, null and undefined types are not widened to any. For example, let z = null; infers type as null, whereas in regular type checking mode the inferred type would be any.
The void operator evaluates the given expression and then returns undefined. It is commonly used to obtain the undefined primitive value, usually using void 0 or void(0).
When using optional chaining with function calls (func?.(args)), the expression automatically returns undefined instead of throwing an exception if the method isn't found. However, if the property exists but is not a function, a TypeError will still be raised.
A variable that has not been assigned a value is of type undefined.
The logical OR (||) returns the right-hand side if the left operand is any falsy value (including 0, '', NaN, false), while the nullish coalescing operator (??) only returns the right-hand side when the left operand is specifically null or undefined. For example, 0 || 42 returns 42, but 0 ?? 42 returns 0.
Type guards support non-null and non-undefined checks using ==, !=, ===, or !== operators to compare to null or undefined. The effects on subject variable types accurately reflect JavaScript semantics (double-equals checks for both values, triple-equals only checks for the specified value).
All current browsers expose document.all with typeof returning "undefined", even though document.all is not actually undefined. This is classified in web standards as a "willful violation" of the ECMAScript standard for web compatibility.
The strictNullChecks compiler option switches TypeScript to a strict null checking mode where null and undefined values are not in the domain of every type and are only assignable to themselves and any (with undefined also assignable to void). This enables detection of erroneous use of null/undefined values.
The non-null assertion operator is a new ! postfix expression operator that asserts its operand is non-null and non-undefined. The operation x! produces a value of the type of x with null and undefined excluded. It is removed in the emitted JavaScript code.
The optional chaining operator supports three syntax forms: obj?.prop for property access, obj?.[expr] for expression-based property access, and func?.(args) for optional function calls.
Optional chaining cannot be used as a template literal tag. Both String?.rawHello and String.raw?.Hello result in a SyntaxError: "Invalid tagged template on optional chain".
The get() method returns the value if present in the Optional, otherwise throws NoSuchElementException. This method is discouraged in favor of orElse(), orElseGet(), or orElseThrow() for safer handling of absent values.
Type guards support checking "dotted names" consisting of a variable or parameter name followed by one or more property accesses (e.g., options.location.x). A type guard for a dotted name has no effect following an assignment to any part of the dotted name.
Optional parameters and properties automatically have undefined added to their types, even when their type annotations don't specifically include undefined. For example, type T1 = (x?: number) => string has x with type number | undefined.
Optional.of(T value) returns an Optional with the specified present non-null value and throws NullPointerException if value is null. Optional.ofNullable(T value) returns an Optional with a present value if the specified value is non-null, otherwise returns an empty Optional.
When using loose equality (==), x == undefined also checks whether x is null, because null is loosely equal to undefined. Strict equality (===) does not have this behavior and only matches undefined.
None is an object frequently used to represent the absence of a value, as when default arguments are not passed to a function. None is the sole instance of the NoneType type, and assignments to None are illegal and raise a SyntaxError.
The nullish coalescing operator is a logical operator that returns its right-hand side operand when its left-hand side operand is null or undefined, and otherwise returns its left-hand side operand.
No, short-circuiting only happens along one continuous "chain" of property accesses. If you group one part of the chain with parentheses, subsequent property accesses will still be evaluated. For example, (potentiallyNullObj?.a).b will throw a TypeError if potentiallyNullObj is null.
When using optional chaining, if the left operand is null or undefined, the expression will not be evaluated. For example, const potentiallyNullObj = null; const prop = potentiallyNullObj?.[x++]; will not increment x because the expression short-circuits.
Java's Optional provides orElse(T other) to return a default value, orElseGet(Supplier<? extends T> other) to lazily invoke a supplier, and orElseThrow(Supplier<? extends X> exceptionSupplier) to throw a custom exception if no value is present.
Expressions that can be converted to false (falsy values) include: false, null, NaN, 0, empty string (""), and undefined.
The disallowArithmeticOperators option (default: false) disallows arithmetic operations on optional chaining expressions, which may result in NaN. When enabled, it warns about unary operators (-, +), arithmetic operators (+, -, /, *, %, **), and assignment operators (+=, -=, etc.) on optional chaining results.
The rule flags optional chaining expressions when used as: function calls (obj?.foo()), property access on the result ((obj?.foo).bar), spread operators [...obj?.foo], in operator (1 in obj?.foo), instanceof operator (bar instanceof obj?.foo), for...of loops, destructuring, with statements, and class extends clauses.
Yes, undefined can be used as an identifier (variable name) in any scope other than the global scope because undefined is not a reserved word. However, this is strongly discouraged as it makes code difficult to maintain and debug.
typeof works with undeclared identifiers, returning "undefined" instead of throwing a ReferenceError. However, using typeof on lexical declarations (let, const, class) before their declaration line will throw a ReferenceError due to the temporal dead zone (TDZ).
No, the constructor of new expressions cannot be an optional chain. Using new Intl?.DateTimeFormat() or new Map?.() results in a SyntaxError: "new keyword cannot be used with an optional chain".
The Optional.ofNullable(T value) method returns an Optional describing the specified value if non-null, otherwise returns an empty Optional. The Optional.of(T value) method requires a non-null value and throws NullPointerException if the value is null.
The && operator adds null and/or undefined to the type of the right operand depending on which are present in the type of the left operand. For example, with x: Entity | null, the expression x && x.name results in type string | null.
No, optional chaining cannot be used on a non-declared root object. Using undeclaredVar?.prop will throw a ReferenceError: "undeclaredVar is not defined". However, it can be used with a root object with value undefined.
Debugging Methodology > Binary search debugging
37 questionsgit bisect start --term-old
Instead of asking "What is the problem?", binary search debugging focuses on asking "Where is the problem?" This transforms debugging from a thought-intensive mystery-solving exercise into a straightforward search problem.
Git bisect is a Git command that uses binary search to find which commit in a project's history introduced a bug or changed any property of the project. It allows you to perform binary search across commit history rather than through code.
No, you cannot mix "good" and "bad" with "old" and "new" in a single session. You must use one consistent set of terminology throughout the bisection.
With git bisect, you should expect to test approximately 1 + logâ‚‚(N) commits to find the bug, compared to roughly N/2 commits with a linear search.
The Wolf Fence Algorithm is another name for binary search debugging, named after a hypothetical scenario where you need to find a lone wolf in Alaska by fencing the area in half, waiting for the wolf to howl to determine which half it's in, and repeating the process until you find the wolf.
It takes a maximum of 13 steps to find the first offending commit among 10,000 commits using binary search.
git bisect next This is used to explicitly request the next bisection step, typically after interrupting the process by checking out a different revision.
The --first-parent option tells git bisect to only follow the first parent when traversing the commit history, ignoring merge commits' second parents.
The "Wolf Fence" algorithm for debugging was first published by E. J. Gauss in 1982 in the Communications of the ACM, volume 25, issue 11, page 780.
Binary search debugging is a methodical debugging process that systematically narrows down the location of a bug by repeatedly dividing the code in half and testing each half. At each step, you eliminate approximately half of the remaining code from consideration until you isolate the specific line or commit responsible for the bug.
git bisect start --
git bisect reset HEAD This leaves you on the current bisection commit and avoids switching commits at all.
git bisect replay
git bisect visualize or git bisect view This opens gitk (or git log if no graphical environment is detected) to show currently remaining suspects.
The script should exit with code 0 if the current source code is good/old.
The principle is called "Divide and conquer" or "Proceed by binary search." It involves throwing away half the input to see if the output is still wrong, and if not, going back to discard the other half. The same process can be used on the program text itself by eliminating parts of the program that should have no relationship to the bug.
Using binary search to find a birthday takes at most 9 questions, compared to an average of 183 guesses with linear search (366 days divided by 2).
Binary search debugging is most effective for bugs that are easily reproducible with clear input/output. It may not be effective for intermittent bugs, bugs related to external factors, or bugs that occur in code executed multiple times (like loops). It also requires familiarity with the codebase to identify which sections to isolate.
git bisect reset [
git bisect reset bisect/bad This will check out the first bad revision instead of returning to the original HEAD.
git bisect terms [--term-(good|old) | --term-(bad|new)] To get just the old term: git bisect terms --term-old or git bisect terms --term-good To get just the new term: git bisect terms --term-new or git bisect terms --term-bad
git bisect start [--term-(bad|new)=
After the Mars Pathfinder landed in July 1997, the spacecraft's computers tended to reset once a day. The engineers had seen this problem during pre-launch tests but had ignored it while working on unrelated problems. They were forced to deal with the problem when the machine was tens of millions of miles away, demonstrating the importance of debugging issues immediately rather than later.
git bisect skip [(
git bisect run
The script should exit with a code between 1 and 127 (inclusive), except 125, if the current source code is bad/new.
Git checks DISPLAY (set in X Window System environments on Unix), SESSIONNAME (set under Cygwin in interactive desktop sessions), MSYSTEM (set under Msys2 and Git for Windows), and SECURITYSESSIONID (may be set on macOS in interactive desktop sessions).
- Identify the problem and exact behavior not working as intended 2. Choose a breakpoint at the middle of the suspected code section 3. Run the code with the breakpoint 4. Determine if the code behaves as expected at this breakpoint 5. Choose a new breakpoint midway between the previous breakpoint and the start or end, depending on the result 6. Repeat steps 3-6 until narrowed down to a single line
git bisect log This shows what has been done so far in the current bisection session.
git bisect good [
If you skip a commit adjacent to the one you're looking for, Git will be unable to tell exactly which of those commits was the first bad one, and you'll get a message listing possible candidates.
git bisect bad [
Git bisect leaves the reference refs/bisect/bad pointing at the first bad commit when the bisection is complete.
The --no-checkout option tells git bisect to not checkout commits during the bisection process, leaving the working tree unchanged.
The benefits are: (1) it allows you to progress systematically toward a solution rather than taking shots in the dark, (2) it almost always works with a success rate close to 100%, and (3) it breaks the debugging process into smaller, easier-to-manage chunks.
Test Failure Analysis > Debugging flaky tests
33 questionsThe five primary causes are: 1) Lack of isolation (tests that create data and leave it behind), 2) Asynchronous behavior (tests that check before async operations complete), 3) Remote services (tests depending on external services that may be slow or down), 4) Time (tests that depend on specific time values or measure time intervals), and 5) Resource leaks (tests not properly releasing file handles, database connections, or memory).
Testcontainers is a Java library that provides lightweight, throwaway instances of common databases, Selenium web browsers, or anything that can run in a Docker container. This eliminates dependency on shared database instances, giving each test a fresh container and helping avoid isolation issues and shared state pollution that cause flakiness.
The tool requires four parameters: --run-tests (the command to execute tests), --test-output-file (the file for test runner output, currently only JUnit format supported), --test-output-format (either "junit" or "cucumberJson"), and --repeat (the number of times to execute the tests). Example: flaky-test-detector --run-tests "npm run test" --test-output-file=./test-results.xml --test-output-format=junit --repeat=5
The test pyramid approach emphasizes writing fewer UI system tests, which should be rare. Instead, write most tests at lower levels (unit tests, integration tests) where there are more possibilities to test and the proportion of flaky tests drops significantly. One commentator stated that if developers write all the tests and follow the pyramid, the flaky test proportion would drop to 1:1000.
JUnit 5 has built-in timeout support documented in the "Timeouts" section of the user guide, allowing tests to fail if they take longer than a specified time. This helps identify timing-related issues and prevents tests from hanging indefinitely.
- Numeric limit: Only allow a fixed number (e.g., 8) of tests in quarantine; once the limit is reached, developers must clear all tests out. 2) Time limit: No test should remain in quarantine longer than a week. 3) Aggressive approach: Put the quarantine suite into the deployment pipeline one stage after healthy tests so flaky tests still run but don't block critical path.
Mocha has a documented "Retrying tests" feature that allows failed tests to be automatically re-run a specified number of times. Mocha is designed for asynchronous testing with tests running serially, allowing for flexible and accurate reporting while mapping uncaught exceptions to the correct test cases.
The command is avocado diff 7025aaba 384b949c (comparing two job IDs). This command allows you to easily compare several aspects of two given jobs, including system information, test results, and environmental differences that might explain why a test failed in one run but not another.
The command is avocado replay 825b86 (using either a full or partial job ID). This command reproduces a job using exactly the same data as the original run, which is useful for reproducing intermittent failures that occurred in previous test runs.
Martin Fowler stated that "non-deterministic tests are useless" and "they are a virulent infection that can completely ruin your entire test suite." They must be dealt with as soon as possible before the entire deployment pipeline is compromised. Once developers lose the discipline of taking all test failures seriously, they'll start ignoring failures in healthy tests too, at which point "you've lost the whole game and might as well get rid of all the tests."
Approximately 16% of Google's tests have some level of flakiness associated with them. Google describes this as "a staggering number; it means that more than 1 in 7 of the tests written by our world-class engineers occasionally fail."
The main test thread continues executing assertions while the asynchronous operation is still processing. When the test finishes and returns its result before the async operation completes, it leads to non-deterministic results depending on timing.
UI tests are "definitely flaky" due to how test harnesses interact with the UI, timing issues, handshaking between the test framework and UI, and the extraction of state from the UI. These complex interactions create multiple opportunities for non-deterministic behavior.
JUnit 5's "Test Instance Lifecycle" controls how test instances are created, which can help with shared state issues that cause flakiness. This allows tests to run with fresh instances or shared instances depending on the testing needs.
Properly isolated tests can be run in any sequence without affecting each other. If executing tests in a different order causes failures, it indicates a lack of isolation due to shared state or data pollution between tests.
About 84% of the transitions from pass to fail that Google observes involve a flaky test. This high percentage means that most test failures are false positives, making it difficult to identify legitimate failures.
The core argument is that all temporary solutions tend to become permanent, and brute-forcing flaky tests through retries masks real problems. The race condition or timeout causing flakiness might be in production code, not just the test, potentially affecting customers. The long-term solution is to either fix or replace the flaky tests, or delete and rewrite them from scratch if they cannot be fixed.
Google maintains a continual rate of about 1.5% of all test runs reporting a flaky result. For an average project containing 1000 individual tests, this means approximately 15 tests will likely fail per release, blocking submission and introducing costly delays.
Google runs tests both before submission (pre-submit) and after submission (post-submit). Pre-submit testing gates the submission, preventing code from being committed if tests fail. Post-submit testing decides whether the project is ready for release. Flaky tests cause extra repetitive work in both phases to determine whether a failure is flaky or legitimate.
The primary methods are mock() and @Mock (to create mock objects), when() and given() (to specify mock behavior), spy() and @Spy (for partial mocking), and @InjectMocks (to automatically inject mocks/spies into the class under test). These help isolate tests from external dependencies that could cause flakiness.
A flaky test is defined as "a test that exhibits both a passing and a failing result with the same code." This means that running the same test multiple times without any code changes can produce different results - sometimes passing and sometimes failing - making it non-deterministic.
There are two approaches: always rebuild your starting state from scratch, or ensure that each test cleans up properly after itself. Rebuilding from scratch is preferred because it's often easier and easier to find the source of a problem if one occurs.
Semaphore CI explicitly chooses not to support rerunning failed tests because "this approach is harmful much more often than it is useful." They describe it as "poisonous" because it legitimizes and encourages entropy, rots the test suite in the long run, and defeats the purpose of testing.
Pytest provides modular fixtures for managing test resources (helping with shared state issues), has over 1300+ external plugins in its ecosystem including flaky test handling plugins, and supports re-running failed tests while maintaining state between test runs. The fixture system is particularly important for proper resource management and test isolation.
JUnit 5 supports test execution order configuration, which can help isolate order-dependent flaky tests. If tests pass when run in one order but fail in another, it indicates lack of isolation and shared state pollution between tests.
It is human nature to ignore alarms when there is a history of false signals coming from a system. Developers become conditioned to treat test failures as false positives from flaky tests, leading them to dismiss legitimate failures as flaky, only to later realize that it was a real problem. This is analogous to airline pilots ignoring alarms due to false signals.
Google uses a tool that monitors the flakiness of tests and automatically quarantines tests with flakiness that is too high. The tool removes the test from the critical path and files a bug. Another tool detects changes in flakiness levels and works to identify the code change that caused the test to change its flakiness level.
Google provides the ability to re-run only failing tests, with options to re-run tests automatically when they fail. Additionally, tests can be marked as flaky, causing them to report a failure only if they fail 3 times in a row.
The three key Mockito best practices are: 1) Do not mock types you don't own, 2) Do not mock value objects, and 3) Do not mock everything. Following these practices helps avoid external service dependencies that can lead to flaky tests while maintaining test integrity.
If a long-running integration test is marked as flaky and broken by a code submission, the breakage will not be discovered until 3 executions of the test complete. For a 15-minute integration test, this means a 45-minute delay before the failure is detected.
Avocado's sysinfo collector automatically gathers system information per job or even between tests, including cpuinfo, meminfo, mounts, network configuration, installed packages, and other system state. This information is stored in $HOME/avocado/job-results/latest/sysinfo/ and helps identify environment-specific flakiness, resource exhaustion, or configuration differences between runs.
The quarantine strategy involves placing non-deterministic tests in a separate test suite away from healthy tests. This prevents them from blocking deployments while maintaining awareness that they need fixing. Critical warning: Tests in quarantine must be fixed quickly or they will be forgotten, eroding the bug detection system.
Debugging Methodology > Root cause analysis techniques
32 questionsWhiteboards & Sticky Notes: Limited visibility once the session ends, difficult to archive or share remotely, no easy way to track follow-up actions Excel & Spreadsheets: Aren't built for visual methods like Fishbone or Fault Trees, can quickly become cluttered and hard to navigate, lack collaboration features Visio, PowerPoint, and Diagramming Tools: Useful for creating visuals but they're static not dynamic, updating requires manual rework, they don't connect findings to corrective actions or task tracking
Brainstorming sessions should be performed to identify root causes. It is a technique by which various efforts are made to define a specific problem or defect. There might be more than one root cause of a defect, so one needs to identify as many causes as possible. Brainstorming helps generate multiple potential causes that can then be systematically analyzed.
Two primary techniques are used to perform a five whys analysis: the fishbone (or Ishikawa) diagram and a tabular format. These tools allow for analysis to be branched in order to provide multiple root causes.
Fault Tree Analysis (FTA) is a type of failure analysis in which an undesired state of a system is examined. This analysis method is mainly used in safety engineering and reliability engineering to understand how systems can fail, to identify the best ways to reduce risk and to determine event rates of a safety accident or a particular system level failure. FTA is also used in software engineering for debugging purposes and is closely related to cause-elimination technique used to detect bugs.
Defining the defect means to identify or determine if a defect is present in a system. It includes understanding what exactly is happening, what are particular symptoms, what issues you observe, its severity, etc. This is the critical first step that ensures the analysis is focused on the actual problem rather than symptoms.
Fault Tree Analysis was originally developed in 1962 at Bell Laboratories by H.A. Watson, under a U.S. Air Force Ballistics Systems Division contract to evaluate the Minuteman I Intercontinental Ballistic Missile (ICBM) Launch Control System. Following the first published use in the 1962 Minuteman I Launch Control Safety Study, Boeing and AVCO expanded use of FTA to the entire Minuteman II system in 1963-1964.
In Fault Tree Analysis, the undesired outcome is taken as the root ('top event') of a tree of logic. The analysis works backward from this top event to determine how it could occur, mapping the relationship between faults, subsystems, and redundant safety design elements by creating a logic diagram of the overall system.
The Five Whys technique has been criticized for the following reasons: - Tendency for investigators to stop at symptoms rather than going on to lower-level root causes - Inability to go beyond the investigator's current knowledge - Lack of support to help the investigator provide the right answer to "why" questions - Results are not repeatable - different people using five whys come up with different causes for the same problem - Tendency to isolate a single root cause, whereas each question could elicit many different root causes - The arbitrary depth of the fifth why is unlikely to correlate with the root cause
FTA can be used to: - Understand the logic leading to the top event/undesired state - Show compliance with system safety/reliability requirements - Prioritize the contributors leading to the top event - Monitor and control the safety performance of complex systems - Minimize and optimize resources - Assist in designing a system by helping create requirements - Function as a diagnostic tool to identify and correct causes of the top event
Root Cause Analysis (RCA) is a structured method used to identify the underlying reason a defect or failure occurs in a system. Unlike simple debugging or patching, which addresses immediate symptoms, RCA goes deeper to uncover systemic issues that allowed the defect to happen in the first place, whether they originate in design, requirements, process, tools, or human factors.
The Ishikawa Fishbone Diagram is a visual root cause analysis tool that organizes potential causes of a problem into categories. Shaped like a fishbone, it helps teams brainstorm systematically by grouping causes under headings like Methods, Materials, Machines, People, and Environment. It is best for complex problems with multiple potential causes.
Based on the 80/20 Principle, the Pareto Chart helps prioritize the most significant causes of a problem. In most cases, 80% of failures come from just 20% of causes. By charting causes by frequency or impact, teams can focus on the biggest drivers of failure first. It is best for identifying which issues deliver the highest ROI when solved.
The Five Whys (or 5 Whys) is an iterative interrogative technique used to explore the cause-and-effect relationships underlying a particular problem. The primary goal is to determine the root cause of a defect or problem by repeating the question "why?" five times, each time directing the current "why" to the answer of the previous "why." The number of whys may be higher or lower depending on the complexity of the analysis and problem.
The modern Five Whys technique was originally developed by Sakichi Toyoda and was used within the Toyota Motor Corporation during the evolution of its manufacturing methodologies. Taiichi Ohno, the architect of the Toyota Production System, described the five whys method as "the basis of Toyota's scientific approach by repeating why five times the nature of the problem as well as its solution becomes clear."
FTA methodology is described in several industry and government standards, including: - NRC NUREG–0492 for the nuclear power industry - An aerospace-oriented revision to NUREG–0492 for use by NASA - SAE ARP4761 for civil aerospace - MIL–HDBK–338 for military systems - IEC standard IEC 61025 for cross-industry use (adopted as European Norm EN 61025)
Following process industry disasters such as the 1984 Bhopal disaster and 1988 Piper Alpha explosion, in 1992 the United States Department of Labor Occupational Safety and Health Administration (OSHA) published in the Federal Register at 57 FR 6356 (1992-02-24) its Process Safety Management (PSM) standard in 29 CFR 1910.119. OSHA PSM recognizes FTA as an acceptable method for process hazard analysis (PHA).
When a specific event is found to have more than one effect event, meaning it has impact on several subsystems, it is called a common cause or common mode. Graphically, this means this event will appear at several locations in the fault tree.
Root Cause Corrective Action (RCCA) involves taking measures and actions to resolve or eliminate the current defect, with main focus on eliminating the root cause so it does not occur in future. Root Cause Prevention Action (RCPA) involves creating plans regarding defect reoccurrence, including improving skills, performing tasks properly, and following proper documentation of preventive actions to ensure the defect does not reoccur.
Early in the Apollo program, NASA initially decided to rely on the use of failure modes and effects analysis (FMEA) and other qualitative methods for system safety assessments. After the Challenger accident in 1986, the importance of probabilistic risk assessment (PRA) and FTA in systems risk and reliability analysis was realized and its use at NASA began to grow. Now FTA is considered as one of the most important system reliability and safety analysis techniques at NASA.
FTA is used in aerospace, nuclear power, chemical and process, pharmaceutical, petrochemical and other high-hazard industries. It is also used in fields as diverse as risk factor identification relating to social service system failure. In 1976, the U.S. Army Materiel Command incorporated FTA into an Engineering Design Handbook on Design for Reliability.
The Affinity Diagram organizes large amounts of qualitative data into clusters based on natural relationships. It's especially useful for turning brainstorming notes, survey feedback, or stakeholder input into structured insights. It is best for finding themes and connections in complex or ambiguous problems.
In 1970, the U.S. Federal Aviation Administration (FAA) published a change to 14 CFR 25.1309 airworthiness regulations for transport category aircraft in the Federal Register at 35 FR 5665 (1970-04-08). This change adopted failure probability criteria for aircraft systems and equipment and led to widespread use of FTA in civil aviation.
FMEA is a proactive root cause analysis tool that anticipates where and how a process might fail. Teams rank risks by severity, occurrence, and detection, allowing them to prioritize corrective actions before problems escalate. It is best for high-risk industries where prevention is critical. The three ranking criteria are severity, occurrence, and detection.
According to research by the Consortium for IT Software Quality (CISQ), software failures cost U.S. businesses alone over $2.4 trillion annually, from operational outages, lost productivity, customer churn, and reputational damage. Implementing RCA can provide up to 100× savings by addressing root causes rather than repeatedly fixing symptoms.
When collecting data regarding a defect, you should gather: - Impact of the defect - Proof that the defect exists - How long the defect has existed - Whether it is a reoccurring defect - Communication with customers or employees who experienced or observed the issue Before identifying root cause, one needs to analyze the defect or problem completely and gather all required information or evidence.
The six steps to perform RCA are: 1. Define Problem or Defect: Identify what exactly is happening, what are particular symptoms, its severity, etc. 2. Collect Data regarding defect: Gather all information including impact, proof of existence, duration, and whether it's reoccurring 3. Identify Root Cause of defect: Identify the main cause causing the defect to arise, potentially using tools and brainstorming sessions 4. Implement Root Cause Corrective Action (RCCA): Take measures to resolve or eliminate the defect, focusing on eliminating the root cause 5. Implement Root Cause Prevention Action (RCPA): Create plans to prevent defect reoccurrence through improved skills, proper task execution, and documentation 6. Monitor and Validate: Verify the fix is effective and prevents reoccurrence
Fault Tree Analysis is a top-down, deductive RCA tool that maps events in a tree structure to show how multiple smaller issues combine into major system failures. It is best for high-risk, high-consequence problems that require exhaustive prevention. In contrast, the Five Whys is a simple, fast iterative technique that digs deeper by repeatedly asking "why?" and is best for quick investigations of recurring, surface-level issues.
Within the nuclear power industry, the U.S. Nuclear Regulatory Commission began using PRA methods including FTA in 1975, and significantly expanded PRA research following the 1979 incident at Three Mile Island. This eventually led to the 1981 publication of the NRC Fault Tree Handbook NUREG–0492.
Capers Jones' widely cited studies show that over 50% of all software defects stem from flawed requirements or design decisions—not coding errors—and are completely preventable when caught early.
A 2024 PwC survey found that one in three consumers will stop using a brand they love after just one bad experience. This underscores the importance of RCA in preventing recurring defects that damage user trust.
The key characteristics of RCA include: - Systematic: It follows a logical, repeatable process rather than relying on guesswork - Fact-based: It's grounded in data and evidence collected from the defect and its context - Action-oriented: The goal is not just to identify what went wrong but to fix the root cause and verify the fix - Collaborative: It often involves cross-functional teams to investigate issues
The PROACT RCA Method is a structured, evidence-driven approach developed by Reliability Center Inc. It is designed to tackle chronic, recurring failures that traditional methods often miss. The steps include: - Preserve Evidence & Acquire Data (using the 5 Ps: Parts, Position, People, Paper, Paradigms) - Order Your Team & Assign Resources - Analyze the Event using logic trees - Communicate Findings & Recommendations - Track & Measure Bottom-Line Results It is best for organizations seeking measurable ROI from RCA programs.
Debugging Methodology > Rubber duck debugging
32 questionsThe approach has been taught in computer science and software engineering courses. It is mentioned as being taught in introductory programming lessons to help students when they can't understand why their code won't work.
The article "The Contribution of the Cardboard Cutout Dog to Software Reliability and Maintainability" by SJ Baker was archived from the original on October 5, 2013.
Pair programming is a type of teamwork where two software developers sit at the same computer and work on a programming problem together, with one person typing while the other reviews. This process is similar to rubber duck debugging: as the "driver" writes code, they explain what the program needs to do and how new additions will achieve that.
While AI can potentially take on the duck's role, there's an important distinction: the AI tool gives feedback, which can produce volumes of unrelated information that might distract the user and obscure their original thought process. LLMs may inhibit metacognition by offering an attractive escape from effortful practice. One of the most important benefits of rubber duck debugging is that all answers ultimately come from the programmer through methodical inspection.
- Improved debugging efficiency—allows comprehensive examination of codebase, scrutinizing each line, decision, and assumption. 2. Improved communication and collaboration—helps programmers verbalize code clearly and identify actual problem areas. 3. Enhanced problem-solving—forces articulation of thoughts and explanation of code step by step, helping identify mistakes and uncover logical errors. 4. Better memory retention—hearing the sound of your voice enhances how effectively you learn concepts. 5. Integrating new knowledge with existing knowledge—helps learners update and refine existing mental models.
Yes. Variations of the practice use other objects or even pets; teddy bears are especially common. The actual presence of a rubber duck is not crucial—any inanimate object or even a person can serve as a substitute. The key is to engage in dialogue and carefully explain the code to someone or something that will not interrupt or respond.
University professor and author Michelene T.H. Chi has explored the benefits of self-explanation in learning and problem solving. Additionally, US scholars Logan Fiorella and Richard Meyer have examined how learning can be enhanced through teaching others, finding that when students learn content as though they are going to teach it to others, they "develop a deeper and more persistent understanding of the material."
The self-explanation effect is a cognitive phenomenon where explaining concepts or problems in one's own words enhances understanding and retention of material. It encourages deeper cognitive processing and helps identify gaps in comprehension. Self-explanation tends to produce better results than merely thinking aloud without an audience.
Before rubber ducks, a similar practice existed with a "cardboard cutout dog." In a story from the book, a supervisor named Bob required an employee to ask questions aloud to a stuffed duck (named "Bob Junior") before asking him. The practice of using inanimate objects for problem-solving predates the specific rubber duck terminology.
By explaining code line by line to an inanimate object, the programmer is forced to engage with every line of code and take no line for granted. This forces them to slow down and explain in detail the logic of a program, which exposes details, assumptions, or errors that they had previously overlooked.
The term originated from a story in the 1999 book "The Pragmatic Programmer: From Journeyman to Master" by Andrew Hunt and David Thomas, published by Addison Wesley. The story appears on page 95 in a footnote.
David Malan incorporated "Rubber Duck Debugging in CS50 IDE" as part of the Harvard CS50 computer science course, demonstrating how the technique has been integrated into educational tools.
A 2023 study published in ResearchGate titled "Robot Duck Debugging: Can Attentive Listening Improve Problem Solving" explored whether an interactive rubber duck that nods or offers brief, neutral replies when a user presses a button might make people more comfortable talking to it, potentially enhancing the debugging process.
Rubber ducks were dispersed throughout GitHub Universe in 2022, appearing in the "What is GitHub?" video and physically at the event.
Metacognition—the act of thinking about one's own thought process. Educators use metacognition to improve understanding and help learners spot errors in their reasoning, understand their process, and become more effective.
- Solo activity—doesn't require involving another person, avoiding wasting their time or letting feedback distract from understanding your own thought process. 2. Don't expose your mistakes—allows fixing code without exposing simple issues to co-workers. 3. Find solutions—helps solve problems in code. 4. Gain other code insights besides solution—helps programmers glean better understanding of their overall thought process and avoid similar pitfalls in the future.
- Assume your duck knows nothing—begin by assuming the rubber duck has no knowledge of the code or problem, which ensures you start from basics. 2. Provide context for each line and decision made—go through code line by line, explaining the purpose and functionality of each segment. 3. Judge your rationale until you find a solution—evaluate your thought process, question assumptions, check for logical consistency, and challenge your reasoning.
"The Art of Readable Code: Simple and Practical Techniques for Writing Better Code" by Dustin Boswell and Trevor Foucher (2011), published by O'Reilly and Associates, page 137, ISBN 978-0596802295.
Rubber duck debugging (or rubberducking) is a debugging technique in software engineering wherein a programmer explains their code, step by step, in natural language—either aloud or in writing—to reveal mistakes and misunderstandings. The name is a reference to a story in the book The Pragmatic Programmer.
- Go find a rubber duck and put it on your desk. 2. Explain the big picture of what your code is supposed to do. 3. Explain line by line what each part does. As you explain, you'll notice the line of code that's not doing what you want it to and fix it.
"Tell Us What You Really Think: A Think Aloud Protocol Analysis of the Verbal Cognitive Reflection Test" by Byrd, Nick; Joseph, Brianna; Gongora, Gabriela; Sirota, Miroslav (2023), published in Journal of Intelligence, 11(4): 76, DOI: 10.3390/jintelligence11040076.
Microsoft Visual Studio includes a rubber duck debugging feature, among other IDEs that have incorporated this technique.
A common mistake is choosing not to share what you see as small or unimportant aspects of your code. It's important not to leave any information out, as verbalizing every last piece of code presents more opportunities to identify the problem.
Jeff Atwood, co-founder of Stack Overflow, wrote that Stack Exchange insists people put effort into their questions partly to teach them "Rubber Duck problem solving." He noted that he received tons of feedback over the years from people who, in the process of writing up their thorough, detailed question for Stack Overflow, figured out the answer to their own problem.
The critical part is to totally commit to asking a thorough, detailed question of the imaginary person or inanimate object. The effort of walking an imaginary someone through the problem, step by step and in some detail, is what will often lead to the answer.
- Can be used as an alternative to seeking real feedback or avoiding criticism—might turn to it when feedback would be beneficial. 2. Doesn't work if intention isn't clear—requires knowing what you want the problem code to do. 3. Not good for "big issues"—best when you already know the answer and just need to think it over, not for problems you simply don't know how to solve. 4. Working on other people's code—you might be just as in the dark as the duck regarding their intention or rationale.
"The Psychology Underlying the Power of Rubber Duck Debugging" by David Hayes, published June 25, 2014, on Press Up. It was archived from the original on July 9, 2014.
On April 1, 2018, Stack Overflow launched an April Fools' Day joke called Quack Overflow. A rubber duck avatar appeared in the bottom right corner of the screen, listened to user problems, and pretended to type solutions, only to respond with a simple "quack" sound.
Fix Verification > Finding all instances of a bug pattern
32 questionsSearch rules detect matches based on patterns described by a rule and perform semantic analyses like constant propagation and type inference. Taint rules make use of Semgrep's taint analysis in addition to default search functionalities, and can specify sources, sinks, and propagators of data as well as sanitizers.
A true positive occurs when a rule detected a piece of code it was intended to find.
An error matrix is a 2x2 table that visualizes the findings of a Semgrep rule in relation to the vulnerable lines of code it does or doesn't detect. It has two axes: Positive/Negative and True/False, yielding four combinations: true positive, false positive, true negative, and false negative.
Propagators must be explicitly listed because during intraprocedural taint analysis, there is no way for Semgrep to infer which function calls propagate taint. Explicitly listing propagators is the only way for Semgrep to know if tainted data could be passed within a function.
An example of a sanitizer is the DOMPurify.sanitize(dirty); function from the DOMPurify package in JavaScript.
Search rules perform several semantic analyses including: interpreting syntactically different code as semantically equivalent, constant propagation, matching a fully qualified name to its reference in the code even when not fully qualified, and type inference (particularly when using typed metavariables).
A rule is a specification of the patterns that Semgrep must match to the code to generate a finding. Rules are written in YAML. Without a rule, the engine has no instructions on how to match code. Rules can be run on either Semgrep or its OSS Engine. Only proprietary Semgrep can perform interfile analysis.
A sanitizer is any piece of code, such as a function or a cast, that can clean untrusted or tainted data. Data from untrusted sources may be tainted with unsafe characters, and sanitizers ensure that unsafe characters are removed or stripped from the input.
A propagator is any code that alters a piece of data as the data moves across the program. This includes functions, reassignments, and so on. When writing rules that perform taint analysis, propagators are pieces of code specified through the pattern-propagator key as code that always passes tainted data.
A true negative occurs when a rule correctly skipped over a piece of code it wasn't meant to find.
An l-value (left-value, or location-value) is an expression that denotes an object in memory; a memory location that can be used in the left-hand side (LHS) of an assignment. For example, x and array[2] are l-values, but 2+2 is not.
Taint analysis tracks and traces the flow of untrusted or unsafe data. Data coming from sources such as user inputs could be unsafe and used as an attack vector if these inputs are not sanitized. Taint analysis provides a means of tracing that data as it moves through the program from untrusted sources to vulnerable functions.
Cross-file analysis takes into account how information flows between files and can track data through arbitrarily many files. Per-file analysis (also known as intrafile analysis) can only trace or track information within a single file and cannot trace data if it flows to another file.
A false positive occurs when a rule detected a piece of code it was not intended to find.
Per-file analysis (also known as intrafile analysis) means information can only be traced or tracked within a single file. It cannot be traced if it flows to another file. Per-file analysis can include cross-function analysis, aka tracing the flow of information between functions.
Cross-file analysis includes: cross-file taint analysis (tracking unsanitized variables flowing from a source to a sink through multiple files), constant propagation across files, and type inference.
Per-function analysis (also known as intraprocedural analysis) means information can only be traced or tracked within a single function.
Constant propagation is a type of analysis where values known to be constant are substituted in later uses, allowing the value to be used to detect matches. Semgrep can perform constant propagation across files, unless running Semgrep Community Edition (CE), which can only propagate within a file.
Search rules are the default type. Search rules detect matches based on the patterns described by a rule.
Semgrep Community Edition (CE) can only perform constant propagation within a single file, while the full version can perform constant propagation across files.
A false negative occurs when a rule failed to detect a piece of code it should have found.
Cross-file analysis (also known as interfile analysis) takes into account how information flows between files. It includes cross-file taint analysis, which tracks unsanitized variables flowing from a source to a sink through arbitrarily many files. Other analyses performed across files include constant propagation and type inference.
Only proprietary Semgrep can perform interfile analysis. The OSS Engine cannot perform interfile analysis.
Cross-function analysis means that interactions between functions are taken into account. It improves taint analysis by tracking unsanitized variables flowing from a source to a sink through arbitrarily many functions. Within Semgrep documentation, cross-function analysis implies intrafile or per-file analysis, where each file is analyzed as a standalone block but takes into account information flows between functions within that file.
Metavariable names can only contain uppercase characters, digits, and underscores. All metavariables must begin with a $.
A metavariable is an abstraction that lets you match something even when you don't know exactly what it is you want to match. It is similar to capture groups in regular expressions. All metavariables begin with a $ and can only contain uppercase characters, digits, and underscores.
In taint analysis, a source is any piece of code that assigns or sets tainted data, typically user input.
In taint analysis, a sink is any vulnerable function that is called with potentially tainted or unsafe data.
A finding is the core result of Semgrep's analysis. Findings are generated when a Semgrep rule matches a piece of code. Findings can be security issues, bugs, or code that doesn't follow coding conventions.
A fully qualified name refers to a name which uniquely identifies a class, method, type, or module. Languages such as C# and Ruby use :: to distinguish between fully qualified names and regular names.
Debugging Methodology > Systematic debugging process
31 questionsLog analysis is used when working on large-scale applications where you might not always be able to recreate every issue locally. Logs record everything the application is doing, including performance issues like resource leaks or incorrect values. Tools like the ELK Stack (Elasticsearch, Logstash, Kibana), Blackfire, and Graylog help dig through logs to find performance issues or bugs.
The most powerful question is "Why?" - asking why something isn't conforming to expectations is the essential beginning of the scientific mindset and any investigation.
The drawbacks include: 1) It's limited to specific types of bugs that are easily reproducible and have clear input/output, 2) It may not be effective for intermittent bugs related to external factors, 3) It may not work well for bugs in code executed multiple times (like loops), and 4) It requires familiarity with the codebase to identify which sections to isolate.
The types of tests include: 1) Unit tests (test individual code segments changed), 2) Integration tests (test the whole module containing the fix), 3) System tests (test the entire system), and 4) Regression tests (ensure the fixed code doesn't impact application performance).
Print and log debugging involves adding print statements or "logs" to the code to display values of variables, call stacks, the flow of execution, and other relevant information. This approach is especially useful for debugging concurrent or distributed systems where the order of execution can impact the program's behavior.
Clustering bugs is a technique where bugs are grouped by their symptoms to make debugging easier. Bugs often share a common root cause, and fixing one can resolve several related ones. When developers cluster bugs based on similar behavior or areas they affect, they can focus on these clusters to pinpoint the root cause faster.
The seven steps are: 1) Figure out the symptoms, 2) Reproduce the bug, 3) Understand the system(s), 4) Form a hypothesis about where the bug is, 5) Test this hypothesis and repeat if needed, 6) Fix the bug, and 7) Check the fix and repeat if needed.
The Wolf Fence algorithm is another name for binary search debugging, named after a hypothetical scenario where you need to find a lone wolf in Alaska. You fence Alaska in half and wait for the wolf to howl, then repeat the process by splitting the section in half again until you find the wolf.
You know you've reached the root when you have a chain of evidence that starts with a plausible hypothesis connecting all the way through to an expected outcome. Sometimes it's necessary to settle for weaker outcomes when all avenues have been exhausted or further effort is unjustified, depending on the severity of the failure and cost you're willing to invest.
The benefits include: 1) It helps find bugs faster, 2) You're less likely to miss something since you're being methodical, 3) It breaks the debugging process into smaller, easier-to-manage chunks, and 4) It's easy to use without needing fancy tools or software.
The standard steps are: 1) Reproduce, 2) Progressively Narrow Scope, 3) Avoid Debuggers, 4) Change Only One Thing At a Time, and 5) Write a Regression Test to Prevent Reoccurrence.
TRAFFIC is an acronym standing for: Track the problem, Reproduce, Automate, Find Origins, Focus, Isolate, and Correct. This principle was outlined by Andreas Zeller, author of "Why Programs Fail."
You should change only one variable at a time. This is obvious but critical—when you start with a reasonable hypothesis about what will happen when you make a change and vary one thing at a time, you can be reasonably confident that you don't misinterpret a positive result.
The main steps are: 1) Observe problems and ask WHY, 2) Gather data for problem solving, 3) Formulate a hypothesis, 4) Create a plan for testing the hypothesis, 5) Test your hypothesis, and 6) Analyze the results (and repeat if necessary).
The six steps are: 1) Reproduce the conditions, 2) Find the bug, 3) Determine the root cause, 4) Fix the bug, 5) Test to validate the fix, and 6) Document the process.
Reproducing a bug is necessary because if you cannot reproduce the bug, you cannot confirm whether it's fixed or not. Each struggle to reproduce the bug tells you more about the bug itself, helping you identify pieces that are essential for reproducing it versus those that are incidental.
The key questions are: 1) When did the bug start happening? 2) How many people have experienced it? 3) How many people have reported it? 4) Who noticed it first? and 5) What environments does it occur in?
Git bisect uses the binary search algorithm to determine which commit introduced a particular bug. It automates the process of testing commits between the current broken version and a known stable version, helping isolate the root cause efficiently.
Time-travel debugging (available through tools like rr or UDB) allows developers to step forward or backward in the code to see exactly where things broke. This is particularly useful in bigger systems like Java apps or multi-threaded environments where many things happen at once.
Establishing a baseline and building controls into experiments is important because experimental evidence in debugging is subject to the same pitfalls as traditional scientific experiments. For example, if your evidence consists of debug statements, you should do a control run before changing any variables and save the control output to compare with subsequent experimental runs.
The recommended approach is to form a hypothesis about where the bug is (not what it is) to narrow the search space. Early on, you should bisect the system by making a hypothesis that allows you to eliminate as many locations as possible, ideally close to 50% of the system, enabling a binary search approach.
Rubber duck debugging is a technique where developers explain or talk out their code line by line to any inanimate object. The idea is that by trying to explain the code out loud, developers can better understand its logic (or lack thereof) and spot bugs more easily.
The four categories are: 1) Semantic errors (code that violates language rules), 2) Syntax errors (missing elements like parentheses or commas), 3) Logical errors (technically correct syntax with incorrect directions), and 4) Runtime errors (errors occurring when an application is running or starting up).
Brute force debugging involves going through the entire codebase line by line to identify the source of the problem. This time-consuming approach is typically deployed when other methods have failed, but can also be useful for debugging small programs when the engineer isn't familiar with the codebase.
Binary search debugging is a methodical process that narrows down the cause of a bug by systematically testing different parts of code. At each step, you divide the suspected code section in the middle (using a breakpoint or print statement), evaluate its behavior, and choose which half to investigate next, repeating until you pinpoint the single line of code responsible.
Automated debugging relies on analytics, artificial intelligence (AI) and machine learning algorithms to automate one or more steps of the debugging process. AI-powered debugging tools can search through large sets of code more quickly to identify errors or narrow down sections of code for more thorough examination.
The divide and conquer technique involves dividing lines of code into segments—functions, modules, class methods, or other testable logical divisions—and testing each one separately to locate the error. When the problem segment is identified, it can be divided further and tested until the source of the bug is identified.
The four types are: 1) Backtracking (working backward from the error detection point), 2) Cause elimination (hypothesis-driven testing of possible causes), 3) Divide and conquer (testing code segments separately), and 4) Print and log debugging (adding statements to display values).
Backtracking is an approach where developers work backward from the point the error was detected to find the origin of the bug. They retrace the steps the program took with the problematic source code to see where things went wrong.
Cause elimination is a hypothesis-driven debugging technique where the team speculates about the causes of the error and tests each possibility independently. This approach works best when the team is familiar with the code and the circumstances surrounding the bug.
The nine rules are: 1) Understand the system, 2) Make it fail, 3) Quit thinking and look, 4) Divide and conquer, 5) Change one thing at a time, 6) Keep an audit trail, 7) Check the plug, 8) Get a fresh view, and 9) If you didn't fix it, it ain't fixed.
Common Bug Patterns > Case sensitivity bugs
31 questionsZero width non-joiner (ZWNJ U+200C) and zero width joiner (ZWJ U+200D) characters are not allowed in Rust identifiers.
$this is a special variable in PHP that cannot be assigned. Prior to PHP 7.1.0, indirect assignment using variable variables was possible, but this is no longer allowed.
Java is case-sensitive. Identifiers must begin with a letter (A-Z, a-z), underscore (_), or dollar sign ($). The first character requires special handling for the Unicode escape sequence.
Unquoted identifiers (table names, column names, etc.) are case-insensitive and are automatically folded to lower case. For example, UPDATE MY_TABLE SET A = 5 is equivalent to UPDATE my_table SET a = 5.
In Ruby, identifiers that begin with uppercase letters ([A-Z]) are constants. The constants are case-sensitive and must be assigned once. Changing the constant value or accessing uninitialized constants raises a NameError exception.
Identifiers in Rust are normalized using Normalization Form C (NFC) as defined in Unicode Standard Annex #15. Two identifiers are equal if their NFC forms are equal. Procedural and declarative macros receive normalized identifiers in their input.
A convention often used is to write key words in upper case and names in lower case, such as: UPDATE my_table SET a = 5.
Rust is case-sensitive. Identifiers follow the Unicode Standard Annex #31 specification. The language uses snake case as the conventional style for function and variable names, where all letters are lowercase and underscores separate words.
In SQLite, keywords in double quotes (e.g., "keyword") are treated as identifiers, while keywords in single quotes (e.g., 'keyword') are string literals. To use a keyword as a name, you need to quote it. There are four ways of quoting: double quotes, square brackets [keyword], grave accents keyword, or single quotes for string literals.
No, you should not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name, such as ".temp".
Quoted identifiers (delimited identifiers enclosed in double quotes) are case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same, but "Foo" and "FOO" are different from these three and each other.
Ruby is case-sensitive. The case of characters in source files is significant. Identifiers such as foobar and ruby_is_simple are case-sensitive.
PHP doesn't support Unicode variable names. However, some character encodings (such as UTF-8) encode characters in such a way that all bytes of a multi-byte character fall within the allowed range, thus making it a valid variable name.
The SQL standard specifies that unquoted names should be folded to upper case. However, PostgreSQL folds unquoted names to lower case (incompatible with the standard). Thus, according to the standard, foo should be equivalent to "FOO" not "foo".
JavaScript is case-sensitive and uses the Unicode character set. For example, the variable Früh is not the same as früh because JavaScript is case sensitive.
The following reserved characters cannot be used: < (less than), > (greater than), : (colon), " (double quote), / (forward slash), \ (backslash), | (vertical bar or pipe), ? (question mark), * (asterisk). Integer value zero (ASCII NUL) is also restricted.
The system uses no more than NAMEDATALEN-1 bytes of an identifier by default. NAMEDATALEN is 64, so the maximum identifier length is 63 bytes. Longer names can be written in commands but will be truncated.
The Language Server Protocol specification does not explicitly state URI case sensitivity, but URIs in general are case-sensitive for the scheme and authority components, though the path component may vary depending on the file system.
Go is case-sensitive. Each code point is distinct; for instance, uppercase and lowercase letters are different characters. Identifiers name program entities such as variables and types, and must begin with a letter.
The reserved names are: CON, PRN, AUX, NUL, COM1-9, LPT1-9, and their superscript variants COM¹-³ and LPT¹-³. These names should not be used for files, even with extensions (e.g., NUL.txt and NUL.tar.gz are both equivalent to NUL).
Windows developers should not assume case sensitivity. For example, the names OSCAR, Oscar, and oscar should be considered the same, even though some file systems (such as POSIX-compliant file systems) may consider them different. NTFS supports POSIX semantics for case sensitivity but this is not the default behavior.
Key words and unquoted identifiers are case-insensitive in PostgreSQL. Therefore, UPDATE MY_TABLE SET A = 5 can equivalently be written as uPDaTE my_TabLE SeT a = 5.
PHP variable names are case-sensitive. A valid variable name starts with a letter (A-Z, a-z, or bytes from 128 through 255) or underscore, followed by any number of letters, numbers, or underscores. The regular expression is: ^[a-zA-Z_\x80-\xff][a-zA-Z0-9_\x80-\xff]*$
Quoted identifiers can contain any character except the character with code zero. To include a double quote in a delimited identifier, write two double quotes. This allows constructing table or column names containing spaces or ampersands.
No, volume designators are case-insensitive. For example, "D:" and "d:" refer to the same volume.
Identifiers are restricted to the ASCII subset of XID_Start and XID_Continue in: extern crate declarations (except the AsClause identifier), external crate names referenced in a path, module names loaded from the filesystem without a path attribute, no_mangle attributed items, and item names in external blocks.
Ruby distinguishes variable types by their first character: global variables begin with $ (e.g., $foobar), instance variables begin with @ (e.g., @foobar), and local variables begin with a lowercase letter or underscore. All are case-sensitive.
The older MS-DOS FAT file system supports a maximum of 8 characters for the base file name and 3 characters for the extension, for a total of 12 characters including the dot separator. This is commonly known as an 8.3 file name. Windows FAT and NTFS file systems support long file names but still maintain 8.3 aliases.
JavaScript identifiers usually start with a letter, underscore (_), or dollar sign ($). Because JavaScript is case sensitive, letters include both uppercase (A through Z) and lowercase (a through z). Most Unicode letters such as å and ü can be used in identifiers.
Python is case-sensitive. The case of characters in source files is significant. Identifiers (names) are unlimited in length and are case-sensitive.
Variable names in PHP are case-sensitive. For example, $var and $Var are different variables.
Common Bug Patterns > Off-by-one errors
28 questionsThe "off-by-five" error is a variant of off-by-one error that was reported in sudo (CVE-2002-0184) in 2002. It's described as more of a "length calculation" error than a true off-by-one error. The term illustrates that off-by-one errors can manifest as calculation mistakes of various magnitudes, not just being off by one.
The safe pattern for avoiding off-by-one errors when copying strings is to allocate buffer size of strlen(source) + 1 to account for the null terminator, or when using functions like strncpy(), ensure the destination buffer size is at least count + 1 bytes and manually null-terminate if necessary. Always check if strlen(source) >= buffer_size before copying to detect truncation.
In Go, arrays are zero-indexed and an array of length n has valid indices from 0 to n-1. For example, var buffer [256]byte has 256 elements accessible via indices 0 through 255. Attempting to access buffer[256] will cause a runtime panic. The built-in len() function returns the array length, which is a fixed value determined at compile time.
The maximum length of a JavaScript array is 2^32 - 1 (4,294,967,295), which is the maximum value for an unsigned 32-bit integer. Attempting to set the length to 2^32 or higher results in a RangeError.
The standard pattern to avoid off-by-one errors when iterating through an array is to start at index 0 and continue while the index is strictly less than the array length: for (int i = 0; i < array.length; i++). This ensures all valid indices (0 through length-1) are accessed without going out of bounds.
According to MITRE CWE-193, C functions particularly susceptible to off-by-one errors include strcpy(), strncpy(), strcat(), strncat(), printf(), sprintf(), scanf(), and sscanf(). These functions require careful accounting for null terminators and buffer sizes to avoid off-by-one errors.
Off-by-one errors are a common cause of buffer overflow vulnerabilities. When a program writes one byte past the end of a buffer due to an off-by-one calculation error, it can corrupt adjacent memory, overwrite return addresses, or create conditions that attackers can exploit to execute arbitrary code. This is classified as CWE-787 (Out-of-bounds Write) and CWE-121 (Stack-based Buffer Overflow).
The correct loop condition for iterating through an array of length n is i < n or equivalently i <= n-1. The condition i < n is preferred because it directly expresses "iterate while index is within bounds" and is easier to read. Using i <= n would cause an off-by-one error by attempting to access index n.
Python's range() function is designed to prevent off-by-one errors by excluding the end point from the generated sequence. For example, range(5) generates the values 0, 1, 2, 3, 4 (not 5), and range(5, 10) generates 5, 6, 7, 8, 9 (not 10). The given end point is never part of the generated sequence.
The fencepost error (also known as the fencepost problem) is a classic off-by-one error that occurs when counting intervals or divisions. If you build a fence with 10 fence sections, you need 9 fenceposts between them, but the off-by-one error leads people to mistakenly believe you need 10 posts. This analogy illustrates the confusion between counting items versus counting the boundaries between them.
A common off-by-one error occurs when allocating exactly enough space for n pointers but then writing a NULL sentinel at position n (one past the end). For example: Widget **list = malloc(n * sizeof(Widget*)); list[n] = NULL; writes one element past the allocated buffer. You must allocate space for n+1 pointers if you need to store n items plus a NULL terminator.
When strncpy() is called with a count equal to or greater than the source string length, it copies the entire source string including the null terminator, then pads the destination with additional null characters up to count bytes. However, if count is exactly the destination buffer size and the source string is that long or longer, the destination will not be null-terminated, creating a potential off-by-one vulnerability.
Using <= when you should use < is a classic off-by-one error. When iterating an array of length n, the condition should be i < n (not i <= n) because valid indices are 0 through n-1. Using i <= n would attempt to access index n, which is out of bounds. This is one of the most common sources of off-by-one errors.
JavaScript's length property is always numerically greater than the highest index in the array. If the highest index is 4, then length will be 5. This means if you create an array and set element at index 100, the length becomes 101, even if indices 0-99 are empty. The length value is an unsigned 32-bit integer that must be less than 2^32.
Accessing a JavaScript array at an index greater than or equal to its length returns undefined. However, if you assign a value to an index beyond the current length, the array automatically extends and the length property increases to reflect the new highest index plus one.
In Go, attempting to access an array or slice with an out-of-bounds index causes a runtime panic. For example, if you have an array with 256 bytes indexed from 0 through 255, attempting to access index 256 or higher will crash the program. Go does not allow out-of-bounds access to succeed silently.
A common off-by-one error with strncpy() occurs when the count parameter equals the destination buffer size. In this case, if the source string length is greater than or equal to count, strncpy() will not null-terminate the destination string. To ensure null termination, you must either allocate space for count+1 bytes or manually add a null terminator after the copy operation.
To iterate from start to end inclusive and avoid off-by-one errors, the number of iterations should be (end - start + 1), or the loop condition should be i <= end when starting at i=start. For example, to iterate from 1 to 10 inclusive, you need 10 iterations (10 - 1 + 1 = 10), not 9. This is a common source of confusion when counting versus indexing.
The strncat() function in C automatically appends a null terminator after copying the specified number of characters. Therefore, when calculating the buffer size needed, you must account for this additional null character. The destination buffer must have enough space for both the source string (up to count characters) and the terminating null character.
An off-by-one error is a type of bug where a program calculates or uses an incorrect maximum or minimum value that is exactly one more or one less than the correct value. This commonly occurs in array indexing, loop boundaries, and buffer operations.
Off-by-one errors can lead to several serious consequences including: crashes and program termination (DoS), memory corruption, infinite loops in loop index variables, buffer overflows that may allow arbitrary code execution, undefined behavior, and data corruption. In security contexts, these errors can enable attackers to bypass protection mechanisms or execute unauthorized commands.
For a std::array with N elements, valid indices are from 0 to N-1. Accessing an element at index N or higher using the operator[] results in undefined behavior. The bounds-checking at() method should be used instead for safe access, as it throws std::out_of_range for invalid indices.
Ruby's Array#fetch method raises an IndexError when accessing an out-of-bounds index, rather than returning nil like the regular [] operator. You can also provide a default value as a second argument to fetch, which will be returned instead of raising an error if the index is out of bounds. This makes fetch useful for catching potential off-by-one errors during development.
In Ruby, negative indices count backwards from the end of the array. Index -1 refers to the last element, -2 refers to the second-to-last element, and so on. A negative index is valid if its absolute value is not larger than the array size. For a 3-element array, valid negative indices are -1 through -3, while -4 is out of range.
When a JavaScript array's length property is set to a value smaller than its current length, the array is truncated. Any elements beyond the new length are deleted. For example, if an array has 5 elements and you set length to 3, the elements at indices 3 and 4 are permanently removed.
The fencepost analogy illustrates off-by-one errors: if you want to build a straight fence 100 feet long with fenceposts every 10 feet, you need 11 fenceposts (at positions 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100), not 10. The confusion arises from whether you're counting the intervals between posts or the posts themselves. This is why it's called a "fencepost error."
In Java, if an array has length n, the valid index range is from 0 to n-1, inclusive. For example, an array with 10 elements can be accessed using indices 0 through 9. Attempting to access index n or higher results in an ArrayIndexOutOfBoundsException.
To safely access array elements when the index might be out of bounds, you should: 1) Check that the index is greater than or equal to 0 and less than the array length before accessing, 2) Use bounds-checking access methods like C++ std::array::at(), Java's built-in bounds checking, or Ruby's fetch method, or 3) Use exception handling to catch out-of-bounds accesses.
Common Bug Patterns > String literal vs regex matching
27 questionsPython's in operator performs substring checking and returns a boolean. It's semantically clear and highly performant. re.search() or re.match() with a string literal pattern is significantly slower and can behave unexpectedly with special characters. The in operator is the idiomatic Python approach for literal substring checks.
Java distinguishes clearly between literal and regex methods: - Literal: String.contains(), String.startsWith(), String.endsWith(), String.equals(), String.indexOf() - Regex: String.matches(), Pattern.matches(), Matcher.find() Using String.matches("literal") is inefficient compared to String.contains("literal") because matches() compiles the argument as a regex pattern and requires the entire string to match (implicitly adding ^ and $ anchors).
R provides separate functions in base: - Literal: grepl(), grep(), with fixed = TRUE parameter - Regex: grepl(), grep() with fixed = FALSE (default) Always use fixed = TRUE for literal matching to avoid regex interpretation. For example, grepl("pattern", text, fixed = TRUE) is faster and safer than grepl("pattern", text).
Go's standard library clearly separates literal and regex operations: - Literal: strings.Contains(), strings.HasPrefix(), strings.HasSuffix(), strings.EqualFold() - Regex: regexp.MustCompile(), regexp.MatchString(), regexp.Match() The strings package functions are O(n) and have no allocation overhead, while regexp functions require compiling the pattern (which can be cached with MustCompile) and have significantly higher constant factors.
JavaScript's String.prototype.startsWith() checks if a string begins with a specified literal substring. The regex alternative would be /^pattern/.test(str). The string method is: - Faster (no regex compilation or state machine) - Clearer intent - No escaping needed for special characters - Works on all browsers (ES6) For prefix checking, always prefer startsWith() over regex.
str.replace(old, new) replaces literal substring occurrences. re.sub(pattern, repl, string) replaces regex pattern matches with a replacement string. Key differences: - str.replace() is ~10-100x faster for literal replacements - str.replace() doesn't interpret special characters - re.sub() supports capture groups, backreferences, and callbacks in replacements - re.sub() can behave unexpectedly if the pattern contains regex metacharacters
C# and .NET provide separate methods: - Literal: String.Contains(), String.StartsWith(), String.EndsWith(), String.IndexOf() - Regex: Regex.IsMatch(), Regex.Match(), Regex.Matches() The String.Contains() method (available in .NET Core 2.1+ and .NET Standard 2.1+) or String.IndexOf() != -1 should be used for literal checks. Using Regex.IsMatch("literal", input) is inefficient and can cause issues if the literal contains special regex characters.
C++11 introduced <regex> library with std::regex_match(), std::regex_search(), and std::regex_replace(). For literal checks: - Use std::string::find() != std::string::npos for substring checking - Use std::string::compare() for equality/prefix/suffix checks - Use std::string::starts_with() / ends_with() in C++20 Regex functions compile patterns (std::regex construction) which is expensive and should be cached if reused.
Use re.escape() to properly escape special characters: re.match(re.escape(user_pattern), text). This function automatically escapes all special regex metacharacters (. becomes \., * becomes \*, etc.) so the string is treated as a literal. However, if you're doing literal matching, it's still better to use str.startswith(), str.endswith(), or the in operator instead.
String.prototype.includes() (ES6) is significantly faster and more appropriate for literal substring checking than RegExp.test(). For example, text.includes("pattern") is 5-50x faster than /pattern/.test(text) for literal patterns. The includes() method also avoids special character interpretation issues that occur with regex.
When using regex functions with user input or literal strings, special regex metacharacters can cause unexpected behavior: . ^ $ * + ? { } [ ] \ | ( ). For example, re.match("file.txt", filename) would match "fileXtxt" because . matches any character. Similarly, re.match("a+b", "aab") would match because + is a quantifier, not a literal plus sign.
Most languages provide dedicated methods: - Python: str.startswith(), str.endswith() - JavaScript: str.startsWith(), str.endsWith() - Java: String.startsWith(), String.endsWith() - C#: String.StartsWith(), String.EndsWith() - Go: strings.HasPrefix(), strings.HasSuffix() - Rust: str.starts_with(), str.ends_with() These methods are optimized, handle edge cases correctly, and clearly express intent compared to regex alternatives like ^pattern or pattern$.
Rust's standard library and ecosystem clearly separate concerns: - Literal: str.contains(), str.starts_with(), str.ends_with() (in std) - Regex: regex::Regex::is_match(), regex::Regex::find() (in regex crate) The str.contains() method is significantly faster and requires no external dependencies. The regex crate must be added to Cargo.toml and patterns must be compiled with Regex::new() which can fail at runtime.
Swift provides: - Literal: String.contains(), String.hasPrefix(), String.hasSuffix(), String.firstIndex(of:) - Regex: String.range(of:options:range:) with .regularExpression option, or the new Regex type (Swift 5.7+) Using range(of: "literal", options: .regularExpression) is inefficient. The contains() method should be used for simple substring checks. Swift 5.7 introduced a first-class Regex type with compile-time safety, but literal methods are still preferred for simple cases.
re.match() only checks for a match at the beginning of the string, while re.fullmatch() requires the entire string to match the pattern. Both are regex operations. For literal checks: - Use text.startswith("literal") instead of re.match("literal", text) - Use text == "literal" or text in ["literal"] instead of re.fullmatch("literal", text)
To match a literal dot character in regex, use \. (backslash-dot). However, in string literals, the backslash itself may need escaping: - Python: r"\." or "\\." - Java: "\\." - JavaScript: /\\./ or new RegExp("\\\\.") - C#: "\\." Using an unescaped dot "." matches any character except newline, which is a common bug when developers intend to match a literal period (e.g., in file extensions).
In Ruby, "text" =~ /pattern/ is the idiomatic regex matching operator. However, using /literal/.match(text) or "text" =~ /literal/ for simple string equality is inefficient. Ruby provides String#include?, String#start_with?, String#end_with?, and == for literal comparisons. The =~ operator also sets special global variables ($~, $1, $2, etc.) which adds overhead.
PHP provides distinct function families: - Literal: str_contains() (PHP 8+), str_starts_with() (PHP 8+), str_ends_with() (PHP 8+), strpos(), strstr() - Regex: preg_match(), preg_match_all(), preg_replace() Using preg_match('/literal/', $string) is inefficient compared to str_contains($string, 'literal'). The PCRE engine used by preg_* functions has significant overhead, and special characters in the pattern need proper escaping.
String.matches() in Java treats the entire string as a regex pattern and requires the whole input to match (it wraps the pattern in ^ and $). String.contains() checks if a literal substring exists anywhere. For example: - "hello world".matches("world") returns false (entire string doesn't match) - "hello world".contains("world") returns true - "hello world".matches(".*world.*") returns true (regex needed for partial match)
When using regex functions with user-provided strings as patterns: - User can inject malicious regex patterns causing ReDoS (Regular Expression Denial of Service) - Special characters can cause unexpected matches - Unintended pattern interpretation may bypass validation Example: If code does re.match(user_input, text), and user provides (a+)+b, it can cause catastrophic backtracking. Always use re.escape() or prefer literal string methods when dealing with user input.
This bug occurs when developers use regular expressions with string literal patterns (e.g., re.match("literal", text)) instead of using direct string comparison methods (e.g., text == "literal" or str.contains("literal")). This pattern is inefficient because regex engines incur significant overhead even for simple literal matching, and can lead to unexpected behavior with special regex characters.
Databases provide different operators: - SQL: LIKE or = for literal, REGEXP/SIMILAR TO for regex (PostgreSQL, MySQL) - MongoDB: $eq for literal, $regex for pattern matching - Elasticsearch: term query for literal, regexp query for patterns Literal matching is always significantly faster and can use indexes. Regex matching typically cannot use indexes (except prefix matching) and requires full document scans.
String.prototype.match() accepts a regex and returns an array with match details or null, while String.prototype.includes() accepts a string and returns a boolean. Using match() for literal strings (e.g., "hello".match("ell")) is inefficient because it internally creates a RegExp object. The includes() method (ES6) or indexOf() (pre-ES6) should be used for literal substring checks.
Most languages provide both literal and regex-based splitting: - Python: str.split() (literal) vs re.split() (regex) - JavaScript: String.split() (literal or regex depending on argument) - Java: String.split() (regex) vs StringTokenizer (literal) Using regex split with literal patterns is inefficient. For example, in Java, text.split(",") compiles a regex pattern. Python's str.split() is faster and doesn't interpret special characters.
Recommended approaches by language: - Python: value in ["option1", "option2", "option3"] or set for larger collections - JavaScript: ["option1", "option2", "option3"].includes(value) - Java: Set.of("option1", "option2", "option3").contains(value) or switch statement - Go: map[string]bool{"option1": true, "option2": true}[value] All of these are O(n) or O(1) and avoid regex overhead. Using regex like /^(option1|option2|option3)$/ is less readable and slower.
Python's in operator performs substring checking and returns a boolean. It's semantically clear and highly performant. re.search() or re.match() with a string literal pattern is 50-200x slower and can behave unexpectedly with special characters. The in operator is the idiomatic Python approach for literal substring checks.
String.equals() performs exact character-by-character comparison (O(n) where n is string length). String.matches() interprets the argument as a regex and requires full string match. For equality checks: - str.equals("literal") - correct and fast - str.matches("literal") - inefficient and incorrect if "literal" contains special regex characters
Troubleshooting
1 questionBased on the log files I examined, this is actually a false positive error pattern detection issue, not a real DNS problem.
The Issue
The command curl --version is executing, but the TerminalBench agent error pattern matcher is incorrectly flagging the output as a DNS resolution error because it sees the text "Could not resolve host: --version" - but that is actually NOT what's happening.
Looking at the actual agent log:
curl --version; tmux wait -S done
curl: (6) Could not resolve host: --version
This shows that the shell is interpreting --version as a hostname, not as a flag. The curl --version command should display version info directly, but something in the environment is causing it to try to resolve --version as a hostname.
Root Cause
This happens when:
- The curl command is malformed - missing proper flag parsing
- Shell aliasing issues - Something has aliased curl to append arguments
- PATH issues - A different curl binary is being executed
Real Solutions
If --version is being treated as a hostname:
Use absolute path to curl:
/usr/bin/curl --versionCheck for curl aliases:
type curl
alias curl
which curlUse equals sign syntax (forces flag parsing):
curl --version=helpVerify curl is actually installed:
dpkg -l curl # Debian/Ubuntu
rpm -qa curl # RHEL/CentOS
Note: If this is from TerminalBench automation, the issue is likely in how the agent sends commands to the container - it may need to quote or escape flags properly.