lambda_snapstart 22 Q&As

Lambda Snapstart FAQ & Answers

22 expert Lambda Snapstart answers researched from official documentation. Every answer cites authoritative sources you can verify.

containers

4 questions
A

Lambda uses a three-tier caching system for container images: (1) L1 cache on Lambda workers (in-memory, 67% of chunks served from here), (2) Distributed cache, (3) Amazon S3 for durable storage. Lambda uses lazy loading - only 6.5% of bytes on average are needed at startup (Harter et al research). Subsequent cold starts on the same worker reuse cached layers (500ms+ faster). The team improved cold starts 15x through optimizations.

99% confidence
A

Optimization strategies: (1) Use AWS base images (optimized for Lambda), (2) Multi-stage Docker builds - build in large image, copy only artifacts to slim final image (60-80% size reduction), (3) Order Dockerfile from stable to frequently changing (better layer caching), (4) Store multiple functions in one ECR repo (share large layers), (5) Minimize image size - 1GB -> 200MB = 1-2s faster cold start, (6) Use distroless/slim base images for smallest footprint.

99% confidence
A

Workarounds for container cold starts: (1) Provisioned Concurrency - pre-warm instances ($0.015/GB-hour), eliminates cold starts entirely, (2) Smaller images via multi-stage builds (1GB -> 200MB = 1-2s faster), (3) Keep functions warm with scheduled invocations (CloudWatch Events every 5 minutes), (4) Use ARM64 architecture (Graviton2) - often faster cold starts, (5) Increase memory allocation - more CPU = faster init, (6) Move init-heavy code to layers (cached separately).

99% confidence
A

Lambda supports container images up to 10 GB in size. The image must be stored in Amazon ECR (same account or cross-account with permissions). While Lambda supports large images, cold start time correlates with image size due to layer pulling. Best practice: keep images as small as possible through multi-stage builds and minimal dependencies. The 250 MB limit applies only to .zip deployments, not containers.

99% confidence

comparison

3 questions
A

Use SnapStart when: (1) Unpredictable/bursty traffic patterns, (2) Cost-sensitive but need reduced cold starts, (3) Sub-second cold start is acceptable, (4) Don't need EFS or >512MB ephemeral storage. Use Provisioned Concurrency when: (1) Need guaranteed double-digit millisecond response, (2) Traffic is predictable, (3) Need EFS access, (4) Using container images or Node.js, (5) Willing to pay ~$0.015/GB-hour for always-warm instances.

99% confidence
A

Use container images when: (1) Complex dependencies requiring custom OS libraries (TensorFlow, PyTorch, OpenCV with native binaries), (2) Custom runtimes not yet supported by Lambda (Python 3.13, Node.js 22 before official support), (3) Dependencies exceed 250MB .zip limit, (4) Docker-first development workflow (consistent dev/prod), (5) Multi-step build processes. Accept the cold start tradeoff and use Provisioned Concurrency if latency is critical.

99% confidence
A

Lambda layers: 50 MB compressed per layer, 250 MB total uncompressed, 5 layers max per function, version-controlled, shareable across functions, work with SnapStart. Container images: 10 GB total, custom OS packages, native binaries, any runtime, no layer limit, but NO SnapStart support. Use layers when: dependencies < 250 MB and want SnapStart. Use containers when: need native libraries, custom runtime, or dependencies > 250 MB.

99% confidence

limitations

2 questions
A

Container images use a different execution model incompatible with Firecracker snapshots. SnapStart captures microVM state after Lambda's init phase (dependencies loaded, code initialized). Container images require Docker layer pulling (1-5s for large images), OCI image extraction, and container runtime startup - all happening before the phase SnapStart can checkpoint. The Firecracker snapshot mechanism can't capture the container-specific initialization process.

99% confidence
A

SnapStart limitations: (1) No provisioned concurrency - can't use both together, (2) No Amazon EFS support, (3) No ephemeral storage > 512 MB, (4) Must use published versions (not $LATEST), (5) Container images not supported, (6) Only Java/Python/.NET runtimes, (7) Some regions not available for Python/.NET. Also: snapshot must be < 250 MB compressed, and functions with very short initialization may see minimal improvement.

99% confidence

uniqueness

2 questions
A

SnapStart reuses the same snapshot across execution environments, so random values generated during init are duplicated. Solutions: (1) Generate random numbers in handler, not init, (2) Use cryptographically secure generators - java.security.SecureRandom (snap-resilient), not java.util.Random, (3) Use OS sources (/dev/random, /dev/urandom) which are snap-resilient, (4) AWS updated OpenSSL's RAND_bytes for SnapStart compatibility. Avoid: PRNGs seeded during init, UUIDs generated at init.

99% confidence
A

Network connections established during init may be stale or closed when Lambda resumes from snapshot. Solutions: (1) Re-establish connections in handler as needed, (2) Use AWS SDKs (already updated to handle this), (3) Implement connection validation before use, (4) Use afterRestore runtime hook to re-establish connections. The connection state isn't guaranteed - always code defensively for reconnection scenarios.

99% confidence

hooks

2 questions
A

Runtime hooks let you run code before snapshot (beforeCheckpoint) and after restore (afterRestore). Use cases: (1) beforeCheckpoint - release resources, close connections, delete sensitive data from memory, (2) afterRestore - re-establish connections, regenerate unique IDs, update configuration. Java uses CRaC (Coordinated Restore at Checkpoint) API. Python uses the open-source Snapshot Restore for Python library (included in managed runtime).

99% confidence
A

CRaC is an open-source project that provides APIs for Java applications to coordinate checkpoint/restore operations. Lambda uses CRaC for SnapStart runtime hooks. Implement org.crac.Resource interface with beforeCheckpoint() and afterRestore() methods. Register resources with Core.getGlobalContext().register(this). CRaC ensures your code can clean up before snapshot and reinitialize after restore.

99% confidence

architecture

2 questions
A

Firecracker is an open-source virtualization technology developed by AWS using KVM to run microVMs. Each Lambda execution environment is an isolated Firecracker microVM. MicroVMs are lightweight (~125ms startup), provide strong security isolation, and enable SnapStart's snapshot/restore capability. Firecracker manages memory and disk state independently, allowing Lambda to capture and restore complete execution environment state.

99% confidence
A

Lambda divides snapshots into 512KB chunks for optimized retrieval. Three cache layers: (1) L1 - on Lambda worker nodes (fastest, ~1ms per chunk, 67% hit rate), (2) L2 - distributed cache (single-digit ms per chunk), (3) Amazon S3 - durable storage (hundreds of ms per chunk). Lambda intelligently routes requests to cache layer with available data. Frequently-used snapshots stay in L1, reducing restore time to milliseconds.

99% confidence

overview

1 question
A

Lambda SnapStart reduces cold starts by creating a Firecracker microVM snapshot after function initialization. When you publish a function version, Lambda initializes it, takes an encrypted snapshot of memory and disk state, and caches it across three layers: Amazon S3 (durable), distributed cache (single-digit ms per 512KB chunk), and local worker cache (1ms per 512KB chunk). When invoked, Lambda resumes from snapshot instead of initializing from scratch, reducing cold starts from seconds to sub-second.

99% confidence

requirements

1 question
A

SnapStart supports: Java 11 and later (since 2022), Python 3.12 and later (November 2024), and .NET 8 and later (November 2024). NOT supported: Node.js, Ruby, custom runtimes (OS-only), and container images. Regional availability for Python/.NET: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Singapore, Sydney, Tokyo), Europe (Frankfurt, Ireland, Stockholm).

99% confidence

tools

1 question
A

The SnapStart scanning tool is an open-source SpotBugs plugin that runs static analysis to detect code that may break uniqueness assumptions. It identifies: (1) Use of java.util.Random instead of SecureRandom, (2) UUID generation during init, (3) Secrets/tokens cached at init, (4) Network connections established at init. Run during build to catch issues before deployment. Install via Maven/Gradle SpotBugs plugin.

99% confidence

pricing

1 question
A

For Java managed runtimes: SnapStart is FREE - no additional charges. For Python 3.12+ and .NET 8+ (US East N. Virginia): $0.0000015046 per GB-second for caching snapshots, $0.0001397998 per restore. Charged for caching while function version is active (minimum 3 hours). Compare to Provisioned Concurrency at ~$0.015/GB-hour which is significantly more expensive for sporadic traffic patterns.

99% confidence

performance

1 question
A

SnapStart typically achieves 70-90% cold start reduction. Java Spring Boot example: 4.5s -> 400ms (91% reduction). General results: 6.2s -> 1.9s (70% reduction) for complex apps. Best results with heavy init (framework loading, dependency injection). Minimal improvement for functions with short init phases. Performance varies by function complexity, memory allocation, and cache state.

99% confidence

migration

1 question
A

Migration path: (1) Extract dependencies to Lambda layers (/opt/java/lib), (2) Package application code as .zip, (3) Update deployment to use .zip + layers instead of container, (4) Enable SnapStart on function configuration, (5) Publish version (required for SnapStart), (6) Test cold start improvement (expect 75-90% reduction). May require refactoring if you depend on custom OS packages not available in Lambda runtime.

99% confidence

setup

1 question
A

To enable SnapStart: (1) Console: Function configuration -> Edit -> SnapStart -> Enable, (2) CDK/CloudFormation: Set snapStart.applyOn to 'PublishedVersions', (3) AWS CLI: aws lambda update-function-configuration --function-name MyFunc --snap-start ApplyOn=PublishedVersions. After enabling, you MUST publish a new version - SnapStart only works on published versions, not $LATEST. Verify with aws lambda get-function to check snapStart.optimizationStatus.

99% confidence