AWS Lambda
What is Lambda?
- Serverless compute — run code without provisioning or managing servers
- Event-driven — executes in response to triggers (S3, API Gateway, Kinesis, DynamoDB Streams, SNS, SQS, etc.)
- FaaS (Function as a Service) — you only write the function logic
- AWS manages: provisioning, scaling, patching, availability
- Pay per invocation + duration — free tier: 1M requests/month + 400,000 GB-seconds compute
Key limits:
| Parameter | Limit |
|---|---|
| Max execution timeout | 15 minutes |
| Memory | 128 MB – 10 GB |
Ephemeral storage (/tmp) | 512 MB – 10 GB |
| Deployment package (zip) | 50 MB (250 MB unzipped) |
| Container image size | 10 GB |
| Concurrent executions (default) | 1,000 per region (soft limit) |
Execution Model
graph LR
Trigger[Event Trigger<br/>S3 / API GW / Kinesis] --> Lambda[Lambda Function<br/>Init + Execute]
Lambda --> Result[Response /<br/>Downstream Action]
subgraph Cold Start
CS1[Download code] --> CS2[Start runtime] --> CS3[Run init code]
end
subgraph Warm Start
WS1[Reuse existing<br/>execution environment]
end
style Cold Start fill:#fee2e2,stroke:#dc2626
style Warm Start fill:#dcfce7,stroke:#16a34a
Cold Start vs Warm Start
| Cold Start | Warm Start | |
|---|---|---|
| When | First invocation / after idle | Reused execution environment |
| Latency | Higher (100ms–1s+) | Near-zero |
| Mitigation | Provisioned Concurrency (keeps envs warm), SnapStart (Java) |
Exam tip: Provisioned Concurrency = pre-initialized execution environments. Costs money even when idle. Use for latency-sensitive workloads (e.g., SageMaker endpoint wrapper).
Triggers (Event Sources)
graph TD
Triggers --> Sync[Synchronous<br/>Waits for response]
Triggers --> Async[Asynchronous<br/>Fire and forget]
Triggers --> Stream[Stream / Queue<br/>Polling-based]
Sync --> APIGW[API Gateway]
Sync --> ALB[ALB]
Sync --> Cognito[Cognito]
Async --> S3[S3 Events]
Async --> SNS[SNS]
Async --> EventBridge[EventBridge]
Stream --> Kinesis[Kinesis Data Streams]
Stream --> DDB[DynamoDB Streams]
Stream --> SQS[SQS]
style Sync fill:#dbeafe,stroke:#3b82f6
style Async fill:#dcfce7,stroke:#16a34a
style Stream fill:#f3e8ff,stroke:#9333ea
- Synchronous: caller waits for Lambda to return (e.g., API Gateway → Lambda → response)
- Asynchronous: Lambda queues the event, retries on failure (2 retries by default), can send to DLQ
- Stream/Queue: Lambda polls the source (Kinesis, SQS), processes in batches
Lambda for ML Pipelines (MLA-C01 focus)
This is where Lambda matters for the exam:
Pattern 1: S3 → Lambda → SageMaker (Trigger-based inference)
S3 (new file uploaded) → Lambda → invoke SageMaker endpoint → store result in S3/DynamoDB
Use case: Real-time batch processing — new data file lands, Lambda triggers inference job
Pattern 2: Kinesis → Lambda → Feature Store
Kinesis Data Stream → Lambda (transform/enrich) → SageMaker Feature Store / S3
Use case: Online feature engineering from streaming data before model serving
Pattern 3: API Gateway → Lambda → Bedrock
API Gateway → Lambda → Bedrock InvokeModel API → return LLM response
Use case: Serverless GenAI app — Lambda wraps Bedrock API, handles auth + prompt formatting
Pattern 4: EventBridge → Lambda → SageMaker Pipeline
EventBridge (cron) → Lambda → start SageMaker Pipeline execution
Use case: Scheduled retraining pipeline — trigger nightly model refresh without always-on compute
Exam tip: Lambda is the glue layer in ML pipelines. It connects triggers to ML services without needing an always-on server. Know these 4 patterns cold.
Lambda Layers
- Layers = additional code/libraries bundled separately and shared across functions
- Use case: Package large ML libraries (numpy, pandas, scikit-learn) once, reuse across functions
- Up to 5 layers per function
- Total unzipped size (function + layers) ≤ 250 MB
Lambda Function
├── function.py (your code)
└── Layers
├── numpy-layer (shared)
├── pandas-layer (shared)
└── custom-utils-layer
MLA-C01 tip: SageMaker inference uses Lambda Layers to package inference dependencies when building Lambda-backed inference logic.
Lambda + VPC
- By default, Lambda runs outside your VPC (can call public AWS services via internet)
- To access RDS, ElastiCache, or private SageMaker endpoints: deploy Lambda inside a VPC
- Requires: VPC ID, subnet IDs, security group IDs
- Cold start penalty increases when Lambda is VPC-attached (ENI provisioning) — mitigated since 2019 with hyperplane ENIs
graph LR
A[Lambda outside VPC] -->|Public endpoint| B[S3 / DynamoDB / Bedrock]
C[Lambda inside VPC] -->|Private| D[RDS / ElastiCache / SageMaker Private Endpoint]
C -->|Via NAT GW| B
IAM for Lambda
- Execution Role — IAM role assumed by Lambda at runtime to call other AWS services
- Example: allow
s3:GetObject,sagemaker:InvokeEndpoint,bedrock:InvokeModel
- Example: allow
- Resource Policy — who can invoke the Lambda function (e.g., allow API Gateway, S3, EventBridge)
Execution Role = what Lambda CAN DO
Resource Policy = who CAN CALL Lambda
Exam tip: If Lambda can't call SageMaker/Bedrock, check the execution role policy first.
Lambda Concurrency
| Type | Description |
|---|---|
| Unreserved concurrency | Default pool shared across all functions in the account |
| Reserved concurrency | Caps a function's max concurrency — also guarantees it won't be throttled by others |
| Provisioned concurrency | Pre-warms execution environments — eliminates cold starts |
- Account default: 1,000 concurrent executions per region
- Throttling →
TooManyRequestsException(HTTP 429)
Pricing
- Requests: $0.20 per 1M invocations (first 1M/month free)
- Duration: $0.0000166667 per GB-second (400,000 GB-seconds free/month)
- Provisioned Concurrency: charged per GB-second of provisioned capacity (even when idle)
Exam Cheat Sheet
| Topic | Key Fact |
|---|---|
| Max timeout | 15 minutes |
| Cold start fix | Provisioned Concurrency |
| Shared libraries | Lambda Layers (up to 5) |
| Private resource access | Deploy Lambda inside VPC |
| ML pipeline glue | Lambda → SageMaker / Bedrock / Kinesis |
| Who calls Lambda | Resource Policy |
| What Lambda can call | Execution Role |
| Async retry on failure | 2 retries → DLQ |
| Stream polling | Kinesis, DynamoDB Streams, SQS |
| Concurrency cap | Reserved Concurrency |
Labs
Lab 1 — Hello Lambda (10 min)
Goal: Deploy your first function and test it manually.
- Console → Lambda → Create function → Author from scratch
- Runtime: Python 3.12 | Execution role: Create new with basic Lambda permissions
- Write a simple handler that returns
{"statusCode": 200, "body": "Hello from Lambda"} - Test with a sample event → observe logs in CloudWatch
Lab 2 — S3 Trigger (20 min)
Goal: Auto-trigger Lambda when a file lands in S3.
- Create an S3 bucket
- Create a Lambda function that logs the S3 object key from the event
- Add S3 trigger: bucket → event type
PUT - Upload a test file → verify Lambda fired in CloudWatch Logs
- Extend: print the file size from
event['Records'][0]['s3']['object']['size']
Lab 3 — API Gateway → Lambda (20 min)
Goal: Build a serverless HTTP endpoint.
- Create Lambda function that parses
event['queryStringParameters']and returns a JSON response - Attach API Gateway trigger (REST API or HTTP API)
- Deploy the API → test via
curlor browser - Extend: add a POST handler that reads
event['body']
Lab 4 — Lambda → Bedrock (30 min) ⭐ MLA-C01 relevant
Goal: Call Bedrock from Lambda (the serverless GenAI pattern).
- Create Lambda with execution role that includes
bedrock:InvokeModel - Use
boto3to callbedrock-runtime.invoke_model()with Claude or Titan - Parse the response and return the LLM output
- Test with a sample prompt
import boto3, json
def lambda_handler(event, context):
client = boto3.client("bedrock-runtime", region_name="us-east-1")
body = json.dumps({
"prompt": event.get("prompt", "What is machine learning?"),
"max_tokens_to_sample": 200
})
response = client.invoke_model(
modelId="anthropic.claude-instant-v1",
body=body,
contentType="application/json",
accept="application/json"
)
return json.loads(response["body"].read())
Lab 5 — Lambda + SageMaker Endpoint Invoke (30 min) ⭐ MLA-C01 relevant
Goal: Trigger real-time inference from Lambda.
- Deploy any SageMaker endpoint (even a built-in XGBoost demo)
- Create Lambda with
sagemaker:InvokeEndpointin execution role - Use
boto3 sagemaker-runtimeto call the endpoint with sample payload - Return the prediction result
- Add API Gateway trigger → you now have a serverless ML API
Lab 6 — Lambda Layers (20 min)
Goal: Package numpy/pandas as a Lambda Layer.
- Create a zip with
python/numpyandpython/pandasinstalled via pip - Upload as a Lambda Layer
- Attach to a function and confirm
import numpyworks without bundling it in the function zip
Summary
| Feature | Lambda |
|---|---|
| Type | Serverless FaaS |
| Max runtime | 15 min |
| Trigger model | Event-driven (sync / async / stream) |
| ML role | Pipeline glue — connect S3/Kinesis/API GW to SageMaker/Bedrock |
| Scaling | Automatic (up to concurrency limit) |
| Cold start | Yes — mitigated by Provisioned Concurrency |
| VPC support | Yes (with ENI overhead) |
| Key for MLA-C01 | Trigger patterns + Bedrock/SageMaker invoke patterns |