awslabs · krokoko · Jun 30, 2026 · Jun 19, 2026 · Jun 23, 2026 · Jun 24, 2026
@@ -86,7 +86,7 @@
         "microvms",
         "firecracker"
       ],
-      "version": "1.2.0"
+      "version": "1.3.0"
     },
     {
       "category": "migration",

@@ -22,5 +22,5 @@
   "license": "Apache-2.0",
   "name": "aws-serverless",
   "repository": "https://github.com/awslabs/agent-plugins",
-  "version": "1.2.0"
+  "version": "1.3.0"
 }
@@ -1,6 +1,6 @@
 {
   "name": "aws-serverless",
-  "version": "1.2.0",
+  "version": "1.3.0",
   "description": "Design, build, deploy, test, and debug serverless applications with AWS Serverless services.",
   "author": {
     "name": "Amazon Web Services",

@@ -4,13 +4,14 @@ description: >
   Evaluate, configure, and migrate workloads to AWS Lambda Managed Instances (LMI).
   Triggers on: Lambda Managed Instances, LMI, capacity provider, multi-concurrency Lambda,
   dedicated instance Lambda, EC2-backed Lambda, cold start elimination, Graviton Lambda,
-  instance type for Lambda, Lambda cost optimization with Reserved Instances or Savings Plans.
-  Also trigger when users describe high-volume predictable workloads seeking cost savings,
+  instance type for Lambda, scheduled scaling for LMI, Lambda cost optimization with
+  Reserved Instances or Savings Plans. Also trigger when users describe high-volume
+  predictable workloads seeking cost savings, want to scale LMI capacity on a schedule,
   or compare Lambda vs EC2 for steady-state traffic. For standard Lambda without LMI,
   use the aws-lambda skill instead.
 argument-hint: "[describe your workload or what you need help with]"
 metadata:
-  tags: lambda, lmi, managed-instances, ec2, capacity-provider, multi-concurrency, cost-optimization
+  tags: lambda, lmi, managed-instances, ec2, capacity-provider, multi-concurrency, cost-optimization, scheduled-scaling
 ---
 
 # AWS Lambda Managed Instances (LMI)
@@ -22,11 +23,11 @@ For standard Lambda development, see [aws-lambda skill](../aws-lambda/). For SAM
 ## When to Load Reference Files
 
 - **Cost comparison**, **pricing analysis**, **Lambda vs LMI cost**, **Savings Plans**, or **Reserved Instances** -> see [references/cost-comparison.md](references/cost-comparison.md)
-- **Instance types**, **memory sizing**, **vCPU ratios**, **scaling tuning**, or **capacity provider config** -> see [references/configuration-guide.md](references/configuration-guide.md)
+- **Instance types**, **memory sizing**, **vCPU ratios**, **scaling tuning**, **scheduled scaling**, or **capacity provider config** -> see [references/configuration-guide.md](references/configuration-guide.md)
 - **Thread safety**, **concurrency model**, **code review checklist**, **Powertools compatibility**, or **multi-concurrency readiness** -> see [references/thread-safety.md](references/thread-safety.md)
 - **Before/after code examples**, **runtime-specific migration** (Node.js, Python, Java, .NET), or **connection pooling** -> see [references/migration-patterns.md](references/migration-patterns.md)
-- **IAM roles**, **VPC setup**, **CLI commands**, **SAM template**, or **CDK example** -> see [references/infrastructure-setup.md](references/infrastructure-setup.md) and [scripts/setup-lmi.sh](scripts/setup-lmi.sh)
-- **Errors**, **throttling**, **debugging**, or **stuck deployments** -> see [references/troubleshooting.md](references/troubleshooting.md)
+- **IAM roles**, **VPC setup**, **CLI commands**, **SAM template**, **CDK example**, or **scheduled scaling setup (EventBridge Scheduler)** -> see [references/infrastructure-setup.md](references/infrastructure-setup.md) and [scripts/setup-lmi.sh](scripts/setup-lmi.sh)
+- **Errors**, **throttling**, **debugging**, **stuck deployments**, **tuning configuration**, or **adjusting after deployment** -> see [references/troubleshooting.md](references/troubleshooting.md)
 
 ## Quick Decision: Is LMI Right for This Workload?
 
@@ -54,6 +55,38 @@ Gather these signals before recommending:
 6. **Concurrency readiness**: Thread safety (Node.js/Java/.NET)? Shared `/tmp` paths? Per-invocation DB connections?
 7. **VPC**: Already in a VPC? Private resource access needed?
 
+#### Deriving LMI Configuration from Metrics
+
+If Lambda Insights is enabled on the function, use these metrics to calculate your starting configuration. If Lambda Insights is not enabled, suggest adding it to gather accurate workload data — but only proceed with the user's explicit confirmation, as adding the Insights layer may affect function performance or cold start times.
+
+To check if Lambda Insights is enabled, look for a LambdaInsightsExtension layer on the function. To add it, find the latest layer ARN for your region from the [Lambda Insights documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Lambda-Insights-extension-versions.html) and attach the `CloudWatchLambdaInsightsExecutionRolePolicy` managed policy to the function's execution role.
+
+**Target max concurrency** (from `cpu_total_time` and `Duration`):
+
+```
+PerExecutionEnvironmentMaxConcurrency = floor((0.5 × Duration) / cpu_total_time)
+```
+
+This targets 50% CPU utilization at full concurrency, leaving headroom for scaling.
+
+**Memory allocation** (from `memory_utilization` and current memory):
+
+```
+MemorySize = min(32768, max(2048, MaxConcurrency × (memory_utilization / 100) × current_allocated_memory))
+```
+
+This overestimates (assumes no shared base memory) but provides a safe starting point. The outer `min` caps the result at the 32 GB (32768 MB) LMI maximum.
+
+**Minimum execution environments** (from baseline `ConcurrentExecutions`):
+
+```
+MinExecutionEnvironments = max(3, ceil(baseline_concurrent_executions × 2 / MaxConcurrency))
+```
+
+Targets 50% concurrency utilization to leave headroom for traffic bursts.
+
+**Without Lambda Insights:** Start with the runtime's default max concurrency, 2 GB memory, and MinExecutionEnvironments = 3. Adjust during testing.
+
 ### Step 2: Build the Cost Comparison
 
 REQUIRED: Present a cost comparison before recommending LMI. Compare at minimum:
@@ -71,12 +104,14 @@ For discount analysis (Savings Plans, Reserved Instances), refer users to the [A
 
 **Instance families** (~450 types): C-series (compute, .xlarge+), M-series (general, .large+), R-series (memory, .large+). ARM (Graviton) for best price-performance.
 
-**Memory-to-vCPU ratios**: 2:1 (compute), 4:1 (general, default), 8:1 (memory). Min 2 GB, max 32 GB.
+**Memory-to-vCPU ratios**: 2:1 (default, CPU-bound work), 4:1 (general/mixed workloads), 8:1 (memory-heavy or Python apps). Min 2 GB, max 32 GB.
 
 **Multi-concurrency defaults/vCPU**: Node.js 64, Java 32, .NET 32, Python 16.
 
 **Scaling**: MinExecutionEnvironments (default 3), MaxVCpuCount (default 400), TargetResourceUtilization.
 
+**Scheduled scaling**: For predictable traffic (business hours, marketing events), use EventBridge Scheduler to adjust Min/Max execution environments on a one-time or recurring schedule — scale up before peak, scale down or to zero when idle.
+
 See [references/configuration-guide.md](references/configuration-guide.md) for decision trees and detailed tuning.
 
 ### Step 4: Migrate the Code
@@ -105,16 +140,16 @@ See [references/infrastructure-setup.md](references/infrastructure-setup.md) for
 ### Step 6: Validate and Cut Over
 
 1. Deploy to a non-production environment first
-2. Monitor CloudWatch: CPU utilization, memory, concurrency, throttle rate
-3. Gradual traffic shift with weighted aliases (10% → 50% → 100%)
+2. Monitor CloudWatch: CPU utilization, memory, concurrency, throttle rate. If you observe low CPU utilization or ongoing throttles, see [references/troubleshooting.md](references/troubleshooting.md) for metric-specific adjustment guidance.
+3. Shift traffic to the LMI function (note: weighted alias shifting between LMI and non-LMI functions is not currently supported)
 4. Compare costs after 1-2 weeks of production data
 5. Decommission standard Lambda once stable
 
 ## Best Practices
 
 ### Configuration
 
-- Do: Start with 4:1 ratio and runtime default concurrency
+- Do: Start with 2:1 ratio and runtime default concurrency
 - Do: Use ARM (Graviton) unless x86 dependencies exist
 - Do: Let Lambda choose instance types unless specific hardware needed
 - Do: Set MaxVCpuCount to control cost ceiling
@@ -125,7 +160,7 @@ See [references/infrastructure-setup.md](references/infrastructure-setup.md) for
 
 - Do: Start with I/O-heavy functions (benefit most from multi-concurrency; CPU-bound functions compete for same CPU)
 - Do: Review code for concurrency safety before attaching to capacity provider (thread safety for Node.js/Java/.NET; `/tmp` and memory for Python)
-- Do: Use weighted aliases for gradual traffic shift
+- Do: Plan traffic shifting strategy based on your invocation source (weighted alias shifting between LMI and non-LMI functions is not currently supported)
 - Do: Include request IDs in all log statements
 - Do: Initialize DB pools and SDK clients outside the handler
 - Do: Estimate total `/tmp` usage under max concurrency
@@ -135,8 +170,10 @@ See [references/infrastructure-setup.md](references/infrastructure-setup.md) for
 ### Operations
 
 - Do: Set CloudWatch alarms on throttle rate > 1% and CPU > 80%
+- Do: Use scheduled scaling (EventBridge Scheduler) for predictable traffic — raise Min/Max before peak periods and lower them (or scale to zero) when idle
 - Don't: Manually terminate LMI EC2 instances (delete the capacity provider instead)
 - Don't: Forget to publish a version — unpublished functions cannot run on LMI
+- Don't: Rely on a deactivated (Min=Max=0) function to self-recover — schedule an explicit scale-up to reactivate it
 
 ## Limits Quick Reference
 
@@ -172,7 +209,7 @@ REQUIRED: AWS credentials configured on the host machine.
 
 ### Regional Availability
 
-Currently available: us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1. Expanding to all commercial regions soon.
+Available in all commercial AWS Regions except Israel (Tel Aviv), Middle East (Bahrain), Middle East (UAE), and Asia Pacific (Auckland).
 
 Check the [Lambda Managed Instances documentation](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances.html) for the latest regional availability.
 
@@ -204,12 +241,14 @@ Override: "use SAM" → SAM YAML, "use CloudFormation" → CloudFormation YAML.
 
 ### Unsupported Region
 
-- State: "Lambda Managed Instances is not yet available in [region]"
-- List available regions
+- State: "Lambda Managed Instances is not available in [region]"
+- Name the excluded regions: Israel (Tel Aviv), Middle East (Bahrain), Middle East (UAE), Asia Pacific (Auckland)
+- Suggest the nearest supported region
 
 ## Resources
 
 - [Lambda Managed Instances Docs](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances.html)
+- [Scaling LMI & Scheduled Scaling Docs](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances-scaling.html)
 - [Introducing LMI (AWS Blog)](https://aws.amazon.com/blogs/aws/introducing-aws-lambda-managed-instances-serverless-simplicity-with-ec2-flexibility/)
 - [Build High-Performance Apps with LMI](https://aws.amazon.com/blogs/compute/build-high-performance-apps-with-aws-lambda-managed-instances/)
 - [Migrating Functions to LMI (AWS Blog)](https://aws.amazon.com/blogs/compute/migrating-your-functions-to-aws-lambda-managed-instances/)

@@ -5,17 +5,17 @@
 - **CPU-intensive** (encoding, ML, compression) → C-series, 2:1 ratio, concurrency=1/vCPU
 - **Memory-intensive** (caching, large datasets) → R-series, 8:1 ratio
 - **Network-intensive** (streaming, data transfer) → Use AllowedInstanceTypes for n-suffix types, 4:1 ratio
-- **General/balanced** (web APIs, microservices) → M-series, 4:1 ratio, default concurrency
+- **General/balanced** (web APIs, microservices) → M-series, 2:1 ratio (default), default concurrency
 
 Architecture: ARM (Graviton, g-suffix) for price-performance. x86 (i=Intel, a=AMD) when dependencies require it.
 
 ## Memory-to-vCPU Ratios
 
-| Ratio | Profile | When to use                | Memory examples       |
-| ----- | ------- | -------------------------- | --------------------- |
-| 2:1   | Compute | CPU-bound work             | 2GB/1vCPU, 4GB/2vCPU  |
-| 4:1   | General | Most workloads (default)   | 4GB/1vCPU, 8GB/2vCPU  |
-| 8:1   | Memory  | Caching, data, Python apps | 8GB/1vCPU, 16GB/2vCPU |
+| Ratio | Profile | When to use                      | Memory examples       |
+| ----- | ------- | -------------------------------- | --------------------- |
+| 2:1   | Compute | CPU-bound work (default)         | 2GB/1vCPU, 4GB/2vCPU  |
+| 4:1   | General | Mixed CPU/memory-heavy workloads | 4GB/1vCPU, 8GB/2vCPU  |
+| 8:1   | Memory  | Memory-heavy or Python apps      | 8GB/1vCPU, 16GB/2vCPU |
 
 Min: 2 GB / 1 vCPU. Max: 32 GB. Memory must align with ratio multiples.
 
@@ -51,6 +51,26 @@ Total capacity = MinExecutionEnvironments × PerExecutionEnvironmentMaxConcurren
 | AllowedInstanceTypes      | All           | Restrict only for specific hardware needs             |
 | ExcludedInstanceTypes     | None          | Exclude expensive types in dev/test                   |
 
+## Scheduled Scaling (Predictable Traffic)
+
+For workloads with known traffic patterns (business hours, marketing events, batch windows), use [Amazon EventBridge Scheduler](https://docs.aws.amazon.com/scheduler/latest/UserGuide/managing-targets-universal.html) to adjust a function's `MinExecutionEnvironments` and `MaxExecutionEnvironments` on a one-time or recurring schedule. A schedule (cron or rate expression) targets the Lambda `PutFunctionScalingConfig` API as an EventBridge Scheduler universal target, passing new Min/Max values in the input payload.
+
+**Behavior:**
+
+- Scheduled scaling sets the provisioned floor and ceiling. Actual scaling between Min and Max still responds to CPU utilization and concurrency saturation.
+- If traffic more than doubles within 5 minutes of a scheduled scale-up, you may still see throttles while capacity provisions.
+- Setting both `MinExecutionEnvironments` and `MaxExecutionEnvironments` to 0 deactivates the function version (instances terminate). A deactivated function does NOT auto-recover — schedule a separate action with non-zero values to reactivate it.
+
+**Common patterns:**
+
+| Pattern                | Scale-up schedule                   | Scale-down schedule              |
+| ---------------------- | ----------------------------------- | -------------------------------- |
+| Business hours         | Raise Min/Max before work starts    | Lower Min/Max after hours        |
+| Marketing/launch event | Raise Min ahead of the campaign     | Restore baseline after the event |
+| Idle scale-to-zero     | Reactivate (non-zero) before demand | Set Min=Max=0 when idle          |
+
+See [infrastructure-setup.md](infrastructure-setup.md) for the EventBridge Scheduler IAM role and `create-schedule` CLI examples.
+
 ## Monitoring Thresholds
 
 - **CPU > 80%**: reduce concurrency or add vCPUs

@@ -224,6 +224,83 @@ Resources:
           CapacityProviderArn: !GetAtt MyCP.Arn
 ```
 
+## Scheduled Scaling (EventBridge Scheduler)
+
+For predictable traffic, adjust `MinExecutionEnvironments`/`MaxExecutionEnvironments` on a schedule using [Amazon EventBridge Scheduler](https://docs.aws.amazon.com/scheduler/latest/UserGuide/managing-targets-universal.html). The schedule calls the Lambda `PutFunctionScalingConfig` API directly as a universal target — no Lambda code or extra glue required.
+
+### 1. Scheduler execution role
+
+Trust policy (allow EventBridge Scheduler to assume the role):
+
+```json
+{
+  "Version": "2012-10-17",
+  "Statement": [{
+    "Effect": "Allow",
+    "Principal": { "Service": "scheduler.amazonaws.com" },
+    "Action": "sts:AssumeRole"
+  }]
+}
+```
+
+Permissions (call `PutFunctionScalingConfig` on the target function):
+
+```json
+{
+  "Version": "2012-10-17",
+  "Statement": [{
+    "Effect": "Allow",
+    "Action": "lambda:PutFunctionScalingConfig",
+    "Resource": "arn:aws:lambda:*:*:function:my-lmi-function"
+  }]
+}
+```
+
+### 2. Create schedules
+
+Scale up before peak (08:00 UTC daily):
+
+```bash
+aws scheduler create-schedule \
+  --name ScaleUpLmi \
+  --schedule-expression "cron(0 8 * * ? *)" \
+  --flexible-time-window '{"Mode": "OFF"}' \
+  --target '{
+    "Arn": "arn:aws:scheduler:::aws-sdk:lambda:PutFunctionScalingConfig",
+    "RoleArn": "arn:aws:iam::<account-id>:role/eventbridge-scheduler-role",
+    "Input": "{\"FunctionName\": \"my-lmi-function\", \"Qualifier\": \"$LATEST.PUBLISHED\", \"FunctionScalingConfig\": {\"MinExecutionEnvironments\": 100, \"MaxExecutionEnvironments\": 1000}}"
+  }'
+```
+
+Scale down after peak (18:00 UTC daily):
+
+```bash
+aws scheduler create-schedule \
+  --name ScaleDownLmi \
+  --schedule-expression "cron(0 18 * * ? *)" \
+  --flexible-time-window '{"Mode": "OFF"}' \
+  --target '{
+    "Arn": "arn:aws:scheduler:::aws-sdk:lambda:PutFunctionScalingConfig",
+    "RoleArn": "arn:aws:iam::<account-id>:role/eventbridge-scheduler-role",
+    "Input": "{\"FunctionName\": \"my-lmi-function\", \"Qualifier\": \"$LATEST.PUBLISHED\", \"FunctionScalingConfig\": {\"MinExecutionEnvironments\": 5, \"MaxExecutionEnvironments\": 20}}"
+  }'
+```
+
+Set both values to `0` to deactivate during idle periods; schedule a separate non-zero action to reactivate (a deactivated function does not auto-recover).
+
+### Manual override
+
+Update scaling limits directly at any time:
+
+```bash
+aws lambda put-function-scaling-config \
+  --function-name my-lmi-function \
+  --qualifier '$LATEST.PUBLISHED' \
+  --function-scaling-config MinExecutionEnvironments=5,MaxExecutionEnvironments=20
+```
+
+`MinExecutionEnvironments` and `MaxExecutionEnvironments` accept values from 0 to 15000 and must be set together. Setting them on `$LATEST.PUBLISHED` propagates to future published versions.
+
 ## Cleanup
 
 ```bash

@@ -1,5 +1,41 @@
 # LMI Troubleshooting
 
+## Testing Phase: Monitor and Adjust
+
+After deploying your LMI function with a test workload, check these metrics and adjust:
+
+**Duration increased vs. existing function:**
+
+- This indicates the concurrency estimations used during setup may be off. Investigate by:
+  - Checking ExecutionEnvironmentCPUUtilization and ExecutionEnvironmentMemoryUtilization for saturation
+  - Reducing PerExecutionEnvironmentMaxConcurrency to see if duration improves
+  - Reviewing instance types — switching to larger or more powerful instances may help if resources are constrained
+- If reducing concurrency doesn't help, check throttle metrics below
+
+**Low ExecutionEnvironmentCPUUtilization (below 10%):**
+
+- Increase PerExecutionEnvironmentMaxConcurrency to improve utilization
+- Or lower MemorySize to reduce vCPUs per execution environment
+- If memory utilization is also high, increase ExecutionEnvironmentMemoryGiBPerVCpu ratio instead
+
+**Ongoing CPUThrottles:**
+
+- Switch capacity provider to Manual scaling mode with a lower CPU utilization target (e.g., 25%)
+
+**Ongoing MemoryThrottles:**
+
+- Increase MemorySize
+- To maintain the same vCPU count, adjust ratio proportionally (e.g., 4GB/2:1 → 8GB/4:1 keeps 2 vCPUs)
+
+**Ongoing DiskThrottles:**
+
+- Reduce per-invocation /tmp usage or reduce PerExecutionEnvironmentMaxConcurrency
+
+**Ongoing ConcurrencyThrottles:**
+
+- Increase PerExecutionEnvironmentMaxConcurrency (if CPU and memory have headroom)
+- Check if MaxExecutionEnvironments or MaxVCpuCount is capping scale-out
+
 ## Common Issues
 
 | Issue                          | Cause                                            | Resolution                                                                          |