Plan Execution System
The plan execution system executes generated plans with intelligent retry logic, error categorization, state persistence, and multiple execution modes. It ensures reliable execution even in the face of transient errors.
Overviewβ
The plan executor provides robust execution of plans with:
- Intelligent Retry Logic: Automatic retries with exponential backoff for recoverable errors
- Error Categorization: Distinguishes between recoverable and fatal errors
- Execution Modes: Bounded (limited iterations) and Continuous (run until complete)
- State Persistence: Saves progress after each task for checkpoint recovery
- Dependency Validation: Ensures task dependencies are met before execution
- Context File Support: Injects context files into agent prompts
Key Featuresβ
- Automatic Retries: Recoverable errors are retried automatically
- Exponential Backoff: Delays increase exponentially between retries
- Checkpoint Recovery: Resume execution from last checkpoint
- Progress Tracking: Real-time progress reporting
- Multiple Execution Modes: Bounded or continuous execution
Execution Lifecycleβ
The executor follows a structured lifecycle:
- Load Manifest: Load plan manifest from disk (or create new)
- Resume Checkpoint: If resuming, skip completed tasks
- Iteration Loop: Execute iterations based on RunMode
- Task Execution: For each task:
- Check dependencies
- Execute with retry logic
- Save state after completion
- Progress Tracking: Update and display progress
Retry Logicβ
The executor uses intelligent retry logic with exponential backoff.
Retry Behaviorβ
- Max Retries: Configurable (default: 3 attempts)
- Backoff Formula:
delay = base_delay_ms * 2^attempt - Error Categorization: Only retries recoverable errors
Retry Flowβ
Task Execution
β
Error Occurs
β
Categorize Error
β
Recoverable? β Yes β Retry with backoff
β β
No Max retries?
β β
Fail Immediately Yes β Fail
β
No β Retry
Example: Retry Sequenceβ
Attempt 1: Error (rate limit)
β Wait 1 second (1000ms * 2^0)
Attempt 2: Error (rate limit)
β Wait 2 seconds (1000ms * 2^1)
Attempt 3: Error (rate limit)
β Wait 4 seconds (1000ms * 2^2)
Attempt 4: Success!
Error Categorizationβ
Errors are automatically categorized to determine retry behavior.
Recoverable Errorsβ
These errors are retried with exponential backoff:
- HTTP 429: Rate limit exceeded
- Network Timeouts: Connection timeouts, read timeouts
- Connection Errors: Network unreachable, connection refused
- HTTP 5xx: Server errors (500, 502, 503, 504)
- File Lock Errors: Temporary file locking issues
- Temporary I/O Errors: Transient I/O failures
- Model Execution Errors: May be transient (rate limits, timeouts)
Fatal Errorsβ
These errors fail immediately without retry:
- HTTP 401/403: Authentication/authorization failures
- Missing Configuration: Required config not found
- Invalid Data: Malformed data, invalid format
- Dependency Not Met: Task dependencies not completed
- Agent Not Found: Referenced agent doesn't exist
Error Categorization Logicβ
use radium_core::planning::executor::{ExecutionError, ErrorCategory};
let error = ExecutionError::ModelExecution("Rate limit exceeded".to_string());
match error.category() {
ErrorCategory::Recoverable => {
// Will retry with exponential backoff
}
ErrorCategory::Fatal => {
// Will fail immediately
}
}
Execution Modesβ
The executor supports two execution modes:
Bounded Modeβ
Execute up to N iterations, then stop:
use radium_core::planning::executor::RunMode;
let mode = RunMode::Bounded(5); // Execute up to 5 iterations
Use Cases:
- Incremental execution
- Testing specific iterations
- Limited resource scenarios
Continuous Modeβ
Execute all iterations until plan is complete:
let mode = RunMode::Continuous; // Execute until complete (YOLO mode)
Use Cases:
- Full plan execution
- Automated workflows
- Complete feature implementation
Safety: Includes a sanity limit to prevent infinite loops.
State Persistenceβ
The executor saves state after each task completion for checkpoint recovery.
Checkpoint Structureβ
State is saved to plan/plan_manifest.json with:
- Task completion status
- Iteration status
- Plan progress
- Timestamps
Resuming from Checkpointβ
use radium_core::planning::executor::{ExecutionConfig, PlanExecutor};
use std::path::PathBuf;
let config = ExecutionConfig {
resume: true, // Enable resume mode
skip_completed: true, // Skip completed tasks
// ... other config
};
let executor = PlanExecutor::with_config(config);
// Executor will automatically skip completed tasks
API Usageβ
Basic Executionβ
use radium_core::planning::executor::{PlanExecutor, ExecutionConfig, RunMode};
use radium_core::models::PlanManifest;
use std::path::PathBuf;
async fn execute_plan() -> Result<(), Box<dyn std::error::Error>> {
let config = ExecutionConfig {
resume: false,
skip_completed: true,
check_dependencies: true,
state_path: PathBuf::from("plan/plan_manifest.json"),
context_files: None,
run_mode: RunMode::Bounded(5),
};
let executor = PlanExecutor::with_config(config);
let manifest = executor.load_manifest(&config.state_path)?;
// Execute plan...
Ok(())
}
Task Execution with Retryβ
use radium_core::planning::executor::PlanExecutor;
use radium_core::models::PlanTask;
use radium_abstraction::Model;
use std::sync::Arc;
async fn execute_with_retry(
executor: &PlanExecutor,
task: &PlanTask,
model: Arc<dyn Model>,
) -> Result<TaskResult, ExecutionError> {
// Execute with 3 retries, 1 second base delay
executor.execute_task_with_retry(task, model, 3, 1000).await
}
Dependency Validationβ
use radium_core::planning::executor::PlanExecutor;
fn validate_dependencies(
executor: &PlanExecutor,
manifest: &PlanManifest,
task: &PlanTask,
) -> Result<(), ExecutionError> {
executor.check_dependencies(manifest, task)
}
Progress Trackingβ
use radium_core::planning::executor::PlanExecutor;
use std::time::Duration;
fn track_progress(executor: &PlanExecutor, manifest: &PlanManifest) {
let progress = executor.calculate_progress(manifest);
println!("Progress: {}%", progress);
executor.print_progress(
manifest,
1, // Current iteration
Duration::from_secs(120), // Elapsed time
Some("Task 1.1"), // Current task
);
}
Configuration Optionsβ
ExecutionConfigβ
pub struct ExecutionConfig {
/// Resume from last checkpoint
pub resume: bool,
/// Skip already completed tasks
pub skip_completed: bool,
/// Validate task dependencies before execution
pub check_dependencies: bool,
/// Path to save state checkpoints
pub state_path: PathBuf,
/// Optional context files content to inject into prompts
pub context_files: Option<String>,
/// Execution mode (bounded or continuous)
pub run_mode: RunMode,
}
Default Configurationβ
ExecutionConfig {
resume: false,
skip_completed: true,
check_dependencies: true,
state_path: PathBuf::from("plan/plan_manifest.json"),
context_files: None,
run_mode: RunMode::Bounded(5),
}
Common Use Casesβ
1. Basic Plan Executionβ
Execute a plan with default settings:
let executor = PlanExecutor::new();
let manifest = executor.load_manifest(&PathBuf::from("plan/plan_manifest.json"))?;
// Execute plan...
2. Resume from Checkpointβ
Resume execution from last checkpoint:
let config = ExecutionConfig {
resume: true,
skip_completed: true,
// ... other config
};
let executor = PlanExecutor::with_config(config);
// Completed tasks are automatically skipped
3. Bounded Executionβ
Execute limited iterations:
let config = ExecutionConfig {
run_mode: RunMode::Bounded(3), // Only 3 iterations
// ... other config
};
4. Continuous Execution (YOLO Mode)β
Execute until complete:
let config = ExecutionConfig {
run_mode: RunMode::Continuous, // Run until done
// ... other config
};
5. Context File Injectionβ
Inject context files into prompts:
let config = ExecutionConfig {
context_files: Some(std::fs::read_to_string("context.md")?),
// ... other config
};
Error Handlingβ
Handling Execution Errorsβ
use radium_core::planning::executor::{ExecutionError, ErrorCategory};
match executor.execute_task(&task, model).await {
Ok(result) => {
if result.success {
// Task completed successfully
} else {
// Task failed, check if retryable
let error = result.error.unwrap();
// Handle error...
}
}
Err(e) => {
match e.category() {
ErrorCategory::Recoverable => {
// Will be retried automatically
}
ErrorCategory::Fatal => {
// Must be fixed manually
}
}
}
}
Common Error Scenariosβ
Rate Limit (Recoverable):
Error: Rate limit exceeded (429)
β Retried with exponential backoff
β Eventually succeeds
Authentication Failure (Fatal):
Error: Unauthorized (401)
β Fails immediately
β Must fix API key
Dependency Not Met (Fatal):
Error: Dependency task not completed: I1.T1
β Fails immediately
β Must complete dependency first
Best Practicesβ
- Use Checkpoints: Enable resume mode for long-running plans
- Monitor Progress: Track progress for visibility
- Handle Errors: Check error categories for appropriate handling
- Validate Dependencies: Ensure dependencies are met before execution
- Use Bounded Mode: For testing or incremental execution
- Use Continuous Mode: For full automated execution
Integration Pointsβ
- Plan Generator: Generates plans for execution
- DAG System: Provides dependency ordering
- Agent System: Discovers and executes agents
- Model System: Executes tasks with AI models
- Workflow System: Can be used within workflows
Related Featuresβ
- Autonomous Planning - Plan generation
- DAG Dependencies - Dependency management
- CLI Commands - Command-line usage
- Checkpointing - Checkpoint system
See Alsoβ
- API Reference - Complete API documentation
- Examples - Usage examples