CLI Performance Guide
This document describes performance benchmarks, optimization strategies, and performance guidelines for the Radium CLI.
Benchmark Suiteβ
The CLI includes a benchmark suite using Criterion located in apps/cli/benches/.
Running Benchmarksβ
# Run all benchmarks
cargo bench -p radium-cli
# Run specific benchmark
cargo bench -p radium-cli --bench workspace_bench
Benchmark Categoriesβ
-
Workspace Operations
- Workspace discovery performance
- Workspace initialization time
- Directory structure creation
-
Command Execution
- Command parsing overhead
- JSON serialization performance
- Output formatting
Performance Targetsβ
Command Startupβ
- Cold start: <100ms (excluding dependency loading)
- Warm start: <50ms
- Command parsing: <10ms
Common Operationsβ
- Workspace discovery: <50ms (even in deep directory trees)
- Agent list: <200ms (for 100+ agents)
- Status command: <300ms (including all checks)
- Plan generation: <2s (excluding AI model calls)
- JSON output: <10ms overhead per command
File Operationsβ
- Workspace init: less than 500ms
- File reading: less than 100ms per file
- Directory traversal: less than 200ms for typical workspace
Optimization Strategiesβ
Workspace Discoveryβ
Current Implementation:
- Searches upward from current directory
- Stops at first
.radiumdirectory found - Caches result where possible
Optimization Opportunities:
- Cache workspace path in environment variable
- Use faster path operations (avoid canonicalization where not needed)
- Limit search depth (e.g., max 10 levels up)
Command Parsingβ
Current Implementation:
- Uses Clap derive macros
- Parses all arguments upfront
Optimization Opportunities:
- Lazy argument parsing for rarely-used flags
- Cache parsed command structure
- Minimize string allocations
JSON Outputβ
Current Implementation:
- Uses
serde_json::to_string_pretty() - Creates full JSON structure before output
Optimization Opportunities:
- Stream JSON output for large datasets
- Use
to_string()instead ofto_string_pretty()for production - Reuse JSON serializers where possible
Async Operationsβ
Current Implementation:
- All commands use async/await
- Tokio runtime for I/O operations
Optimization Opportunities:
- Use
tokio::fsfor async file operations - Batch file system operations
- Parallelize independent operations
Performance Monitoringβ
Baseline Measurementsβ
Run benchmarks regularly to establish baselines:
# Generate baseline report
cargo bench -p radium-cli -- --save-baseline baseline
# Compare against baseline
cargo bench -p radium-cli -- --baseline baseline
Regression Testingβ
Add performance regression tests to CI:
#[test]
fn test_workspace_discovery_performance() {
let start = std::time::Instant::now();
// ... perform operation
let duration = start.elapsed();
assert!(duration.as_millis() < 50, "Workspace discovery too slow");
}
Profilingβ
Using Criterionβ
Criterion automatically generates HTML reports with detailed profiling information:
cargo bench -p radium-cli
# Reports available in target/criterion/
Using Flamegraphβ
For detailed profiling:
# Install flamegraph
cargo install flamegraph
# Profile command execution
cargo flamegraph --bin radium-cli -- status
Using Perf (Linux)β
perf record --call-graph=dwarf cargo run --release -p radium-cli -- status
perf report
Common Bottlenecksβ
File System Operationsβ
Problem: Slow file I/O operations
Solutions:
- Use async file operations (
tokio::fs) - Batch file reads/writes
- Cache file metadata
- Avoid unnecessary
canonicalize()calls
String Allocationsβ
Problem: Excessive string allocations
Solutions:
- Use string slices where possible
- Reuse string buffers
- Use
Cow<str>for conditional ownership - Minimize format!() calls in hot paths
JSON Serializationβ
Problem: Slow JSON output generation
Solutions:
- Use streaming serialization for large datasets
- Avoid pretty printing in production
- Cache serialized structures where possible
- Use
serde_json::to_writer()for direct output
Workspace Discoveryβ
Problem: Slow workspace discovery in deep trees
Solutions:
- Cache workspace path
- Limit search depth
- Use faster path operations
- Store workspace marker in parent directories
Performance Guidelines for New Commandsβ
When implementing new commands:
- Measure First: Establish baseline performance
- Profile Hot Paths: Identify bottlenecks
- Optimize Incrementally: Make small, measurable improvements
- Test Regressions: Ensure optimizations don't break functionality
- Document Targets: Set and document performance targets
Example Performance Testβ
#[test]
fn test_command_performance() {
let start = std::time::Instant::now();
// ... execute command
let duration = start.elapsed();
assert!(
duration.as_millis() < 1000,
"Command took {}ms, target is <1000ms",
duration.as_millis()
);
}
Memory Usageβ
Targetsβ
- Command startup: less than 10MB
- Typical command execution: less than 50MB
- Large operations (plan generation): less than 200MB
Monitoringβ
Use tools like valgrind or heaptrack to monitor memory usage:
# Using valgrind
valgrind --tool=massif cargo run --release -p radium-cli -- status
# Using heaptrack (Linux)
heaptrack cargo run --release -p radium-cli -- status
Best Practicesβ
- Lazy Loading: Load resources only when needed
- Caching: Cache expensive computations and file reads
- Batching: Batch file operations where possible
- Async I/O: Use async file operations for better concurrency
- Minimize Allocations: Reuse buffers and avoid unnecessary allocations
- Profile Regularly: Run benchmarks and profiles regularly
- Set Targets: Define and track performance targets
- Test Regressions: Add performance tests to prevent regressions
Troubleshooting Slow Commandsβ
- Profile the command: Use
cargo flamegraphorperf - Check file I/O: Look for excessive file operations
- Check network calls: Verify no unexpected network requests
- Check dependencies: Ensure no slow dependencies
- Review algorithms: Look for inefficient algorithms
- Check memory: Verify no memory leaks or excessive allocations
Performance Checklistβ
When reviewing code for performance:
- File operations use async I/O where appropriate
- Workspace discovery is cached
- JSON output is optimized (no unnecessary pretty printing)
- String allocations are minimized
- Expensive operations are lazy-loaded
- Performance tests exist for critical paths
- Benchmarks are up to date
- No obvious bottlenecks in hot paths