Persona System User Guide
The Persona System provides intelligent model selection, cost optimization, and automatic fallback chains for Radium agents. This guide explains how to use persona metadata to enhance your agents.
Table of Contentsβ
- Overview
- Quick Start
- Persona Configuration
- Performance Profiles
- Model Recommendations
- Cost Estimation
- Budget Management
- CLI Commands
- Troubleshooting
Overviewβ
What is the Persona System?β
The Persona System extends agent configuration with metadata that enables:
- Intelligent Model Selection: Automatically choose the best model based on task requirements
- Cost Optimization: Track and estimate costs for agent executions
- Fallback Chains: Gracefully handle model unavailability with automatic fallbacks
- Performance Profiles: Match model capabilities to task complexity
Benefitsβ
- Automatic Model Selection: No need to manually specify models for each execution
- Cost Transparency: Understand and control AI model costs
- Reliability: Automatic fallback when primary models are unavailable
- Optimization: Match model performance to task requirements
Quick Startβ
Adding Persona to an Agentβ
The easiest way to add persona metadata is when creating a new agent:
rad agents create my-agent --with-persona
This generates an agent configuration with a persona template that you can customize.
Adding Persona to Existing Agentsβ
Edit your agent's TOML configuration file and add a [agent.persona] section:
[agent]
id = "my-agent"
name = "My Agent"
description = "Does something useful"
prompt_path = "prompts/agents/my-category/my-agent.md"
[agent.persona]
[agent.persona.models]
primary = "gemini-2.0-flash-exp"
fallback = "gemini-2.0-flash-thinking"
premium = "gemini-1.5-pro"
[agent.persona.performance]
profile = "balanced"
estimated_tokens = 1500
Persona Configurationβ
TOML Formatβ
Persona configuration is added to your agent's TOML file under the [agent.persona] section:
[agent.persona]
[agent.persona.models]
primary = "gemini-2.0-flash-exp"
fallback = "gemini-2.0-flash-thinking" # Optional
premium = "gemini-1.5-pro" # Optional
[agent.persona.performance]
profile = "balanced" # speed, balanced, thinking, or expert
estimated_tokens = 1500 # Optional
Model Formatβ
Models can be specified in two formats:
-
Simple format (uses agent's engine):
primary = "gemini-2.0-flash-exp" -
Full format (explicit engine):
primary = "gemini:gemini-2.0-flash-exp"
Required Fieldsβ
primary: The primary recommended model (required)
Optional Fieldsβ
fallback: Model to use if primary is unavailablepremium: Premium model for critical tasksprofile: Performance profile (defaults to "balanced")estimated_tokens: Estimated token usage per execution
Performance Profilesβ
Performance profiles help match model capabilities to task requirements:
Speedβ
Optimized for fast responses and lower costs. Best for:
- Simple tasks
- High-volume operations
- Cost-sensitive applications
Example:
[agent.persona.performance]
profile = "speed"
Balancedβ
Balanced speed and quality. Best for:
- General-purpose tasks
- Code generation
- Documentation
Example:
[agent.persona.performance]
profile = "balanced"
Thinkingβ
Optimized for deep reasoning. Best for:
- Complex problem-solving
- Architecture design
- Planning and analysis
Example:
[agent.persona.performance]
profile = "thinking"
Expertβ
Expert-level reasoning, highest cost. Best for:
- Critical decisions
- Complex analysis
- Premium features
Example:
[agent.persona.performance]
profile = "expert"
Model Recommendationsβ
Choosing Modelsβ
When selecting models for your persona configuration:
-
Primary Model: Choose based on performance profile
- Speed: Fast models (e.g.,
gemini-2.0-flash-exp) - Balanced: General models (e.g.,
gemini-2.0-flash-exp) - Thinking: Reasoning models (e.g.,
gemini-2.0-flash-thinking) - Expert: Premium models (e.g.,
gemini-1.5-pro)
- Speed: Fast models (e.g.,
-
Fallback Model: Choose a reliable alternative
- Should be available when primary might not be
- Can be a different performance tier
-
Premium Model: Choose for critical tasks
- Highest quality option
- Used when explicitly requested or primary/fallback unavailable
Example Configurationsβ
Speed-Optimized Agent:
[agent.persona]
[agent.persona.models]
primary = "gemini-2.0-flash-exp"
fallback = "gemini-2.0-flash-thinking"
premium = "gemini-1.5-pro"
[agent.persona.performance]
profile = "speed"
estimated_tokens = 1000
Thinking Agent:
[agent.persona]
[agent.persona.models]
primary = "gemini-2.0-flash-thinking"
fallback = "gemini-2.0-flash-exp"
premium = "gemini-1.5-pro"
[agent.persona.performance]
profile = "thinking"
estimated_tokens = 2000
Cost Estimationβ
Understanding Costsβ
Costs are calculated based on:
- Input Tokens: Tokens in the prompt
- Output Tokens: Tokens in the response
- Model Pricing: Per-token pricing from the model provider
Viewing Cost Estimatesβ
Use the rad agents cost command to see cost estimates:
rad agents cost my-agent
You can also specify expected token counts:
rad agents cost my-agent --input-tokens 2000 --output-tokens 1000
Cost Breakdownβ
The cost command shows:
- Estimated costs for primary, fallback, and premium models
- Token estimates (input, output, total)
- Cost per model in the fallback chain
Budget Managementβ
Setting a Budgetβ
Set a budget limit to track spending:
rad budget set 100.00
This sets a budget of $100.00 USD.
Viewing Budget Statusβ
Check your current budget usage:
rad budget status
This shows:
- Budget limit
- Amount spent
- Remaining budget
- Usage percentage
- Status (active, warning, exceeded)
Resetting Budgetβ
Reset budget tracking (keeps the limit):
rad budget reset
CLI Commandsβ
View Agent Personaβ
Show persona configuration for an agent:
rad agents persona <agent-id>
View Agent Info (with Persona)β
Show full agent information including persona:
rad agents info <agent-id>
List Agents by Profileβ
Filter agents by performance profile:
rad agents list --profile thinking
Valid profiles: speed, balanced, thinking, expert
Estimate Costsβ
Show cost estimates for an agent:
rad agents cost <agent-id>
rad agents cost <agent-id> --input-tokens 2000 --output-tokens 1000
Budget Commandsβ
rad budget set <amount> # Set budget limit
rad budget status # Show budget status
rad budget reset # Reset budget tracking
Troubleshootingβ
"No persona configuration found"β
Problem: Agent doesn't have persona metadata.
Solution: Add persona configuration to the agent's TOML file or use rad agents create --with-persona when creating new agents.
"Model not available"β
Problem: Selected model is unavailable.
Solution: The system will automatically try fallback models. Ensure your fallback chain is configured correctly.
"Invalid performance profile"β
Problem: Profile value is not recognized.
Solution: Use one of: speed, balanced, thinking, or expert.
Cost Estimates Seem Incorrectβ
Problem: Cost estimates don't match actual costs.
Solution:
- Check that
estimated_tokensis set correctly - Verify model pricing is up to date
- Use
--input-tokensand--output-tokensflags for more accurate estimates
Budget Not Trackingβ
Problem: Budget status shows no spending.
Solution: Budget tracking is currently in-memory. Future versions will add persistent tracking.
Best Practicesβ
- Set Appropriate Profiles: Match performance profiles to task complexity
- Configure Fallbacks: Always set a fallback model for reliability
- Estimate Tokens: Set
estimated_tokensfor accurate cost estimates - Use Budgets: Set budget limits to control costs
- Test Fallback Chains: Verify fallback models work correctly
Examplesβ
See the core agents for examples:
agents/core/arch-agent.toml- Thinking profileagents/core/plan-agent.toml- Thinking profileagents/core/code-agent.toml- Balanced profileagents/core/review-agent.toml- Balanced profileagents/core/doc-agent.toml- Speed profile
Further Readingβ
- Agent Creation Guide - Complete guide to creating agents
- Persona System Architecture - Technical architecture details