Model Trait and Provider Abstraction API Reference
Overviewβ
This document provides a complete reference for Radium's model abstraction layer, enabling developers to understand how to implement custom model providers or extend existing ones. The abstraction consists of the Model trait, ModelFactory, and related types defined in the radium-abstraction and radium-models crates.
Model Traitβ
The Model trait is the core interface for all AI model implementations in Radium. It provides a unified API for text generation and chat completions.
Trait Definitionβ
#[async_trait]
pub trait Model: Send + Sync {
async fn generate_text(
&self,
prompt: &str,
parameters: Option<ModelParameters>,
) -> Result<ModelResponse, ModelError>;
async fn generate_chat_completion(
&self,
messages: &[ChatMessage],
parameters: Option<ModelParameters>,
) -> Result<ModelResponse, ModelError>;
fn model_id(&self) -> &str;
}
Method: generate_textβ
Generates a text completion based on a single prompt string.
Parameters:
prompt: &str- The input prompt for text generationparameters: Option<ModelParameters>- Optional parameters to control generation (temperature, max_tokens, etc.)
Returns:
Result<ModelResponse, ModelError>- The generated response or an error
Example:
use radium_abstraction::{Model, ModelParameters};
let response = model.generate_text(
"Write a haiku about programming",
Some(ModelParameters {
temperature: Some(0.7),
max_tokens: Some(100),
..Default::default()
})
).await?;
println!("{}", response.content);
Method: generate_chat_completionβ
Generates a chat completion based on a conversation history.
Parameters:
messages: &[ChatMessage]- The conversation history as a slice of chat messagesparameters: Option<ModelParameters>- Optional parameters to control generation
Returns:
Result<ModelResponse, ModelError>- The generated response or an error
Example:
use radium_abstraction::{Model, ChatMessage, ModelParameters};
let messages = vec![
ChatMessage {
role: "system".to_string(),
content: "You are a helpful assistant.".to_string(),
},
ChatMessage {
role: "user".to_string(),
content: "What is Rust?".to_string(),
},
];
let response = model.generate_chat_completion(&messages, None).await?;
println!("{}", response.content);
Method: model_idβ
Returns the identifier of the model instance.
Returns:
&str- The model ID (e.g., "llama-3-70b", "gpt-4")
Example:
let id = model.model_id();
println!("Using model: {}", id);
StreamingModel Traitβ
The StreamingModel trait enables real-time token-by-token streaming of model responses.
Trait Definitionβ
#[async_trait]
pub trait StreamingModel: Send + Sync {
async fn generate_stream(
&self,
prompt: &str,
parameters: Option<ModelParameters>,
) -> Result<Pin<Box<dyn Stream<Item = Result<String, ModelError>> + Send>>, ModelError>;
}
Method: generate_streamβ
Generates a streaming text completion, yielding tokens as they're generated.
Parameters:
prompt: &str- The input promptparameters: Option<ModelParameters>- Optional generation parameters
Returns:
Result<Pin<Box<dyn Stream<Item = Result<String, ModelError>> + Send>>, ModelError>- A stream of tokens or an error
Example:
use radium_abstraction::StreamingModel;
use futures::StreamExt;
let mut stream = model.generate_stream("Tell me a story", None).await?;
while let Some(result) = stream.next().await {
match result {
Ok(token) => print!("{}", token),
Err(e) => eprintln!("Error: {}", e),
}
}
Data Typesβ
ChatMessageβ
Represents a message in a conversation.
pub struct ChatMessage {
pub role: String, // "user", "assistant", or "system"
pub content: String, // The message content
}
Roles:
"system"- System instructions or context"user"- User messages"assistant"- Assistant responses
ModelParametersβ
Parameters for controlling model generation.
pub struct ModelParameters {
pub temperature: Option<f32>, // 0.0-2.0, higher = more creative
pub top_p: Option<f32>, // 0.0-1.0, nucleus sampling
pub max_tokens: Option<u32>, // Maximum tokens to generate
pub stop_sequences: Option<Vec<String>>, // Stop generation on these sequences
}
Default Values:
temperature:0.7top_p:1.0max_tokens:512stop_sequences:None
ModelResponseβ
The response from a model generation.
pub struct ModelResponse {
pub content: String, // The generated text
pub model_id: Option<String>, // The model ID used
pub usage: Option<ModelUsage>, // Token usage statistics
}
ModelUsageβ
Token usage statistics for a request.
pub struct ModelUsage {
pub prompt_tokens: u32, // Tokens in the prompt
pub completion_tokens: u32, // Tokens in the completion
pub total_tokens: u32, // Total tokens used
}
ModelErrorβ
Errors that can occur when interacting with models.
pub enum ModelError {
RequestError(String), // Network or request errors
ModelResponseError(String), // Model returned an error
SerializationError(String), // JSON serialization errors
UnsupportedModelProvider(String), // Provider not supported
QuotaExceeded { // Rate limit or quota exceeded
provider: String,
message: Option<String>,
},
Other(String), // Other unexpected errors
}
ModelFactoryβ
The ModelFactory provides a unified way to create model instances from configuration.
Creating Modelsβ
createβ
Creates a model instance from a ModelConfig.
use radium_models::{ModelConfig, ModelFactory, ModelType};
let config = ModelConfig::new(
ModelType::Universal,
"llama-3-70b".to_string(),
)
.with_base_url("http://localhost:8000/v1".to_string());
let model = ModelFactory::create(config)?;
create_from_strβ
Creates a model from a string representation of the model type.
let model = ModelFactory::create_from_str(
"universal",
"llama-3-70b".to_string(),
)?;
Supported Model Type Strings:
"mock"- Mock model for testing"claude"or"anthropic"- Anthropic Claude"gemini"- Google Gemini"openai"- OpenAI GPT models"universal","openai-compatible", or"local"- Universal provider"ollama"- Ollama (not yet implemented in factory, use Universal)
create_with_api_keyβ
Creates a model with an explicit API key.
let model = ModelFactory::create_with_api_key(
"openai",
"gpt-4".to_string(),
"sk-...".to_string(),
)?;
ModelConfigβ
Configuration for creating model instances.
pub struct ModelConfig {
pub model_type: ModelType,
pub model_id: String,
pub api_key: Option<String>,
pub base_url: Option<String>, // Required for Universal models
}
Builder Methods:
new(model_type, model_id)- Create a new configwith_api_key(api_key)- Set the API keywith_base_url(base_url)- Set the base URL (required for Universal)
ModelTypeβ
Enumeration of supported model types.
pub enum ModelType {
Mock, // Testing model
Claude, // Anthropic Claude
Gemini, // Google Gemini
OpenAI, // OpenAI GPT
Universal, // OpenAI-compatible (vLLM, LocalAI, Ollama, etc.)
Ollama, // Ollama (factory integration pending)
}
UniversalModelβ
The UniversalModel is the primary way to use self-hosted models. It implements the OpenAI Chat Completions API specification.
Constructorsβ
newβ
Creates a UniversalModel, loading API key from environment variables.
use radium_models::UniversalModel;
// Loads API key from UNIVERSAL_API_KEY or OPENAI_COMPATIBLE_API_KEY
let model = UniversalModel::new(
"llama-3-70b".to_string(),
"http://localhost:8000/v1".to_string(),
)?;
Environment Variables:
UNIVERSAL_API_KEY(primary)OPENAI_COMPATIBLE_API_KEY(fallback)
with_api_keyβ
Creates a UniversalModel with an explicit API key.
let model = UniversalModel::with_api_key(
"llama-3-70b".to_string(),
"http://localhost:8000/v1".to_string(),
"optional-api-key".to_string(),
);
without_authβ
Creates a UniversalModel without authentication (most common for local servers).
let model = UniversalModel::without_auth(
"llama-3-70b".to_string(),
"http://localhost:8000/v1".to_string(),
);
Supported Serversβ
UniversalModel works with any server implementing the OpenAI Chat Completions API:
- vLLM:
http://localhost:8000/v1 - LocalAI:
http://localhost:8080/v1 - Ollama:
http://localhost:11434/v1(OpenAI-compatible endpoint) - LM Studio:
http://localhost:1234/v1 - Any OpenAI-compatible server
Ollama Implementation Statusβ
Current Stateβ
A native OllamaModel implementation exists in crates/radium-models/src/ollama.rs, but it is not yet integrated into the ModelFactory. The factory returns an error when attempting to create an Ollama model:
ModelType::Ollama => {
Err(ModelError::UnsupportedModelProvider(
"Ollama model type is not yet implemented. Use UniversalModel with base_url 'http://localhost:11434/v1' instead.".to_string(),
))
}
Recommended Approachβ
Use UniversalModel with Ollama's OpenAI-compatible endpoint:
use radium_models::UniversalModel;
let model = UniversalModel::without_auth(
"llama3.2".to_string(),
"http://localhost:11434/v1".to_string(),
);
This works because Ollama provides an OpenAI-compatible API endpoint at /v1/chat/completions.
Implementing a Custom Providerβ
To implement a custom model provider, you need to:
- Implement the
Modeltrait:
use async_trait::async_trait;
use radium_abstraction::{Model, ModelError, ModelParameters, ModelResponse, ChatMessage};
#[derive(Debug)]
pub struct CustomModel {
model_id: String,
// ... your fields
}
#[async_trait]
impl Model for CustomModel {
async fn generate_text(
&self,
prompt: &str,
parameters: Option<ModelParameters>,
) -> Result<ModelResponse, ModelError> {
// Your implementation
}
async fn generate_chat_completion(
&self,
messages: &[ChatMessage],
parameters: Option<ModelParameters>,
) -> Result<ModelResponse, ModelError> {
// Your implementation
}
fn model_id(&self) -> &str {
&self.model_id
}
}
-
Optionally implement
StreamingModelfor streaming support -
Handle errors appropriately using
ModelErrorvariants -
Return proper
ModelResponsewith content, model_id, and usage statistics
Reference Implementationsβ
- MockModel:
crates/radium-models/src/lib.rs- Simple testing implementation - OpenAIModel:
crates/radium-models/src/openai.rs- HTTP client pattern - UniversalModel:
crates/radium-models/src/universal.rs- OpenAI-compatible API pattern - OllamaModel:
crates/radium-models/src/ollama.rs- Custom API pattern
Error Handlingβ
Common Error Patternsβ
Connection Errors:
if e.is_connect() {
ModelError::RequestError(format!(
"Server not reachable at {}. Is it running?",
base_url
))
}
API Errors:
if !status.is_success() {
ModelError::ModelResponseError(format!(
"API returned error: {}",
status
))
}
Quota/Rate Limits:
if status == 429 {
ModelError::QuotaExceeded {
provider: "custom".to_string(),
message: Some("Rate limit exceeded".to_string()),
}
}
Best Practicesβ
- Use
UniversalModelfor self-hosted models - It's the simplest and most compatible approach - Handle errors gracefully - Provide user-friendly error messages
- Include usage statistics - Return
ModelUsagewhen available - Support streaming - Implement
StreamingModelfor better UX - Test thoroughly - Use
MockModelfor testing your integration - Document your provider - Add setup guides and examples
Related Documentationβ
- Universal Provider Guide - Detailed UniversalModel usage
- Setup Guides - Provider-specific setup instructions
- Configuration Guide - Agent configuration examples
- Source Code - Model trait definition
- Factory Source Code - ModelFactory implementation