Model Trait and Provider Abstraction API Reference

Overview

This document provides a complete reference for Radium's model abstraction layer, enabling developers to understand how to implement custom model providers or extend existing ones. The abstraction consists of the Model trait, ModelFactory, and related types defined in the radium-abstraction and radium-models crates.

Model Trait

The Model trait is the core interface for all AI model implementations in Radium. It provides a unified API for text generation and chat completions.

Trait Definition

#[async_trait]
pub trait Model: Send + Sync {
    async fn generate_text(
        &self,
        prompt: &str,
        parameters: Option<ModelParameters>,
    ) -> Result<ModelResponse, ModelError>;

    async fn generate_chat_completion(
        &self,
        messages: &[ChatMessage],
        parameters: Option<ModelParameters>,
    ) -> Result<ModelResponse, ModelError>;

    fn model_id(&self) -> &str;
}

Method: `generate_text`

Generates a text completion based on a single prompt string.

Parameters:

prompt: &str - The input prompt for text generation
parameters: Option<ModelParameters> - Optional parameters to control generation (temperature, max_tokens, etc.)

Returns:

Result<ModelResponse, ModelError> - The generated response or an error

Example:

use radium_abstraction::{Model, ModelParameters};

let response = model.generate_text(
    "Write a haiku about programming",
    Some(ModelParameters {
        temperature: Some(0.7),
        max_tokens: Some(100),
        ..Default::default()
    })
).await?;

println!("{}", response.content);

Method: `generate_chat_completion`

Generates a chat completion based on a conversation history.

Parameters:

messages: &[ChatMessage] - The conversation history as a slice of chat messages
parameters: Option<ModelParameters> - Optional parameters to control generation

Returns:

Result<ModelResponse, ModelError> - The generated response or an error

Example:

use radium_abstraction::{Model, ChatMessage, ModelParameters};

let messages = vec![
    ChatMessage {
        role: "system".to_string(),
        content: "You are a helpful assistant.".to_string(),
    },
    ChatMessage {
        role: "user".to_string(),
        content: "What is Rust?".to_string(),
    },
];

let response = model.generate_chat_completion(&messages, None).await?;
println!("{}", response.content);

Method: `model_id`

Returns the identifier of the model instance.

Returns:

&str - The model ID (e.g., "llama-3-70b", "gpt-4")

Example:

let id = model.model_id();
println!("Using model: {}", id);

StreamingModel Trait

The StreamingModel trait enables real-time token-by-token streaming of model responses.

Trait Definition

#[async_trait]
pub trait StreamingModel: Send + Sync {
    async fn generate_stream(
        &self,
        prompt: &str,
        parameters: Option<ModelParameters>,
    ) -> Result<Pin<Box<dyn Stream<Item = Result<String, ModelError>> + Send>>, ModelError>;
}

Method: `generate_stream`

Generates a streaming text completion, yielding tokens as they're generated.

Parameters:

prompt: &str - The input prompt
parameters: Option<ModelParameters> - Optional generation parameters

Returns:

Result<Pin<Box<dyn Stream<Item = Result<String, ModelError>> + Send>>, ModelError> - A stream of tokens or an error

Example:

use radium_abstraction::StreamingModel;
use futures::StreamExt;

let mut stream = model.generate_stream("Tell me a story", None).await?;
while let Some(result) = stream.next().await {
    match result {
        Ok(token) => print!("{}", token),
        Err(e) => eprintln!("Error: {}", e),
    }
}

Data Types

ChatMessage

Represents a message in a conversation.

pub struct ChatMessage {
    pub role: String,    // "user", "assistant", or "system"
    pub content: String,  // The message content
}

Roles:

"system" - System instructions or context
"user" - User messages
"assistant" - Assistant responses

ModelParameters

Parameters for controlling model generation.

pub struct ModelParameters {
    pub temperature: Option<f32>,        // 0.0-2.0, higher = more creative
    pub top_p: Option<f32>,             // 0.0-1.0, nucleus sampling
    pub max_tokens: Option<u32>,        // Maximum tokens to generate
    pub stop_sequences: Option<Vec<String>>, // Stop generation on these sequences
}

Default Values:

temperature: 0.7
top_p: 1.0
max_tokens: 512
stop_sequences: None

ModelResponse

The response from a model generation.

pub struct ModelResponse {
    pub content: String,              // The generated text
    pub model_id: Option<String>,    // The model ID used
    pub usage: Option<ModelUsage>,   // Token usage statistics
}

ModelUsage

Token usage statistics for a request.

pub struct ModelUsage {
    pub prompt_tokens: u32,      // Tokens in the prompt
    pub completion_tokens: u32,  // Tokens in the completion
    pub total_tokens: u32,       // Total tokens used
}

ModelError

Errors that can occur when interacting with models.

pub enum ModelError {
    RequestError(String),                    // Network or request errors
    ModelResponseError(String),              // Model returned an error
    SerializationError(String),              // JSON serialization errors
    UnsupportedModelProvider(String),        // Provider not supported
    QuotaExceeded {                          // Rate limit or quota exceeded
        provider: String,
        message: Option<String>,
    },
    Other(String),                          // Other unexpected errors
}

ModelFactory

The ModelFactory provides a unified way to create model instances from configuration.

Creating Models

`create`

Creates a model instance from a ModelConfig.

use radium_models::{ModelConfig, ModelFactory, ModelType};

let config = ModelConfig::new(
    ModelType::Universal,
    "llama-3-70b".to_string(),
)
.with_base_url("http://localhost:8000/v1".to_string());

let model = ModelFactory::create(config)?;

`create_from_str`

Creates a model from a string representation of the model type.

let model = ModelFactory::create_from_str(
    "universal",
    "llama-3-70b".to_string(),
)?;

Supported Model Type Strings:

"mock" - Mock model for testing
"claude" or "anthropic" - Anthropic Claude
"gemini" - Google Gemini
"openai" - OpenAI GPT models
"universal", "openai-compatible", or "local" - Universal provider
"ollama" - Ollama (not yet implemented in factory, use Universal)

`create_with_api_key`

Creates a model with an explicit API key.

let model = ModelFactory::create_with_api_key(
    "openai",
    "gpt-4".to_string(),
    "sk-...".to_string(),
)?;

ModelConfig

Configuration for creating model instances.

pub struct ModelConfig {
    pub model_type: ModelType,
    pub model_id: String,
    pub api_key: Option<String>,
    pub base_url: Option<String>,  // Required for Universal models
}

Builder Methods:

new(model_type, model_id) - Create a new config
with_api_key(api_key) - Set the API key
with_base_url(base_url) - Set the base URL (required for Universal)

ModelType

Enumeration of supported model types.

pub enum ModelType {
    Mock,      // Testing model
    Claude,    // Anthropic Claude
    Gemini,    // Google Gemini
    OpenAI,    // OpenAI GPT
    Universal, // OpenAI-compatible (vLLM, LocalAI, Ollama, etc.)
    Ollama,    // Ollama (factory integration pending)
}

UniversalModel

The UniversalModel is the primary way to use self-hosted models. It implements the OpenAI Chat Completions API specification.

Constructors

`new`

Creates a UniversalModel, loading API key from environment variables.

use radium_models::UniversalModel;

// Loads API key from UNIVERSAL_API_KEY or OPENAI_COMPATIBLE_API_KEY
let model = UniversalModel::new(
    "llama-3-70b".to_string(),
    "http://localhost:8000/v1".to_string(),
)?;

Environment Variables:

UNIVERSAL_API_KEY (primary)
OPENAI_COMPATIBLE_API_KEY (fallback)

`with_api_key`

Creates a UniversalModel with an explicit API key.

let model = UniversalModel::with_api_key(
    "llama-3-70b".to_string(),
    "http://localhost:8000/v1".to_string(),
    "optional-api-key".to_string(),
);

`without_auth`

Creates a UniversalModel without authentication (most common for local servers).

let model = UniversalModel::without_auth(
    "llama-3-70b".to_string(),
    "http://localhost:8000/v1".to_string(),
);

Supported Servers

UniversalModel works with any server implementing the OpenAI Chat Completions API:

vLLM: http://localhost:8000/v1
LocalAI: http://localhost:8080/v1
Ollama: http://localhost:11434/v1 (OpenAI-compatible endpoint)
LM Studio: http://localhost:1234/v1
Any OpenAI-compatible server

Ollama Implementation Status

Current State

A native OllamaModel implementation exists in crates/radium-models/src/ollama.rs, but it is not yet integrated into the ModelFactory. The factory returns an error when attempting to create an Ollama model:

ModelType::Ollama => {
    Err(ModelError::UnsupportedModelProvider(
        "Ollama model type is not yet implemented. Use UniversalModel with base_url 'http://localhost:11434/v1' instead.".to_string(),
    ))
}

Recommended Approach

Use UniversalModel with Ollama's OpenAI-compatible endpoint:

use radium_models::UniversalModel;

let model = UniversalModel::without_auth(
    "llama3.2".to_string(),
    "http://localhost:11434/v1".to_string(),
);

This works because Ollama provides an OpenAI-compatible API endpoint at /v1/chat/completions.

Implementing a Custom Provider

To implement a custom model provider, you need to:

Implement the Model trait:

use async_trait::async_trait;
use radium_abstraction::{Model, ModelError, ModelParameters, ModelResponse, ChatMessage};

#[derive(Debug)]
pub struct CustomModel {
    model_id: String,
    // ... your fields
}

#[async_trait]
impl Model for CustomModel {
    async fn generate_text(
        &self,
        prompt: &str,
        parameters: Option<ModelParameters>,
    ) -> Result<ModelResponse, ModelError> {
        // Your implementation
    }

    async fn generate_chat_completion(
        &self,
        messages: &[ChatMessage],
        parameters: Option<ModelParameters>,
    ) -> Result<ModelResponse, ModelError> {
        // Your implementation
    }

    fn model_id(&self) -> &str {
        &self.model_id
    }
}

Optionally implement StreamingModel for streaming support
Handle errors appropriately using ModelError variants
Return proper ModelResponse with content, model_id, and usage statistics

Reference Implementations

MockModel: crates/radium-models/src/lib.rs - Simple testing implementation
OpenAIModel: crates/radium-models/src/openai.rs - HTTP client pattern
UniversalModel: crates/radium-models/src/universal.rs - OpenAI-compatible API pattern
OllamaModel: crates/radium-models/src/ollama.rs - Custom API pattern

Error Handling

Common Error Patterns

Connection Errors:

if e.is_connect() {
    ModelError::RequestError(format!(
        "Server not reachable at {}. Is it running?",
        base_url
    ))
}

API Errors:

if !status.is_success() {
    ModelError::ModelResponseError(format!(
        "API returned error: {}",
        status
    ))
}

Quota/Rate Limits:

if status == 429 {
    ModelError::QuotaExceeded {
        provider: "custom".to_string(),
        message: Some("Rate limit exceeded".to_string()),
    }
}

Best Practices

Use UniversalModel for self-hosted models - It's the simplest and most compatible approach
Handle errors gracefully - Provide user-friendly error messages
Include usage statistics - Return ModelUsage when available
Support streaming - Implement StreamingModel for better UX
Test thoroughly - Use MockModel for testing your integration
Document your provider - Add setup guides and examples

Universal Provider Guide - Detailed UniversalModel usage
Setup Guides - Provider-specific setup instructions
Configuration Guide - Agent configuration examples
Source Code - Model trait definition
Factory Source Code - ModelFactory implementation

Overview​

Model Trait​

Trait Definition​

Method: generate_text​

Method: generate_chat_completion​

Method: model_id​

StreamingModel Trait​

Trait Definition​

Method: generate_stream​

Data Types​

ChatMessage​

ModelParameters​

ModelResponse​

ModelUsage​

ModelError​

ModelFactory​

Creating Models​

create​

create_from_str​

create_with_api_key​

ModelConfig​

ModelType​

UniversalModel​

Constructors​

new​

with_api_key​

without_auth​

Supported Servers​

Ollama Implementation Status​

Current State​

Recommended Approach​

Implementing a Custom Provider​

Reference Implementations​

Error Handling​

Common Error Patterns​

Best Practices​

Related Documentation​

Overview

Model Trait

Trait Definition

Method: `generate_text`

Method: `generate_chat_completion`

Method: `model_id`

StreamingModel Trait

Trait Definition

Method: `generate_stream`

Data Types

ChatMessage

ModelParameters

ModelResponse

ModelUsage

ModelError

ModelFactory

Creating Models

`create`

`create_from_str`

`create_with_api_key`

ModelConfig

ModelType

UniversalModel

Constructors

`new`

`with_api_key`

`without_auth`

Supported Servers

Ollama Implementation Status

Current State

Recommended Approach

Implementing a Custom Provider

Reference Implementations

Error Handling

Common Error Patterns

Best Practices

Related Documentation