Titan Model¶

The TitanModel class provides an implementation for Amazon's Titan language models. It handles request formatting, response processing, and error handling specific to the Titan model family.

Class Definition¶

`bedrock_swarm.models.titan.TitanModel(model_id: str)` ¶

Bases: BedrockModel

Implementation for Amazon Titan models.

Source code in src/bedrock_swarm/models/base.py

def __init__(self, model_id: str):
    """Initialize the model.

    Args:
        model_id: The Bedrock model ID to use
    """
    self._model_id = model_id
    self._config: Dict[str, Any] = {
        "max_tokens": 4096,  # Default maximum tokens
        "default_tokens": 2048,  # Default response length
    }

Functions¶

`format_request(message: str, system: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None) -> Dict[str, Any]` ¶

Format a request for Titan.

PARAMETER	DESCRIPTION
`message`	The message to send to the model TYPE: `str`
`system`	Optional system prompt TYPE: `Optional[str]` DEFAULT: `None`
`temperature`	Temperature for response generation (0.0 to 1.0) TYPE: `float` DEFAULT: `0.7`
`max_tokens`	Maximum number of tokens to generate TYPE: `Optional[int]` DEFAULT: `None`

RETURNS	DESCRIPTION
`Dict[str, Any]`	Formatted request dictionary

RAISES	DESCRIPTION
`ValueError`	If max_tokens exceeds the model's limit or temperature is invalid

Source code in src/bedrock_swarm/models/titan.py

def format_request(
    self,
    message: str,
    system: Optional[str] = None,
    temperature: float = 0.7,
    max_tokens: Optional[int] = None,
) -> Dict[str, Any]:
    """Format a request for Titan.

    Args:
        message: The message to send to the model
        system: Optional system prompt
        temperature: Temperature for response generation (0.0 to 1.0)
        max_tokens: Maximum number of tokens to generate

    Returns:
        Formatted request dictionary

    Raises:
        ValueError: If max_tokens exceeds the model's limit or temperature is invalid
    """
    # Validate temperature
    if not 0.0 <= temperature <= 1.0:
        raise ValueError("Temperature must be between 0.0 and 1.0")

    # Combine system prompt and message if provided
    prompt = f"{system}\n\n{message}" if system and system.strip() else message

    # Validate token count
    token_count = self.validate_token_count(max_tokens)

    request = {
        "inputText": prompt,
        "textGenerationConfig": {
            "temperature": temperature,
            "topP": 1,
            "maxTokenCount": token_count,
            "stopSequences": [],
        },
    }
    return request

`_extract_content(response: Dict[str, Any]) -> str` ¶

Extract content from Titan response.

PARAMETER	DESCRIPTION
`response`	Raw response from Titan TYPE: `Dict[str, Any]`

RETURNS	DESCRIPTION
`str`	Extracted content as string

RAISES	DESCRIPTION
`ResponseParsingError`	If content cannot be extracted

Source code in src/bedrock_swarm/models/titan.py

def _extract_content(self, response: Dict[str, Any]) -> str:
    """Extract content from Titan response.

    Args:
        response: Raw response from Titan

    Returns:
        Extracted content as string

    Raises:
        ResponseParsingError: If content cannot be extracted
    """
    content = []
    logger.debug("Processing response: %s", response)

    for event in response["body"]:
        try:
            chunk = json.loads(event.get("chunk", {}).get("bytes", b"{}").decode())
            logger.debug("Processing chunk: %s", chunk)
            if "outputText" in chunk:
                content.append(chunk["outputText"])
        except json.JSONDecodeError as e:
            raise ResponseParsingError(f"Error parsing chunk: {str(e)}")
        except (KeyError, AttributeError) as e:
            raise ResponseParsingError(f"Invalid chunk format: {str(e)}")

    # Join and clean up the content
    return " ".join(part.strip() for part in content).strip()

`invoke(message: str, **kwargs: Any) -> Dict[str, Any]` ¶

Invoke the model with a message.

PARAMETER	DESCRIPTION
`message`	The message to send to the model TYPE: `str`
`**kwargs`	Additional arguments to pass to format_request TYPE: `Any` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Dict[str, Any]`	Model response

RAISES	DESCRIPTION
`ModelInvokeError`	If there is an error invoking the model

Source code in src/bedrock_swarm/models/titan.py

def invoke(self, message: str, **kwargs: Any) -> Dict[str, Any]:
    """Invoke the model with a message.

    Args:
        message: The message to send to the model
        **kwargs: Additional arguments to pass to format_request

    Returns:
        Model response

    Raises:
        ModelInvokeError: If there is an error invoking the model
    """
    try:
        request = self.format_request(message, **kwargs)
        response = self.client.invoke_model_with_response_stream(
            modelId=self.get_model_id(),
            body=json.dumps(request).encode(),
        )
        return self.process_response(response)
    except Exception as e:
        raise ModelInvokeError(f"Error invoking model: {str(e)}")

Model Variants¶

The Titan model family includes several variants with different capabilities and token limits:

Titan Text Express
Model ID: amazon.titan-text-express-v1
Max Tokens: 8,000
Default Tokens: 2,048
Best for: General text generation and completion
Titan Text Lite
Model ID: amazon.titan-text-lite-v1
Max Tokens: 4,000
Default Tokens: 2,048
Best for: Lightweight text generation tasks
Titan Text Premier
Model ID: amazon.titan-text-premier-v1:0
Max Tokens: 3,072
Default Tokens: 2,048
Best for: High-quality text generation

Request Format¶

The Titan model accepts requests in the following format:

{
    "inputText": "Your message here",
    "textGenerationConfig": {
        "temperature": 0.7,  # 0.0 to 1.0
        "topP": 1,
        "maxTokenCount": 2048,
        "stopSequences": [],
    },
}

Parameters¶

inputText (str): The message to send to the model
temperature (float): Controls randomness in generation (0.0 to 1.0)
maxTokenCount (int): Maximum number of tokens to generate
system (Optional[str]): System prompt to prepend to the message

Response Format¶

The model returns responses in a streaming format:

{
    "body": [
        {
            "chunk": {
                "bytes": b'{"outputText": "Model response part 1"}'
            }
        },
        {
            "chunk": {
                "bytes": b'{"outputText": "Model response part 2"}'
            }
        }
    ]
}

Error Handling¶

The Titan model implementation includes comprehensive error handling:

Token Validation

# Raises ValueError if max_tokens exceeds model's limit
request = model.format_request(message="Test", max_tokens=10000)

Temperature Validation

# Raises ValueError if temperature is outside 0.0-1.0 range
request = model.format_request(message="Test", temperature=1.5)

Response Parsing

# Raises ResponseParsingError for invalid response format
try:
    content = model._extract_content(response)
except ResponseParsingError as e:
    print(f"Failed to parse response: {e}")

Usage Example¶

from bedrock_swarm.models.factory import ModelFactory

# Create Titan model instance
model = ModelFactory.create_model("amazon.titan-text-express-v1")

# Format request with system prompt
request = model.format_request(
    message="What is machine learning?",
    system="You are a helpful AI assistant.",
    temperature=0.7,
    max_tokens=1000
)

# Invoke model with retry handling
try:
    response = model.invoke(
        client=client,
        message="What is machine learning?"
    )
    print(response["content"])
except ModelInvokeError as e:
    print(f"Model invocation failed: {e}")

Implementation Details¶

Request Formatting¶

The format_request method handles: - System prompt integration - Temperature validation - Token count validation - Request structure formatting

Response Processing¶

The _extract_content method: - Decodes response chunks - Extracts output text - Handles JSON parsing errors - Validates response format

Retry Logic¶

The model uses exponential backoff for retries: - Initial delay: 1 second - Maximum retries: 5 - Handles rate limiting - Retries on throttling errors

Testing¶

The Titan model implementation includes comprehensive tests:

Request Formatting
Temperature validation
Token limit validation
System prompt integration
Response Processing
Chunk processing
Error handling
Content extraction
Error Scenarios
Invalid JSON
Missing fields
API errors
Rate limiting

Titan Model¶

Class Definition¶

bedrock_swarm.models.titan.TitanModel(model_id: str) ¶

Functions¶

format_request(message: str, system: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None) -> Dict[str, Any] ¶

_extract_content(response: Dict[str, Any]) -> str ¶

invoke(message: str, **kwargs: Any) -> Dict[str, Any] ¶

Model Variants¶

Request Format¶

Parameters¶

Response Format¶

Error Handling¶

Usage Example¶

Implementation Details¶

Request Formatting¶

Response Processing¶

Retry Logic¶

Testing¶

See Also¶

`bedrock_swarm.models.titan.TitanModel(model_id: str)` ¶

`format_request(message: str, system: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None) -> Dict[str, Any]` ¶

`_extract_content(response: Dict[str, Any]) -> str` ¶

`invoke(message: str, **kwargs: Any) -> Dict[str, Any]` ¶