Skip to content

Titan Model

The TitanModel class provides an implementation for Amazon's Titan language models. It handles request formatting, response processing, and error handling specific to the Titan model family.

Class Definition

bedrock_swarm.models.titan.TitanModel(model_id: str)

Bases: BedrockModel

Implementation for Amazon Titan models.

Source code in src/bedrock_swarm/models/base.py
def __init__(self, model_id: str):
    """Initialize the model.

    Args:
        model_id: The Bedrock model ID to use
    """
    self._model_id = model_id
    self._config: Dict[str, Any] = {
        "max_tokens": 4096,  # Default maximum tokens
        "default_tokens": 2048,  # Default response length
    }

Functions

format_request(message: str, system: Optional[str] = None, temperature: float = 0.7, max_tokens: Optional[int] = None) -> Dict[str, Any]

Format a request for Titan.

PARAMETER DESCRIPTION
message

The message to send to the model

TYPE: str

system

Optional system prompt

TYPE: Optional[str] DEFAULT: None

temperature

Temperature for response generation (0.0 to 1.0)

TYPE: float DEFAULT: 0.7

max_tokens

Maximum number of tokens to generate

TYPE: Optional[int] DEFAULT: None

RETURNS DESCRIPTION
Dict[str, Any]

Formatted request dictionary

RAISES DESCRIPTION
ValueError

If max_tokens exceeds the model's limit or temperature is invalid

Source code in src/bedrock_swarm/models/titan.py
def format_request(
    self,
    message: str,
    system: Optional[str] = None,
    temperature: float = 0.7,
    max_tokens: Optional[int] = None,
) -> Dict[str, Any]:
    """Format a request for Titan.

    Args:
        message: The message to send to the model
        system: Optional system prompt
        temperature: Temperature for response generation (0.0 to 1.0)
        max_tokens: Maximum number of tokens to generate

    Returns:
        Formatted request dictionary

    Raises:
        ValueError: If max_tokens exceeds the model's limit or temperature is invalid
    """
    # Validate temperature
    if not 0.0 <= temperature <= 1.0:
        raise ValueError("Temperature must be between 0.0 and 1.0")

    # Combine system prompt and message if provided
    prompt = f"{system}\n\n{message}" if system and system.strip() else message

    # Validate token count
    token_count = self.validate_token_count(max_tokens)

    request = {
        "inputText": prompt,
        "textGenerationConfig": {
            "temperature": temperature,
            "topP": 1,
            "maxTokenCount": token_count,
            "stopSequences": [],
        },
    }
    return request

_extract_content(response: Dict[str, Any]) -> str

Extract content from Titan response.

PARAMETER DESCRIPTION
response

Raw response from Titan

TYPE: Dict[str, Any]

RETURNS DESCRIPTION
str

Extracted content as string

RAISES DESCRIPTION
ResponseParsingError

If content cannot be extracted

Source code in src/bedrock_swarm/models/titan.py
def _extract_content(self, response: Dict[str, Any]) -> str:
    """Extract content from Titan response.

    Args:
        response: Raw response from Titan

    Returns:
        Extracted content as string

    Raises:
        ResponseParsingError: If content cannot be extracted
    """
    content = []
    logger.debug("Processing response: %s", response)

    for event in response["body"]:
        try:
            chunk = json.loads(event.get("chunk", {}).get("bytes", b"{}").decode())
            logger.debug("Processing chunk: %s", chunk)
            if "outputText" in chunk:
                content.append(chunk["outputText"])
        except json.JSONDecodeError as e:
            raise ResponseParsingError(f"Error parsing chunk: {str(e)}")
        except (KeyError, AttributeError) as e:
            raise ResponseParsingError(f"Invalid chunk format: {str(e)}")

    # Join and clean up the content
    return " ".join(part.strip() for part in content).strip()

invoke(message: str, **kwargs: Any) -> Dict[str, Any]

Invoke the model with a message.

PARAMETER DESCRIPTION
message

The message to send to the model

TYPE: str

**kwargs

Additional arguments to pass to format_request

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Dict[str, Any]

Model response

RAISES DESCRIPTION
ModelInvokeError

If there is an error invoking the model

Source code in src/bedrock_swarm/models/titan.py
def invoke(self, message: str, **kwargs: Any) -> Dict[str, Any]:
    """Invoke the model with a message.

    Args:
        message: The message to send to the model
        **kwargs: Additional arguments to pass to format_request

    Returns:
        Model response

    Raises:
        ModelInvokeError: If there is an error invoking the model
    """
    try:
        request = self.format_request(message, **kwargs)
        response = self.client.invoke_model_with_response_stream(
            modelId=self.get_model_id(),
            body=json.dumps(request).encode(),
        )
        return self.process_response(response)
    except Exception as e:
        raise ModelInvokeError(f"Error invoking model: {str(e)}")

Model Variants

The Titan model family includes several variants with different capabilities and token limits:

  1. Titan Text Express
  2. Model ID: amazon.titan-text-express-v1
  3. Max Tokens: 8,000
  4. Default Tokens: 2,048
  5. Best for: General text generation and completion

  6. Titan Text Lite

  7. Model ID: amazon.titan-text-lite-v1
  8. Max Tokens: 4,000
  9. Default Tokens: 2,048
  10. Best for: Lightweight text generation tasks

  11. Titan Text Premier

  12. Model ID: amazon.titan-text-premier-v1:0
  13. Max Tokens: 3,072
  14. Default Tokens: 2,048
  15. Best for: High-quality text generation

Request Format

The Titan model accepts requests in the following format:

{
    "inputText": "Your message here",
    "textGenerationConfig": {
        "temperature": 0.7,  # 0.0 to 1.0
        "topP": 1,
        "maxTokenCount": 2048,
        "stopSequences": [],
    },
}

Parameters

  • inputText (str): The message to send to the model
  • temperature (float): Controls randomness in generation (0.0 to 1.0)
  • maxTokenCount (int): Maximum number of tokens to generate
  • system (Optional[str]): System prompt to prepend to the message

Response Format

The model returns responses in a streaming format:

{
    "body": [
        {
            "chunk": {
                "bytes": b'{"outputText": "Model response part 1"}'
            }
        },
        {
            "chunk": {
                "bytes": b'{"outputText": "Model response part 2"}'
            }
        }
    ]
}

Error Handling

The Titan model implementation includes comprehensive error handling:

  1. Token Validation

    # Raises ValueError if max_tokens exceeds model's limit
    request = model.format_request(message="Test", max_tokens=10000)
    

  2. Temperature Validation

    # Raises ValueError if temperature is outside 0.0-1.0 range
    request = model.format_request(message="Test", temperature=1.5)
    

  3. Response Parsing

    # Raises ResponseParsingError for invalid response format
    try:
        content = model._extract_content(response)
    except ResponseParsingError as e:
        print(f"Failed to parse response: {e}")
    

Usage Example

from bedrock_swarm.models.factory import ModelFactory

# Create Titan model instance
model = ModelFactory.create_model("amazon.titan-text-express-v1")

# Format request with system prompt
request = model.format_request(
    message="What is machine learning?",
    system="You are a helpful AI assistant.",
    temperature=0.7,
    max_tokens=1000
)

# Invoke model with retry handling
try:
    response = model.invoke(
        client=client,
        message="What is machine learning?"
    )
    print(response["content"])
except ModelInvokeError as e:
    print(f"Model invocation failed: {e}")

Implementation Details

Request Formatting

The format_request method handles: - System prompt integration - Temperature validation - Token count validation - Request structure formatting

Response Processing

The _extract_content method: - Decodes response chunks - Extracts output text - Handles JSON parsing errors - Validates response format

Retry Logic

The model uses exponential backoff for retries: - Initial delay: 1 second - Maximum retries: 5 - Handles rate limiting - Retries on throttling errors

Testing

The Titan model implementation includes comprehensive tests:

  1. Request Formatting
  2. Temperature validation
  3. Token limit validation
  4. System prompt integration

  5. Response Processing

  6. Chunk processing
  7. Error handling
  8. Content extraction

  9. Error Scenarios

  10. Invalid JSON
  11. Missing fields
  12. API errors
  13. Rate limiting

See Also