models.openai

class agentopera.models.openai.OpenAIChatCompletionClient(**kwargs: Unpack)[source]

Bases: BaseOpenAIChatCompletionClient

Chat completion client for OpenAI hosted models.

To use this client, you must install the openai extra:

pip install "agentopera[openai]"

You can also use this client for OpenAI-compatible ChatCompletion endpoints. Using this client for non-OpenAI models is not tested or guaranteed.

Parameters:

model (str) – Which OpenAI model to use.
api_key (optional, str) – The API key to use. Required if ‘OPENAI_API_KEY’ is not found in the environment variables.
organization (optional, str) – The organization ID to use.
base_url (optional, str) – The base URL to use. Required if the model is not hosted on OpenAI.
timeout – (optional, float): The timeout for the request in seconds.
max_retries (optional, int) – The maximum number of retries to attempt.
model_info (optional, ModelInfo) – The capabilities of the model. Required if the model name is not a valid OpenAI model.
frequency_penalty (optional, float) –
logit_bias – (optional, dict[str, int]):
max_tokens (optional, int) –
n (optional, int) –
presence_penalty (optional, float) –
response_format (optional, literal["json_object", "text"] | pydantic.BaseModel) –
seed (optional, int) –
stop (optional, str | List[str]) –
temperature (optional, float) –
top_p (optional, float) –
user (optional, str) –
default_headers (optional, dict[str, str]) – Custom headers; useful for authentication or other custom requirements.
add_name_prefixes (optional, bool) – Whether to prepend the source value to each UserMessage content. E.g., “this is content” becomes “Reviewer said: this is content.” This can be useful for models that do not support the name field in message. Defaults to False.
stream_options (optional, dict) – Additional options for streaming. Currently only include_usage is supported.

Examples

The following code snippet shows how to use the client with an OpenAI model:

from agentopera.models.openai import OpenAIChatCompletionClient
from agentopera.core.types.models import UserMessage

openai_client = OpenAIChatCompletionClient(
    model="gpt-4o-2024-08-06",
    # api_key="sk-...", # Optional if you have an OPENAI_API_KEY environment variable set.
)

result = await openai_client.create([UserMessage(content="What is the capital of France?", source="user")])  # type: ignore
print(result)

To use the client with a non-OpenAI model, you need to provide the base URL of the model and the model info. For example, to use Ollama, you can use the following code snippet:

from agentopera.models.openai import OpenAIChatCompletionClient
from agentopera.core.types.models import ModelFamily

custom_model_client = OpenAIChatCompletionClient(
    model="deepseek-r1:1.5b",
    base_url="http://localhost:11434/v1",
    api_key="placeholder",
    model_info={
        "vision": False,
        "function_calling": False,
        "json_output": False,
        "family": ModelFamily.R1,
    },
)

To use structured output as well as function calling, you can use the following code snippet:

import asyncio
from typing import Literal

from agentopera.core.types.models import (
    AssistantMessage,
    FunctionExecutionResult,
    FunctionExecutionResultMessage,
    SystemMessage,
    UserMessage,
)
from agentopera.core.tools import FunctionTool
from agentopera.models.openai import OpenAIChatCompletionClient
from pydantic import BaseModel


# Define the structured output format.
class AgentResponse(BaseModel):
    thoughts: str
    response: Literal["happy", "sad", "neutral"]


# Define the function to be called as a tool.
def sentiment_analysis(text: str) -> str:
    """Given a text, return the sentiment."""
    return "happy" if "happy" in text else "sad" if "sad" in text else "neutral"


# Create a FunctionTool instance with `strict=True`,
# which is required for structured output mode.
tool = FunctionTool(sentiment_analysis, description="Sentiment Analysis", strict=True)

# Create an OpenAIChatCompletionClient instance.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o-mini",
    response_format=AgentResponse,  # type: ignore
)


async def main() -> None:
    # Generate a response using the tool.
    response1 = await model_client.create(
        messages=[
            SystemMessage(content="Analyze input text sentiment using the tool provided."),
            UserMessage(content="I am happy.", source="user"),
        ],
        tools=[tool],
    )
    print(response1.content)
    # Should be a list of tool calls.
    # [FunctionCall(name="sentiment_analysis", arguments={"text": "I am happy."}, ...)]

    assert isinstance(response1.content, list)
    response2 = await model_client.create(
        messages=[
            SystemMessage(content="Analyze input text sentiment using the tool provided."),
            UserMessage(content="I am happy.", source="user"),
            AssistantMessage(content=response1.content, source="assistant"),
            FunctionExecutionResultMessage(
                content=[FunctionExecutionResult(content="happy", call_id=response1.content[0].id, is_error=False, name="sentiment_analysis")]
            ),
        ],
    )
    print(response2.content)
    # Should be a structured output.
    # {"thoughts": "The user is happy.", "response": "happy"}


asyncio.run(main())

To load the client from a configuration, you can use the load_component method:

from agentopera.core.types.models import ChatCompletionClient

config = {
    "provider": "OpenAIChatCompletionClient",
    "config": {"model": "gpt-4o", "api_key": "REPLACE_WITH_YOUR_API_KEY"},
}

client = ChatCompletionClient.load_component(config)

To view the full list of available configuration options, see the OpenAIClientConfigurationConfigModel class.

class agentopera.models.openai.AzureOpenAIChatCompletionClient(**kwargs: Unpack)[source]

Bases: BaseOpenAIChatCompletionClient

Chat completion client for Azure OpenAI hosted models.

Parameters:

model (str) – Which OpenAI model to use.
azure_endpoint (str) – The endpoint for the Azure model. Required for Azure models.
azure_deployment (str) – Deployment name for the Azure model. Required for Azure models.
api_version (str) – The API version to use. Required for Azure models.
azure_ad_token (str) – The Azure AD token to use. Provide this or azure_ad_token_provider for token-based authentication.
azure_ad_token_provider (optional, Callable[[], Awaitable[str]] | AzureTokenProvider) – The Azure AD token provider to use. Provide this or azure_ad_token for token-based authentication.
api_key (optional, str) – The API key to use, use this if you are using key based authentication. It is optional if you are using Azure AD token based authentication or AZURE_OPENAI_API_KEY environment variable.
timeout – (optional, float): The timeout for the request in seconds.
max_retries (optional, int) – The maximum number of retries to attempt.
model_info (optional, ModelInfo) – The capabilities of the model. Required if the model name is not a valid OpenAI model.
frequency_penalty (optional, float) –
logit_bias – (optional, dict[str, int]):
max_tokens (optional, int) –
n (optional, int) –
presence_penalty (optional, float) –
response_format (optional, literal["json_object", "text"]) –
seed (optional, int) –
stop (optional, str | List[str]) –
temperature (optional, float) –
top_p (optional, float) –
user (optional, str) –
default_headers (optional, dict[str, str]) – Custom headers; useful for authentication or other custom requirements.

To use this client, you must install the azure and openai extensions:

pip install "agentopera[openai,azure]"

To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities. For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.

The following code snippet shows how to use AAD authentication.

from agentopera.models.openai import AzureOpenAIChatCompletionClient
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

# Create the token provider
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

az_model_client = AzureOpenAIChatCompletionClient(
    azure_deployment="{your-azure-deployment}",
    model="{deployed-model, such as 'gpt-4o'}",
    api_version="2024-06-01",
    azure_endpoint="https://{your-custom-endpoint}.openai.azure.com/",
    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.
    # api_key="sk-...", # For key-based authentication. `AZURE_OPENAI_API_KEY` environment variable can also be used instead.
)

To load the client that uses identity based aith from a configuration, you can use the load_component method:

from agentopera.core.types.models import ChatCompletionClient

config = {
    "provider": "AzureOpenAIChatCompletionClient",
    "config": {
        "model": "gpt-4o-2024-05-13",
        "azure_endpoint": "https://{your-custom-endpoint}.openai.azure.com/",
        "azure_deployment": "{your-azure-deployment}",
        "api_version": "2024-06-01",
        "azure_ad_token_provider": {
            "provider": "agentopera.agents.auth.azure.AzureTokenProvider",
            "config": {
                "provider_kind": "DefaultAzureCredential",
                "scopes": ["https://cognitiveservices.azure.com/.default"],
            },
        },
    },
}

client = ChatCompletionClient.load_component(config)

To view the full list of available configuration options, see the AzureOpenAIClientConfigurationConfigModel class.

Note

Right now only DefaultAzureCredential is supported with no additional args passed to it.

class agentopera.models.openai.BaseOpenAIChatCompletionClient(client: AsyncOpenAI | AsyncAzureOpenAI, *, create_args: Dict[str, Any], model_capabilities: ModelCapabilities | None = None, model_info: ModelInfo | None = None, add_name_prefixes: bool = False)[source]

Bases: ChatCompletionClient

classmethod create_from_config(config: Dict[str, Any]) → ChatCompletionClient[source]

async create(messages: Sequence[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage | VercelMessage], *, tools: Sequence[Tool | ToolSchema] = [], json_output: bool | None = None, extra_create_args: Mapping[str, Any] = {}, cancellation_token: CancellationToken | None = None) → CreateResult[source]

async create_stream(messages: Sequence[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage | VercelMessage], *, tools: Sequence[Tool | ToolSchema] = [], json_output: bool | None = None, extra_create_args: Mapping[str, Any] = {}, cancellation_token: CancellationToken | None = None, max_consecutive_empty_chunk_tolerance: int = 0) → AsyncGenerator[str | CreateResult, None][source]

Creates an AsyncGenerator that will yield a stream of chat completions based on the provided messages and tools.

Parameters:

messages (Sequence[LLMMessage]) – A sequence of messages to be processed.
tools (Sequence[Tool | ToolSchema], optional) – A sequence of tools to be used in the completion. Defaults to [].
json_output (Optional[bool], optional) – If True, the output will be in JSON format. Defaults to None.
extra_create_args (Mapping[str, Any], optional) – Additional arguments for the creation process. Default to {}.
cancellation_token (Optional[CancellationToken], optional) – A token to cancel the operation. Defaults to None.
max_consecutive_empty_chunk_tolerance (int) – [Deprecated] The maximum number of consecutive empty chunks to tolerate before raising a ValueError. This seems to only be needed to set when using AzureOpenAIChatCompletionClient. Defaults to 0. This parameter is deprecated, empty chunks will be skipped.

Yields:

AsyncGenerator[Union[str, CreateResult], None] – A generator yielding the completion results as they are produced.

In streaming, the default behaviour is not return token usage counts. See: [OpenAI API reference for possible args](https://platform.openai.com/docs/api-reference/chat/create). However extra_create_args={“stream_options”: {“include_usage”: True}} will (if supported by the accessed API) return a final chunk with usage set to a RequestUsage object having prompt and completion token counts, all preceding chunks will have usage as None. See: [stream_options](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options).

Other examples of OPENAI supported arguments that can be included in extra_create_args:

temperature (float): Controls the randomness of the output. Higher values (e.g., 0.8) make the output more random, while lower values (e.g., 0.2) make it more focused and deterministic.
max_tokens (int): The maximum number of tokens to generate in the completion.
top_p (float): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
frequency_penalty (float): A value between -2.0 and 2.0 that penalizes new tokens based on their existing frequency in the text so far, decreasing the likelihood of repeated phrases.
presence_penalty (float): A value between -2.0 and 2.0 that penalizes new tokens based on whether they appear in the text so far, encouraging the model to talk about new topics.

actual_usage() → RequestUsage[source]

total_usage() → RequestUsage[source]

count_tokens(messages: Sequence[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage | VercelMessage], *, tools: Sequence[Tool | ToolSchema] = []) → int[source]

remaining_tokens(messages: Sequence[SystemMessage | UserMessage | AssistantMessage | FunctionExecutionResultMessage | VercelMessage], *, tools: Sequence[Tool | ToolSchema] = []) → int[source]

property capabilities: ModelCapabilities

property model_info: ModelInfo

Bases: BaseOpenAIClientConfigurationConfigModel

azure_endpoint: str

azure_deployment: str | None

api_version: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

azure_ad_token: str | None

Bases: BaseOpenAIClientConfigurationConfigModel

organization: str | None

base_url: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Bases: CreateArgumentsConfigModel

model: str

api_key: str | None

timeout: float | None

max_retries: int | None

model_capabilities: ModelCapabilities | None

model_info: ModelInfo | None

add_name_prefixes: bool | None

default_headers: Dict[str, str] | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Bases: BaseModel

frequency_penalty: float | None

logit_bias: Dict[str, int] | None

max_tokens: int | None

n: int | None

presence_penalty: float | None

response_format: ResponseFormat | None

seed: int | None

stop: str | List[str] | None

temperature: float | None

top_p: float | None

user: str | None

stream_options: StreamOptions | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].