Venice AI Helper Utilities

Convenience helpers for common patterns when working with the Venice AI SDK.

:func:tool_from_model -- create a :class:Tool from a Pydantic model
:func:tool_from_function -- create a :class:Tool from a typed Python function
:class:Conversation -- thin wrapper for building multi-turn message lists
:func:cosine_similarity -- score two embedding vectors in [-1, 1]
:func:detect_image_format -- sniff (extension, mime_type) from raw image bytes
:func:fit_image_bytes -- resize an image to fit within a max-dimension box
:func:extract_thinking_blocks -- parse <thinking> / <think> tags
:func:normalize_duration_seconds -- parse 5 / "5" / "5s" / "5 seconds" to int

extract_thinking_blocks

def extract_thinking_blocks(content: str | list[Any]) -> tuple[list[str], str]

Extract reasoning/thinking blocks from a response's content.

Some Venice models surface chain-of-thought reasoning inline in the assistant's content wrapped in <thinking>...</thinking> or <think>...</think> tags. This helper splits those out so callers can display them separately from the final answer.

For models that put reasoning in the dedicated reasoning_content field instead, prefer :attr:ChatCompletionResponse.thinking_blocks, which checks both server shapes.

Arguments:

content: message.content value — either a string or a list of multimodal parts (the list form is stringified before parsing).

Returns:

(blocks, cleaned_content) — a list of the extracted block contents, and the original content with the matched tags removed and trailing whitespace stripped.

detect_image_format

def detect_image_format(data: bytes) -> tuple[str, str]

Sniff image format from magic bytes.

Returns (extension, mime_type) for supported formats, e.g. ("png", "image/png"). For unrecognized bytes returns ("bin", "application/octet-stream").

Recognizes JPEG, PNG, WebP, and GIF — the formats Venice models return. Use this when you need to save an image to disk with the correct extension or assemble a data: URI from raw bytes.

Arguments:

data: Raw image bytes (typically the result of base64.b64decode(response.images[0])).

fit_image_bytes

def fit_image_bytes(data: bytes,
                    *,
                    max_dim: int = 1024,
                    quality: int = 85) -> bytes

Resize an image so neither side exceeds max_dim pixels.

Returns data unchanged when the image already fits. Otherwise scales down preserving aspect ratio and re-encodes as JPEG.

Why this matters: some Venice vision models accept smaller maximum input dimensions than others. venice-uncensored-1-2 and venice-uncensored-role-play return an opaque HTTP 500 ("Inference processing failed") when handed multi-megapixel inputs that other vision models accept fine. Pre-resizing client-side sidesteps that. Defaults of max_dim=1024 / quality=85 produce ~150 KB photographs that every Venice vision model accepts.

Pillow is a hard SDK dependency, so no extras gating.

Arguments:

data: Raw image bytes (PNG, JPEG, WebP, or GIF).
max_dim: Maximum width or height in pixels.
quality: JPEG quality (1-100).

Returns:

Resized JPEG bytes, or data unchanged if already within max_dim.

cosine_similarity

def cosine_similarity(a: Sequence[float], b: Sequence[float]) -> float

Cosine similarity between two embedding vectors.

Returns a value in [-1.0, 1.0]: 1.0 for identical direction, 0.0 for orthogonal, -1.0 for opposite. Pure Python — no numpy dependency — so it works on the raw embedding lists returned by

Raises:

ValueError: If a and b have different lengths or are empty, or if either vector has zero magnitude (cosine is undefined).

normalize_duration_seconds

def normalize_duration_seconds(value: int | str) -> int

Parse 5 / "5" / "5s" / "5 seconds" to an integer.

Liberal in what we accept, strict in what we return — image / music / video resources all coerce duration values through this helper before validating against per-model enums and before sending the request.

Examples:

>>> normalize_duration_seconds(5)
5
>>> normalize_duration_seconds("5")
5
>>> normalize_duration_seconds("5s")
5
>>> normalize_duration_seconds("5 SECONDS")
5

Raises:

ValueError: If value cannot be parsed as a positive integer number of seconds.

tool_from_model

def tool_from_model(model: type[BaseModel],
                    *,
                    name: str | None = None,
                    description: str | None = None) -> Tool

Create a :class:Tool definition from a Pydantic BaseModel subclass.

Uses Pydantic's :meth:~pydantic.BaseModel.model_json_schema to generate the JSON Schema for the function parameters.

Arguments:

model: A Pydantic BaseModel subclass.
name: Override the function name (defaults to the model class name).
description: Override the description (defaults to the model docstring).

Returns:

A :class:Tool ready to pass to tools=[...].

tool_from_function

def tool_from_function(fn: Callable[..., Any],
                       *,
                       name: str | None = None,
                       description: str | None = None) -> Tool

Create a :class:Tool definition from a Python function's type hints.

Inspects the function signature and type annotations to build a JSON Schema for the parameters field. Only a subset of types is supported (see :func:_python_type_to_json_schema); for richer schemas, use

Arguments:

fn: The function to introspect.
name: Override the tool name (defaults to fn.__name__).
description: Override the description (defaults to fn's docstring).

Returns:

A :class:Tool ready to pass to tools=[...].

Conversation Objects

class Conversation()

Convenience helper for building multi-turn message lists.

Maintains an ordered list of chat messages and provides chainable methods for appending user turns, assistant responses, and tool results.

For production use cases requiring token management, persistence, or conversation branching, manage messages directly.

Example:

conv = Conversation(system="You are a helpful assistant.")
conv.add_user("What's the weather?")
response = await client.chat.completions.create(
model=model, messages=conv.messages,
)
conv.add_response(response)
conv.add_user("And tomorrow?")

Conversation.messages

@property
def messages(
) -> list[UserMessage | AssistantMessage | SystemMessage | ToolMessage]

Return a shallow copy of the message list.

Conversation.add_user

def add_user(content: str | list[MessageContentPartParam]) -> Conversation

Append a user message and return self for chaining.

Accepts either a plain string or a list of multimodal content parts. Each part can be a typed object (:class:TextContent, :class:ImageContent, :class:AudioContent, :class:VideoContent) or a plain dict matching one of the corresponding TypedDict shapes (e.g. {"type": "text", "text": "hi"}).

Conversation.add_response

def add_response(response: ChatCompletionResponse,
                 choice_index: int = 0) -> Conversation

Append an assistant message extracted from a completion response.

Conversation.add_assistant_message

def add_assistant_message(
        content: str | None = None,
        *,
        tool_calls: list[ToolCall] | None = None) -> Conversation

Append an assistant message directly.

Useful in agent loops where you want to inject an assistant turn — often a tool-call turn — without first wrapping it in a :class:ChatCompletionResponse.

Conversation.add_tool_result

def add_tool_result(tool_call_id: str, content: str) -> Conversation

Append a tool result message.

Conversation.run_with_tools

async def run_with_tools(client: VeniceClient,
                         *,
                         model: str,
                         tools: Sequence[Callable[..., Any] | Tool],
                         on_tool_call: Callable[[ToolCall, Any], None]
                         | None = None,
                         on_tool_error: Callable[[ToolCall, Exception], str]
                         | None = None,
                         parallel: bool = False,
                         max_iterations: int = 10,
                         **create_kwargs: Any) -> ToolLoopResult

Run :meth:ChatCompletions.run_with_tools against this conversation.

Thin wrapper that takes the conversation's current messages as input, runs the tool-orchestration loop, and appends every new message the loop produced (assistant tool-call turns, tool results, and the final assistant turn) to this conversation. The conversation is left ready for the next user turn.

Arguments:

client: The Venice client to drive completions through.
model: Model id to use for every iteration.
tools: See :meth:ChatCompletions.run_with_tools.
on_tool_call: See :meth:ChatCompletions.run_with_tools.
on_tool_error: See :meth:ChatCompletions.run_with_tools.
parallel: See :meth:ChatCompletions.run_with_tools.
max_iterations: See :meth:ChatCompletions.run_with_tools.
create_kwargs: Forwarded to chat.completions.create on every iteration.

Returns:

The :class:ToolLoopResult from the underlying call. Note that result.messages is a separate copy — the conversation's own messages are mutated in place to reflect the same final history.

extract_thinking_blocks​

detect_image_format​

fit_image_bytes​

cosine_similarity​

normalize_duration_seconds​

tool_from_model​

tool_from_function​

Conversation Objects​

Conversation.messages​

Conversation.add_user​

Conversation.add_response​

Conversation.add_assistant_message​

Conversation.add_tool_result​

Conversation.run_with_tools​

extract_thinking_blocks

detect_image_format

fit_image_bytes

cosine_similarity

normalize_duration_seconds

tool_from_model

tool_from_function

Conversation Objects

Conversation.messages

Conversation.add_user

Conversation.add_response

Conversation.add_assistant_message

Conversation.add_tool_result

Conversation.run_with_tools