Venice AI Helper Utilities
Convenience helpers for common patterns when working with the Venice AI SDK.
- :func:
tool_from_model-- create a :class:Toolfrom a Pydantic model - :func:
tool_from_function-- create a :class:Toolfrom a typed Python function - :class:
Conversation-- thin wrapper for building multi-turn message lists - :func:
cosine_similarity-- score two embedding vectors in [-1, 1] - :func:
detect_image_format-- sniff (extension, mime_type) from raw image bytes - :func:
fit_image_bytes-- resize an image to fit within a max-dimension box - :func:
extract_thinking_blocks-- parse<thinking>/<think>tags - :func:
normalize_duration_seconds-- parse5/"5"/"5s"/"5 seconds"to int
extract_thinking_blocks
def extract_thinking_blocks(content: str | list[Any]) -> tuple[list[str], str]
Extract reasoning/thinking blocks from a response's content.
Some Venice models surface chain-of-thought reasoning inline in the
assistant's content wrapped in <thinking>...</thinking> or
<think>...</think> tags. This helper splits those out so callers can
display them separately from the final answer.
For models that put reasoning in the dedicated reasoning_content
field instead, prefer :attr:ChatCompletionResponse.thinking_blocks,
which checks both server shapes.
Arguments:
content:message.contentvalue — either a string or a list of multimodal parts (the list form is stringified before parsing).
Returns:
(blocks, cleaned_content) — a list of the extracted block
contents, and the original content with the matched tags
removed and trailing whitespace stripped.
detect_image_format
def detect_image_format(data: bytes) -> tuple[str, str]
Sniff image format from magic bytes.
Returns (extension, mime_type) for supported formats, e.g.
("png", "image/png"). For unrecognized bytes returns
("bin", "application/octet-stream").
Recognizes JPEG, PNG, WebP, and GIF — the formats Venice models
return. Use this when you need to save an image to disk with the
correct extension or assemble a data: URI from raw bytes.
Arguments:
data: Raw image bytes (typically the result ofbase64.b64decode(response.images[0])).
fit_image_bytes
def fit_image_bytes(data: bytes,
*,
max_dim: int = 1024,
quality: int = 85) -> bytes
Resize an image so neither side exceeds max_dim pixels.
Returns data unchanged when the image already fits. Otherwise scales down preserving aspect ratio and re-encodes as JPEG.
Why this matters: some Venice vision models accept smaller maximum input
dimensions than others. venice-uncensored-1-2 and
venice-uncensored-role-play return an opaque HTTP 500
("Inference processing failed") when handed multi-megapixel inputs
that other vision models accept fine. Pre-resizing client-side sidesteps
that. Defaults of max_dim=1024 / quality=85 produce ~150 KB
photographs that every Venice vision model accepts.
Pillow is a hard SDK dependency, so no extras gating.
Arguments:
data: Raw image bytes (PNG, JPEG, WebP, or GIF).max_dim: Maximum width or height in pixels.quality: JPEG quality (1-100).
Returns:
Resized JPEG bytes, or data unchanged if already within
max_dim.
cosine_similarity
def cosine_similarity(a: Sequence[float], b: Sequence[float]) -> float
Cosine similarity between two embedding vectors.
Returns a value in [-1.0, 1.0]: 1.0 for identical direction,
0.0 for orthogonal, -1.0 for opposite. Pure Python — no numpy
dependency — so it works on the raw embedding lists returned by
Raises:
ValueError: If a and b have different lengths or are empty, or if either vector has zero magnitude (cosine is undefined).
normalize_duration_seconds
def normalize_duration_seconds(value: int | str) -> int
Parse 5 / "5" / "5s" / "5 seconds" to an integer.
Liberal in what we accept, strict in what we return — image / music / video resources all coerce duration values through this helper before validating against per-model enums and before sending the request.
Examples:
>>> normalize_duration_seconds(5)
5
>>> normalize_duration_seconds("5")
5
>>> normalize_duration_seconds("5s")
5
>>> normalize_duration_seconds("5 SECONDS")
5
Raises:
ValueError: If value cannot be parsed as a positive integer number of seconds.
tool_from_model
def tool_from_model(model: type[BaseModel],
*,
name: str | None = None,
description: str | None = None) -> Tool
Create a :class:Tool definition from a Pydantic BaseModel subclass.
Uses Pydantic's :meth:~pydantic.BaseModel.model_json_schema to generate
the JSON Schema for the function parameters.
Arguments:
model: A PydanticBaseModelsubclass.name: Override the function name (defaults to the model class name).description: Override the description (defaults to the model docstring).
Returns:
A :class:Tool ready to pass to tools=[...].
tool_from_function
def tool_from_function(fn: Callable[..., Any],
*,
name: str | None = None,
description: str | None = None) -> Tool
Create a :class:Tool definition from a Python function's type hints.
Inspects the function signature and type annotations to build a JSON
Schema for the parameters field. Only a subset of types is supported
(see :func:_python_type_to_json_schema); for richer schemas, use
Arguments:
fn: The function to introspect.name: Override the tool name (defaults tofn.__name__).description: Override the description (defaults tofn's docstring).
Returns:
A :class:Tool ready to pass to tools=[...].
Conversation Objects
class Conversation()
Convenience helper for building multi-turn message lists.
Maintains an ordered list of chat messages and provides chainable methods for appending user turns, assistant responses, and tool results.
For production use cases requiring token management, persistence, or conversation branching, manage messages directly.
Example:
conv = Conversation(system="You are a helpful assistant.")
conv.add_user("What's the weather?")
response = await client.chat.completions.create(
model=model, messages=conv.messages,
)
conv.add_response(response)
conv.add_user("And tomorrow?")
Conversation.messages
@property
def messages(
) -> list[UserMessage | AssistantMessage | SystemMessage | ToolMessage]
Return a shallow copy of the message list.
Conversation.add_user
def add_user(content: str | list[MessageContentPartParam]) -> Conversation
Append a user message and return self for chaining.
Accepts either a plain string or a list of multimodal content parts.
Each part can be a typed object (:class:TextContent,
:class:ImageContent, :class:AudioContent, :class:VideoContent)
or a plain dict matching one of the corresponding TypedDict
shapes (e.g. {"type": "text", "text": "hi"}).
Conversation.add_response
def add_response(response: ChatCompletionResponse,
choice_index: int = 0) -> Conversation
Append an assistant message extracted from a completion response.
Conversation.add_assistant_message
def add_assistant_message(
content: str | None = None,
*,
tool_calls: list[ToolCall] | None = None) -> Conversation
Append an assistant message directly.
Useful in agent loops where you want to inject an assistant turn —
often a tool-call turn — without first wrapping it in a
:class:ChatCompletionResponse.
Conversation.add_tool_result
def add_tool_result(tool_call_id: str, content: str) -> Conversation
Append a tool result message.
Conversation.run_with_tools
async def run_with_tools(client: VeniceClient,
*,
model: str,
tools: Sequence[Callable[..., Any] | Tool],
on_tool_call: Callable[[ToolCall, Any], None]
| None = None,
on_tool_error: Callable[[ToolCall, Exception], str]
| None = None,
parallel: bool = False,
max_iterations: int = 10,
**create_kwargs: Any) -> ToolLoopResult
Run :meth:ChatCompletions.run_with_tools against this conversation.
Thin wrapper that takes the conversation's current messages as input, runs the tool-orchestration loop, and appends every new message the loop produced (assistant tool-call turns, tool results, and the final assistant turn) to this conversation. The conversation is left ready for the next user turn.
Arguments:
client: The Venice client to drive completions through.model: Model id to use for every iteration.tools: See :meth:ChatCompletions.run_with_tools.on_tool_call: See :meth:ChatCompletions.run_with_tools.on_tool_error: See :meth:ChatCompletions.run_with_tools.parallel: See :meth:ChatCompletions.run_with_tools.max_iterations: See :meth:ChatCompletions.run_with_tools.create_kwargs: Forwarded tochat.completions.createon every iteration.
Returns:
The :class:ToolLoopResult from the underlying call.
Note that result.messages is a separate copy — the
conversation's own messages are mutated in place to reflect
the same final history.