venice_ai.resources.responses

Venice AI Responses API resource (Alpha).

Wraps POST /responses — the OpenAI-compatible Responses API endpoint described at swagger.yaml.md:7244. The endpoint is currently tagged Alpha by Venice; request and response shapes may change without notice.

Unlike /chat/completions, this endpoint returns a typed output array containing reasoning, message, function_call, and web_search_call blocks. It is stateless — each request is independent and no conversation state is persisted between calls. E2EE-capable models are not supported; use /chat/completions with E2EE headers instead.

Streaming is supported via Server-Sent Events when stream=True; the returned :class:~venice_ai.streaming.Stream yields :class:~venice_ai.types.api.responses.ResponsesStreamEvent chunks.

Responses Objects

class Responses(APIResource["VeniceClient"])

Access the Venice Responses API (Alpha).

Access via :attr:VeniceClient.responses.

Responses.create

async def create(
    *,
    model: str,
    input: str | list[dict[str, Any]],
    include: list[str] | None = None,
    max_output_tokens: int | None = None,
    temperature: float | None = None,
    top_p: float | None = None,
    fallbacks: list[dict[str, str]] | None = None,
    reasoning: Any | None = None,
    tools: list[Tool | dict[str, Any]] | None = None,
    tool_choice: str | dict[str, Any] | None = None,
    web_search: bool | None = None,
    venice_parameters: Any | None = None,
    stream: bool = False
) -> ResponsesResponse | AsyncIterable[ResponsesStreamEvent]

Create a response using the Responses API (Alpha).

Wraps POST /api/v1/responses. Each call is stateless - no conversation history is persisted between requests.

Arguments:

model - Model ID. E2EE-capable models are not supported; use /chat/completions with E2EE headers instead.
input - Prompt - either a plain string or a list of structured input items (messages, reasoning blocks, function calls, etc.) as documented in the OpenAI Responses API.
include - Additional response fields to include.
max_output_tokens - Maximum tokens to generate.
temperature - Sampling temperature (0-2).
top_p - Nucleus sampling (0-1).
fallbacks - Anthropic beta parameter for Claude Fable 5 server-side refusal fallback. Array of {"model": ...} objects (max 10). Forwarded only for direct Anthropic routes; ignored otherwise.
reasoning - Nested reasoning config (``{"effort": "...",
"summary" - "..."}orReasoningConfig``).
tools - Tool definitions. Function tools plus the Alpha tool types (web_search, x_search, code_interpreter, file_search, computer_use_preview) are supported.
tool_choice - "auto" | "none" | "required" or a
```{"type"` - "function", "function": {"name": ...}}`` dict.
web_search - Enable web search for this request.
venice_parameters - Venice-specific request parameters.
stream - When True, returns an async iterator of :class:ResponsesStreamEvent chunks parsed from Server-Sent Events. Default False returns a single :class:ResponsesResponse.

Returns:

:class:ResponsesResponse with typed output blocks (reasoning, message, function_call, web_search_call), or an AsyncIterable[ResponsesStreamEvent] when stream=True.

Raises:

InvalidRequestError - If parameters fail server-side validation (e.g. malformed input, unsupported tool type, or an E2EE-capable model is supplied).
AuthenticationError - If the API key is missing or invalid.
PermissionDeniedError - If the account lacks access to the Responses API alpha or the requested model.
NotFoundError - If the model id is unknown.
RateLimitError - If account-level rate limits are exceeded.
APIError - For other HTTP-level failures.

Example:

.. code-block:: python

from venice_ai import VeniceClient

async with VeniceClient() as client: model = await client.models.resolve_chat() response = await client.responses.create( model=model, input="Summarize the Treaty of Versailles in two sentences.", max_output_tokens=200, ) for block in response.output: if block.type == "message": print(block.content)

Responses Objects​

Responses.create​

Responses Objects

Responses.create