venice_ai.resources.responses
Venice AI Responses API resource (Alpha).
Wraps POST /responses — the OpenAI-compatible Responses API endpoint
described at swagger.yaml.md:7244. The endpoint is currently tagged
Alpha by Venice; request and response shapes may change without notice.
Unlike /chat/completions, this endpoint returns a typed output array
containing reasoning, message, function_call, and
web_search_call blocks. It is stateless — each request is independent
and no conversation state is persisted between calls. E2EE-capable models
are not supported; use /chat/completions with E2EE headers instead.
Streaming is supported via Server-Sent Events when stream=True; the
returned :class:~venice_ai.streaming.Stream yields
:class:~venice_ai.types.api.responses.ResponsesStreamEvent chunks.
Responses Objects
class Responses(APIResource["VeniceClient"])
Access the Venice Responses API (Alpha).
Access via :attr:VeniceClient.responses.
Responses.create
async def create(
*,
model: str,
input: str | list[dict[str, Any]],
include: list[str] | None = None,
max_output_tokens: int | None = None,
temperature: float | None = None,
top_p: float | None = None,
fallbacks: list[dict[str, str]] | None = None,
reasoning: Any | None = None,
tools: list[Tool | dict[str, Any]] | None = None,
tool_choice: str | dict[str, Any] | None = None,
web_search: bool | None = None,
venice_parameters: Any | None = None,
stream: bool = False
) -> ResponsesResponse | AsyncIterable[ResponsesStreamEvent]
Create a response using the Responses API (Alpha).
Wraps POST /api/v1/responses. Each call is stateless - no
conversation history is persisted between requests.
Arguments:
model- Model ID. E2EE-capable models are not supported; use/chat/completionswith E2EE headers instead.input- Prompt - either a plain string or a list of structured input items (messages, reasoning blocks, function calls, etc.) as documented in the OpenAI Responses API.include- Additional response fields to include.max_output_tokens- Maximum tokens to generate.temperature- Sampling temperature (0-2).top_p- Nucleus sampling (0-1).fallbacks- Anthropic beta parameter for Claude Fable 5 server-side refusal fallback. Array of{"model": ...}objects (max 10). Forwarded only for direct Anthropic routes; ignored otherwise.reasoning- Nested reasoning config (``{"effort": "...","summary"- "..."}orReasoningConfig``).tools- Tool definitions. Function tools plus the Alpha tool types (web_search,x_search,code_interpreter,file_search,computer_use_preview) are supported.tool_choice-"auto" | "none" | "required"or a- ```{"type"` - "function", "function": {"name": ...}}`` dict.
web_search- Enable web search for this request.venice_parameters- Venice-specific request parameters.stream- WhenTrue, returns an async iterator of :class:ResponsesStreamEventchunks parsed from Server-Sent Events. DefaultFalsereturns a single :class:ResponsesResponse.
Returns:
:class:ResponsesResponse with typed output blocks
(reasoning, message, function_call, web_search_call), or an
AsyncIterable[ResponsesStreamEvent] when stream=True.
Raises:
InvalidRequestError- If parameters fail server-side validation (e.g. malformedinput, unsupported tool type, or an E2EE-capable model is supplied).AuthenticationError- If the API key is missing or invalid.PermissionDeniedError- If the account lacks access to the Responses API alpha or the requested model.NotFoundError- If the model id is unknown.RateLimitError- If account-level rate limits are exceeded.APIError- For other HTTP-level failures.
Example:
.. code-block:: python
from venice_ai import VeniceClient
async with VeniceClient() as client: model = await client.models.resolve_chat() response = await client.responses.create( model=model, input="Summarize the Treaty of Versailles in two sentences.", max_output_tokens=200, ) for block in response.output: if block.type == "message": print(block.content)