Skip to main content

Migration Guide: v1 → v2

venice-ai v2.0.0 is a ground-up rewrite. The client is now async-first, every response is a typed Pydantic model, and a large set of new resources (video, music, crypto, augment, x402, a CLI, rate limiting, and more) ships alongside the v1 surface.

This guide covers everything a v1.3.x user must change to move to v2.0.0, in rough order of impact. For the complete list of additions, see the CHANGELOG.


1. The client is now async by default

This is the largest change. In v1, VeniceClient was synchronous and a separate AsyncVeniceClient provided the async API. In v2 those roles changed:

  • VeniceClient is now asynchronous. Its methods are coroutines and must be awaited.
  • AsyncVeniceClient has been removed. from venice_ai import AsyncVeniceClient raises ImportError.
  • SyncVeniceClient is the new synchronous client for code that genuinely can't go async.

Before (v1) — synchronous VeniceClient:

from venice_ai import VeniceClient

client = VeniceClient(api_key="...")
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

After (v2) — async by default:

import asyncio
from venice_ai import VeniceClient
from venice_ai.types.api import UserMessage

async def main():
async with VeniceClient() as client: # reads VENICE_API_KEY from env
response = await client.chat.completions.create(
model=await client.models.resolve_chat(),
messages=[UserMessage(content="Hello")],
)
print(response.text)

asyncio.run(main())

If you must stay synchronous, swap in SyncVeniceClient — same surface, no await:

from venice_ai import SyncVeniceClient

with SyncVeniceClient() as client:
response = client.chat.completions.create(
model=client.models.resolve_chat(),
messages=[UserMessage(content="Hello")],
)
print(response.text)

What to do:

  • If you used AsyncVeniceClient, rename it to VeniceClient (it is now the async client).
  • If you used the synchronous VeniceClient, either move your calls into an async context and await them, or switch to SyncVeniceClient.

2. Python ≥3.13 required

v2.0.0 requires Python ≥3.13. v1.3.x supported Python 3.11–3.12; support for both has been dropped.

What to do: Upgrade your runtime to 3.13 or later before upgrading the SDK.


3. max_tokensmax_completion_tokens

The max_tokens parameter (deprecated in v1) has been removed. Passing it raises a TypeError.

Before (v1):

client.chat.completions.create(model=model, messages=[...], max_tokens=512)

After (v2):

await client.chat.completions.create(
model=await client.models.resolve_chat(),
messages=[...],
max_completion_tokens=512,
)

4. client.image.generate()client.image.create()

The primary image-generation method was renamed. There is no deprecation alias — the old name raises AttributeError.

client.image.generate(...) → client.image.create(...)
client.image.get_available_styles() → client.image.list_styles()

client.image.simple_generate(...) is unchanged — it remains a thin OpenAI-compat shim around POST /images/generations. For full Venice features (LoRA, CFG, multi-variant, editing) use client.image.create(...).


5. Responses are now typed Pydantic models

In v1 several endpoints returned plain TypedDicts (read with subscripts, resp["data"]) or raw bytes. In v2 every response is a typed Pydantic model, read by attribute (resp.data). Subscripting a v2 response raises TypeError.

Dict → attribute access

This affects embeddings, models, billing, and audio.get_voices:

Before (v1):

result = client.embeddings.create(model=model, input=["hello"])
vector = result["data"][0]["embedding"]
total = result["usage"]["total_tokens"]

After (v2):

result = await client.embeddings.create(
model=await client.models.resolve_embedding(), input=["hello"]
)
vector = result.data[0].embedding
total = result.usage.total_tokens

The same ["key"].key change applies to client.models.list(), client.billing.get_usage(), and client.audio.get_voices(). Field names are preserved — only the access style changed. (chat and characters responses were already Pydantic models in v1, so attribute access there is unchanged.)

audio.create_speech() returns AudioResponse, not bytes

Before (v1):

audio_bytes = client.audio.create_speech(model=tts_model, input="Hello", voice=voice)
open("out.mp3", "wb").write(audio_bytes)

After (v2): the result is an AudioResponse — use .save(), or read the raw bytes from .content:

audio = await client.audio.create_speech(
model=await client.models.resolve_tts(), input="Hello", voice=voice
)
audio.save("out.mp3") # convenience helper
# or: open("out.mp3", "wb").write(audio.content)

api_keys return types

client.api_keys.retrieve() and client.api_keys.delete() previously returned raw dicts; they now return typed models (ApiKey and DeleteApiKeyResponse). Read their fields by attribute:

Before (v1):

details = client.api_keys.retrieve(api_key_id=key_id)
print(details["description"])

After (v2):

api_key = await client.api_keys.retrieve(api_key_id=key_id)
print(api_key.description)

create(), get_rate_limits(), and the web3 helpers also return typed models in v2.


6. Type import paths moved

The per-resource type modules moved under a new venice_ai.types.api package:

from venice_ai.types.image import ... → from venice_ai.types.api.images import ...
from venice_ai.types.models import ... → from venice_ai.types.api.models import ...
# likewise for api_keys, billing, characters, embeddings

Several response classes were also renamed (e.g. ChatCompletionChatCompletionResponse, ImageResponseImageGenerationResponse). If you only read responses by attribute you won't notice; if you imported the types directly, update the import.

For building chat messages, prefer the typed helpers from venice_ai.types.api (UserMessage, SystemMessage, UserMessage.builder()). Plain {"role": "user", "content": "..."} dicts are still accepted.


New in v2 (additive — no migration required)

None of the following change existing v1 call signatures; they're listed so v1 users know what's now available.

New top-level resources

  • client.video — text/image-to-video generation as async jobs (submit() / run()).
  • client.music — music generation as async jobs (submit() / run()).
  • client.crypto — JSON-RPC proxy (rpc(), batch_rpc()) with billing/idempotency headers surfaced.
  • client.augmentsearch(), scrape(), parse_text() over Venice's /augment/* endpoints.
  • client.x402 — wallet billing (balance(), transactions(), top_up()) via SIWE / EIP-4361 auth. See ADVANCED.md § x402 Wallet Authentication.
  • client.responses — alpha OpenAI-compatible Responses API.
  • client.tee — Trusted Execution Environment attestation and confidential compute.

audio also gained transcribe() (speech-to-text) and create_voice().

Model resolution

client.models gained resolve_chat(), resolve_image(), resolve_embedding(), resolve_tts(), resolve_video(), resolve_music(), and friends — always prefer these over hardcoding a model id, which goes stale on deprecation.

CLI, rate limiting, and observability

v2 ships a venice command-line tool (pip install exposes the venice entry point), pluggable rate limiting (SIMPLE / ADAPTIVE modes; the [adaptive] extra), configurable backends (in-memory by default, Redis via the [redis] extra), cost tracking, and structured logging.

New optional extras

pip install 'venice-ai[redis]' # Redis-backed rate limiting / caching
pip install 'venice-ai[adaptive]' # ADAPTIVE rate limiter
pip install 'venice-ai[x402]' # SIWE wallet auth (eth-account + siwe)
pip install 'venice-ai[x402-solana]' # Solana wallet settlement
pip install 'venice-ai[e2ee]' # TEE client-side encryption (cryptography)

enable_e2ee is now real client-side encryption

In v1, enable_e2ee had no effect. In v2 it engages real client-side E2EE: setting enable_e2ee=True (or e2ee=True on client.chat.completions.create) makes the SDK verify the model's attestation, encrypt each user/system message to the attested enclave key, stream the response, and decrypt it locally.

What to do:

  • Install the extra: pip install 'venice-ai[e2ee]' (pulls cryptography). Baseline attestation works without it; only encryption needs it.

  • Use an e2ee-* confidential-compute model. Discover one dynamically — do not hardcode a model id:

    models = await client.models.list(type="text")
    e2ee_model = next(
    entry.id
    for entry in models.data
    if getattr(entry.model_spec.capabilities, "supportsE2EE", False)
    )

    resp = await client.chat.completions.create(
    model=e2ee_model,
    messages=[{"role": "user", "content": "Confidential prompt."}],
    e2ee=True, # or venice_parameters={"enable_e2ee": True}
    )
  • Note the constraints: tool calling, web search/scraping, and multimodal (image/file) content are rejected with InvalidRequestError under E2EE.

  • Security limitation: attestation verification is baseline — it trusts Venice's server-side verified claim and the nonce / report-data binding, but does not perform full client-side Intel TDX + NVIDIA quote verification. A one-time UserWarning is emitted on engagement. Supply a FullQuoteVerifier via e2ee=TeeOptions(verifier=...) if your threat model requires it. See CHANGELOG § TEE client-side end-to-end encryption.

New kwargs on existing methods

MethodNew kwarg(s)Notes
client.image.createenable_web_searchOptional; supported models pull recent web context.
client.chat.completions.createstore, text, include, metadata, prompt_cache_retentionOpenAI-compat passthroughs + Venice's cache-retention tier.

Further Reading

  • CHANGELOG — complete list of all changes in v2.0.0
  • README — updated usage examples for v2