Migration Guide: v1 → v2
venice-ai v2.0.0 is a ground-up rewrite. The client is now async-first, every response is a typed Pydantic model, and a large set of new resources (video, music, crypto, augment, x402, a CLI, rate limiting, and more) ships alongside the v1 surface.
This guide covers everything a v1.3.x user must change to move to v2.0.0, in rough order of impact. For the complete list of additions, see the CHANGELOG.
1. The client is now async by default
This is the largest change. In v1, VeniceClient was synchronous and a separate
AsyncVeniceClient provided the async API. In v2 those roles changed:
VeniceClientis now asynchronous. Its methods are coroutines and must beawaited.AsyncVeniceClienthas been removed.from venice_ai import AsyncVeniceClientraisesImportError.SyncVeniceClientis the new synchronous client for code that genuinely can't go async.
Before (v1) — synchronous VeniceClient:
from venice_ai import VeniceClient
client = VeniceClient(api_key="...")
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)
After (v2) — async by default:
import asyncio
from venice_ai import VeniceClient
from venice_ai.types.api import UserMessage
async def main():
async with VeniceClient() as client: # reads VENICE_API_KEY from env
response = await client.chat.completions.create(
model=await client.models.resolve_chat(),
messages=[UserMessage(content="Hello")],
)
print(response.text)
asyncio.run(main())
If you must stay synchronous, swap in SyncVeniceClient — same surface, no await:
from venice_ai import SyncVeniceClient
with SyncVeniceClient() as client:
response = client.chat.completions.create(
model=client.models.resolve_chat(),
messages=[UserMessage(content="Hello")],
)
print(response.text)
What to do:
- If you used
AsyncVeniceClient, rename it toVeniceClient(it is now the async client). - If you used the synchronous
VeniceClient, either move your calls into an async context andawaitthem, or switch toSyncVeniceClient.
2. Python ≥3.13 required
v2.0.0 requires Python ≥3.13. v1.3.x supported Python 3.11–3.12; support for both has been dropped.
What to do: Upgrade your runtime to 3.13 or later before upgrading the SDK.
3. max_tokens → max_completion_tokens
The max_tokens parameter (deprecated in v1) has been removed. Passing it raises
a TypeError.
Before (v1):
client.chat.completions.create(model=model, messages=[...], max_tokens=512)
After (v2):
await client.chat.completions.create(
model=await client.models.resolve_chat(),
messages=[...],
max_completion_tokens=512,
)
4. client.image.generate() → client.image.create()
The primary image-generation method was renamed. There is no deprecation alias —
the old name raises AttributeError.
client.image.generate(...) → client.image.create(...)
client.image.get_available_styles() → client.image.list_styles()
client.image.simple_generate(...) is unchanged — it remains a thin
OpenAI-compat shim around POST /images/generations. For full Venice features
(LoRA, CFG, multi-variant, editing) use client.image.create(...).
5. Responses are now typed Pydantic models
In v1 several endpoints returned plain TypedDicts (read with subscripts,
resp["data"]) or raw bytes. In v2 every response is a typed Pydantic model, read
by attribute (resp.data). Subscripting a v2 response raises TypeError.
Dict → attribute access
This affects embeddings, models, billing, and audio.get_voices:
Before (v1):
result = client.embeddings.create(model=model, input=["hello"])
vector = result["data"][0]["embedding"]
total = result["usage"]["total_tokens"]
After (v2):
result = await client.embeddings.create(
model=await client.models.resolve_embedding(), input=["hello"]
)
vector = result.data[0].embedding
total = result.usage.total_tokens
The same ["key"] → .key change applies to client.models.list(),
client.billing.get_usage(), and client.audio.get_voices(). Field names are
preserved — only the access style changed. (chat and characters responses were
already Pydantic models in v1, so attribute access there is unchanged.)
audio.create_speech() returns AudioResponse, not bytes
Before (v1):
audio_bytes = client.audio.create_speech(model=tts_model, input="Hello", voice=voice)
open("out.mp3", "wb").write(audio_bytes)
After (v2): the result is an AudioResponse — use .save(), or read the raw
bytes from .content:
audio = await client.audio.create_speech(
model=await client.models.resolve_tts(), input="Hello", voice=voice
)
audio.save("out.mp3") # convenience helper
# or: open("out.mp3", "wb").write(audio.content)
api_keys return types
client.api_keys.retrieve() and client.api_keys.delete() previously returned raw
dicts; they now return typed models (ApiKey and DeleteApiKeyResponse). Read
their fields by attribute:
Before (v1):
details = client.api_keys.retrieve(api_key_id=key_id)
print(details["description"])
After (v2):
api_key = await client.api_keys.retrieve(api_key_id=key_id)
print(api_key.description)
create(), get_rate_limits(), and the web3 helpers also return typed models in v2.
6. Type import paths moved
The per-resource type modules moved under a new venice_ai.types.api package:
from venice_ai.types.image import ... → from venice_ai.types.api.images import ...
from venice_ai.types.models import ... → from venice_ai.types.api.models import ...
# likewise for api_keys, billing, characters, embeddings
Several response classes were also renamed (e.g. ChatCompletion →
ChatCompletionResponse, ImageResponse → ImageGenerationResponse). If you only
read responses by attribute you won't notice; if you imported the types directly,
update the import.
For building chat messages, prefer the typed helpers from venice_ai.types.api
(UserMessage, SystemMessage, UserMessage.builder()). Plain
{"role": "user", "content": "..."} dicts are still accepted.
New in v2 (additive — no migration required)
None of the following change existing v1 call signatures; they're listed so v1 users know what's now available.
New top-level resources
client.video— text/image-to-video generation as async jobs (submit()/run()).client.music— music generation as async jobs (submit()/run()).client.crypto— JSON-RPC proxy (rpc(),batch_rpc()) with billing/idempotency headers surfaced.client.augment—search(),scrape(),parse_text()over Venice's/augment/*endpoints.client.x402— wallet billing (balance(),transactions(),top_up()) via SIWE / EIP-4361 auth. See ADVANCED.md § x402 Wallet Authentication.client.responses— alpha OpenAI-compatible Responses API.client.tee— Trusted Execution Environment attestation and confidential compute.
audio also gained transcribe() (speech-to-text) and create_voice().
Model resolution
client.models gained resolve_chat(), resolve_image(), resolve_embedding(),
resolve_tts(), resolve_video(), resolve_music(), and friends — always prefer
these over hardcoding a model id, which goes stale on deprecation.
CLI, rate limiting, and observability
v2 ships a venice command-line tool (pip install exposes the venice entry
point), pluggable rate limiting (SIMPLE / ADAPTIVE modes; the [adaptive] extra),
configurable backends (in-memory by default, Redis via the [redis] extra), cost
tracking, and structured logging.
New optional extras
pip install 'venice-ai[redis]' # Redis-backed rate limiting / caching
pip install 'venice-ai[adaptive]' # ADAPTIVE rate limiter
pip install 'venice-ai[x402]' # SIWE wallet auth (eth-account + siwe)
pip install 'venice-ai[x402-solana]' # Solana wallet settlement
pip install 'venice-ai[e2ee]' # TEE client-side encryption (cryptography)
enable_e2ee is now real client-side encryption
In v1, enable_e2ee had no effect. In v2 it engages real client-side E2EE: setting
enable_e2ee=True (or e2ee=True on client.chat.completions.create) makes the SDK
verify the model's attestation, encrypt each user/system message to the attested
enclave key, stream the response, and decrypt it locally.
What to do:
-
Install the extra:
pip install 'venice-ai[e2ee]'(pullscryptography). Baseline attestation works without it; only encryption needs it. -
Use an
e2ee-*confidential-compute model. Discover one dynamically — do not hardcode a model id:models = await client.models.list(type="text")e2ee_model = next(entry.idfor entry in models.dataif getattr(entry.model_spec.capabilities, "supportsE2EE", False))resp = await client.chat.completions.create(model=e2ee_model,messages=[{"role": "user", "content": "Confidential prompt."}],e2ee=True, # or venice_parameters={"enable_e2ee": True}) -
Note the constraints: tool calling, web search/scraping, and multimodal (image/file) content are rejected with
InvalidRequestErrorunder E2EE. -
Security limitation: attestation verification is baseline — it trusts Venice's server-side
verifiedclaim and the nonce / report-data binding, but does not perform full client-side Intel TDX + NVIDIA quote verification. A one-timeUserWarningis emitted on engagement. Supply aFullQuoteVerifierviae2ee=TeeOptions(verifier=...)if your threat model requires it. See CHANGELOG § TEE client-side end-to-end encryption.
New kwargs on existing methods
| Method | New kwarg(s) | Notes |
|---|---|---|
client.image.create | enable_web_search | Optional; supported models pull recent web context. |
client.chat.completions.create | store, text, include, metadata, prompt_cache_retention | OpenAI-compat passthroughs + Venice's cache-retention tier. |