Skip to main content

venice_ai.resources.video

Venice AI Video API resources.

This module provides classes for interacting with the Venice AI Video API, supporting video generation operations including text-to-video and image-to-video.

The video API allows for:

  • Queuing video generation requests (text-to-video and image-to-video)
  • Getting price quotes before generation
  • Retrieving video generation results (polling for completion)
  • Marking videos as complete / deleting from storage
  • High-level VideoJob abstraction for lifecycle management

VideoJob Objects

class VideoJob()

Manages the lifecycle of an async video generation request.

Use as an async context manager to guarantee server-side cleanup:

async with await client.video.run(model=model, prompt="...", duration="5s") as job:
status = await job.wait()
await job.download("output.mp4", status)

VideoJob.__aexit__

async def __aexit__(exc_type: type[BaseException] | None,
_exc_val: BaseException | None, _exc_tb: object) -> None

Guarantee server-side cleanup on exit.

Propagates any in-flight exception from the user's block. If both the user code and cleanup raise, the user's exception wins; the cleanup failure is logged.

VideoJob.status

@property
def status() -> VideoRetrieveResponse | None

Last known status from polling.

VideoJob.progress

@property
def progress() -> float | None

Progress as a fraction 0.0–1.0, or None if not processing.

VideoJob.poll

async def poll() -> VideoRetrieveResponse

Single poll — check current status.

VideoJob.wait

async def wait(
*,
poll_interval: float = 5.0,
max_polls: int = 120,
on_progress: Callable[[VideoProcessingStatus], None] | None = None
) -> VideoCompletedStatus

Poll until complete or failed. Returns completed status or raises.

Arguments:

  • poll_interval: Seconds between polls.
  • max_polls: Maximum number of polls before raising TimeoutError.
  • on_progress: Optional callback invoked on each processing status update.

Raises:

  • VideoGenerationError: If the server reports generation failure.
  • TimeoutError: If max_polls is exhausted.

VideoJob.download

async def download(path: str | Path, status: VideoCompletedStatus) -> Path

Download a completed video to path. Does NOT call cancel() — use the context manager.

File I/O is offloaded to a worker thread so the event loop never blocks. URL downloads reuse the SDK client's managed HTTP session, so proxy, SSL, timeout, and retry configuration are honored.

Arguments:

  • path: Destination file path.
  • status: A completed status (from :meth:wait or :meth:poll).

Returns:

The resolved :class:Path of the saved file.

VideoJob.cancel

async def cancel() -> VideoCompleteResponse

Release server-side storage / cancel an in-progress job.

Wraps the /video/complete endpoint, which deletes the queue entry server-side regardless of whether the job has finished. Named cancel (rather than the wire-format complete) to distinguish it from the :attr:is_complete state check — terminal states are polled via :meth:wait / :attr:status.

Video Objects

class Video(APIResource["VeniceClient"])

Asynchronous interface for Venice AI's Video generation API.

The Video class provides access to Venice's video generation endpoints including text-to-video, image-to-video, quoting, retrieval, and completion.

Core Capabilities:

  • Text-to-Video: Generate video from text prompts
  • Image-to-Video: Animate a reference image into video
  • Price Quoting: Get cost estimates before generation
  • Result Retrieval: Poll for video generation status and download URL
  • Cleanup: Mark videos as complete and delete from storage

Usage Patterns: The Video class is accessed through the Venice AI client's :attr:~venice_ai._client.VeniceClient.video property rather than instantiated directly.

Typical Workflow:

  1. Call :meth:quote to get a price estimate
  2. Call :meth:submit to start generation (returns a queue_id)
  3. Poll :meth:retrieve with the queue_id until status is COMPLETED
  4. Download the video from the returned URL
  5. Call :meth:cancel to clean up server-side storage

Arguments:

  • client - The Venice AI client instance providing authentication and connection management.

Example:

Basic text-to-video generation workflow:

async with VeniceClient() as client:
# Get a price quote (pricing is driven by model + duration +
# resolution; /video/quote does not accept a prompt).
quote = await client.video.quote(
model="wan-2-7-text-to-video",
duration="5s",
)
print(f"Estimated cost: ${quote.quote}")

# Queue the generation
result = await client.video.submit(
model="wan-2.6-text-to-video",
prompt="A sunset over the ocean with gentle waves",
duration="5s",
aspect_ratio="16:9",
)
print(f"Queue ID: {result.queue_id}")

# Poll for completion
status = await client.video.retrieve(
model="wan-2.6-text-to-video",
queue_id=result.queue_id,
)

# Clean up after download
await client.video.cancel(
model="wan-2.6-text-to-video",
queue_id=result.queue_id,
)

Video.submit

async def submit(
*,
model: str,
prompt: str,
duration_seconds: int | str,
negative_prompt: str | None = None,
resolution: str | None = None,
audio: bool | None = None,
aspect_ratio: str | None = None,
image_url: str | None = None,
upscale_factor: Literal[1, 2, 4] | None = None,
end_image_url: str | None = None,
audio_url: str | None = None,
video_url: str | None = None,
reference_image_urls: list[str] | None = None,
reference_audio_urls: list[str] | None = None,
reference_video_urls: list[str] | None = None,
elements: list[VideoElement | dict] | None = None,
scene_image_urls: list[str] | None = None,
consents: VideoConsents | dict | None = None) -> VideoQueueResponse

Queue a new video generation request.

Automatically selects the appropriate request type based on whether image_url is provided:

  • With image_url: Uses image-to-video (I2V) request
  • Without image_url: Uses text-to-video (T2V) request

Call :meth:quote first to get a price estimate, then poll

Arguments:

  • model (str): Video model ID (e.g., "wan-2.6-text-to-video").
  • prompt (str): Text prompt for video generation (max length varies by model; default 2500 chars, up to 10000 for some models).
  • duration_seconds (int | str): Duration of generated video as an integer number of seconds (e.g. 5, 10). Liberal string parsing also accepts "5" / "5s" / "5 seconds". The wire format "5s" is generated internally. Valid values vary by model.
  • negative_prompt (Optional[str]): Negative prompt to avoid unwanted content.
  • resolution (Optional[str]): Output resolution (e.g., "720p", "1080p"). Valid values vary by model.
  • audio (Optional[bool]): Generate audio if model supports it.
  • aspect_ratio (Optional[str]): Aspect ratio (e.g., "16:9", "9:16"). Typically required for T2V, ignored for I2V.
  • image_url (Optional[str]): Reference image URL for image-to-video generation. Must start with http://, https://, or data:. When provided, switches to I2V request type.
  • upscale_factor (Optional[Literal[1, 2, 4]]): Upscale models only. 1 = quality enhancement, 2 = double resolution (default for topaz-video-upscale), 4 = quadruple.
  • end_image_url (Optional[str]): End-frame image for models that support transitions.
  • audio_url (Optional[str]): Background audio input (WAV/MP3, max 30s/15MB) for models that support it.
  • video_url (Optional[str]): Source video for video-to-video / upscale models (MP4/MOV/WebM).
  • reference_image_urls (Optional[list[str]]): Up to 9 reference images for character or style consistency.
  • reference_audio_urls (Optional[list[str]]): Up to 3 reference audio donors for R2V models (e.g. Seedance 2.0 R2V).
  • reference_video_urls (Optional[list[str]]): Up to 3 reference video donors for R2V models (e.g. Seedance 2.0 R2V) used to inherit subject motion, camera movement, and overall style.
  • elements (Optional[list[VideoElement | dict]]): Up to 4 structured character/object elements for advanced element-aware models (Kling O3 R2V). Each dict should include frontal_image_url and optional reference_image_urls. Reference in the prompt as @Element1, @Element2, etc.
  • scene_image_urls (Optional[list[str]]): Up to 4 scene reference images for element-aware models. Reference as @Image1, @Image2, etc.

Raises:

  • venice_ai.exceptions.APIError: If the API request fails.
  • pydantic.ValidationError: If request parameters are invalid. Example: Text-to-video:
result = await client.video.submit(
model="wan-2.6-text-to-video",
prompt="A cat playing piano",
duration_seconds=5,
aspect_ratio="16:9",
resolution="1080p",
)
Image-to-video:
result = await client.video.submit(
model="wan-2.6-image-to-video",
prompt="Make this photo come to life",
duration_seconds=5,
image_url="https://example.com/photo.jpg",
)

Returns:

VideoQueueResponse: Queue response containing model and queue_id.

Video.quote

async def quote(
*,
model: str,
duration_seconds: int | str,
aspect_ratio: str | None = None,
resolution: str | None = None,
upscale_factor: Literal[1, 2, 4] | None = None,
audio: bool | None = None,
video_url: str | None = None,
reference_video_total_duration: float | None = None
) -> VideoQuoteResponse

Get a price estimate for a video generation request.

Returns the estimated cost in USD. The /video/quote endpoint prices based on model + duration + resolution + upscale; prompt text and reference images do not affect the quote and are not sent (see the Venice API spec).

Arguments:

  • model: Video model ID (e.g., "wan-2-7-text-to-video").
  • duration_seconds: Duration as an integer number of seconds (e.g. 5, 10). Liberal string parsing also accepts "5" / "5s" / "5 seconds". The wire form "5s" is generated internally.
  • aspect_ratio: Aspect ratio (e.g., "16:9", "9:16").
  • resolution: Output resolution (e.g., "720p", "1080p").
  • upscale_factor: For upscale models: 1 = quality enhancement, 2 = double resolution, 4 = quadruple.
  • audio: Generate audio if the model supports it.
  • video_url: Source video for video-to-video / upscale quotes (MP4/MOV/WebM — HTTP URL or data: URI).
  • reference_video_total_duration: For R2V models (e.g. Seedance 2.0 R2V), the aggregate duration in seconds of all reference videos to include in the quote. When provided, the quote reflects the 'input with video' rate tier; when omitted, the no-reference baseline is returned.

Raises:

  • venice_ai.exceptions.APIError: If the API request fails.
  • pydantic.ValidationError: If request parameters are invalid. Example:
quote = await client.video.quote(
model="wan-2-7-text-to-video",
duration_seconds=5,
aspect_ratio="16:9",
resolution="720p",
)
print(f"Estimated cost: ${quote.quote}")

Returns:

VideoQuoteResponse: Quote response containing estimated cost.

Video.retrieve

async def retrieve(
*,
model: str,
queue_id: str,
delete_media_on_completion: bool = False) -> VideoRetrieveResponse

Retrieve the result of a video generation request.

Poll this endpoint with the queue_id from :meth:submit until the video is ready. Returns one of three status types:

  • :class:~venice_ai.types.api.video.VideoProcessingStatus — still processing, poll again
  • :class:~venice_ai.types.api.video.VideoFailedStatus — generation failed
  • :class:~venice_ai.types.api.video.VideoCompletedStatus — complete, download from url

Arguments:

  • model (str): Model ID used for generation.
  • queue_id (str): Queue ID from the queue response.
  • delete_media_on_completion (bool): Auto-delete media after successful retrieval. If True, you don't need to call :meth:cancel.

Raises:

  • venice_ai.exceptions.APIError: If the API request fails.
  • pydantic.ValidationError: If request parameters are invalid. Example:
import asyncio

# Poll until complete
while True:
status = await client.video.retrieve(
model="wan-2.6-text-to-video",
queue_id="queue_abc123",
)
if hasattr(status, 'url'):
print(f"Video ready: {status.url}")
break
elif hasattr(status, 'error'):
print(f"Failed: {status.error}")
break
else:
print(f"Processing: {status.progress_percent:.0f}%")
await asyncio.sleep(5)

Returns:

VideoRetrieveResponse: Video retrieval response (union of status types).

Video.cancel

async def cancel(*, model: str, queue_id: str) -> VideoCompleteResponse

Release server-side storage for a video job (cancel / cleanup).

Wraps the /video/complete endpoint, which deletes the queue entry server-side regardless of whether generation has finished. Call this after successfully downloading the video, or to abort an in-progress job. Not needed if delete_media_on_completion was set to True in the :meth:retrieve request.

Arguments:

  • model (str): Model ID used for generation.
  • queue_id (str): Queue ID to release.

Raises:

  • venice_ai.exceptions.APIError: If the API request fails.
  • pydantic.ValidationError: If request parameters are invalid. Example:
result = await client.video.cancel(
model="wan-2.6-text-to-video",
queue_id="queue_abc123",
)
print(f"Cleanup successful: {result.success}")

Returns:

VideoCompleteResponse: Complete response indicating success.

Video.run

async def run(*,
model: str,
prompt: str,
duration_seconds: int | str,
negative_prompt: str | None = None,
resolution: str | None = None,
audio: bool | None = None,
aspect_ratio: str | None = None,
image_url: str | None = None,
upscale_factor: Literal[1, 2, 4] | None = None,
end_image_url: str | None = None,
audio_url: str | None = None,
video_url: str | None = None,
reference_image_urls: list[str] | None = None,
reference_audio_urls: list[str] | None = None,
reference_video_urls: list[str] | None = None,
elements: list[VideoElement | dict] | None = None,
scene_image_urls: list[str] | None = None,
consents: VideoConsents | dict | None = None) -> VideoJob

Queue a video generation and return a :class:VideoJob for lifecycle management.

Accepts the same parameters as :meth:submit. The returned job should be used as an async context manager to guarantee server-side cleanup:

async with await client.video.run(
model=model, prompt="...", duration_seconds=5
) as job:
status = await job.wait()
await job.download("output.mp4", status)

Returns:

A :class:VideoJob handle.

Video.transcribe

async def transcribe(
url: str,
*,
response_format: Literal["json", "text"] = "json"
) -> VideoTranscriptionResponse | str

Transcribe a public video URL to text.

Wraps POST /api/v1/video/transcriptions. Priced at a flat $0.02 per request.

Arguments:

  • url (str): Publicly accessible video URL (e.g. a YouTube watch URL).
  • response_format (Literal["json", "text"]): "json" (default) returns a :class:VideoTranscriptionResponse with transcript and lang; "text" returns the raw transcript as str.

Raises:

  • venice_ai.exceptions.APIError: If the API request fails.
  • pydantic.ValidationError: If url is not an http(s) URL. Example:
result = await client.video.transcribe(
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
)
print(result.lang, result.transcript)

Returns:

VideoTranscriptionResponse | str: Either the parsed response model or the plain transcript string, depending on response_format.