Venice AI Embeddings API Resources.
This module provides asynchronous client interfaces for Venice AI's Embeddings API, enabling developers to generate high-quality vector embeddings from text inputs. Embeddings are dense vector representations that capture the semantic meaning of text, making them essential for modern natural language processing applications.
Key Features:
- Text Embedding Generation: Convert text into dense vector representations
- Batch Processing: Generate embeddings for multiple texts in a single API call
- Multiple Input Formats: Support for text strings, token arrays, and mixed formats
- Configurable Dimensions: Adjust embedding dimensions for performance optimization
- Format Control: Choose between float arrays and base64-encoded representations
- Asynchronous Operations: Full async/await support for scalable applications
Common Use Cases:
- Semantic Search: Find documents similar in meaning to a query
- Text Classification: Group texts by semantic similarity
- Clustering Analysis: Discover hidden patterns in text collections
- Recommendation Systems: Suggest content based on semantic relationships
- Similarity Scoring: Measure semantic distance between texts
- Content Deduplication: Identify duplicate or near-duplicate content
Vector embeddings enable sophisticated text analysis by transforming human language into mathematical representations that preserve semantic relationships. Texts with similar meanings will have similar embedding vectors, allowing for computational comparison and analysis of semantic content.
Example:
.. code-block:: python
import asyncio from venice_ai import VeniceClient
async def analyze_text_similarity(): async with VeniceClient() as client:
Resolve an embedding model ID at runtime (do not hardcode)
model = await client.models.resolve_embedding()
Generate embeddings for comparison
response = await client.embeddings.create( model=model, input=[ "The weather is beautiful today", "It's a lovely sunny day", "I need to buy groceries" ] )
Extract embeddings for similarity analysis
embeddings = [item.embedding for item in response.data]
The first two texts should have higher similarity
than either compared to the third text
import numpy as np similarity = np.dot(embeddings[0], embeddings[1]) print(f"Similarity between weather texts: {similarity}")
asyncio.run(analyze_text_similarity())
Notes:
All operations in this module are asynchronous and require proper async/await
handling. The Embeddings class is accessed through the :attr:VeniceClient.embeddings
property and provides optimized batch processing for multiple text inputs.
Embeddings Objects
class Embeddings(APIResource["VeniceClient"])
Provides access to text embedding generation operations (asynchronous).
This class manages asynchronous embedding operations through the Venice AI API. Embeddings are vector representations of text that capture semantic meaning and can be used for various natural language processing tasks such as semantic search, clustering, classification, and similarity analysis.
Arguments:
client(venice_ai._client.VeniceClient): The Venice AI client instance used to make API requests.
Embeddings.create
async def create(*,
model: str,
input: str | list[str] | list[int] | list[list[int]],
dimensions: int | None = None,
encoding_format: Literal["float", "base64"] | None = None,
user: str | None = None) -> EmbeddingsResponse
Generates embeddings for input text(s) asynchronously.
This method sends an asynchronous request to the Venice AI API to generate vector embeddings for the provided text or token inputs using the specified model. The embeddings can be used for semantic search, clustering, classification, and other NLP tasks.
Arguments:
model(str): The ID of the embedding model to use. Resolve a valid ID at runtime viaclient.models.resolve_embedding()rather than hardcoding one, since available models change over time.input(Union[str, List[str], List[int], List[List[int]]]): The input text(s) to generate embeddings for. Can be a single string, a list of strings for batch processing, a list of token integers, or a list of token lists. For batch processing, all inputs will be processed together in a single API call.dimensions(Optional[int]): The number of dimensions for the output embeddings. If not specified, uses the model's default dimensionality. Some models support reducing dimensions for efficiency.encoding_format(Optional[Literal["float", "base64"]]): The format for the returned embeddings. Defaults to'float'for numerical arrays. Use'base64'for base64-encoded string representation.user: A unique identifier representing your end-user. This parameter is supported for compatibility with OpenAI clients but is discarded by the Venice API and does not affect the response.
Raises:
venice_ai.exceptions.InvalidRequestError: If parameter values are invalid (e.g., empty model or input, unsupported encoding format).venice_ai.exceptions.AuthenticationError: If the API key is invalid or missing.venice_ai.exceptions.PermissionDeniedError: If access to the specified model is denied.venice_ai.exceptions.NotFoundError: If the specified model is not found.venice_ai.exceptions.RateLimitError: If rate limits are exceeded.venice_ai.exceptions.APIError: For other API-related errors. Examples:
Generate an embedding for a single string:
.. code-block:: python
import asyncio
from venice_ai import VeniceClient
async def create_embedding():
async with VeniceClient(api_key="your-api-key") as client:
model = await client.models.resolve_embedding()
response = await client.embeddings.create(
model=model,
input="The quick brown fox jumps over the lazy dog."
)
embedding = response.data[0].embedding
print(f"Embedding dimensions: {len(embedding)}")
print(f"First 5 dimensions: {embedding[:5]}")
asyncio.run(create_embedding())
Generate embeddings for multiple strings (batch processing):
.. code-block:: python
async def create_batch_embeddings():
inputs = [
"First sentence for embedding.",
"Second sentence for embedding.",
"Third sentence for embedding."
]
async with VeniceClient(api_key="your-api-key") as client:
model = await client.models.resolve_embedding()
batch_response = await client.embeddings.create(
model=model,
input=inputs
)
for i, data_item in enumerate(batch_response.data):
print(f"Embedding for '{inputs[i]}' (first 3 dims): {data_item.embedding[:3]}")
print(f"Total tokens used: {batch_response.usage.total_tokens}")
asyncio.run(create_batch_embeddings())
Using optional parameters:
.. code-block:: python
async def create_custom_embedding():
async with VeniceClient(api_key="your-api-key") as client:
model = await client.models.resolve_embedding()
response = await client.embeddings.create(
model=model,
input="Sample text for embedding",
dimensions=512, # Reduce dimensions if supported
encoding_format="base64", # Get base64-encoded embeddings
user="user-123" # Track usage by user
)
asyncio.run(create_custom_embedding())
Returns:
A response object containing the generated embeddings and usage data. The response includes an array of embedding objects, each containing the vector representation and associated metadata.