Download OpenAPI specification:Download
Create Chat Completion on a specified text generation model.
model required | string The model that will be inferred for chat completion. |
required | Array of objects (Chat Completion Message) The message context to use for the chat completion request, separated by system, user, and assistant roles. |
max_tokens | integer Default: 512 The maximum number of tokens to generate for the chat completion. |
seed | integer Default: -1 If you would like a different response from the same message, changing the seed will change the response. |
temperature | number <float> [ 0 .. 2 ] Default: 0.8 A value between 0.0 and 2.0 that controls the randomness of the model's output. When set closer to 1, such as 0.8, the outcome is more unpredictable and creative. Values nearing 0, like 0.2, produce more predictable and less creative results. Setting temperature to zero is equivalent to setting a seed, enabling deterministic testing. |
top_k | number <float> [ 0 .. 100 ] Default: 40 A value between 0 and 100 that controls how many tokens are considered. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
top_p | number <float> [ 0 .. 1 ] Default: 0.9 A value between 0.0 and 1.0 that controls the probability of the model generating a particular token. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
stream | boolean Indicates whether the response should be streamed. |
{- "model": "string",
- "messages": [
- {
- "role": "string",
- "content": "string"
}
], - "max_tokens": 512,
- "seed": -1,
- "temperature": 0.8,
- "top_k": 40,
- "top_p": 0.9,
- "stream": true
}
{- "id": "string",
- "created": 0,
- "model": "string",
- "choices": [
- {
- "index": 0,
- "message": {
- "role": "string",
- "content": "string"
}, - "finish_reason": "string"
}
], - "usage": {
- "completion_tokens": 0,
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Create Chat Completion on a specified text generation model with retrieval-augmented generation, utilizing the context of relevant search results from items or files in a vectore store collection.
collection required | string The vector store collection to search for relevant context. |
model required | string The model that will be inferred for chat completion. |
required | Array of objects (Chat Completion Message) The message context to use for the chat completion request, separated by system, user, and assistant roles. |
max_tokens | integer Default: 512 The maximum number of tokens to generate for the chat completion. Does not include token usage for the vector search embeddings operation. |
seed | integer Default: -1 If you would like a different response from the same message, changing the seed will change the response. |
temperature | number <float> [ 0 .. 2 ] Default: 0.8 A value between 0.0 and 2.0 that controls the randomness of the model's output. When set closer to 1, such as 0.8, the outcome is more unpredictable and creative. Values nearing 0, like 0.2, produce more predictable and less creative results. Setting temperature to zero is equivalent to setting a seed, enabling deterministic testing. |
top_k | number <float> [ 0 .. 100 ] Default: 40 A value between 0 and 100 that controls how many tokens are considered. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
top_p | number <float> [ 0 .. 1 ] Default: 0.9 A value between 0.0 and 1.0 that controls the probability of the model generating a particular token. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
stream | boolean Indicates whether the response should be streamed. |
{- "collection": "string",
- "model": "string",
- "messages": [
- {
- "role": "string",
- "content": "string"
}
], - "max_tokens": 512,
- "seed": -1,
- "temperature": 0.8,
- "top_k": 40,
- "top_p": 0.9,
- "stream": true
}
{- "id": "string",
- "created": 0,
- "model": "string",
- "choices": [
- {
- "index": 0,
- "message": {
- "role": "string",
- "content": "string"
}, - "finish_reason": "string"
}
], - "usage": {
- "completion_tokens": 0,
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Retrieve a Chat Completion job by the ID.
id required | string The ID of the chat completion job. |
{- "id": "string",
- "created": 0,
- "model": "string",
- "choices": [
- {
- "index": 0,
- "message": {
- "role": "string",
- "content": "string"
}, - "finish_reason": "string"
}
], - "usage": {
- "completion_tokens": 0,
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Rate a Chat Completion job by the ID.
id required | string The ID of the chat completion job. |
vote required | string Your positive or negative rating for this chat completion as |
{- "vote": "string"
}
Generates speech audio from input text.
model required | string The model that will be used to generate text-to-speech audio. |
input required | string The text to generate audio for, up to a maximum of 2,000 characters. |
voice required | string The voice that will be used in the generated audio. |
{- "model": "string",
- "input": "string",
- "voice": "string"
}
Creates an embedding vector representing the given input text.
input required | string The input text to embed. |
model required | string The model that will be used to generate embeddings. |
encoding_format | string Default: "float" Enum: "float" "base64" The format to return embeddings in. |
{- "input": "string",
- "model": "string",
- "encoding_format": "float"
}
{- "object": "string",
- "data": [
- {
- "object": "string",
- "embedding": [
- 0.1
], - "index": 0
}
], - "model": "string",
- "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Creates a vector store collection for searchable embeddings.
name required | string The name of the vector store collection. This is also used to auto-generate a unique ID for the record. |
{- "name": "string"
}
{- "collection": {
- "id": "string",
- "name": "string"
}
}
Updates a vector store collection record.
id required | string The ID of the vector store collection. |
name required | string The name of the vector store collection. Note: the previously generated unique ID will remain the same. |
{- "name": "string"
}
{- "collection": {
- "id": "string",
- "name": "string"
}
}
Searches items in a vector store collection for the closest embeddings matches.
input required | string The text query to search against the embeddings items in the vector store collection. |
{- "input": "string"
}
{- "results": [
- {
- "id": "string",
- "created": "string",
- "content": "string"
}
], - "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Retrieve a list of items within a vector store collections.
id required | string The ID of the vector store collection. |
{- "items": [
- {
- "id": "string",
- "created": "string",
- "description": "string"
}
]
}
Adds an item to a vector store collection.
id required | string The ID of the vector store collection. |
content required | string The text to be converted into embeddings and stored in the vector store collection. |
description | string A description of the contents in this collection item record. If omitted, this value will default to a shortened version of the text stored in the collection. |
{- "content": "string",
- "description": "string"
}
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}, - "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Retrieve a vector store collection item by the ID.
id required | string The ID of the vector store collection. |
itemid required | string The ID of the vector store collection item. |
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}
}
Updates a vector store collection item record.
id required | string The ID of the vector store collection. |
itemid required | string The ID of the vector store collection item. |
description | string A description of the contents in this collection item record. |
{- "description": "string"
}
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}
}
Retrieve a list of files within a vector store collections.
id required | string The ID of the vector store collection. |
{- "files": [
- {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
]
}
Adds a file to a vector store collection.
id required | string The ID of the vector store collection. |
file | string <binary> The file object to be uploaded to the vector store collection. |
{- "file": {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
}
Retrieve a vector store collection file by the ID.
id required | string The ID of the vector store collection. |
fileid required | string The ID of the vector store collection file. |
{- "file": {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
}
Retrieve a list of models for chat completion inference.
{- "models": [
- {
- "type": "string",
- "model": "string",
- "display_name": "string",
- "preferred": true,
- "streaming": true,
- "vram": 0,
- "ctx_length": 0
}
], - "private_models": [
- {
- "type": "string",
- "model": "string",
- "display_name": "string"
}
]
}
Look up requests sent to the Vultr Inference API along with their response details.
period required | integer Enum: 15 30 45 60 The number of minutes to search back from the chosen timestamp, up to the previous hour. |
timestamp | string The UTC timestamp to search request logs from in ISO 8601 format, e.g. |
endpoint | string The name of the endpoint to narrow your request search. |
{- "period": 15,
- "timestamp": "string",
- "endpoint": "string"
}
{- "requests": [
- {
- "timestamp": "string",
- "method": "string",
- "endpoint": "string",
- "request_headers": "string",
- "request_body": "string",
- "response_body": "string",
- "response_code": 0
}
]
}