Download OpenAPI specification:Download
Create Chat Completion on a specified text generation model.
model required | string The model that will be inferred for chat completion. |
required | Array of objects (Chat Completion Request Message) The message context to use for the chat completion request, separated by system, user, and assistant roles. |
stream | boolean Indicates whether the response should be streamed. |
max_tokens | integer Default: 512 The maximum number of tokens to generate for the chat completion. |
n | integer Default: 1 The number of chat completion choices to generate for each input message. |
seed | integer If you would like a different response from the same message, changing the seed will change the response. A null value generates a random seed. |
temperature | number <float> [ 0 .. 2 ] Default: 1 A value between 0.0 and 2.0 that controls the randomness of the model's output. When set closer to 1, such as 0.8, the outcome is more unpredictable and creative. Values nearing 0, like 0.2, produce more predictable and less creative results. Setting temperature to zero is equivalent to setting a seed, enabling deterministic testing. |
top_p | number <float> [ 0 .. 1 ] Default: 1 A value between 0.0 and 1.0 that controls the probability of the model generating a particular token. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
frequency_penalty | number <float> [ -2 .. 2 ] Default: 0 A value between -2.0 and 2.0 that controls how much the model penalizes generating repetitive responses. |
presence_penalty | number <float> [ -2 .. 2 ] Default: 0 A value between -2.0 and 2.0 that controls how much the model penalizes generating responses that contain certain words or phrases. |
stop | Array of strings A list of strings that the model will stop generating text if it encounters any of them. |
logprobs | boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. |
top_logprobs | integer [ 0 .. 20 ] An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. |
{- "model": "string",
- "messages": [
- {
- "role": "system",
- "content": "string"
}
], - "stream": true,
- "max_tokens": 512,
- "n": 1,
- "seed": 0,
- "temperature": 1,
- "top_p": 1,
- "frequency_penalty": 0,
- "presence_penalty": 0,
- "stop": [
- "string"
], - "logprobs": true,
- "top_logprobs": 20
}
{- "id": "string",
- "created": 0,
- "model": "string",
- "choices": [
- {
- "index": 0,
- "message": {
- "role": "system",
- "content": "string",
- "tool_calls": [
- {
- "id": "string",
- "type": "string",
- "function": {
- "name": "string",
- "arguments": "string"
}
}
]
}, - "logprobs": {
- "content": [
- {
- "token": "string",
- "logprob": 0.1,
- "bytes": [
- 0
], - "top_logprobs": [
- {
- "token": "string",
- "logprob": 0.1,
- "bytes": [
- 0
]
}
]
}
]
}, - "finish_reason": "string"
}
], - "usage": {
- "completion_tokens": 0,
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Create Chat Completion on a specified text generation model with retrieval-augmented generation, utilizing the context of relevant search results from items or files in a vectore store collection.
collection required | string The vector store collection to search for relevant context. |
model required | string The model that will be inferred for chat completion. |
required | Array of objects (Chat Completion Request Message) The message context to use for the chat completion request, separated by system, user, and assistant roles. |
max_tokens | integer Default: 512 The maximum number of tokens to generate for the chat completion. |
n | integer Default: 1 The number of chat completion choices to generate for each input message. |
seed | integer If you would like a different response from the same message, changing the seed will change the response. A null value generates a random seed. |
temperature | number <float> [ 0 .. 2 ] Default: 1 A value between 0.0 and 2.0 that controls the randomness of the model's output. When set closer to 1, such as 0.8, the outcome is more unpredictable and creative. Values nearing 0, like 0.2, produce more predictable and less creative results. Setting temperature to zero is equivalent to setting a seed, enabling deterministic testing. |
top_p | number <float> [ 0 .. 1 ] Default: 1 A value between 0.0 and 1.0 that controls the probability of the model generating a particular token. A higher value will result in more diverse outputs, while a lower value will result in more repetitive outputs. |
stop | Array of strings A list of strings that the model will stop generating text if it encounters any of them. |
frequency_penalty | number <float> [ -2 .. 2 ] Default: 0 A value between -2.0 and 2.0 that controls how much the model penalizes generating repetitive responses. |
presence_penalty | number <float> [ -2 .. 2 ] Default: 0 A value between -2.0 and 2.0 that controls how much the model penalizes generating responses that contain certain words or phrases. |
stream | boolean Indicates whether the response should be streamed. |
logprobs | boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. |
top_logprobs | integer [ 0 .. 20 ] An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. |
{- "collection": "string",
- "model": "string",
- "messages": [
- {
- "role": "system",
- "content": "string"
}
], - "max_tokens": 512,
- "n": 1,
- "seed": 0,
- "temperature": 1,
- "top_p": 1,
- "stop": [
- "string"
], - "frequency_penalty": 0,
- "presence_penalty": 0,
- "stream": true,
- "logprobs": true,
- "top_logprobs": 20
}
{- "id": "string",
- "created": 0,
- "model": "string",
- "choices": [
- {
- "index": 0,
- "message": {
- "role": "system",
- "content": "string",
- "tool_calls": [
- {
- "id": "string",
- "type": "string",
- "function": {
- "name": "string",
- "arguments": "string"
}
}
]
}, - "logprobs": {
- "content": [
- {
- "token": "string",
- "logprob": 0.1,
- "bytes": [
- 0
], - "top_logprobs": [
- {
- "token": "string",
- "logprob": 0.1,
- "bytes": [
- 0
]
}
]
}
]
}, - "finish_reason": "string"
}
], - "usage": {
- "completion_tokens": 0,
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Generates speech audio from input text.
model required | string The model that will be used to generate text-to-speech audio. |
input required | string The text to generate audio for, up to a maximum of 2,000 characters. |
voice required | string The voice that will be used in the generated audio. |
{- "model": "string",
- "input": "string",
- "voice": "string"
}
Creates a vector store collection for searchable embeddings.
name required | string The name of the vector store collection. This is also used to auto-generate a unique ID for the record. |
{- "name": "string"
}
{- "collection": {
- "id": "string",
- "name": "string"
}
}
Updates a vector store collection record.
id required | string The ID of the vector store collection. |
name required | string The name of the vector store collection. Note: the previously generated unique ID will remain the same. |
{- "name": "string"
}
{- "collection": {
- "id": "string",
- "name": "string"
}
}
Searches items in a vector store collection for the closest embeddings matches.
input required | string The text query to search against the embeddings items in the vector store collection. |
{- "input": "string"
}
{- "results": [
- {
- "id": "string",
- "created": "string",
- "content": "string"
}
], - "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Retrieve a list of items within a vector store collections.
id required | string The ID of the vector store collection. |
{- "items": [
- {
- "id": "string",
- "created": "string",
- "description": "string"
}
]
}
Adds an item to a vector store collection.
id required | string The ID of the vector store collection. |
content required | string The text to be converted into embeddings and stored in the vector store collection. |
description | string A description of the contents in this collection item record. If omitted, this value will default to a shortened version of the text stored in the collection. |
{- "content": "string",
- "description": "string"
}
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}, - "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0
}
}
Retrieve a vector store collection item by the ID.
id required | string The ID of the vector store collection. |
itemid required | string The ID of the vector store collection item. |
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}
}
Updates a vector store collection item record.
id required | string The ID of the vector store collection. |
itemid required | string The ID of the vector store collection item. |
description | string A description of the contents in this collection item record. |
{- "description": "string"
}
{- "item": {
- "id": "string",
- "created": "string",
- "description": "string",
- "content": "string"
}
}
Retrieve a list of files within a vector store collections.
id required | string The ID of the vector store collection. |
{- "files": [
- {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
]
}
Adds a file to a vector store collection.
id required | string The ID of the vector store collection. |
file | string <binary> The file object to be uploaded to the vector store collection. |
{- "file": {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
}
Retrieve a vector store collection file by the ID.
id required | string The ID of the vector store collection. |
fileid required | string The ID of the vector store collection file. |
{- "file": {
- "id": "string",
- "filename": "string",
- "status": "enqueued",
- "items": 0,
- "tokens": 0
}
}
Look up requests sent to the Vultr Inference API along with their response details.
period required | integer Enum: 15 30 45 60 The number of minutes to search back from the chosen timestamp, up to the previous hour. |
timestamp | string The UTC timestamp to search request logs from in ISO 8601 format, e.g. |
endpoint | string The name of the endpoint to narrow your request search. |
{- "period": 15,
- "timestamp": "string",
- "endpoint": "string"
}
{- "requests": [
- {
- "timestamp": "string",
- "method": "string",
- "endpoint": "string",
- "request_headers": "string",
- "request_body": "string",
- "response_body": "string",
- "response_code": 0
}
]
}