Endpoints
Chat Completions
Create chat completions using any supported model.
POST
Chat Completions
Create Chat Completion
Generate a response from an AI model given a conversation history.Request
Headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer your_api_key |
Content-Type | Yes | application/json |
Body Parameters
The model ID to use (e.g.,
claude-sonnet-4-5, gpt-4o, gemini-2.5-pro). See Supported Models for the full list.An array of message objects representing the conversation history.Each message object has:
role(string): One ofsystem,user, orassistantcontent(string): The message content
If
true, responses are streamed back as Server-Sent Events (SSE).Sampling temperature between 0 and 2. Lower values make output more focused and deterministic.
Maximum number of tokens to generate in the response.
Nucleus sampling parameter. An alternative to temperature.
Up to 4 sequences where the API will stop generating further tokens.
Example Request
Response
Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier for the completion |
object | string | Always chat.completion |
created | integer | Unix timestamp of creation |
model | string | The model used |
choices | array | Array of completion choices |
choices[].message | object | The generated message |
choices[].finish_reason | string | stop, length, or content_filter |
usage | object | Token usage statistics |
Streaming
Set"stream": true to receive Server-Sent Events (SSE) as the model generates tokens.
Streaming Request
Streaming Response
Each event is a JSON object prefixed withdata: :
Streaming with Python
Streaming with Node.js
Chat Completions