Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "stream": true,
  "temperature": 123,
  "max_tokens": 123,
  "top_p": 123,
  "stop": {}
}
'

Create Chat Completion

Generate a response from an AI model given a conversation history.

Request

POST https://api.modelstack.cc/v1/chat/completions

Headers

HeaderRequiredDescription
AuthorizationYesBearer your_api_key
Content-TypeYesapplication/json

Body Parameters

model
string
required
The model ID to use (e.g., claude-sonnet-4-5, gpt-4o, gemini-2.5-pro). See Supported Models for the full list.
messages
array
required
An array of message objects representing the conversation history.Each message object has:
  • role (string): One of system, user, or assistant
  • content (string): The message content
stream
boolean
default:"false"
If true, responses are streamed back as Server-Sent Events (SSE).
temperature
number
default:"1.0"
Sampling temperature between 0 and 2. Lower values make output more focused and deterministic.
max_tokens
integer
Maximum number of tokens to generate in the response.
top_p
number
default:"1.0"
Nucleus sampling parameter. An alternative to temperature.
stop
string | array
Up to 4 sequences where the API will stop generating further tokens.

Example Request

curl https://api.modelstack.cc/v1/chat/completions \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "claude-sonnet-4-5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum bits (qubits) instead of classical bits..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Response Fields

FieldTypeDescription
idstringUnique identifier for the completion
objectstringAlways chat.completion
createdintegerUnix timestamp of creation
modelstringThe model used
choicesarrayArray of completion choices
choices[].messageobjectThe generated message
choices[].finish_reasonstringstop, length, or content_filter
usageobjectToken usage statistics

Streaming

Set "stream": true to receive Server-Sent Events (SSE) as the model generates tokens.

Streaming Request

curl https://api.modelstack.cc/v1/chat/completions \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "Write a haiku about coding."}],
    "stream": true
  }'

Streaming Response

Each event is a JSON object prefixed with data: :
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Lines"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" of"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" code"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Streaming with Python

stream = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Write a haiku about coding."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Streaming with Node.js

const stream = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [{ role: "user", content: "Write a haiku about coding." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}