LLM Services API
OpenAI-compatible language model endpoints.
Base URL: https://api.bluenexus.ai/api/v1
POST /chat/completions
Create a chat completion. Fully OpenAI-compatible.
Auth: Bearer token with llm-all scope
Rate limit: 30 requests/minute
Credit guard: Requires positive balance
Request:
{
"model": "bluenexus/glm-4.7-flash-tee",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 500,
"stream": false,
"top_p": 1,
"stop": null,
"tools": [],
"tool_choice": "auto",
"response_format": {"type": "text"},
"seed": null
}
Response (200):
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1713000000,
"model": "bluenexus/glm-4.7-flash-tee",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 8,
"total_tokens": 23
}
}
Response Headers:
X-Credits-Consumed— Credits usedX-Credits-Remaining— Remaining balance
Streaming
Set "stream": true for SSE streaming:
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"}}]}
data: [DONE]
GET /models
List available LLM models.
Auth: Bearer token with llm-all scope
Response (200): Array of model objects with IDs and capabilities.
Compatibility
Works with any OpenAI-compatible client:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_BLUENEXUS_TOKEN",
base_url="https://api.bluenexus.ai/api/v1"
)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_BLUENEXUS_TOKEN",
baseURL: "https://api.bluenexus.ai/api/v1",
});