U
UniLLM
PricingDocsAboutContactLoginSign Up

API Documentation

UniLLM provides an OpenAI-compatible REST API for accessing AI models. This page covers authentication, endpoints, SDK usage, rate limits, and error handling.

Base URL

https://unillm.ccwu.cc/api/v1

Authentication

All API requests require a Bearer token in the Authorization header. You can generate API keys from your API Keys dashboard.

Authorization: Bearer YOUR_API_KEY

Chat Completions

POST /v1/chat/completions

Creates a model response for the given conversation.

Request Body

  • model string (required) — The model to use, e.g. "deepseek-chat", "qwen-max", "glm-4-plus"
  • messages array (required) — An array of message objects with role ("system", "user", "assistant") and content (string)
  • temperature number (optional, 0–2, default 1) — Controls randomness. Lower values make output more deterministic.
  • max_tokens number (optional) — Maximum number of tokens to generate in the response.
  • stream boolean (optional, default false) — Whether to stream partial responses using Server-Sent Events.

Example Request

curl https://unillm.ccwu.cc/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.7
  }'

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1713000000,
  "model": "deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 9,
    "total_tokens": 29
  }
}

SDK Examples

Since UniLLM is fully compatible with the OpenAI API format, you can use the official OpenAI SDKs by changing the base URL.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://unillm.ccwu.cc/api/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://unillm.ccwu.cc/api/v1",
  apiKey: "YOUR_API_KEY",
});

const response = await client.chat.completions.create({
  model: "deepseek-chat",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Rate Limits

Rate limits are determined by your plan tier:

  • Starter: 60 requests per minute
  • Popular: 120 requests per minute
  • Enterprise: 300 requests per minute

If you exceed your rate limit, the API will return a 429 Too Many Requests response. Implement exponential backoff in your client for best results.

Error Codes

  • 401 Unauthorized — Invalid or missing API key.
  • 402 Payment Required — Insufficient credits. Purchase more credits from the Pricing page.
  • 429 Too Many Requests — Rate limit exceeded. Wait and retry with exponential backoff.
  • 500 Internal Server Error — An unexpected error occurred on our end. Please retry or contact support.
  • 503 Service Unavailable — The requested model is temporarily unavailable. This may occur during maintenance or upstream provider outages.

Contact

Questions about the API? Email us at support@unillm.ccwu.cc.

U
UniLLM

Independent developer-run AI inference platform

Product

  • Pricing
  • About

Developers

  • Models
  • Documentation
  • API Pricing

Legal

  • Terms of Service
  • Privacy Policy
  • Refund Policy

Contact

  • support@unillm.ccwu.cc
  • Contact Us

© 2026 UniLLM. All rights reserved.

Payments are securely processed by our Merchant of Record.