Custom LLM Requirements

Technical requirements and guidelines for integrating custom Large Language Models with Reqase Lite.

OpenAI API Compatibility

Critical Requirement

Your custom LLM must support OpenAI API-compatible endpoints. Reqase Lite uses the OpenAI client library to communicate with AI providers, so your endpoint must accept and respond in the same format.

What is OpenAI API Compatibility?

OpenAI API compatibility means your custom LLM endpoint must:

Accept OpenAI Request Format

Handle POST requests to /v1/chat/completions with OpenAI's JSON structure

Return OpenAI Response Format

Respond with the same JSON structure that OpenAI uses, including choices, messages, and metadata

Support Authentication

Accept API keys via Authorization: Bearer header

Handle Standard Parameters

Process parameters like temperature, max_tokens, and messages

Compatible Frameworks

The following frameworks and tools provide OpenAI-compatible API endpoints out of the box:

LiteLLM

Unified interface for 100+ LLMs with OpenAI-compatible endpoints. Supports Azure, Anthropic, Cohere, Hugging Face, and more.

Learn more →

LocalAI

Self-hosted OpenAI-compatible API for running LLMs locally. Supports llama.cpp, Whisper, Stable Diffusion, and more.

Learn more →

vLLM

High-throughput and memory-efficient inference engine with OpenAI-compatible server. Optimized for production deployments.

Learn more →

FastChat

Platform for training, serving, and evaluating LLMs. Includes OpenAI-compatible RESTful API server.

Learn more →

Request Format

Your custom LLM endpoint must accept POST requests with the following structure:

POST /v1/chat/completions
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "model": "your-model-name",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Generate test cases for login functionality"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2000
}

Key Parameters

model: The model identifier (e.g., "gpt-4", "llama-2-70b")
messages: Array of conversation messages with role and content
temperature: Controls randomness (0.0 to 1.0)
max_tokens: Maximum length of the generated response

Response Format

Your endpoint must return responses in this format:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "your-model-name",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Test Case 1: Valid Login\n..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 56,
    "completion_tokens": 31,
    "total_tokens": 87
  }
}

Required Fields

choices: Array containing the generated response(s)
choices[0].message.content: The actual generated text
choices[0].message.role: Must be "assistant"
model: The model that generated the response

Testing Your Endpoint

Before configuring your custom LLM in Reqase Lite, test it manually:

Test with cURL

curl -X POST https://your-endpoint.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Verify the response matches the OpenAI format shown above

Check that the choices[0].message.content field contains the generated text

Once verified, configure the endpoint in Reqase Lite's Custom AI Integration settings

Common Issues

❌ Incorrect Response Structure

Problem: Your endpoint returns a different JSON structure

Solution: Use a compatibility layer (like LiteLLM) to transform responses to OpenAI format

❌ Missing Required Fields

Problem: Response is missing choices or message.content

Solution: Ensure your endpoint includes all required fields in the response

❌ Authentication Errors

Problem: Endpoint doesn't accept Bearer token authentication

Solution: Configure your endpoint to accept Authorization: Bearer YOUR_API_KEY header

Next Steps

Once your custom LLM meets these requirements, you can configure it in Reqase Lite.

Configure AI Integration Troubleshooting Guide