Custom LLM Requirements
Technical requirements and guidelines for integrating custom Large Language Models with Reqase Lite.
OpenAI API Compatibility
Critical Requirement
Your custom LLM must support OpenAI API-compatible endpoints. Reqase Lite uses the OpenAI client library to communicate with AI providers, so your endpoint must accept and respond in the same format.
What is OpenAI API Compatibility?
OpenAI API compatibility means your custom LLM endpoint must:
Accept OpenAI Request Format
Handle POST requests to /v1/chat/completions with OpenAI's JSON structure
Return OpenAI Response Format
Respond with the same JSON structure that OpenAI uses, including choices, messages, and metadata
Support Authentication
Accept API keys via Authorization: Bearer header
Handle Standard Parameters
Process parameters like temperature, max_tokens, and messages
Compatible Frameworks
The following frameworks and tools provide OpenAI-compatible API endpoints out of the box:
LiteLLM
Unified interface for 100+ LLMs with OpenAI-compatible endpoints. Supports Azure, Anthropic, Cohere, Hugging Face, and more.
Learn more →LocalAI
Self-hosted OpenAI-compatible API for running LLMs locally. Supports llama.cpp, Whisper, Stable Diffusion, and more.
Learn more →vLLM
High-throughput and memory-efficient inference engine with OpenAI-compatible server. Optimized for production deployments.
Learn more →FastChat
Platform for training, serving, and evaluating LLMs. Includes OpenAI-compatible RESTful API server.
Learn more →Request Format
Your custom LLM endpoint must accept POST requests with the following structure:
POST /v1/chat/completions
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
{
"model": "your-model-name",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Generate test cases for login functionality"
}
],
"temperature": 0.7,
"max_tokens": 2000
}Key Parameters
- model: The model identifier (e.g., "gpt-4", "llama-2-70b")
- messages: Array of conversation messages with role and content
- temperature: Controls randomness (0.0 to 1.0)
- max_tokens: Maximum length of the generated response
Response Format
Your endpoint must return responses in this format:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "your-model-name",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Test Case 1: Valid Login\n..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 56,
"completion_tokens": 31,
"total_tokens": 87
}
}Required Fields
- choices: Array containing the generated response(s)
- choices[0].message.content: The actual generated text
- choices[0].message.role: Must be "assistant"
- model: The model that generated the response
Testing Your Endpoint
Before configuring your custom LLM in Reqase Lite, test it manually:
Test with cURL
curl -X POST https://your-endpoint.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model",
"messages": [{"role": "user", "content": "Hello"}]
}'Verify the response matches the OpenAI format shown above
Check that the choices[0].message.content field contains the generated text
Once verified, configure the endpoint in Reqase Lite's Custom AI Integration settings
Common Issues
❌ Incorrect Response Structure
Problem: Your endpoint returns a different JSON structure
Solution: Use a compatibility layer (like LiteLLM) to transform responses to OpenAI format
❌ Missing Required Fields
Problem: Response is missing choices or message.content
Solution: Ensure your endpoint includes all required fields in the response
❌ Authentication Errors
Problem: Endpoint doesn't accept Bearer token authentication
Solution: Configure your endpoint to accept Authorization: Bearer YOUR_API_KEY header
Next Steps
Once your custom LLM meets these requirements, you can configure it in Reqase Lite.