POST /api/arena/chat
Send a prompt to a single AI model and receive a response. Supports automatic model selection, custom system prompts, and thread persistence.
POST
Single Chat Stable
Send a prompt to a single AI model. Supports automatic or manual model selection with a 3-tier fallback chain.Authentication Required
JWT Claims Extraction (Controller Lines 154-166):Request Body
User message textValidation (Line 89):Constraints:
- MUST NOT be null
- MUST NOT be whitespace-only
- No max length enforced (provider-dependent)
Model name or Behavior:
"auto" for random selectionSelection Logic (Lines 102-108):null, empty,"auto"→ Random active model- Specific name → Direct model usage
System prompt (maps internally to
request.System)Default: Implementation-defined and not guaranteed (provider-specific)Maximum response tokensLimits: Provider-dependent (checked at provider level)
Sampling temperatureRange: 0.0 (deterministic) to 2.0 (maximum creativity)Default: Not specified by API contract (provider-defined)
Session identifierAuto-generation (Line 87):
Thread UUID for message persistenceValidation (Lines 168-174):Behavior: If invalid GUID or omitted, message not persisted to thread
Response
Always
"ai.response" (Line 121)Always
true on success (Line 130)AI response text (Line 131, mirrors
output.content[0].text)Echo of user prompt (Line 138)
"automatic" or "manual" (Line 139)Total duration in milliseconds (Lines 115, 140)Includes: Fallback retry time if primary provider failed
ISO8601 UTC timestamp (Line 147)
Side Effects
Database Mutations (Lines 150-174):-
message_logs table (Line 150):
-
users table UPSERT (Lines 152-166):
- Executes on every authenticated request
- Idempotent UPSERT operation
- Creates user if not exists, updates if exists
-
thread_messages table (conditional, Lines 168-174):
- Only if
threadIdprovided and valid GUID - Links message to existing thread
- Only if
Behavior
Provider Execution with Fallback (Lines 111, 528-593):nullor"auto": Queryai_modelstable for random active model- Specific model name: Direct lookup in model registry
- Happens after AI inference (Lines 152-166)
- Non-blocking (awaited)
- Failure behavior not enforced by server contract
- Happens after AI inference and user sync
- Only if
threadIdprovided - Only if
threadIdvalid GUID - Failure would bubble to 500 error
Error Conditions
| Code | HTTP | Cause | Controller Line |
|---|---|---|---|
INVALID_REQUEST | 400 | Prompt null or whitespace | 91-97 |
UNAUTHORIZED | 401 | Missing/invalid JWT | Middleware |
API_ERROR | 500 | Provider failure | 180-186 |
API_ERROR | 500 | Uncaught exception | 178-187 |
Edge Cases
- Invalid threadId GUID: Silently skipped, no error (Line 170 guard)
- User sync failure: Logged as warning, request continues (implicit in UserSyncService)
- Model not found: Fallback chain triggered
- All providers fail: 500 error after ~135s
- Empty model name: Treated as
"auto"(Line 102 check)
Rate Limits
No explicit rate limiting in controller. Provider-level limits apply:- Groq free tier: 30 req/min, 14,400 tokens/min
- Groq paid tier: Higher limits (check API dashboard)
- Bytez: Provider-dependent