Skip to main content
POST
/
api
/
arena
/
dualchat
curl -X POST 'http://localhost:5079/api/arena/dualchat' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Write a haiku about programming",
    "selectionMode": "random",
    "temperature": 0.9
  }'
{
  "success": true,
  "agent1": {
    "object": "ai.response",
    "message": "Code flows like water\nThrough circuits of logic pure\nBugs hide in shadows",
    "model": {
      "name": "llama-3.3-70b-versatile",
      "displayName": "Llama 3.3 70B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 25,
      "totalTokens": 37
    },
    "responseTimeMs": 823,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "agent2": {
    "object": "ai.response",
    "message": "Silent keystrokes fall\nAlgorithms come alive\nCreation awaits",
    "model": {
      "name": "mixtral-8x7b-32768",
      "displayName": "Mixtral 8x7B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 22,
      "totalTokens": 34
    },
    "responseTimeMs": 1102,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "comparisonId": "c7d3a4b2-9e1f-4c5d-8b3a-7f6e9d2c1a0b",
  "arena": {
    "comparison": {
      "winnerByLength": "agent1",
      "winnerByTokens": "agent1",
      "verdict": "Agent 1 produced the longer, more token-heavy answer.",
      "userWinner": null,
      "agent1MessageLength": 68,
      "agent2MessageLength": 55,
      "agent1Tokens": 37,
      "agent2Tokens": 34
    },
    "models": {
      "agent1": "llama-3.3-70b-versatile",
      "agent2": "mixtral-8x7b-32768"
    }
  },
  "timestamp": "2024-01-15T10:30:00.000Z",
  "totalResponseTimeMs": 1102
}

Dual Chat (Arena Mode) Stable

Send a prompt to two AI models simultaneously. Models respond in parallel, and responses are returned anonymized for blind comparison.

Authentication Required

JWT Claims Extraction (Lines 338-350):
sub | ClaimTypes.NameIdentifier → User UUID (required)
email | ClaimTypes.Email → User email
full_name | name | ClaimTypes.Name → Display name

Request Body

prompt
string
required
User message sent to both modelsValidation (Line 200):
if (request == null || string.IsNullOrWhiteSpace(request.Prompt))
    return BadRequest("INVALID_REQUEST");
selectionMode
string
Model selection strategyValues:
  • "random": Two different random active models (Lines 246-249)
  • "topper": Top-performing model + random model (Lines 237-242)
  • Default if manual models provided: "manual" (Line 233)
Selection Logic:
if (request.SelectionMode == "topper") {
    var pair = await _leaderboardModelSelector.GetTopperAndRandomModelAsync();
} else {
    var pair = await _modelSelector.GetTwoRandomModelsAsync();
}
model1
string
First model name (manual selection)Validation (Lines 218-229):
if (manual && (string.IsNullOrWhiteSpace(request.Model1) || 
                string.IsNullOrWhiteSpace(request.Model2))) {
    return BadRequest("Both model1 and model2 are required");
}
Required: Only if model2 also provided
model2
string
Second model name (manual selection)Required: Only if model1 also provided
system
string
System prompt applied to both modelsDefault: Implementation-defined and not guaranteed (provider-specific)
maxTokens
integer
Maximum tokens per model responseApplies: To both models independently
temperature
number
Sampling temperature (0.0-2.0) for both modelsDefault: Not specified by API contract (provider-defined)
sessionId
string
Session tracking identifierAuto-generation (Line 197):
var sessionId = Guid.NewGuid();
threadId
string
Thread UUID to associate comparison with conversationValidation (Lines 354-360):
if (!string.IsNullOrEmpty(request.ThreadId)) {
    if (Guid.TryParse(request.ThreadId, out Guid threadIdGuid)) {
        await _threadMessagesService.LogDualAsync(...);
    }
}

Response

success
boolean
Always true on success (Line 392)
agent1
object
First model response (same structure as single chat)
agent2
object
Second model response (same structure as agent1)
comparisonId
string
UUID identifying this comparison (Line 198, 395)Used for: Voting via /api/arena/model-vote
arena
object
timestamp
string
ISO8601 UTC timestamp (Line 415)
totalResponseTimeMs
integer
Total request duration (Lines 265, 416)Note: Due to parallel execution, approximately equal to slowest model time
curl -X POST 'http://localhost:5079/api/arena/dualchat' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Write a haiku about programming",
    "selectionMode": "random",
    "temperature": 0.9
  }'
{
  "success": true,
  "agent1": {
    "object": "ai.response",
    "message": "Code flows like water\nThrough circuits of logic pure\nBugs hide in shadows",
    "model": {
      "name": "llama-3.3-70b-versatile",
      "displayName": "Llama 3.3 70B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 25,
      "totalTokens": 37
    },
    "responseTimeMs": 823,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "agent2": {
    "object": "ai.response",
    "message": "Silent keystrokes fall\nAlgorithms come alive\nCreation awaits",
    "model": {
      "name": "mixtral-8x7b-32768",
      "displayName": "Mixtral 8x7B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 22,
      "totalTokens": 34
    },
    "responseTimeMs": 1102,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "comparisonId": "c7d3a4b2-9e1f-4c5d-8b3a-7f6e9d2c1a0b",
  "arena": {
    "comparison": {
      "winnerByLength": "agent1",
      "winnerByTokens": "agent1",
      "verdict": "Agent 1 produced the longer, more token-heavy answer.",
      "userWinner": null,
      "agent1MessageLength": 68,
      "agent2MessageLength": 55,
      "agent1Tokens": 37,
      "agent2Tokens": 34
    },
    "models": {
      "agent1": "llama-3.3-70b-versatile",
      "agent2": "mixtral-8x7b-32768"
    }
  },
  "timestamp": "2024-01-15T10:30:00.000Z",
  "totalResponseTimeMs": 1102
}

Side Effects

Database Mutations (Lines 333-360):
  1. message_logs table (Lines 333-334):
    await _messageLogger.LogMessageAsync(sessionId, finalModel1, "agent1", request, response1);
    await _messageLogger.LogMessageAsync(sessionId, finalModel2, "agent2", request, response2);
    
    • Two separate log entries (one per model)
  2. users table UPSERT (Lines 336-350):
    await _userSyncService.EnsureUserExistsAsync(userId, email, name);
    
    • Executes on every authenticated request
    • Idempotent operation
  3. comparisons table (Line 352):
    await _comparisonLogger.LogComparisonAsync(comparisonId, request, response1, response2, userId);
    
    • Stores comparison data for voting/leaderboard
    • Links to both models
  4. thread_messages table (conditional, Lines 354-360):
    if (!string.IsNullOrEmpty(request.ThreadId)) {
        await _threadMessagesService.LogDualAsync(threadIdGuid, request.Prompt, 
            finalModel1, finalModel2, response1, response2, comparisonId);
    }
    
    • Only if threadId provided and valid GUID
    • Links message to comparison via comparisonId

Behavior

Parallel Execution (Lines 254-257):
var task1 = ExecuteWithFallbackAsync(model1, ...);
var task2 = ExecuteWithFallbackAsync(model2, ...);

await Task.WhenAll(task1, task2);
Independence:
  • Each model has independent 45s timeout
  • Each model has independent fallback chain
  • One model failure doesn’t block the other
Response Time (Line 265):
var responseTime = (long)(DateTime.UtcNow - startTime).TotalMilliseconds;
  • Measures total elapsed time
  • Due to parallel execution: max(model1_time, model2_time) + overhead
Selection Modes (Lines 213-251):
ModeLogicLine Range
Manual!string.IsNullOrWhiteSpace(request.Model1 || Model2)218-233
Topperrequest.SelectionMode == "topper"237-242
RandomDefault244-250
Topper Mode Implementation (Lines 239-241):
var pair = await _leaderboardModelSelector.GetTopperAndRandomModelAsync();
  • Queries model_votes table for highest win rate
  • Pairs top model with random model
  • Ensures diverse comparison
Arena Comparison Logic (Lines 362-388):
var msg1Len = (response1.Message ?? string.Empty).Length;
var msg2Len = (response2.Message ?? string.Empty).Length;
var tokens1 = response1.Usage?.TotalTokens ?? 0;
var tokens2 = response2.Usage?.TotalTokens ?? 0;

// Determine winners by length and tokens
// Generate verdict based on combination

Error Conditions

CodeHTTPCauseController Line
INVALID_REQUEST400Prompt null/whitespace202-208
INVALID_REQUEST400Manual mode missing model1 or model2222-228
API_ERROR500Inner exception (provider failure)424-430
API_ERROR500Outer exception (unexpected error)436-442
Nested Exception Handling (Lines 421-443):
try {
    // Dual chat logic
} catch (Exception ex) {
    _logger.LogError(ex, "DualChat inner error");
    return 500 API_ERROR;
}
} catch (Exception outerEx) {
    _logger.LogError(outerEx, "DualChat outer error");
    return 500 API_ERROR;
}
Partial Execution: If one model succeeds and one fails, both tasks still complete. Full dual response returned if both succeed. If either fails, exception bubbles to error handler.

Edge Cases

  1. Same model selected twice: Not prevented by code, allowed in random selection
  2. Topper mode with insufficient vote data: Behavior not enforced by server contract (assumed fallback to random selection)
  3. Invalid threadId GUID: Silently skipped, no error (Line 356 guard)
  4. Model fallback changes model names: finalModel1 and finalModel2 may differ from requested models
  5. Null usage stats: Handled with null-coalescing (Lines 366-367)

Comparison ID Usage

Generated at request start (Line 198):
var comparisonId = Guid.NewGuid();
Used for:
  1. Logging comparison to comparisons table (Line 352)
  2. Linking to thread message in thread_messages table (Line 358)
  3. Returned in response for voting (Line 395)
  4. Voting endpoint requires this ID: POST /api/arena/model-vote

Rate Limits

No explicit rate limiting in controller. Provider-level limits apply:
  • Groq free tier: 30 req/min, 14,400 tokens/min
  • Dual chat consumes 2× tokens (both models)
  • Effective limit: ~15 dual-chat requests/min on free tier