POST

api

arena

dualchat

curl -X POST 'http://localhost:5079/api/arena/dualchat' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Write a haiku about programming",
    "selectionMode": "random",
    "temperature": 0.9
  }'

{
  "success": true,
  "agent1": {
    "object": "ai.response",
    "message": "Code flows like water\nThrough circuits of logic pure\nBugs hide in shadows",
    "model": {
      "name": "llama-3.3-70b-versatile",
      "displayName": "Llama 3.3 70B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 25,
      "totalTokens": 37
    },
    "responseTimeMs": 823,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "agent2": {
    "object": "ai.response",
    "message": "Silent keystrokes fall\nAlgorithms come alive\nCreation awaits",
    "model": {
      "name": "mixtral-8x7b-32768",
      "displayName": "Mixtral 8x7B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 22,
      "totalTokens": 34
    },
    "responseTimeMs": 1102,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "comparisonId": "c7d3a4b2-9e1f-4c5d-8b3a-7f6e9d2c1a0b",
  "arena": {
    "comparison": {
      "winnerByLength": "agent1",
      "winnerByTokens": "agent1",
      "verdict": "Agent 1 produced the longer, more token-heavy answer.",
      "userWinner": null,
      "agent1MessageLength": 68,
      "agent2MessageLength": 55,
      "agent1Tokens": 37,
      "agent2Tokens": 34
    },
    "models": {
      "agent1": "llama-3.3-70b-versatile",
      "agent2": "mixtral-8x7b-32768"
    }
  },
  "timestamp": "2024-01-15T10:30:00.000Z",
  "totalResponseTimeMs": 1102
}

Dual Chat (Arena Mode) Stable

Send a prompt to two AI models simultaneously. Models respond in parallel, and responses are returned anonymized for blind comparison.

Authentication Required

JWT Claims Extraction (Lines 338-350):

sub | ClaimTypes.NameIdentifier → User UUID (required)
email | ClaimTypes.Email → User email
full_name | name | ClaimTypes.Name → Display name

Request Body

prompt

string

required

User message sent to both modelsValidation (Line 200):

if (request == null || string.IsNullOrWhiteSpace(request.Prompt))
    return BadRequest("INVALID_REQUEST");

selectionMode

string

Model selection strategyValues:

"random": Two different random active models (Lines 246-249)
"topper": Top-performing model + random model (Lines 237-242)
Default if manual models provided: "manual" (Line 233)

Selection Logic:

if (request.SelectionMode == "topper") {
    var pair = await _leaderboardModelSelector.GetTopperAndRandomModelAsync();
} else {
    var pair = await _modelSelector.GetTwoRandomModelsAsync();
}

model1

string

First model name (manual selection)Validation (Lines 218-229):

if (manual && (string.IsNullOrWhiteSpace(request.Model1) || 
                string.IsNullOrWhiteSpace(request.Model2))) {
    return BadRequest("Both model1 and model2 are required");
}

Required: Only if model2 also provided

model2

string

Second model name (manual selection)Required: Only if model1 also provided

system

string

System prompt applied to both modelsDefault: Implementation-defined and not guaranteed (provider-specific)

maxTokens

integer

Maximum tokens per model responseApplies: To both models independently

temperature

number

Sampling temperature (0.0-2.0) for both modelsDefault: Not specified by API contract (provider-defined)

sessionId

string

Session tracking identifierAuto-generation (Line 197):

var sessionId = Guid.NewGuid();

threadId

string

Thread UUID to associate comparison with conversationValidation (Lines 354-360):

if (!string.IsNullOrEmpty(request.ThreadId)) {
    if (Guid.TryParse(request.ThreadId, out Guid threadIdGuid)) {
        await _threadMessagesService.LogDualAsync(...);
    }
}

Response

success

boolean

Always true on success (Line 392)

agent1

object

First model response (same structure as single chat)

Show properties

object

string

"ai.response"

output

object

Content output structure

message

string

AI response text

model

object

Model metadata (name, displayName, provider)

usage

object

Token usage statistics

responseTimeMs

integer

Model 1 inference time

timestamp

string

ISO8601 timestamp

agent2

object

Second model response (same structure as agent1)

comparisonId

string

UUID identifying this comparison (Line 198, 395)Used for: Voting via /api/arena/model-vote

arena

object

Show properties

comparison

object

Show properties

winnerByLength

string

"agent1", "agent2", or "tie" (Lines 369-372)Logic:

if (msg1Len > msg2Len) return "agent1";
else if (msg2Len > msg1Len) return "agent2";
else return "tie";

winnerByTokens

string

"agent1", "agent2", or "tie" (Lines 374-377)

verdict

string

Human-readable comparison summary (Lines 379-388)Examples:

"Both agents produced similar length and token usage."
"Agent 1 produced the longer, more token-heavy answer."
"Agents traded wins on length vs. tokens; review both answers manually."

userWinner

string

Always null (Line 403, vote not submitted yet)

agent1MessageLength

integer

Character count of agent1 response (Line 404)

agent2MessageLength

integer

Character count of agent2 response (Line 405)

agent1Tokens

integer

Total tokens for agent1 (Line 406)

agent2Tokens

integer

Total tokens for agent2 (Line 407)

models

object

Show properties

agent1

string

Model name for agent1 (Line 411)

agent2

string

Model name for agent2 (Line 412)

timestamp

string

ISO8601 UTC timestamp (Line 415)

totalResponseTimeMs

integer

Total request duration (Lines 265, 416)Note: Due to parallel execution, approximately equal to slowest model time

curl -X POST 'http://localhost:5079/api/arena/dualchat' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Write a haiku about programming",
    "selectionMode": "random",
    "temperature": 0.9
  }'

{
  "success": true,
  "agent1": {
    "object": "ai.response",
    "message": "Code flows like water\nThrough circuits of logic pure\nBugs hide in shadows",
    "model": {
      "name": "llama-3.3-70b-versatile",
      "displayName": "Llama 3.3 70B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 25,
      "totalTokens": 37
    },
    "responseTimeMs": 823,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "agent2": {
    "object": "ai.response",
    "message": "Silent keystrokes fall\nAlgorithms come alive\nCreation awaits",
    "model": {
      "name": "mixtral-8x7b-32768",
      "displayName": "Mixtral 8x7B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 22,
      "totalTokens": 34
    },
    "responseTimeMs": 1102,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "comparisonId": "c7d3a4b2-9e1f-4c5d-8b3a-7f6e9d2c1a0b",
  "arena": {
    "comparison": {
      "winnerByLength": "agent1",
      "winnerByTokens": "agent1",
      "verdict": "Agent 1 produced the longer, more token-heavy answer.",
      "userWinner": null,
      "agent1MessageLength": 68,
      "agent2MessageLength": 55,
      "agent1Tokens": 37,
      "agent2Tokens": 34
    },
    "models": {
      "agent1": "llama-3.3-70b-versatile",
      "agent2": "mixtral-8x7b-32768"
    }
  },
  "timestamp": "2024-01-15T10:30:00.000Z",
  "totalResponseTimeMs": 1102
}

Side Effects

Database Mutations (Lines 333-360):

message_logs table (Lines 333-334):

await _messageLogger.LogMessageAsync(sessionId, finalModel1, "agent1", request, response1);
await _messageLogger.LogMessageAsync(sessionId, finalModel2, "agent2", request, response2);

Two separate log entries (one per model)

users table UPSERT (Lines 336-350):
```
await _userSyncService.EnsureUserExistsAsync(userId, email, name);
```
- Executes on every authenticated request
- Idempotent operation

comparisons table (Line 352):

await _comparisonLogger.LogComparisonAsync(comparisonId, request, response1, response2, userId);

Stores comparison data for voting/leaderboard
Links to both models

thread_messages table (conditional, Lines 354-360):

if (!string.IsNullOrEmpty(request.ThreadId)) {
    await _threadMessagesService.LogDualAsync(threadIdGuid, request.Prompt, 
        finalModel1, finalModel2, response1, response2, comparisonId);
}

Only if threadId provided and valid GUID
Links message to comparison via comparisonId

Behavior

Parallel Execution (Lines 254-257):

var task1 = ExecuteWithFallbackAsync(model1, ...);
var task2 = ExecuteWithFallbackAsync(model2, ...);

await Task.WhenAll(task1, task2);

Independence:

Each model has independent 45s timeout
Each model has independent fallback chain
One model failure doesn’t block the other

Response Time (Line 265):

var responseTime = (long)(DateTime.UtcNow - startTime).TotalMilliseconds;

Measures total elapsed time
Due to parallel execution: max(model1_time, model2_time) + overhead

Selection Modes (Lines 213-251):

Mode	Logic	Line Range
Manual	`!string.IsNullOrWhiteSpace(request.Model1 \|\| Model2)`	218-233
Topper	`request.SelectionMode == "topper"`	237-242
Random	Default	244-250

Topper Mode Implementation (Lines 239-241):

var pair = await _leaderboardModelSelector.GetTopperAndRandomModelAsync();

Queries model_votes table for highest win rate
Pairs top model with random model
Ensures diverse comparison

Arena Comparison Logic (Lines 362-388):

var msg1Len = (response1.Message ?? string.Empty).Length;
var msg2Len = (response2.Message ?? string.Empty).Length;
var tokens1 = response1.Usage?.TotalTokens ?? 0;
var tokens2 = response2.Usage?.TotalTokens ?? 0;

// Determine winners by length and tokens
// Generate verdict based on combination

Error Conditions

Code	HTTP	Cause	Controller Line
`INVALID_REQUEST`	400	Prompt null/whitespace	202-208
`INVALID_REQUEST`	400	Manual mode missing model1 or model2	222-228
`API_ERROR`	500	Inner exception (provider failure)	424-430
`API_ERROR`	500	Outer exception (unexpected error)	436-442

Nested Exception Handling (Lines 421-443):

try {
    // Dual chat logic
} catch (Exception ex) {
    _logger.LogError(ex, "DualChat inner error");
    return 500 API_ERROR;
}
} catch (Exception outerEx) {
    _logger.LogError(outerEx, "DualChat outer error");
    return 500 API_ERROR;
}

Partial Execution: If one model succeeds and one fails, both tasks still complete. Full dual response returned if both succeed. If either fails, exception bubbles to error handler.

Edge Cases

Same model selected twice: Not prevented by code, allowed in random selection
Topper mode with insufficient vote data: Behavior not enforced by server contract (assumed fallback to random selection)
Invalid threadId GUID: Silently skipped, no error (Line 356 guard)
Model fallback changes model names: finalModel1 and finalModel2 may differ from requested models
Null usage stats: Handled with null-coalescing (Lines 366-367)

Comparison ID Usage

Generated at request start (Line 198):

var comparisonId = Guid.NewGuid();

Used for:

Logging comparison to comparisons table (Line 352)
Linking to thread message in thread_messages table (Line 358)
Returned in response for voting (Line 395)
Voting endpoint requires this ID: POST /api/arena/model-vote

Rate Limits

No explicit rate limiting in controller. Provider-level limits apply:

Groq free tier: 30 req/min, 14,400 tokens/min
Dual chat consumes 2× tokens (both models)
Effective limit: ~15 dual-chat requests/min on free tier

POST /api/arena/chat/streamStream AI model responses in real-time using Server-Sent Events. Tokens are delivered as they are generated.

curl -X POST 'http://localhost:5079/api/arena/dualchat' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "prompt": "Write a haiku about programming",
    "selectionMode": "random",
    "temperature": 0.9
  }'

{
  "success": true,
  "agent1": {
    "object": "ai.response",
    "message": "Code flows like water\nThrough circuits of logic pure\nBugs hide in shadows",
    "model": {
      "name": "llama-3.3-70b-versatile",
      "displayName": "Llama 3.3 70B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 25,
      "totalTokens": 37
    },
    "responseTimeMs": 823,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "agent2": {
    "object": "ai.response",
    "message": "Silent keystrokes fall\nAlgorithms come alive\nCreation awaits",
    "model": {
      "name": "mixtral-8x7b-32768",
      "displayName": "Mixtral 8x7B",
      "provider": "groq"
    },
    "usage": {
      "promptTokens": 12,
      "completionTokens": 22,
      "totalTokens": 34
    },
    "responseTimeMs": 1102,
    "timestamp": "2024-01-15T10:30:00.000Z"
  },
  "comparisonId": "c7d3a4b2-9e1f-4c5d-8b3a-7f6e9d2c1a0b",
  "arena": {
    "comparison": {
      "winnerByLength": "agent1",
      "winnerByTokens": "agent1",
      "verdict": "Agent 1 produced the longer, more token-heavy answer.",
      "userWinner": null,
      "agent1MessageLength": 68,
      "agent2MessageLength": 55,
      "agent1Tokens": 37,
      "agent2Tokens": 34
    },
    "models": {
      "agent1": "llama-3.3-70b-versatile",
      "agent2": "mixtral-8x7b-32768"
    }
  },
  "timestamp": "2024-01-15T10:30:00.000Z",
  "totalResponseTimeMs": 1102
}

​Dual Chat (Arena Mode) Stable

​Authentication Required

​Request Body

​Response

​Side Effects

​Behavior

​Error Conditions

​Edge Cases

​Comparison ID Usage

​Rate Limits

Dual Chat (Arena Mode) Stable

Authentication Required

Request Body

Response

Side Effects

Behavior

Error Conditions

Edge Cases

Comparison ID Usage

Rate Limits