Chat Modes - DualMind Lab

Why two modes?

DualMind offers two distinct chat modes serving different purposes:

Mode	Models	Use Case	Voting
Single Chat	1 model	Quick AI responses	No
Dual Chat	2 models		Yes

Single Chat

Purpose

Single chat provides a straightforward AI inference endpoint. Send a prompt, receive a response. Use Cases:

Building chatbots with consistent model choice
Testing specific model behavior
Production applications requiring predictable responses
Conversations where model identity matters

How It Works

Model Selection

Client specifies model name OR requests random selection.

Provider Routing

Backend routes to Groq (primary) or Bytez (fallback).

Inference

AI model processes prompt and generates response.

Response

Client receives message, model info, usage stats, and timing.

Characteristics

Aspect	Behavior
Response time	~1-3 seconds typical
Model choice	Explicit or random
Database writes	Message logged if `threadId` provided
Streaming support	Yes (via SSE endpoint)
Parallel execution	No (single model)

Single chat is the foundation. Dual chat builds on this by running two parallel single-chat executions.

Dual Chat (Arena Mode)

Purpose

Dual chat compares two models with the same prompt under identical conditions. Users vote on which response is better, creating comparative quality data. Use Cases:

Benchmarking model quality
Collecting user preferences
Building model leaderboards
Running blind comparisons (users don’t see model names until after voting)

How It Works

Model Pairing

Backend selects two models based on selection mode (random, topper, or manual).

Parallel Execution

Both models receive identical prompt simultaneously via Task.WhenAll().

Independent Fallback

Each model has its own 45s timeout and independent fallback chain.

Arena Verdict

Backend calculates winner by response length and token count.

Comparison Logging

Comparison record created with both responses and timing metrics.

Selection Modes

Random
Topper
Manual

Mode: randomSelects two different models randomly from active model pool.Why use this?

Unbiased comparisons
Discovering unexpected model differences
Equal exposure for all models
Building diverse comparison dataset

Mode: topperPairs top-performing model (highest win rate) against random model.Why use this?

Benchmark new models against current leader
Challenge the champion
Test if top model maintains quality
Accelerate model ranking stabilization

Mode: manualClient explicitly specifies both models.Why use this?

Controlled A/B testing
Specific model matchups
Reproducing comparisons
Custom tournament brackets

Parallel Execution

Both models execute simultaneously, not sequentially. Performance Benefit:

Sequential: Model1 (2s) + Model2 (2s) = 4 seconds total
Parallel:   max(Model1 (2s), Model2 (2s)) = ~2 seconds total

If one model fails, the other completes independently. Partial failures result in single-model response rather than complete failure.

Arena Verdict

The system automatically computes a comparison verdict:

Metric	Calculation
Winner by length	Model with longer response text
Winner by tokens	Model with higher completion token count
Verdict text	Human-readable summary

Example Verdict:

“Agent 2 (Mixtral) provided a slightly longer response with more detailed explanations”

Automatic verdict is informational only. User votes determine actual winner for statistics.

Key Differences

Feature	Single Chat	Dual Chat
Models executed	1	2
Execution	Sequential (with fallback)	Parallel
Comparison ID	None	Generated UUID
Voting support	No	Yes
Response structure
Performance	Faster (1 model)	Slower (2 models) but parallelized
Database writes	1 message	2 messages + 1 comparison

When to Use Each Mode

Choose Single Chat When:

Building a chatbot with consistent model behavior
Model identity is known and important
Minimizing latency (faster than dual-chat)
User doesn’t need to compare models
Streaming response (SSE) is priority

Choose Dual Chat When:

Quality comparison is the goal
Collecting user votes on model preference
Building leaderboards or benchmarks
Running blind tests (hide model names initially)
Research requires comparative data

Streaming Considerations

Single Chat: Full streaming support via SSE endpoint Dual Chat: No streaming support currently Why no dual-chat streaming?

Complexity: Two parallel SSE streams harder to manage client-side
Use case: Arena comparisons typically need full responses for fair comparison
Future: Could support if use case emerges

For long prompts in dual-chat, consider using single-chat SSE endpoint twice sequentially if streaming is needed.

Database Persistence

Single Chat Writes

If threadId provided:

1 row in thread_messages table
Model response, prompt, timing stored
Links to thread for conversation history

Dual Chat Writes

Always writes:

1 row in comparisons table (comparison ID, both models, responses, timing)
1 row in thread_messages (if threadId provided)
Links message to comparison via comparison_id foreign key

Future votes reference the comparison ID.

Model Selection Transparency

Single Chat

Model name returned in response. User always knows which model generated response.

Dual Chat

Models identified as “agent1” and “agent2” in response. Model names included, enabling:

Revealed Arena: Show model names immediately
Blind Arena: Hide names until after vote (client-side logic)

For unbiased voting, hide model names until user submits vote. This prevents brand bias affecting quality assessment.

Next Steps

Model Selection

How models are chosen for inference

Voting System

How user votes affect statistics

Streaming Protocol

SSE implementation for single chat

Thread Management

Persisting conversations

​Why two modes?

​Single Chat

​Purpose

​How It Works

​Characteristics

​Dual Chat (Arena Mode)

​Purpose

​How It Works

​Selection Modes

​Parallel Execution

​Arena Verdict

​Key Differences

​When to Use Each Mode

​Choose Single Chat When:

​Choose Dual Chat When:

​Streaming Considerations

​Database Persistence

​Single Chat Writes

​Dual Chat Writes

​Model Selection Transparency

​Single Chat

​Dual Chat

​Next Steps

Model Selection

Voting System

Streaming Protocol

Thread Management

Why two modes?

Single Chat

Purpose

How It Works

Characteristics

Dual Chat (Arena Mode)

Purpose

How It Works

Selection Modes

Parallel Execution

Arena Verdict

Key Differences

When to Use Each Mode

Choose Single Chat When:

Choose Dual Chat When:

Streaming Considerations

Database Persistence

Single Chat Writes

Dual Chat Writes

Model Selection Transparency

Single Chat

Dual Chat

Next Steps