HERMES AGENT MIXTURE OF AGENTS
COMBINES MULTIPLE MODELS INTO ONE ANSWER.
8% HIGHER THAN OPUS 4.8.
11% HIGHER THAN GPT-5.5.
NO GATED ACCESS REQUIRED.
Mixture of Agents (MoA) runs multiple models
on the same query in parallel.
an aggregator model synthesizes all responses
into one answer that outperforms
any single model alone.
Nous Research benchmarks:
8% higher than Opus 4.8.
11% higher than GPT-5.5.
(upcoming benchmark, numbers from official announcement.)
HOW IT WORKS:
you select a MoA preset as your model.
Hermes fans out your query to 2-3 reference models.
each responds independently.
the aggregator reads all responses
and synthesizes one final answer.
you see one response. behind it: multiple perspectives.
DEFAULT PRESET:
moa:
default_preset: default
presets:
default:
reference_models:
- provider: openai-codex
model: gpt-5.5
- provider: openrouter
model: deepseek/deepseek-v4-pro
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
reference_temperature: 0.6
aggregator_temperature: 0.4
max_tokens: 4096
enabled: true
two reference models generate diverse responses.
Opus aggregates them into one answer.
the output is better than any of the three alone.
SETUP:
Desktop app / Dashboard → Models → MoA presets
CLI: hermes moa configure
create named presets for different tasks:
hermes moa configure # default preset
hermes moa configure review # create "review" preset
hermes moa configure research # create "research" preset
hermes moa list # see all presets
hermes moa delete review # remove a preset
BUILD YOUR OWN PRESETS:
CHEAP RESEARCH (2 models, budget):
presets:
research_lite:
reference_models:
- provider: openrouter
model: deepseek/deepseek-v4
- provider: openrouter
model: google/gemini-2.5-flash
aggregator:
provider: openrouter
model: anthropic/claude-sonnet-4.6
diverse perspectives at budget prices.
Sonnet aggregates. good enough for daily research.
MAXIMUM QUALITY (3 models, premium):
presets:
full_power:
reference_models:
- provider: openai-codex
model: gpt-5.5
- provider: openrouter
model: deepseek/deepseek-v4-pro
- provider: openrouter
model: google/gemini-2.5-pro
aggregator:
provider: openrouter
model: anthropic/claude-opus-4.8
three frontier models + Opus aggregation.
this is the preset that beats benchmarks.
MID-SESSION TOGGLE:
/moa # toggle MoA on/off
/moa research # switch to named preset
/moa off # disable, use plain model
working on routine code? MoA off.
hit a hard architectural problem? /moa on.
one command. no config edit. no restart.
SAFETY RAILS:
→ aggregator cannot be another MoA preset
(recursive MoA trees blocked)
→ enabled: false disables fan-out
(aggregator acts as a plain model)
→ each reference model runs in parallel
(wall clock ≈ slowest model, not sum)
WHERE MoA MAKES SENSE:
→ complex architecture decisions
→ research synthesis across diverse sources
→ code review where one model misses edge cases
→ critical content that needs multi-perspective verification
→ any task where "second opinion" matters
WHERE MoA IS OVERKILL:
→ simple file edits
→ routine web searches
→ cron jobs and monitoring
→ tasks where speed matters more than depth
MoA multiplies your token cost by the number
of reference models. use it for the 10% of tasks
where quality matters most.
full Hermes architecture deep-dive in the article 👇