Skip to content

Model Issues

You may encounter various issues when configuring and using LLM (Large Language Model) providers. This page covers troubleshooting methods for authentication, connection, rate limiting, and other common failures.

Symptom: 401 Unauthorized or Authentication failed error.

Cause: API Key invalid, expired, or not configured correctly.

Solutions:

Terminal window
# 1. Re-run interactive configuration wizard
hermes model
# 2. Manually check API Key (be careful not to leak it)
cat ~/.hermes/config.yaml | grep api_key
# 3. Verify API Key format
# OpenRouter: sk-or-v1-xxxxx
# Anthropic: sk-ant-xxxxx
# OpenAI: sk-xxxxx
# DeepSeek: sk-xxxxx
# Kimi: sk-xxxxx

If using OpenRouter, confirm you’re using an OpenRouter Key, not the original Anthropic or OpenAI Key:

~/.hermes/config.yaml
llm:
provider: openrouter
model: anthropic/claude-sonnet-4-20250514
api_key: sk-or-v1-... # Must be OpenRouter Key

Get OpenRouter API Key: openrouter.ai/keys

Symptom: Connection timeout, Network unreachable, or error after long unresponsive period.

Cause: Network unreachable, proxy not set, or provider service outage.

Solutions:

Terminal window
# 1. Test network connectivity
curl -I https://openrouter.ai/api/v1/models
curl -I https://api.anthropic.com
curl -I https://api.openai.com
# 2. If proxy needed
export HTTPS_PROXY=http://127.0.0.1:7890
export HTTP_PROXY=http://127.0.0.1:7890
# 3. Windows PowerShell proxy settings
# $env:HTTPS_PROXY="http://127.0.0.1:7890"
# 4. Switch to local-accessible provider
hermes model # Choose DeepSeek / Kimi / SiliconFlow / Qwen

Symptom: 429 Too Many Requests or Rate limit exceeded error.

Cause: Too many requests sent in short time, or free tier quota exhausted.

Solutions:

  1. Wait and retry — Usually can continue after 60 seconds
  2. Upgrade plan — Upgrade to higher limit plan in provider console
  3. Configure auto-retry and fallback models:
~/.hermes/config.yaml
llm:
provider: openrouter
model: anthropic/claude-sonnet-4-20250514
retry:
max_retries: 3
retry_delay: 5
fallback_models:
- google/gemini-2.5-pro
- deepseek/deepseek-chat
  1. Use OpenRouter — It aggregates 100+ models, automatically routing to alternatives when one hits limits

Symptom: Model not found or selected model is offline.

Cause: Model identifier misspelled, provider discontinued the model, or using non-existent model name.

Solutions:

Terminal window
# 1. View available model list
hermes model # Interactive selection, only shows available models
# 2. Query using OpenRouter API (requires API Key)
curl https://openrouter.ai/api/v1/models \
-H "Authorization: Bearer $OPENROUTER_API_KEY"
# 3. Note model ID format
# Models on OpenRouter need provider prefix:
# ✅ anthropic/claude-sonnet-4-20250514
# ✅ google/gemini-2.5-pro
# ❌ claude-sonnet-4-20250514 (missing prefix)

Common model identifiers:

ProviderModel ID Example
Anthropicanthropic/claude-sonnet-4-20250514
OpenAIopenai/gpt-4o
Googlegoogle/gemini-2.5-pro
DeepSeekdeepseek/deepseek-chat
Metameta-llama/llama-3.3-70b-instruct
Mistralmistral/mistral-large-latest
Kimimoonshot/moonshot-v1
Qwenqwen/qwen-3-235b-a22b

Symptom: Agent answers inaccurately, makes up information, or has confused logic.

Cause: Selected weak model, context too long causing information loss, or unclear prompts.

Solutions:

  1. Upgrade model — Use stronger models (like Claude Sonnet 4, GPT-4o)
  2. Optimize prompts — Refer to Prompt Guide
  3. Control context length — Regularly use /clear to clear conversation history
  4. Use system prompt guidance — Add role description in configuration:
~/.hermes/config.yaml
agent:
system_prompt: |
You are a precise, rigorous assistant.
Confirm facts before answering, clearly state when uncertain.
Respond in English.

Symptom: Connection to local model fails.

Cause: Local service not started, incorrect port, or model not downloaded.

Solutions:

Terminal window
# Ollama
ollama serve # Start Ollama service
ollama pull llama3.3 # Download model
curl http://localhost:11434/api/tags # Verify service running
# vLLM
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3.3-70B-Instruct \
--port 8000
# Configure Hermes to connect to local model
hermes model # Select Ollama or vLLM
# Or manually configure
llm:
provider: ollama
model: llama3.3
base_url: http://localhost:11434

Symptom: context_length_exceeded or maximum context length error.

Cause: Conversation history too long, exceeding model’s context window.

Solutions:

Terminal window
# 1. Clear current conversation
/clear
# 2. Start new session
hermes chat --new
# 3. Set auto-summary compression in config
~/.hermes/config.yaml
agent:
auto_summarize: true
summarize_threshold: 80000 # Token threshold