Platform

Products

Private AI infrastructure — from compute to agents.

Target groups

Use Cases

For enterprise, SMBs, and individual developers.

Knowledge & Support

Resources

Everything you need to succeed with Mycelis.

Tutorial

"Demo: Multi-model coding agent with automatic fallback"

May 15, 2025 · 12 min min read

In this tutorial you will build a production-ready coding agent from three models: a self-hosted open-source model, a commercial top-tier model, and a cost-efficient fallback. Routing is fully automatic — no custom code required.

What we are building

  • Gemma 4 as a self-hosted deployment (zero per-token cost)
  • Claude Opus 4.6 via BYOK for complex tasks
  • DeepSeek-V3 via BYOK as a cheap mid-tier fallback
  • A virtual model that bundles all three
  • An agent with rule-based routing and Smart Dispatcher
  • OpenCode using the Mycelis proxy as its backend

Step 1: Deploy Gemma 4

Go to Compute → Deployments → New Deployment.

Select gemma-4 (or the available Gemma 4 variant in your cluster) as the model. Give the deployment a clear name like gemma4-coding-local. Start the deployment — it runs on your own GPU and produces no variable token costs.

Tip: Gemma 4 excels at autocomplete, short explanations, and lightweight refactoring. Roughly 60–65% of typical coding requests fall into this category.


Step 2: Deploy Claude Opus 4.6 (BYOK)

Go to Compute → Deployments → New Deployment → BYOK.

Select Anthropic as the provider and claude-opus-4-6 as the model. Enter your Anthropic API key and name the deployment claude-opus-coding. Save.

Reserve Claude Opus 4.6 for architecture decisions, complex debugging with stack traces, and deep reasoning tasks.


Step 3: Deploy DeepSeek-V3 (BYOK)

Go to Compute → Deployments → New Deployment → BYOK again.

Select DeepSeek as the provider and deepseek-chat (DeepSeek-V3) as the model. Enter your DeepSeek API key, name: deepseek-v3-coding. Save.

DeepSeek-V3 costs a fraction of Claude and handles standard coding work — bug fixes, unit tests, mid-complexity refactoring — reliably.


Step 4: Create a virtual model

Go to Models → New Virtual Model.

  • Name: coding-agent
  • Slug: coding-agent (used later in the OpenCode config)
  • Add deployments: all three — gemma4-coding-local, claude-opus-coding, deepseek-v3-coding

A virtual model bundles multiple deployments behind a stable slug. Clients always target the same endpoint; routing happens transparently underneath.


Step 5: Create an agent and choose a strategy

Go to Agents → New Agent.

  • Name: Multi-Model Coding Agent
  • Virtual model: coding-agent
  • Strategy: Rule-based

The rule-based strategy evaluates a priority-ordered list of conditions for every request and routes to the matching deployment. If no rule matches, the Smart Dispatcher steps in as a fallback.


Step 6: Configure routing rules

Under Routing Rules in the agent, add the following three rules in order — priority matters:

Rule 1 – Simple tasks to Gemma 4

Field Value
Condition Keywords contain: autocomplete, explain, comment, rename, snippet
Target deployment gemma4-coding-local
Priority 1 (highest)

Rule 2 – Complex tasks to Claude Opus 4.6

Field Value
Condition Keywords contain: architecture, design, stacktrace, debug, migration, performance, security OR estimated tokens > 4000
Target deployment claude-opus-coding
Priority 2

Rule 3 – Standard coding to DeepSeek (default)

Field Value
Condition Always true (default fallback rule)
Target deployment deepseek-v3-coding
Priority 3 (lowest)

Smart Dispatcher as a safety net: When no rule matches — for example because all deployments are temporarily unavailable or the rule logic leaves a gap — the Smart Dispatcher analyzes the request and selects the most cost-efficient available deployment automatically.


Step 7: Create an API key in Mycelis

Go to Settings → API Keys → New API Key.

  • Name: opencode-local
  • Permissions: Inference (minimum)
  • Click Create and copy the generated key — it is shown only once.

This key authorizes OpenCode to send requests through your Mycelis workspace.


Step 8: Configure OpenCode with the Mycelis proxy

Open your OpenCode configuration file (~/.config/opencode/config.json or opencode.json in your project root).

Add a new provider entry:

{
  "providers": {
    "mycelis": {
      "name": "Mycelis",
      "apiKey": "mc_your_api_key_here",
      "baseURL": "https://mycelis.ai/api/proxy/v1"
    }
  },
  "model": "mycelis/coding-agent"
}

Replace mc_your_api_key_here with the key from step 7 and coding-agent with your virtual model's slug.

Restart OpenCode. All requests now flow through Mycelis, and routing decides in the background which of the three models responds.


Result

You now have a coding agent that:

  1. Answers simple requests for free on your own GPU (Gemma 4)
  2. Forwards complex architecture questions to Claude Opus 4.6
  3. Sends everything else to DeepSeek-V3 at a fraction of Claude's cost
  4. Falls back to the Smart Dispatcher when no rule fires, automatically picking the cheapest suitable model
  5. Logs every routing decision in the Dashboard under Smart Routing Insights

For a typical coding workload this setup saves 60–70% of API costs compared to a single-model setup, with no compromise on output quality.

Back to overview