Platform

Products

Private AI infrastructure — from compute to agents.

Target groups

Use Cases

For enterprise, SMBs, and individual developers.

Knowledge & Support

Resources

Everything you need to succeed with Mycelis.

Compute

GPU Instances —
Dedicated AI compute capacity.

Deploy open-source models on dedicated GPU hardware. Billed hourly, OpenAI-compatible endpoint, no shared resources.

Available GPUs

GPU VRAM Suitable for Price / hour
NVIDIA RTX 4090 24 GB Llama 3.1 8B, Mistral 7B, Qwen 7B €0.39
NVIDIA RTX A6000 48 GB Llama 3.1 70B (Q4), Mixtral 8x7B €0.79
NVIDIA A100 80GB 80 GB Llama 3.1 70B (FP16), 405B (Q4) €1.99
NVIDIA H100 SXM 80 GB HBM3 Llama 3.1 405B, training On request

All prices net, excluding VAT. Hourly billing, cancellable at any time.

Supported open-source models

Llama 3.1 8B
Llama 3.1 70B
Llama 3.1 405B
Mistral 7B
Mixtral 8x7B
Qwen 2.5 7B
Qwen 2.5 72B
DeepSeek Coder
Phi-3 Mini

Setup in 60 seconds

1

Choose GPU & model

Select GPU type and model in the dashboard. Mycelis starts the instance automatically.

2

Generate API key

Create a personal access token (PAT) — takes less than 10 seconds.

3

Start immediately

Change base_url and api_key in your existing code. Done.

OpenAI-compatible endpoint

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.mycelis.io/proxy/v1",
    api_key="pat_..."  # your personal access token
)

response = client.chat.completions.create(
    model="llama-3.1-70b",  # your deployment name
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

GDPR-compliant by design

All GPU instances run on dedicated hardware — no shared resources, no data forwarding to third parties. Prompts and responses are not stored permanently. Data centers are in the EU. Full data ownership stays with the user.

Frequently asked questions

Which GPUs are available?

RTX 4090 (€0.39/h), RTX A6000 (€0.79/h), A100 80GB (€1.99/h). H100 SXM is available on request for training workloads.

Can I host multiple models on one GPU instance?

Each instance runs exactly one deployment. For multiple models, start multiple instances — or use VirtualModels with smart routing to switch between instances.

How are instances billed?

Hourly billing, calculated to the minute. An instance running 2.5 hours is billed as 2.5 × hourly rate. No minimum, no setup fee.

Can I switch the model after deployment?

No — a deployment is tied to a specific model. For a different model, simply start a new instance and stop the old one.

Ready to deploy your first model?

No credit card required. Free starter credits included.

Start for free