Compute

GPU Instances —
Dedicated AI compute capacity.

Deploy open-source models on dedicated GPU hardware. Billed hourly, OpenAI-compatible endpoint, no shared resources.

Available GPUs

GPU	VRAM	Suitable for	Price / hour
NVIDIA RTX 4090	24 GB	Llama 3.1 8B, Mistral 7B, Qwen 7B	€0.39
NVIDIA RTX A6000	48 GB	Llama 3.1 70B (Q4), Mixtral 8x7B	€0.79
NVIDIA A100 80GB	80 GB	Llama 3.1 70B (FP16), 405B (Q4)	€1.99
NVIDIA H100 SXM	80 GB HBM3	Llama 3.1 405B, training	On request

All prices net, excluding VAT. Hourly billing, cancellable at any time.

Supported open-source models

Llama 3.1 8B

Llama 3.1 70B

Llama 3.1 405B

Mistral 7B

Mixtral 8x7B

Qwen 2.5 7B

Qwen 2.5 72B

DeepSeek Coder

Phi-3 Mini

Setup in 60 seconds

1

Choose GPU & model

Select GPU type and model in the dashboard. Mycelis starts the instance automatically.

2

Generate API key

Create a personal access token (PAT) — takes less than 10 seconds.

3

Start immediately

Change base_url and api_key in your existing code. Done.

OpenAI-compatible endpoint

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.mycelis.io/proxy/v1",
    api_key="pat_..."  # your personal access token
)

response = client.chat.completions.create(
    model="llama-3.1-70b",  # your deployment name
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

GDPR-compliant by design

All GPU instances run on dedicated hardware — no shared resources, no data forwarding to third parties. Prompts and responses are not stored permanently. Data centers are in the EU. Full data ownership stays with the user.

Frequently asked questions

Which GPUs are available?

RTX 4090 (€0.39/h), RTX A6000 (€0.79/h), A100 80GB (€1.99/h). H100 SXM is available on request for training workloads.

Can I host multiple models on one GPU instance?

Each instance runs exactly one deployment. For multiple models, start multiple instances — or use VirtualModels with smart routing to switch between instances.

How are instances billed?

Hourly billing, calculated to the minute. An instance running 2.5 hours is billed as 2.5 × hourly rate. No minimum, no setup fee.

Can I switch the model after deployment?

No — a deployment is tied to a specific model. For a different model, simply start a new instance and stop the old one.

Products

Compute

Intelligence

Integration

Use Cases

Enterprise

SMB

Developers & Individuals

Resources

Learn

Community & Updates

Support

GPU Instances —
Dedicated AI compute capacity.

Choose GPU & model

Generate API key

Start immediately

GDPR-compliant by design

Ready to deploy your first model?

Products

Compute

Intelligence

Integration

Use Cases

Enterprise

SMB

Developers & Individuals

Resources

Learn

Community & Updates

Support

GPU Instances —Dedicated AI compute capacity.

Choose GPU & model

Generate API key

Start immediately

GDPR-compliant by design

Ready to deploy your first model?

GPU Instances —
Dedicated AI compute capacity.