Pricing

Predictable On-Prem AI Appliance Pricing

Every quote is sized to your actual usage, hardware footprint and SLA. No per-token billing. No hidden hardware margin. A setup fee and service retainer give you a clear 36-month TCO against cloud AI spend.

01 — Pricing floor

Three guideposts.

Setup, retainer, and zero hardware commission — every customer starts from these three principles.

Pricing floor
From €5,000
One-time setup & configuration. Discovery, install, identity federation, connector wiring, validation, training.
From €3,000 / month
Recurring service retainer — scales with your hardware footprint, user count, and SLA. 20% discount for annual prepay.
0% on hardware
Zero commission on the Supermicro pass-through. Your infrastructure investment stays with you, not us.
02 — Tiers

Three tiers for every enterprise.

Custom-quoted to your discovery and usage audit — never a published price.

Gateway
No GPU · gateway-only
Teams already using external LLM APIs who need a private gateway, usage tracking, and PII redaction — but not local inference yet.
  • LiteLLM gateway + router
  • LibreChat chat surface
  • Multi-provider model routing
  • Usage tracking, audit logs, cost attribution
  • Microsoft Presidio PII redaction
  • No GPU — no local inference
SME Appliance
Single GPU · fully local AI
Mid-size companies wanting fully local AI for engineering teams of 10–100 users.
  • Everything in Gateway tier
  • SGLang + vLLM inference engines
  • OpenClaw / NemoClaw agentic layer
  • Local RAG + knowledge base
  • Open Notebook research agent
  • Governed workflow runtime + scheduled jobs
  • Single-server Supermicro appliance
  • Runs open-weight models on-prem
Enterprise
Multi-node · HA · custom SLA
Large enterprises and regulated industries needing high-availability, horizontal scaling, and dedicated SLAs.
  • Everything in SME tier
  • High-availability multi-node setup
  • Horizontal scaling support
  • Dedicated SLA & on-call
  • Custom integrations on request
  • Air-gapped / regulated-industry options
03 — Deployment modes

Two ways to deploy.
Same sovereignty story.

Both modes keep you the owner of the hardware. Both keep your data plane fully out of our visibility.

Deployment mode · 01

On-premise (default).

Appliance installed at your site or data centre. Best for existing infrastructure, regulated workloads with strict physical-control requirements, and full operational independence.

Deployment mode · 02

Hosted in our Croatian DC.

Same appliance, same sovereignty story — racked in our Croatian data centre. You still own the hardware. Best for fully-remote teams without on-site infrastructure.

04 — Build vs. partner

The build-it-yourself maths.

What sovereign on-prem AI actually costs to ship internally — versus what we charge to ship it for you. Year 1, like-for-like, SME profile.

Option A · DIY

Build it in-house.

Hire the team. Stitch the OSS. Survive the audit.

  • 2× ML / Infra engineers (loaded)€220K
  • 1× DevOps / SRE€100K
  • 1× Security & compliance lead€80K
  • Legal review — EU AI Act, GDPR, NIS2, EU Data Act€60K
  • Hardware procurement, racking, integration€100K
  • NVIDIA AI Enterprise — runtime tax (per 8-GPU server)€36K / yr
  • 6 months stitching, debugging, on-call burnoutopportunity
  • Delivery + key-person riskunbounded
Year 1 typical €600K – €1M+
Option B · LLM Machines

Partner with us.

Same outcome. None of the headcount.

  • Setup & configuration (one-time)From €5K
  • Service retainerFrom €3K / mo
  • Hardware (Supermicro, pass-through)0% commission
  • NVIDIA AI Enterprise tax€0 — pure OSS
  • EU compliance — built inincluded
  • Time to live4–6 weeks
  • Delivery & key-person riskon us, not you
  • Extend in T3 sandbox without breaking SLAyes
Year 1 typical ~€41K + HW
Build in-house
€600K – €1M+
Partner with us
~€41K + HW
SME deployment, Year 1, retainer at €3K floor. Hardware passed through at zero commission — your infrastructure investment stays with you. ~6.5% of the in-house total without counting NVIDIA AI Enterprise runtime fees.
05 — Cloud vs. on-prem

Flat ownership cost beats runaway usage bills.

Public-cloud AI looks flexible until sensitive data, audit requirements, rate limits and token usage all become board-level concerns.

Cloud AI APIs LLM Machines
Data controlPrompts leave your perimeterData stays inside your environment
Cost modelUsage-based token spendSetup + service retainer + owned hardware
Vendor lock-inProvider-specific APIs and policiesOpen-source stack and OpenAI-compatible gateway
Deployment timeFast to start, slower to govern4–6 week controlled deployment
AuditabilityLimited to provider logs and exportsLogs, routing and controls inside your perimeter
06 — FAQ

Pricing questions.

How quotes, hardware, retainers and scaling work before a commercial conversation.

What does the setup fee cover?

Setup covers discovery, appliance configuration, installation, identity federation, connector wiring, validation, training and handoff.

What does the monthly retainer cover?

The retainer covers ongoing service, support, updates and operational partnership, scaled to your hardware footprint, users and SLA.

Do you add margin to hardware?

No. Hardware is passed through at zero commission, so your infrastructure investment stays with you.

Is there an annual prepay discount?

Yes. Annual prepay receives a 20% discount on the recurring service retainer.

How does pricing scale?

Pricing scales with capacity, user count, deployment mode, high-availability requirements, connector scope and support expectations.

What affects quote size most?

The largest variables are local model capacity, number of users, availability requirements, regulated deployment constraints and integration complexity.

What's next

Get a quote sized to your environment.

Discovery call, usage audit, sized appliance spec, 36-month TCO comparison against your current cloud spend.