Technology

The full stack, layered and tier-bounded.

One appliance. Twelve layers of certified core. A signed manifest enforces what runs where — and Pure Mode keeps anything custom from blocking your SLA.

01 — Architecture

Reference architecture.

External entities at the top. The appliance below. The support boundary line cleanly separates certified core from anything you build in T3.

External
End Users

Developers · analysts · support · legal · operations.
Browser · IDE · Slack/Teams · Email · CLI.

Authenticate via client SSO
External
Client Identity Provider

Your existing IdP — we federate, never replace.
e.g., Okta · Azure AD · Google · Ping.

OIDC · SAML 2.0 · SCIM
External
Client's Existing Tools

Whatever you already use — chat, source control, ticketing, docs, CRM, mail, storage.

OAuth · REST · GraphQL · Webhooks
LLM Machine · On-Prem Appliance Certified Core

Edge / Gateway

TLS termination · reverse proxy · routing · rate-limiting

Traefik · Kong · NGINX

T1

Identity & SSO

Federated to your IdP via OIDC / SAML — never replaces it · SCIM user provisioning · role mapping (Admin / User / Auditor / Read-Only)

Keycloak · Authentik · Zitadel

T1

App Surfaces · user-facing

T1
Chat Interface
RAG · multi-user · MCP
Open WebUI
IDE Backend
VS Code · JetBrains plugin
Continue
Code Completion
FIM · self-hosted
opencode / Tabby
Workflow Editor
templates · webhooks
n8n
Knowledge Workspace
RAG · workspace docs
AnythingLLM

Inference Gateway

OpenAI & Anthropic-compatible API · model routing · per-team budgets · audit logging

LiteLLM

T1

Inference Servers

High-throughput model serving · chat · code · embeddings · client fine-tunes · loaded from on-box signed registry

SGLang · vLLM

T1

Tool / Integration Layer

Vetted MCP catalog (T1) + verified partner connectors (T2). All credentials in on-box vault — they never leave the appliance.

MCP servers · chat · source control · ticketing · CRM · docs · …

T1 T2

Agentic Layer

Agent runtimes for multi-step tasks · default catalog of agents we configure · client-extensible in T3

openclaw / nemoclaw

T1

Workflow & Orchestration

Citizen-developer automation + scheduled background workflows

n8n

T1

Data

Vector + RAG store inside knowledge workspace · object storage · cache · optional dedicated DB by agreement

AnythingLLM-managed vectors · MinIO · Redis · (Postgres + pgvector)

T1

Observability & Audit

LLM tracing · metrics · logs — fully on-prem. No telemetry leaves the box.

Langfuse · Grafana · Loki · Prometheus

T1

Platform

Container orchestration · VM management · OS · out-of-band management · signed-update + license daemons

Kubernetes · Portainer · Proxmox · Linux · BMC

T1

Hardware · enterprise / industry-grade

Compute · memory · storage · network · power · physical security

Supermicro GPU(s) · CPU · NVMe · 25 / 100 GbE NIC · redundant PSU · TPM · tamper sensors

T1
SUPPORT BOUNDARY · PURE MODE SHUTS DOWN EVERYTHING BELOW

Client BYO Sandbox

Custom apps · custom connectors · custom workflows · client-trained models

No host privileges · egress allowlist · isolated secrets · outage here never blocks T1

Defined by you, on your clock — outside our SLA

T3
Tier model

T1 / T2 / T3 with manifest enforcement.

Every component is signed and labelled. T1 runs with host privileges. T2 in restricted containers. T3 sandboxed with no host access. The admin UI shows tier badges next to every installed component — never ambiguous, never argued.

Pure Mode

Kill everything custom. Keep certified core running.

One-click admin action that disables every T2/T3 component. Use it for security incidents, support diagnosis ("if it reproduces in Pure Mode it's our ticket"), or to keep an audit clean.

MCP catalog

Vetted connectors out of the box.

Chat, source control, ticketing, docs, CRM, mail, storage — all wired through curated MCP servers. Every credential lives in your on-box vault. Nothing leaves the appliance.

[ 01 ]

LiteLLM — Gateway & Router

Unified endpoint for all LLM providers and local models. Usage tracking, rate limiting, cost control.

[ 02 ]

Open WebUI — User Interface

A polished, ChatGPT-like interface for all end users. No training required.

[ 03 ]

AnythingLLM — RAG Engine

Document ingestion, vector search, and retrieval-augmented generation for enterprise knowledge bases.

[ 04 ]

Open Notebook — Research Agent

AI-powered research and knowledge synthesis. Deep-dive reports generated automatically.

[ 05 ]

NemoClaw / OpenClaw — Agentic Framework

Autonomous agents for complex, multi-step enterprise workflows.

[ 06 ]

Microsoft Presidio — PII Anonymisation

Automatic detection and redaction of sensitive data before it ever reaches a model.

[ 07 ]

SGLang — Inference Engine

A high-performance engine for running open-weight models locally — pure OSS, no NVIDIA AI Enterprise tax.

[ 08 ]

LLM Machines — Integration Layer

The connective tissue that turns these projects into a single, deployable, production-ready appliance. The signed manifest, the tier model, the support boundary, the runbook.

What's next

Ready to dig deeper?

See how the technology lands inside your environment — onboarding, pricing, or just talk to us.