Your data stays in your building. Period.
We deploy large language models on your own hardware, behind your own firewall. No cloud dependency, no data transfer, no compromise. revDSG and GDPR compliant by design.
Data sovereignty
On-premise deployment
Liechtenstein & Rheintal
The challenge
You want AI — chatbots, document search, workflow automation — but your data is regulated. Banking compliance, patient records, legal privilege, trade secrets. Cloud AI means your data leaves your control. Every API call sends your proprietary information to servers you do not own, in jurisdictions you may not trust. We deploy LLMs on your own hardware, behind your own firewall. Your data never leaves your building.
What we deliver
On-Premise LLM Deployment
Install, configure, and optimise open-source LLMs (Llama, Mistral, Qwen, DeepSeek) on your hardware. Full control, no vendor lock-in.
RAG Pipeline
Retrieval-augmented generation over your internal documents, emails, and knowledge base. Answers grounded in your data, not hallucinations.
Custom Fine-Tuning
Train models on your company's domain data for higher accuracy. Your terminology, your processes, your standards — baked into the model.
AI Chatbot / Assistant
Internal-facing Q&A bot with access to your systems. Employees ask questions, the AI answers from your approved sources.
Workflow Automation
AI-powered document classification, summarisation, extraction, and routing. Reduce manual processing by 60-90%.
Ongoing Maintenance
Model updates, performance monitoring, security patches, and quarterly reviews. Your AI stays current without your team learning MLOps.
Pricing
Starter
CHF 15'000 setup
+ CHF 1'500 / month
Single use case, up to 20 users. Ideal for SMEs exploring AI.
- 1 LLM (7-13B parameters)
- 1 RAG pipeline
- Basic web interface
- Hardware advisory
- Email support
Professional
CHF 45'000 setup
+ CHF 4'000 / month
Multi-department deployment with fine-tuning and SSO integration.
- Multiple models (up to 70B)
- Enterprise RAG + document ingestion
- Custom fine-tuning on your data
- SSO / Active Directory integration
- Monitoring dashboard
- Up to 200 users
- Priority support
Enterprise
CHF 100'000+ setup
+ CHF 8'000+ / month
Organisation-wide AI platform with full compliance documentation and SLA.
- Multi-node GPU cluster
- FINMA / GxP compliance docs
- 99.9% SLA
- On-call support
- Audit trails & logging
- Unlimited users
- Full cluster design + build
How we work
Step 1
Assessment
We evaluate your data landscape, compliance requirements, existing infrastructure, and use cases.
Step 2
Architecture
We design the deployment: hardware spec, model selection, integration points, security boundaries.
Step 3
Deployment
On-site installation, configuration, testing with your actual data. Your team watches and learns.
Step 4
Support
Monthly maintenance, model updates, performance reviews. Your AI stays current, your team stays focused.
Why ANULUM
We build the tools ourselves
Creator of Director-AI (hallucination detection, PyPI) and Remanentia (persistent AI memory, 1,600+ tests). Not a reseller — a practitioner.
Data sovereignty by design
No cloud dependency, no API calls to external servers. Your data stays on your hardware, in your building, under your control.
GPU infrastructure since 2017
8+ years of hands-on GPU cluster experience. Mining era to AI inference. We know what hardware works and what fails.
Local — 30 minutes from your office
Based in Marbach SG. On-site in Liechtenstein, Rheintal, St Gallen, and Zurich. Face to face, not a ticket queue.
Ready to bring AI inside your walls?
Book a free 30-minute discovery call. We will assess your use case, data landscape, and compliance needs — no commitment, no sales pitch.
Schedule a discovery callFrequently asked questions
What hardware do I need?
For the Starter tier, a single server with one NVIDIA GPU (RTX 4090 or A6000, CHF 8-15k). We provide a detailed hardware specification and can handle procurement. If you already have suitable hardware, we use that.
Which models do you deploy?
Open-source models: Llama 3, Mistral, Qwen, DeepSeek, and others. We select the best model for your use case and language requirements. No proprietary lock-in — you can switch models at any time.
Is it really fully on-premise?
Yes. The entire system runs on your hardware, in your network. No data leaves your firewall. No cloud API calls. No telemetry. We can operate in fully air-gapped environments if required.
How does it compare to ChatGPT Enterprise or Azure OpenAI?
Cloud solutions route your data through external servers, even with Swiss region options. Our deployment keeps everything local. You also avoid per-token costs — once deployed, your AI runs at fixed infrastructure cost regardless of usage volume.
Do you support FINMA and GxP compliance?
Yes. Enterprise tier includes full compliance documentation: data flow diagrams, access controls, audit trails, and retention policies aligned with FINMA circulars and GxP requirements. We have experience with regulated environments.
How long does deployment take?
Starter: 2-3 weeks from hardware ready. Professional: 4-6 weeks. Enterprise: 8-12 weeks including compliance documentation. Timelines depend on hardware procurement and data preparation.