PiSovereign Documentation

A self-hosted, privacy-first AI assistant platform — deploy anywhere with Docker Compose.

Welcome to the official PiSovereign documentation. This guide covers everything from first deployment to production operations and development.


Introduction

PiSovereign runs a complete AI assistant stack on your own hardware. All inference stays local via Ollama — no data ever leaves your network. It deploys as a set of Docker containers on any Linux or macOS host and is optimized for the Raspberry Pi 5 with Hailo-10H NPU.

Core Principles:

  • Privacy First — All processing happens locally on your hardware
  • GDPR Compliant — No data leaves your network
  • Open Source — MIT licensed, fully auditable, #![forbid(unsafe_code)]
  • Extensible — Clean Architecture with Ports & Adapters

Key Features

FeatureDescription
Local LLM InferenceOllama with dynamic model routing by task complexity
Signal & WhatsAppBidirectional messaging with voice message support
Voice ProcessingLocal STT (whisper.cpp) and TTS (Piper), optional OpenAI fallback
Calendar & ContactsCalDAV/CardDAV (Baïkal, Radicale, Nextcloud)
EmailIMAP/SMTP with any provider (Gmail, Outlook, Proton Mail)
Weather & TransitOpen-Meteo forecasts, German public transit via HAFAS
Web SearchBrave Search with automatic DuckDuckGo fallback
Persistent MemoryRAG with embeddings, decay, deduplication, XChaCha20 encryption
RemindersNatural language scheduling with morning briefings
Agentic ModeMulti-agent orchestration for complex tasks with parallel sub-agents
Secret ManagementHashiCorp Vault with AppRole authentication
ObservabilityPrometheus, Grafana, Loki, OpenTelemetry
Docker ComposeSingle-command deployment with optional monitoring and CalDAV profiles

User Guide

DocumentDescription
Getting Started5-minute Docker deployment
Hardware SetupRaspberry Pi 5 + Hailo-10H assembly
Docker SetupDetailed deployment and operations guide
Vault SetupSecret management with HashiCorp Vault
ConfigurationAll config.toml options
External ServicesWhatsApp, email, CalDAV, search setup
Signal SetupSignal messenger registration
Reminder SystemReminders and morning briefings
TroubleshootingCommon issues and solutions

Developer Guide

DocumentDescription
ArchitectureClean Architecture overview
Memory SystemRAG pipeline and encryption
ContributingDevelopment setup and workflow
Crate ReferenceAll 16 workspace crates documented
API ReferenceREST API with OpenAPI spec

Operations & Security

DocumentDescription
Production DeploymentTLS, production config, multi-arch builds
MonitoringPrometheus, Grafana, Loki, alerting
Backup & RestoreData protection and recovery
Security HardeningApplication, network, and Vault security

Getting Help

Features at a Glance

A teenager-friendly guide to PiSovereign’s architecture and features

What is PiSovereign? It’s your own private AI assistant that runs on your computer (or a Raspberry Pi) instead of sending your data to the cloud. Think of it as having ChatGPT, but it lives in your house and keeps all your conversations private.

This page explains all the cool stuff PiSovereign can do using simple terms and real-world comparisons.


How It’s Built (Architecture)

PiSovereign is organized like a well-run school where each department has clear responsibilities and rules about who talks to whom.

The Layer Cake

┌─────────────────────────────────────────────────────────────┐
│  🖥️  PRESENTATION  (What you see and interact with)        │
│      Web UI, REST API, Command Line                         │
├─────────────────────────────────────────────────────────────┤
│  🔌  INFRASTRUCTURE  (The plumbing)                         │
│      Database, Cache, Secrets, Metrics                      │
├─────────────────────────────────────────────────────────────┤
│  🔗  INTEGRATION  (Connections to the outside world)        │
│      WhatsApp, Signal, Email, Calendar, Weather, Transit    │
├─────────────────────────────────────────────────────────────┤
│  🧠  AI  (The smart stuff)                                  │
│      LLM Inference, Speech-to-Text, Text-to-Speech          │
├─────────────────────────────────────────────────────────────┤
│  ⚙️  APPLICATION  (Business logic)                          │
│      Services, Use Cases, Rules                             │
├─────────────────────────────────────────────────────────────┤
│  💎  DOMAIN  (Core rules and data)                          │
│      Entities, Value Objects, Commands                      │
└─────────────────────────────────────────────────────────────┘

The Golden Rule: Inner layers never depend on outer layers. The Domain layer doesn’t care if you’re using WhatsApp or a CLI — it just knows about messages and conversations.

Architecture Patterns Explained

PatternReal-World AnalogyWhat It Does
Clean ArchitectureA school with separate buildings for classes, admin, and sportsKeeps code organized so the AI brain doesn’t need to know about databases
Ports & AdaptersUniversal phone charger that fits any outletDifferent services (WhatsApp, Email) plug in without changing the core code
Decorator ChainMatryoshka (Russian nesting) dollsEach layer wraps the previous one, adding features like caching or sanitization
Dependency InjectionLEGO bricks that snap togetherEasy to swap real services for test versions without rewriting code
Event-DrivenWaiter who takes your order while the kitchen cooksBackground tasks run without making you wait for responses
Circuit BreakerElectrical fuse that prevents house firesWhen a service fails repeatedly, stop trying and use a backup plan
Multi-Layer CacheSticky notes (fast) + notebook (permanent)Frequently used data stays in memory; everything else on disk

Feature Quick Reference

Here’s everything PiSovereign can do, explained simply:

⚡ Performance Features

FeatureWhat It DoesReal-World AnalogyWhy It’s Cool
Adaptive Model RoutingSends easy questions to small, fast AI; hard questions to bigger AIExpress checkout vs. full-service lane at the grocery store4× faster for simple questions
Semantic CachingRemembers similar questions you asked before, even if worded differentlyA teacher who remembers “What’s 2+2?” and “Two plus two equals?” are the same questionNo waiting for repeat questions
Multi-Layer CacheStores answers in fast memory + disk backupSticky notes on your desk (fast) + a notebook in your drawer (permanent)Under 1ms for cached answers
In-Process Event BusHandles background work (saving memories, logging) without slowing your replyA restaurant where the waiter takes your order while another waiter clears tables100-500ms saved per message
Proactive Pre-ComputationPrepares common answers before you askA friend who checks the weather before your camping tripInstant morning briefings
Template ResponderAnswers trivial questions instantly without using AIAutomated phone menu for simple requestsUnder 10ms for “Hello!”

🧠 AI Features

FeatureWhat It DoesReal-World AnalogyWhy It’s Cool
ReAct Agent (Tool Calling)AI can use 18 tools: check weather, search web, read calendar, send emailsAn assistant who can look things up instead of just guessingAI acts, not just talks
Multi-Agent OrchestrationMultiple AIs work together on complex tasksA group project where each person handles their specialtyParallel work = faster results
RAG Memory SystemRemembers your preferences, name, and past conversationsA personal diary that the AI actually reads“Hey, I remember you like dark mode!”
Fact ExtractionAutomatically pulls important facts from conversationsHighlighting key points in a textbookNever forgets important stuff
Complexity ClassificationFigures out how hard a question is before answeringA teacher deciding if it’s a pop quiz or a final examRight-sized AI for every question

🔒 Security Features

FeatureWhat It DoesReal-World AnalogyWhy It’s Cool
Prompt Injection DefenseBlocks 60+ patterns of attempts to trick the AIA bouncer checking IDs at a club entranceStops “ignore your instructions” attacks
Output SanitizationHides sensitive info (passwords, credit cards, emails) from responsesA TV censor bleeping out swear wordsPII protection with 17 detection patterns
Context SanitizationCleans external data (web results, tool outputs) before feeding to AIAirport security scanning luggageBlocks hidden malicious instructions
Secret ManagementStores API keys and passwords in a secure vaultA safe with a combination lockSecrets never appear in logs
Encryption at RestEncrypts your memories and conversations on diskA locked diary with a key only you haveXChaCha20-Poly1305 encryption
Rate LimitingPrevents abuse by limiting requests per minuteA “take a number” system at a deli counterAuto-cleanup of old entries
HMAC Tool ReceiptsSigns tool results to detect tamperingA wax seal on a medieval letterCryptographic proof nothing was changed

🔊 Speech Features

FeatureWhat It DoesReal-World AnalogyWhy It’s Cool
Speech-to-Text (STT)Converts voice messages to textA court stenographerLocal processing via Whisper
Text-to-Speech (TTS)Reads responses aloudAn audiobook narratorPiper voices for natural speech
Hybrid ProviderFalls back to OpenAI if local processing failsHaving a backup phone charger99.9% uptime even when hardware struggles

Integrations (External Services)

PiSovereign connects to 8 external services. Each one plugs in via the Ports & Adapters pattern, so adding new ones is easy.

ServiceWhat You Can DoExample Commands
WhatsAppSend and receive messages via WhatsApp Cloud API“Send Mom: Don’t forget the groceries!”
SignalPrivate encrypted messaging via signal-cli“Message my Signal group: meeting at 5pm”
Calendar (CalDAV)View, create, and manage events on any CalDAV server“What’s on my calendar this week?”
Contacts (CardDAV)Look up phone numbers and emails“What’s Sarah’s email address?”
Email (IMAP/SMTP)Read inbox, search, draft, and send emails“Any new emails from GitHub?”
Weather (Open-Meteo)Current conditions and 7-day forecast“Will it rain tomorrow in Berlin?”
Web Search (Brave/DDG)Search the internet privately“Search for vegan pasta recipes”
Transit (HAFAS)German public transport schedules“Next train from Munich to Hamburg?”

How a Request Flows Through the System

Here’s what happens when you ask “What’s the weather tomorrow?”:

1. 📱 You send message via Web UI / WhatsApp / Signal
         │
         ▼
2. 🚦 Adaptive Model Routing classifies complexity
   │   → "weather question" = Simple tier
   │
   ▼
3. 💾 Check Semantic Cache
   │   → Similar question asked before? Return cached answer!
   │   → No hit? Continue...
   │
   ▼
4. 🧠 ReAct Agent decides to use the weather tool
   │   → Calls Open-Meteo API
   │   → Sanitizes the result (removes any hidden tricks)
   │
   ▼
5. 🤖 Small AI model (gemma3:1b) formats the response
         │
         ▼
6. 📤 Output Sanitizer checks for leaked secrets
         │
         ▼
7. 💬 Response sent back to you: "Tomorrow: 18°C, partly cloudy"
         │
         ▼
8. 📝 Event Bus (background): Save to cache, extract facts, log metrics

Total time: ~500ms (vs. 5-8 seconds without optimizations)


The 18 Crates (Code Modules)

PiSovereign is split into 18 Rust crates (think of them as LEGO sets that snap together):

LayerCrateOne-Line Description
DomaindomainCore rules: what is a message, user, conversation?
ApplicationapplicationBusiness logic: how do we handle a chat request?
AIai_coreOllama LLM inference and model routing
AIai_speechSpeech-to-text and text-to-speech
InfrastructureinfrastructureDatabase, cache, secrets, metrics adapters
Integrationintegration_whatsappWhatsApp Cloud API connector
Integrationintegration_signalSignal messenger via signal-cli
Integrationintegration_caldavCalDAV calendar protocol
Integrationintegration_carddavCardDAV contacts protocol
Integrationintegration_emailIMAP/SMTP email
Integrationintegration_weatherOpen-Meteo weather API
Integrationintegration_websearchBrave Search + DuckDuckGo fallback
Integrationintegration_transitGerman public transit (HAFAS)
Presentationpresentation_httpREST API with Axum web framework
Presentationpresentation_cliCommand-line interface
Presentationpresentation_webSolidJS web frontend

Technology Stack

CategoryTechnologyWhy We Use It
LanguageRust 2024Fast, safe, no garbage collector pauses
Async RuntimeTokioHandle thousands of requests concurrently
Web FrameworkAxumType-safe, fast HTTP handling
FrontendSolidJS + Tailwind CSSReactive UI without React’s overhead
DatabasePostgreSQL + pgvectorSQL + vector similarity search
CacheMoka (memory) + Redb (disk)Multi-layer for speed + persistence
LLMOllamaRun AI models locally, no cloud needed
SecretsHashiCorp VaultEnterprise-grade secret storage
ContainersDocker ComposeEasy deployment with profiles
ObservabilityPrometheus + Grafana + LokiMetrics, dashboards, logs

Glossary

TermSimple Definition
LLMLarge Language Model — the AI brain that generates text
RAGRetrieval-Augmented Generation — giving the AI context from your memories
EmbeddingConverting text to numbers so computers can measure similarity
InferenceThe process of the AI generating a response
PortAn interface (contract) that says “I need X capability”
AdapterA concrete implementation that fulfills a port’s contract
DecoratorA wrapper that adds behavior to something without changing it
Circuit BreakerPattern that stops calling a failing service to let it recover
Event BusA message highway where components publish and subscribe to events
STT/TTSSpeech-to-Text / Text-to-Speech
CalDAV/CardDAVCalendar / Contact protocols (like HTTP for calendars)
HAFASGerman public transit API standard

Learn More

Want the full technical details? Check out these pages:

Getting Started

Get PiSovereign running in under 5 minutes

PiSovereign is deployed as a set of Docker containers using Docker Compose. This is the only supported installation method.

Prerequisites

  • Docker Engine 24+ with Docker Compose v2
  • 8 GB RAM recommended (4 GB minimum)
  • 20 GB disk space (models + data)
  • A domain name with DNS pointing to your server (for HTTPS)

Quick Start

# Clone the repository
git clone https://github.com/twohreichel/PiSovereign.git
cd PiSovereign/docker

# Create your environment file
cp .env.example .env
nano .env  # Set PISOVEREIGN_DOMAIN and TRAEFIK_ACME_EMAIL

# Start all core services
docker compose up -d

# Initialize Vault (first run only — save the output!)
docker compose exec vault /vault/init.sh

# Wait for model download to complete
docker compose logs -f ollama-init

What Gets Deployed

ServiceDescription
PiSovereignAI assistant application
TraefikHTTPS reverse proxy with Let’s Encrypt
VaultSecret management (API keys, passwords)
OllamaLLM inference engine
Signal-CLISignal messenger integration
WhisperSpeech-to-text processing
PiperText-to-speech synthesis

Post-Setup

  1. Store secrets in Vault — See Vault Setup
  2. Register Signal number — See Signal Setup
  3. Configure integrations — See External Services
  4. Enable monitoring (optional) — docker compose --profile monitoring up -d

Verify Installation

# Check all services are running
docker compose ps

# Test the health endpoint
curl https://your-domain.example.com/health

# Check individual services
curl https://your-domain.example.com/health/inference
curl https://your-domain.example.com/health/vault

Next Steps

Hardware Setup

Hardware assembly guide for Raspberry Pi 5 with Hailo-10H AI HAT+

This guide covers the physical hardware setup. For software installation, see the Docker Setup guide.

Required Components

ComponentRecommended ModelNotes
Raspberry Pi 58 GB RAM variant4 GB works but limits concurrent operations
Hailo AI HAT+ 2Hailo-10H (26 TOPS)Mounts via 40-pin GPIO + PCIe
Power SupplyOfficial 27W USB-CRequired for HAT+ power delivery
CoolingActive Cooler for Pi 5Essential for sustained AI inference
StorageNVMe SSD (256 GB+)Via Hailo HAT+ PCIe or separate HAT
MicroSD Card32 GB+ Class 10For boot (if not using NVMe boot)
CaseOfficial Pi 5 Case (tall)Must accommodate HAT+ height

Assembly Instructions

Important: Always work on a static-free surface and handle boards by edges only.

Step 1: Prepare the Raspberry Pi

  1. Unbox the Raspberry Pi 5
  2. Attach the Active Cooler:
    • Remove the protective film from the thermal pad
    • Align with the CPU and press firmly
    • Connect the 4-pin fan connector to the FAN header

Step 2: Install the Hailo AI HAT+

  1. Locate the 40-pin GPIO header on the Pi
  2. Align the Hailo HAT+ with the GPIO pins
  3. Gently press down until fully seated (approximately 3mm gap)
  4. Connect the PCIe FPC cable:
    • Open the Pi 5’s PCIe connector latch
    • Insert the flat cable (contacts facing down)
    • Close the latch to secure

Step 3: Install Storage (Optional NVMe)

If using the Hailo HAT+ built-in M.2 slot:

  1. Insert NVMe SSD into M.2 slot (M key, 2242/2280)
  2. Secure with the provided screw

Step 4: Enclose and Power

  1. Place assembly in case
  2. Connect Ethernet cable (recommended over WiFi for production)
  3. Connect power supply

OS Installation

Flash Raspberry Pi OS

  1. Install Raspberry Pi Imager on your computer

  2. Choose Device: Raspberry Pi 5

  3. Choose OS: Raspberry Pi OS Lite (64-bit)

  4. Click Edit Settings:

    • Set hostname: pisovereign
    • Set username and strong password
    • Enable SSH with public-key authentication
    • Set your timezone
  5. Flash to SD card / NVMe

First Boot

# SSH into the Pi
ssh pi@pisovereign.local

# Update system
sudo apt update && sudo apt full-upgrade -y

# Install Docker (required for PiSovereign)
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER

# Log out and back in for group change
exit

Configure Boot (Optional NVMe)

sudo raspi-config
  • Advanced OptionsBoot OrderNVMe/USB Boot

Next Steps

Once hardware is assembled and Docker is installed, proceed to the Docker Setup guide for PiSovereign deployment.

Docker Setup

Production deployment guide using Docker Compose

PiSovereign runs as a set of Docker containers orchestrated by Docker Compose. This is the recommended and only supported deployment method.

Prerequisites

  • Docker Engine 24+ and Docker Compose v2
  • 4 GB+ RAM (8 GB recommended)
  • 20 GB+ free disk space

Install Docker if not already installed:

# Raspberry Pi / Debian / Ubuntu
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
# Log out and back in

# macOS
brew install --cask docker

Quick Start

# 1. Clone the repository
git clone https://github.com/twohreichel/PiSovereign.git
cd PiSovereign/docker

# 2. Configure environment
cp .env.example .env
# Edit .env with your domain and email for TLS certificates
nano .env

# 3. Start core services
docker compose up -d

# 4. Initialize Vault (first time only)
docker compose exec vault /vault/init.sh
# Save the unseal key and root token printed to stdout!

# 5. Wait for Ollama model download
docker compose logs -f ollama-init

PiSovereign is now running at https://your-domain.example.com.

Architecture

The deployment consists of these core services:

ServicePurposePortURL
pisovereignMain application server3000 (internal)http://localhost/ via Traefik
traefikReverse proxy + TLS80, 443http://localhost:80
vaultSecret management8200 (internal)Internal only
ollamaLLM inference engine11434 (internal)Internal only
signal-cliSignal messenger daemonUnix socketInternal only
whisperSpeech-to-text (STT)8081 (internal)Internal only
piperText-to-speech (TTS)8082 (internal)Internal only

Monitoring Stack (profile: monitoring)

ServicePurposePortURL
prometheusMetrics collection & alerting9090http://localhost:9090
grafanaDashboards & visualization3000 (internal)http://localhost/grafana via Traefik
lokiLog aggregation3100 (internal)Internal only
promtailLog shipping agentInternal only
node-exporterHost metrics exporter9100 (internal)Internal only
otel-collectorOpenTelemetry Collector4317/4318 (internal)Internal only

CalDAV Server (profile: caldav)

ServicePurposePortURL
baikalCalDAV/CardDAV server80 (internal)http://localhost/caldav via Traefik

Key Endpoints

EndpointDescription
http://localhost/healthApplication health check
http://localhost/metrics/prometheusPrometheus metrics scrape target
http://localhost/grafanaGrafana dashboards (monitoring profile)
http://localhost/caldavBaïkal CalDAV web UI (caldav profile)
http://localhost:9090Prometheus web UI (monitoring profile)
http://localhost:9090/targetsPrometheus scrape target status

Configuration

Environment Variables

Edit docker/.env before starting:

# Your domain (required for TLS)
PISOVEREIGN_DOMAIN=pi.example.com

# Email for Let's Encrypt certificates
TRAEFIK_ACME_EMAIL=you@example.com

# Vault root token (set after vault init)
VAULT_TOKEN=hvs.xxxxx

# Container image version
PISOVEREIGN_VERSION=latest

# Email provider preset: proton (default), gmail, or custom
EMAIL_PROVIDER=proton

Note: On first startup, PiSovereign automatically populates 32 default system commands and validates Vault credentials, logging warnings for any missing or invalid secrets. Check the container logs after first startup to verify all integrations are configured correctly.

Application Config

The main application config is at docker/config/config.toml. All service hostnames use Docker network names (e.g., ollama:11434).

See Configuration Reference for all options.

Storing Secrets in Vault

After Vault initialization, store your integration secrets:

# Enter Vault container
docker compose exec vault sh

# Store WhatsApp credentials
vault kv put secret/pisovereign/whatsapp \
  access_token="your-meta-token" \
  app_secret="your-app-secret"

# Store Brave Search API key
vault kv put secret/pisovereign/websearch \
  api_key="your-brave-api-key"

# Store CalDAV credentials
vault kv put secret/pisovereign/caldav \
  password="your-caldav-password"

# Store email credentials (IMAP/SMTP)
vault kv put secret/pisovereign/email \
  password="your-email-password"

Docker Compose Profiles

Additional services are available via profiles (see tables above for URLs):

Monitoring Stack

docker compose --profile monitoring up -d

CalDAV Server

docker compose --profile caldav up -d

All Profiles

docker compose --profile monitoring --profile caldav up -d

Signal Registration (Docker)

Signal requires a one-time registration before messages can be sent/received.

1. Set your phone number

Edit docker/.env and set your phone number in E.164 format:

SIGNAL_CLI_NUMBER=+491701234567

This automatically configures both the PiSovereign application and can be stored in Vault for secure persistence.

2. Register with Signal

# Register via SMS
docker compose exec signal-cli signal-cli -a +491701234567 register

# Or register via voice call
docker compose exec signal-cli signal-cli -a +491701234567 register --voice

3. Verify the code

# Enter the verification code received via SMS/voice
docker compose exec signal-cli signal-cli -a +491701234567 verify 123-456

4. Store in Vault (optional)

For production, store the phone number in Vault so it’s managed centrally:

docker compose exec vault vault kv put secret/pisovereign/signal \
  phone_number="+491701234567"

The application loads the phone number in this priority order:

  1. config.toml[signal] phone_number = "..."
  2. Environment variablePISOVEREIGN_SIGNAL__PHONE_NUMBER (set via .env)
  3. Vaultsecret/pisovereign/signalphone_number

5. Restart and verify

docker compose restart pisovereign
docker compose logs pisovereign | grep -i signal

For the full Signal setup guide, see Signal Setup.

Operations

Updating

cd docker

# Pull latest images
docker compose pull

# Recreate containers with new images
docker compose up -d

Vault Management

# Check Vault status
docker compose exec vault vault status

# Unseal after restart (use key from init)
docker compose exec vault vault operator unseal <UNSEAL_KEY>

# Read a secret
docker compose exec vault vault kv get secret/pisovereign/whatsapp

Logs

# Follow all logs
docker compose logs -f

# Specific service
docker compose logs -f pisovereign

# Last 100 lines
docker compose logs --tail=100 pisovereign

Backup

# Stop services
docker compose down

# Backup volumes
docker run --rm -v pisovereign-data:/data -v $(pwd):/backup \
  alpine tar czf /backup/pisovereign-backup-$(date +%Y%m%d).tar.gz /data

# Restart
docker compose up -d

Troubleshooting

See the Troubleshooting guide for common issues.

GPU Acceleration

By default, Ollama runs CPU-only inside Docker. For GPU-accelerated inference:

  • macOS (Metal): Run Ollama natively and set OLLAMA_BASE_URL in .env
  • Linux (NVIDIA): Use docker compose -f compose.yml -f compose.gpu-nvidia.yml up -d
  • Linux (AMD/ROCm): Create a compose.override.yml with the ROCm image

See the full GPU Acceleration guide for setup instructions.

GPU Acceleration

Run Ollama with GPU acceleration for faster LLM inference

By default, PiSovereign runs Ollama inside a Docker container using CPU-only inference. With GPU acceleration, inference speed improves dramatically — especially for larger models like qwen2.5:14b or qwen2.5:32b.

Platform Overview

PlatformGPU AccessMethod
macOS (Apple Silicon / Intel)MetalNative Ollama (hybrid mode)
Linux + NVIDIA GPUCUDACompose override file
Linux + AMD GPUROCmManual compose override
Raspberry Pi + HailoNPUSee Hardware Setup

macOS — Native Ollama with Metal GPU

Docker Desktop on macOS runs containers inside a Linux VM and cannot pass through the Metal GPU. To use GPU acceleration, run Ollama natively on the host and point PiSovereign’s Docker container at it.

1. Install Ollama

brew install ollama

2. Start Ollama

ollama serve

Ollama will listen on http://localhost:11434 and automatically use Metal for GPU-accelerated inference on Apple Silicon (M1/M2/M3/M4) or Intel Macs.

3. Pull the inference model

# Default model (recommended for 16 GB+ RAM)
ollama pull qwen2.5:14b

# Embedding model (required)
ollama pull nomic-embed-text

4. Configure Docker environment

Edit docker/.env and set:

OLLAMA_BASE_URL=http://host.docker.internal:11434

This tells the PiSovereign container to connect to the native Ollama instance via Docker’s host.docker.internal bridge (already configured in compose.yml via extra_hosts).

5. Start PiSovereign

# From the repository root
just docker-up

# Or directly
cd docker && docker compose up -d

Note: The Ollama Docker container will still start but is unused. It runs idle with minimal resource consumption. The PiSovereign container connects to native Ollama via the configured OLLAMA_BASE_URL.

Verify GPU is active

# Check Ollama is using Metal
ollama ps
# Should show "metal" in the processor column

# Test inference
curl http://localhost:11434/api/generate -d '{
  "model": "qwen2.5:14b",
  "prompt": "Hello",
  "stream": false
}'

Linux — NVIDIA GPU

On Linux with an NVIDIA GPU, Ollama runs inside Docker with full GPU passthrough via the NVIDIA Container Toolkit.

1. Install NVIDIA Container Toolkit

# Add the NVIDIA repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Configure Docker runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

2. Verify GPU is visible to Docker

docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

This should display your GPU model, driver version, and CUDA version.

3. Start with GPU override

# From the repository root
just docker-up-gpu

# Or directly
cd docker && docker compose -f compose.yml -f compose.gpu-nvidia.yml up -d

This merges compose.gpu-nvidia.yml into the Ollama service, adding NVIDIA GPU device reservations and higher resource limits. The same ollama service is used — only the resource configuration is overridden.

4. Verify GPU inference

# Check GPU layers are loaded
docker compose -f compose.yml -f compose.gpu-nvidia.yml exec ollama ollama ps
# Should show GPU layers in the "processor" column

# Check NVIDIA GPU usage
docker compose -f compose.yml -f compose.gpu-nvidia.yml exec ollama nvidia-smi

GPU Resource Limits

The GPU override file (compose.gpu-nvidia.yml) configures higher resource limits than CPU-only:

SettingCPU-onlyGPU (NVIDIA)
Memory limit12 GB24 GB
CPU limit4.08.0
Parallel requests12
Loaded models12

Adjust these in docker/compose.gpu-nvidia.yml to match your hardware.


Linux — AMD GPU (ROCm)

AMD GPU support requires the ROCm-specific Ollama image and device mappings. This is not provided as a built-in profile due to the different base image, but can be configured manually:

1. Install ROCm drivers

Follow the AMD ROCm installation guide.

2. Create a compose override

Create docker/compose.override.yml:

services:
  ollama:
    image: ollama/ollama:rocm
    devices:
      - /dev/kfd:/dev/kfd
      - /dev/dri:/dev/dri
    group_add:
      - video
      - render
    deploy:
      resources:
        limits:
          memory: 24G
          cpus: "8.0"
    environment:
      - OLLAMA_NUM_PARALLEL=2
      - OLLAMA_MAX_LOADED_MODELS=2
      - OLLAMA_FLASH_ATTENTION=1

3. Start services

cd docker && docker compose up -d

Docker Compose automatically merges compose.yml with compose.override.yml.


Model Configuration

The inference model is configurable via the OLLAMA_MODEL environment variable in docker/.env. The ollama-init container pulls this model on first start.

VRAM / RAMModelParameter
8 GBqwen2.5:7bOLLAMA_MODEL=qwen2.5:7b
16 GBqwen2.5:14bOLLAMA_MODEL=qwen2.5:14b (default)
24 GB+qwen2.5:32bOLLAMA_MODEL=qwen2.5:32b

To change the model:

# Edit docker/.env
OLLAMA_MODEL=qwen2.5:32b

# Restart ollama-init to pull the new model
cd docker && docker compose restart ollama-init

# Or pull manually
just docker-model-pull qwen2.5:32b

The embedding model (nomic-embed-text) is always pulled regardless of the OLLAMA_MODEL setting.


Troubleshooting

macOS: Ollama not reachable from Docker

# Verify Ollama is running
curl http://localhost:11434/api/tags

# Verify Docker can reach the host
docker run --rm --add-host=host.docker.internal:host-gateway \
  curlimages/curl curl -s http://host.docker.internal:11434/api/tags

# Check .env is correct
grep OLLAMA_BASE_URL docker/.env
# Should show: OLLAMA_BASE_URL=http://host.docker.internal:11434

NVIDIA: GPU not visible in container

# Check NVIDIA driver is loaded
nvidia-smi

# Check Container Toolkit is installed
nvidia-ctk --version

# Check Docker runtime
docker info | grep -i nvidia

# Test GPU access
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

Model download fails

# Check ollama-init logs
docker compose logs ollama-init

# Pull manually
docker compose exec ollama ollama pull qwen2.5:14b

# Or via Justfile
just docker-model-pull qwen2.5:14b

Performance is slow despite GPU

# Verify GPU layers are being used
ollama ps
# The "processor" column should show "gpu" or "metal", not "cpu"

# Check if model fits in VRAM — if it spills to RAM, inference slows down
# Reduce model size if VRAM is insufficient

HashiCorp Vault Setup

Secure secret management for PiSovereign using HashiCorp Vault

Vault is included in the Docker Compose stack and initialized automatically on first run. This guide covers how secrets are structured, how to store them, and how Vault integrates with PiSovereign.

Overview

HashiCorp Vault provides centralized secret management with encryption at rest and in transit, fine-grained access control, audit logging, and secret rotation. PiSovereign’s Docker Compose setup includes Vault with automatic initialization via the vault-init sidecar container.

How It Works

┌─────────────────────────────────────────────────────┐
│                    PiSovereign                       │
│  ┌─────────────────────────────────────────────┐   │
│  │           ChainedSecretStore                 │   │
│  │  ┌─────────────┐    ┌──────────────────┐   │   │
│  │  │ VaultSecret │ →  │ EnvironmentSecret │   │   │
│  │  │   Store     │    │     Store         │   │   │
│  │  └─────────────┘    └──────────────────┘   │   │
│  └─────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────┘
           │
           ▼
┌─────────────────────────────────────────────────────┐
│                 HashiCorp Vault                      │
│  ┌──────────────┐  ┌─────────────┐  ┌───────────┐ │
│  │ KV v2 Engine │  │   AppRole   │  │  Audit    │ │
│  │              │  │    Auth     │  │   Log     │ │
│  └──────────────┘  └─────────────┘  └───────────┘ │
└─────────────────────────────────────────────────────┘

PiSovereign uses a ChainedSecretStore that tries multiple backends in order:

  1. Vault (primary) — Production secrets stored securely
  2. Environment variables (fallback) — Overrides for development or CI

Initialization

Vault is initialized on first deployment via the Docker Compose init container. Run manually if needed:

cd docker
docker compose exec vault /vault/init.sh

Important: Save the unseal key and root token printed to stdout. Loss of the unseal key means loss of access to secrets.

After a container restart, Vault may need to be unsealed:

docker compose exec vault vault operator unseal <UNSEAL_KEY>

Storing Secrets

Store integration credentials in Vault after initialization:

# Enter the Vault container
docker compose exec vault sh

# WhatsApp credentials
vault kv put secret/pisovereign/whatsapp \
    access_token="your-meta-access-token" \
    app_secret="your-app-secret"

# Email credentials (IMAP/SMTP password or Bridge password)
vault kv put secret/pisovereign/email \
    password="your-email-password"

# CalDAV credentials
vault kv put secret/pisovereign/caldav \
    username="your-username" \
    password="your-password"

# OpenAI API key (for speech fallback)
vault kv put secret/pisovereign/openai \
    api_key="sk-your-openai-key"

# Brave Search API key
vault kv put secret/pisovereign/websearch \
    brave_api_key="BSA-your-key"

# Signal phone number
vault kv put secret/pisovereign/signal \
    phone_number="+491701234567"

# Verify a secret
vault kv get secret/pisovereign/whatsapp

Secret Paths

PiSovereign expects secrets at these paths:

SecretVault PathEnvironment Variable Fallback
WhatsApp Access Tokensecret/pisovereign/whatsappaccess_tokenPISOVEREIGN_WHATSAPP_ACCESS_TOKEN
WhatsApp App Secretsecret/pisovereign/whatsappapp_secretPISOVEREIGN_WHATSAPP_APP_SECRET
Email Passwordsecret/pisovereign/emailpasswordPISOVEREIGN_EMAIL_PASSWORD
CalDAV Usernamesecret/pisovereign/caldavusernamePISOVEREIGN_CALDAV_USERNAME
CalDAV Passwordsecret/pisovereign/caldavpasswordPISOVEREIGN_CALDAV_PASSWORD
OpenAI API Keysecret/pisovereign/openaiapi_keyPISOVEREIGN_OPENAI_API_KEY
Brave Search Keysecret/pisovereign/websearchbrave_api_keyPISOVEREIGN_WEBSEARCH_BRAVE_API_KEY
Signal Phone Numbersecret/pisovereign/signalphone_numberPISOVEREIGN_SIGNAL__PHONE_NUMBER

AppRole Authentication

For production, use AppRole instead of the root token. AppRole provides short-lived tokens with scoped permissions.

Create Policy

docker compose exec vault sh

vault policy write pisovereign - <<EOF
path "secret/data/pisovereign/*" {
  capabilities = ["read"]
}
path "secret/metadata/pisovereign/*" {
  capabilities = ["list"]
}
path "auth/token/renew-self" {
  capabilities = ["update"]
}
EOF

Configure AppRole

vault auth enable approle

vault write auth/approle/role/pisovereign \
    token_policies="pisovereign" \
    token_ttl=1h \
    token_max_ttl=4h \
    secret_id_ttl=720h \
    secret_id_num_uses=0

# Get Role ID
vault read auth/approle/role/pisovereign/role-id

# Generate Secret ID
vault write -f auth/approle/role/pisovereign/secret-id

Then configure PiSovereign to use AppRole in config.toml:

[vault]
address = "http://vault:8200"
role_id = "12345678-1234-1234-1234-123456789012"
secret_id = "abcd1234-abcd-1234-abcd-abcd12345678"
mount_path = "secret"
timeout_secs = 5

Tip: Store secret_id as an environment variable rather than in the config file:

export PISOVEREIGN_VAULT_SECRET_ID="abcd1234-..."

Operations

Secret Rotation

Update a secret without downtime — PiSovereign reads the latest version automatically:

vault kv put secret/pisovereign/whatsapp \
    access_token="new-access-token" \
    app_secret="same-app-secret"

View secret versions or rollback:

vault kv metadata get secret/pisovereign/whatsapp
vault kv rollback -version=2 secret/pisovereign/whatsapp

Backup

# Backup Vault data volume
docker run --rm -v docker_vault-data:/data -v $(pwd):/backup \
  alpine tar czf /backup/vault-backup-$(date +%Y%m%d).tar.gz /data

For disaster recovery, ensure you have the unseal key and root token stored securely in a separate location.


Troubleshooting

Cannot connect to Vault

docker compose exec vault vault status
docker compose logs vault

Permission denied

# Verify the token has the correct policy
docker compose exec vault vault token lookup
docker compose exec vault vault policy read pisovereign

Secret not found

# Verify the secret exists
docker compose exec vault vault kv get secret/pisovereign/whatsapp

# Check the mount path
docker compose exec vault vault secrets list

Vault sealed after restart

docker compose exec vault vault operator unseal <UNSEAL_KEY>

Next Steps

Configuration Reference

⚙️ Complete reference for all PiSovereign configuration options

This document covers every configuration option available in config.toml.

Table of Contents


Overview

PiSovereign uses a layered configuration system:

  1. Default values - Built into the application
  2. Configuration file - config.toml in the working directory
  3. Environment variables - Override config file values (prefix: PISOVEREIGN_)

Configuration File Location

The application loads config.toml from the current working directory:

# Default location (relative to working directory)
./config.toml

Environment Variable Mapping

Config values can be overridden using environment variables:

[server]
port = 3000

# Becomes:
PISOVEREIGN_SERVER_PORT=3000

Nested values use double underscores:

[speech.local_stt]
threads = 4

# Becomes:
PISOVEREIGN_SPEECH_LOCAL_STT__THREADS=4

Environment Settings

# Application environment: "development" or "production"
# In production:
#   - JSON logging is enforced
#   - Security warnings block startup (unless PISOVEREIGN_ALLOW_INSECURE_CONFIG=true)
#   - TLS verification is enforced
environment = "development"
ValueDescription
developmentRelaxed security, human-readable logs
productionStrict security, JSON logs, TLS enforced

Server Settings

[server]
# Network interface to bind to
# "127.0.0.1" = localhost only (recommended for security)
# "0.0.0.0" = all interfaces (use behind reverse proxy)
host = "127.0.0.1"

# HTTP port
port = 3000

# Enable CORS (Cross-Origin Resource Sharing)
cors_enabled = true

# Allowed CORS origins
# Empty array = allow all (WARNING in production)
# Example: ["https://app.example.com", "https://admin.example.com"]
allowed_origins = []

# Graceful shutdown timeout (seconds)
# Time to wait for active requests to complete
shutdown_timeout_secs = 30

# Log format: "json" or "text"
# In production mode, defaults to "json" even if set to "text"
log_format = "text"

# Secure session cookies (requires HTTPS)
# Set to false for local HTTP development
secure_cookies = false

# Maximum request body size for JSON payloads (optional, bytes)
# max_body_size_json_bytes = 1048576  # 1MB

# Maximum request body size for audio uploads (optional, bytes)
# max_body_size_audio_bytes = 10485760  # 10MB
OptionTypeDefaultDescription
hostString127.0.0.1Bind address
portInteger3000HTTP port
cors_enabledBooleantrueEnable CORS
allowed_originsArray[]CORS allowed origins
shutdown_timeout_secsInteger30Shutdown grace period
log_formatStringtextLog output format
secure_cookiesBooleanfalseSecure cookie mode (HTTPS)
max_body_size_json_bytesInteger1048576(Optional) Max JSON payload size
max_body_size_audio_bytesInteger10485760(Optional) Max audio upload size

Inference Engine

[inference]
# Ollama-compatible server URL
# Works with both hailo-ollama (Raspberry Pi) and standard Ollama (macOS)
base_url = "http://localhost:11434"

# Default model for inference
default_model = "qwen2.5:1.5b"

# Request timeout (milliseconds)
timeout_ms = 60000

# Maximum tokens to generate
max_tokens = 2048

# Sampling temperature (0.0 = deterministic, 2.0 = creative)
temperature = 0.7

# Top-p (nucleus) sampling (0.0-1.0)
top_p = 0.9

# System prompt (optional)
# system_prompt = "You are a helpful AI assistant."
OptionTypeDefaultRangeDescription
base_urlStringhttp://localhost:11434-Inference server URL
default_modelStringqwen2.5:1.5b-Model identifier
timeout_msInteger600001000-300000Request timeout
max_tokensInteger20481-8192Max generation length
temperatureFloat0.70.0-2.0Randomness
top_pFloat0.90.0-1.0Nucleus sampling
system_promptStringNone-(Optional) System prompt

Security Settings

[security]
# Whitelisted phone numbers for WhatsApp
# Empty = allow all, Example: ["+491234567890", "+491234567891"]
whitelisted_phones = []

# API Keys (hashed with Argon2id)
# Generate hashed keys using: pisovereign-cli hash-api-key <your-key>
# Migrate existing plaintext keys: pisovereign-cli migrate-keys --input config.toml --dry-run
#
# [[security.api_keys]]
# hash = "$argon2id$v=19$m=19456,t=2,p=1$..."
# user_id = "550e8400-e29b-41d4-a716-446655440000"
#
# [[security.api_keys]]
# hash = "$argon2id$v=19$m=19456,t=2,p=1$..."
# user_id = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"

# Trusted reverse proxies (IP addresses) - optional
# Add your proxy IPs here if behind a reverse proxy
# trusted_proxies = ["127.0.0.1", "::1"]

# Rate limiting
rate_limit_enabled = true
rate_limit_rpm = 120  # Requests per minute per IP

# TLS settings for outbound connections
tls_verify_certs = true
connection_timeout_secs = 30
min_tls_version = "1.2"  # "1.2" or "1.3"
OptionTypeDefaultDescription
whitelisted_phonesArray[](Optional) Allowed phone numbers
api_keysArray[]API key definitions with Argon2id hash
trusted_proxiesArray-(Optional) Trusted reverse proxy IPs
rate_limit_enabledBooleantrueEnable rate limiting
rate_limit_rpmInteger120Requests/minute/IP
tls_verify_certsBooleantrueVerify TLS certificates for outbound connections
connection_timeout_secsInteger30Connection timeout for external services
min_tls_versionString1.2Minimum TLS version (“1.2” or “1.3”)

Prompt Security

Protects against prompt injection and other AI security threats.

[prompt_security]
# Enable prompt security analysis
enabled = true

# Sensitivity level: "low", "medium", or "high"
# - low: Only block high-confidence threats
# - medium: Block medium and high confidence threats (recommended)
# - high: Block all detected threats including low confidence
sensitivity = "medium"

# Block requests when security threats are detected
block_on_detection = true

# Maximum violations before auto-blocking an IP
max_violations_before_block = 3

# Time window for counting violations (seconds)
violation_window_secs = 3600  # 1 hour

# How long to block an IP after exceeding max violations (seconds)
block_duration_secs = 86400  # 24 hours

# Immediately block IPs that send critical-level threats
auto_block_on_critical = true

# Custom patterns to detect (in addition to built-in patterns) - optional
# custom_patterns = ["DROP TABLE", "eval("]
OptionTypeDefaultDescription
enabledBooleantrueEnable prompt security analysis
sensitivityStringmediumDetection level: “low”, “medium”, or “high”
block_on_detectionBooleantrueBlock requests when threats detected
max_violations_before_blockInteger3Violations before IP auto-block
violation_window_secsInteger3600Time window for counting violations
block_duration_secsInteger86400IP block duration after violations
auto_block_on_criticalBooleantrueAuto-block critical threats immediately
custom_patternsArray-(Optional) Custom threat detection patterns

API Key Authentication

API keys are now securely hashed using Argon2id. Use the CLI tools to generate and migrate keys.

Generate a new hashed key:

pisovereign-cli hash-api-key <your-api-key>

Migrate existing plaintext keys:

pisovereign-cli migrate-keys --input config.toml --dry-run
pisovereign-cli migrate-keys --input config.toml --output config-new.toml

Configuration:

[[security.api_keys]]
hash = "$argon2id$v=19$m=19456,t=2,p=1$..."
user_id = "550e8400-e29b-41d4-a716-446655440000"

Usage:

curl -H "Authorization: Bearer <your-api-key>" http://localhost:3000/v1/chat

Memory & Knowledge Storage

Persistent AI memory for RAG-based context retrieval. Stores interactions, facts, preferences, and corrections using embeddings for semantic similarity search.

[memory]
# Enable memory storage (default: true)
# enabled = true

# Enable RAG context retrieval (default: true)
# enable_rag = true

# Enable automatic learning from interactions (default: true)
# enable_learning = true

# Number of memories to retrieve for RAG context (default: 5)
# rag_limit = 5

# Minimum similarity threshold for RAG retrieval (0.0-1.0, default: 0.5)
# rag_threshold = 0.5

# Similarity threshold for memory deduplication (0.0-1.0, default: 0.85)
# merge_threshold = 0.85

# Minimum importance score to keep memories (default: 0.1)
# min_importance = 0.1

# Decay factor for memory importance over time (default: 0.95)
# decay_factor = 0.95

# Enable content encryption (default: true)
# enable_encryption = true

# Path to encryption key file (generated if not exists)
# encryption_key_path = "memory_encryption.key"

[memory.embedding]
# Embedding model name (default: nomic-embed-text)
# model = "nomic-embed-text"

# Embedding dimension (default: 384 for nomic-embed-text)
# dimension = 384

# Request timeout in milliseconds (default: 30000)
# timeout_ms = 30000
OptionTypeDefaultDescription
enabledBooleantrue(Optional) Enable memory storage
enable_ragBooleantrue(Optional) Enable RAG context retrieval
enable_learningBooleantrue(Optional) Auto-learn from interactions
rag_limitInteger5(Optional) Number of memories for RAG
rag_thresholdFloat0.5(Optional) Min similarity for RAG (0.0-1.0)
merge_thresholdFloat0.85(Optional) Similarity for deduplication (0.0-1.0)
min_importanceFloat0.1(Optional) Min importance to keep memories
decay_factorFloat0.95(Optional) Importance decay over time
enable_encryptionBooleantrue(Optional) Encrypt stored content
encryption_key_pathStringmemory_encryption.key(Optional) Encryption key file path

Embedding Settings:

OptionTypeDefaultDescription
embedding.modelStringnomic-embed-text(Optional) Embedding model name
embedding.dimensionInteger384(Optional) Embedding vector dimension
embedding.timeout_msInteger30000(Optional) Request timeout

Database & Cache

Database

[database]
# SQLite database file path
path = "pisovereign.db"

# Connection pool size
max_connections = 5

# Auto-run migrations on startup
run_migrations = true
OptionTypeDefaultDescription
pathStringpisovereign.dbDatabase file path
max_connectionsInteger5Pool size
run_migrationsBooleantrueAuto-migrate

Cache

PiSovereign uses a 3-layer caching architecture:

  1. L1 (Moka) - In-memory cache for fastest access
  2. L2 (Redb) - Persistent disk cache for exact-match lookups
  3. L3 (Semantic) - pgvector-based similarity cache for semantically equivalent queries
[cache]
# Enable caching (disable for debugging)
enabled = true

# TTL values (seconds)
ttl_short_secs = 300       # 5 minutes - frequently changing
ttl_medium_secs = 3600     # 1 hour - moderately stable
ttl_long_secs = 86400      # 24 hours - stable data

# LLM response caching
ttl_llm_dynamic_secs = 3600   # Dynamic content (briefings)
ttl_llm_stable_secs = 86400   # Stable content (help text)

# L1 (in-memory) cache size
l1_max_entries = 10000
OptionTypeDefaultDescription
enabledBooleantrueEnable caching
ttl_short_secsInteger300Short TTL
ttl_medium_secsInteger3600Medium TTL
ttl_long_secsInteger86400Long TTL
ttl_llm_dynamic_secsInteger3600Dynamic LLM TTL
ttl_llm_stable_secsInteger86400Stable LLM TTL
l1_max_entriesInteger10000Max memory cache entries

Semantic Cache

The semantic cache provides an additional layer that matches queries based on embedding similarity rather than exact string matching. This enables cache hits for semantically equivalent queries like:

  • “What’s the weather?” ≈ “How’s the weather today?”
  • “Tell me about the capital of France” ≈ “What is Paris?”
[cache.semantic]
# Enable semantic caching
enabled = true

# Minimum cosine similarity for cache hit (0.0-1.0)
# Higher = stricter matching, lower = more cache hits
similarity_threshold = 0.92

# TTL for cached entries (hours)
ttl_hours = 48

# Maximum cached entries
max_entries = 10000

# Patterns that bypass semantic cache (time-sensitive queries)
bypass_patterns = ["weather", "time", "date", "today", "tomorrow", "now", "latest", "current", "recent"]

# How often to evict expired entries (minutes)
eviction_interval_minutes = 60
OptionTypeDefaultDescription
enabledBooleantrueEnable semantic caching
similarity_thresholdFloat0.92Minimum cosine similarity (0.0-1.0)
ttl_hoursInteger48Time-to-live in hours
max_entriesInteger10000Maximum cache entries
bypass_patternsArraySee aboveQueries containing these words skip cache
eviction_interval_minutesInteger60Expired entry cleanup interval

Integrations

Messenger Selection

PiSovereign supports one messenger at a time:

# Choose one: "whatsapp", "signal", or "none"
messenger = "whatsapp"
ValueDescription
whatsappUse WhatsApp Business API (webhooks)
signalUse Signal via signal-cli (polling)
noneDisable messenger integration

WhatsApp Business

[whatsapp]
# Meta Graph API access token (store in Vault)
# access_token = "your-access-token"

# Phone number ID from WhatsApp Business
# phone_number_id = "your-phone-number-id"

# App secret for webhook signature verification
# app_secret = "your-app-secret"

# Verify token for webhook setup
# verify_token = "your-verify-token"

# Require webhook signature verification
signature_required = true

# Meta Graph API version
api_version = "v18.0"

# Phone numbers allowed to send messages (empty = allow all)
# whitelist = ["+1234567890"]

# Conversation Persistence Settings
[whatsapp.persistence]
# Enable conversation persistence (default: true)
# enabled = true

# Enable encryption for stored messages (default: true)
# enable_encryption = true

# Enable RAG context retrieval from memory system (default: true)
# enable_rag = true

# Enable automatic learning from interactions (default: true)
# enable_learning = true

# Maximum days to retain conversations (optional, unlimited if not set)
# retention_days = 90

# Maximum messages per conversation before FIFO truncation (optional)
# max_messages_per_conversation = 1000

# Number of recent messages to use as context (default: 50)
# context_window = 50
OptionTypeDefaultDescription
access_tokenString-(Optional) Meta Graph API token (store in Vault)
phone_number_idString-(Optional) WhatsApp Business phone number ID
app_secretString-(Optional) Webhook signature secret
verify_tokenString-(Optional) Webhook verification token
signature_requiredBooleantrueRequire webhook signature verification
api_versionStringv18.0Meta Graph API version
whitelistArray[](Optional) Allowed phone numbers

Persistence Options:

OptionTypeDefaultDescription
persistence.enabledBooleantrue(Optional) Store conversations in database
persistence.enable_encryptionBooleantrue(Optional) Encrypt stored messages
persistence.enable_ragBooleantrue(Optional) Enable RAG context retrieval
persistence.enable_learningBooleantrue(Optional) Auto-learn from interactions
persistence.retention_daysInteger-(Optional) Max retention days (unlimited if not set)
persistence.max_messages_per_conversationInteger-(Optional) Max messages before truncation
persistence.context_windowInteger50(Optional) Recent messages for context

Signal Messenger

[signal]
# Your phone number registered with Signal (E.164 format)
phone_number = "+1234567890"

# Path to signal-cli JSON-RPC socket
socket_path = "/var/run/signal-cli/socket"

# Path to signal-cli data directory (optional)
# data_path = "/var/lib/signal-cli"

# Connection timeout in milliseconds
timeout_ms = 30000

# Phone numbers allowed to send messages (empty = allow all)
# whitelist = ["+1234567890", "+0987654321"]

# Conversation Persistence Settings
[signal.persistence]
# Enable conversation persistence (default: true)
# enabled = true

# Enable encryption for stored messages (default: true)
# enable_encryption = true

# Enable RAG context retrieval from memory system (default: true)
# enable_rag = true

# Enable automatic learning from interactions (default: true)
# enable_learning = true

# Maximum days to retain conversations (optional, unlimited if not set)
# retention_days = 90

# Maximum messages per conversation before FIFO truncation (optional)
# max_messages_per_conversation = 1000

# Number of recent messages to use as context (default: 50)
# context_window = 50
OptionTypeDefaultDescription
phone_numberString-Your Signal phone number (E.164)
socket_pathString/var/run/signal-cli/socketsignal-cli daemon socket
data_pathString-(Optional) signal-cli data directory
timeout_msInteger30000Connection timeout
whitelistArray[](Optional) Allowed phone numbers

Persistence Options:

OptionTypeDefaultDescription
persistence.enabledBooleantrue(Optional) Store conversations in database
persistence.enable_encryptionBooleantrue(Optional) Encrypt stored messages
persistence.enable_ragBooleantrue(Optional) Enable RAG context retrieval
persistence.enable_learningBooleantrue(Optional) Auto-learn from interactions
persistence.retention_daysInteger-(Optional) Max retention days (unlimited if not set)
persistence.max_messages_per_conversationInteger-(Optional) Max messages before truncation
persistence.context_windowInteger50(Optional) Recent messages for context

📖 See Signal Setup Guide for installation instructions.

Speech Processing

Voice message support for speech-to-text (STT) and text-to-speech (TTS).

Cloud Provider (OpenAI):

  • Works on all platforms
  • Requires API key

Local Provider (whisper.cpp + Piper):

  • Raspberry Pi: Models in /usr/local/share/{whisper,piper}/
  • macOS: Models in ~/Library/Application Support/{whisper,piper}/
  • Install whisper.cpp: brew install whisper-cpp (Mac) or build from source (Pi)
  • Install Piper: Download from https://github.com/rhasspy/piper/releases
[speech]
# Speech provider: "openai" (cloud) or "local" (whisper.cpp + Piper)
# provider = "openai"

# OpenAI API key for Whisper (STT) and TTS
# openai_api_key = "sk-..."

# OpenAI API base URL (for custom endpoints)
# openai_base_url = "https://api.openai.com/v1"

# Speech-to-text model (OpenAI Whisper)
# stt_model = "whisper-1"

# Text-to-speech model
# tts_model = "tts-1"

# Default TTS voice: alloy, echo, fable, onyx, nova, shimmer
# default_voice = "nova"

# Output audio format: opus, ogg, mp3, wav
# output_format = "opus"

# Request timeout in milliseconds
# timeout_ms = 60000

# Maximum audio duration in milliseconds (25 min for Whisper)
# max_audio_duration_ms = 1500000

# Response format preference: mirror, text, voice
# response_format = "mirror"

# TTS speaking speed (0.25 to 4.0)
# speed = 1.0
OptionTypeDefaultDescription
providerStringopenai(Optional) Speech provider: “openai” or “local”
openai_api_keyString-(Optional) OpenAI API key (store in Vault)
openai_base_urlStringhttps://api.openai.com/v1(Optional) OpenAI API base URL
stt_modelStringwhisper-1(Optional) Speech-to-text model
tts_modelStringtts-1(Optional) Text-to-speech model
default_voiceStringnova(Optional) TTS voice (alloy, echo, fable, onyx, nova, shimmer)
output_formatStringopus(Optional) Audio format (opus, ogg, mp3, wav)
timeout_msInteger60000(Optional) Request timeout
max_audio_duration_msInteger1500000(Optional) Max audio duration (25 minutes)
response_formatStringmirror(Optional) Response format (mirror, text, voice)
speedFloat1.0(Optional) TTS speaking speed (0.25 to 4.0)

Weather

[weather]
# Open-Meteo API (free, no key required)
# base_url = "https://api.open-meteo.com/v1"

# Connection timeout in seconds
# timeout_secs = 30

# Number of forecast days (1-16)
# forecast_days = 7

# Cache TTL in minutes
# cache_ttl_minutes = 30

# Default location (when user has no profile)
# default_location = { latitude = 52.52, longitude = 13.405 }  # Berlin
OptionTypeDefaultDescription
base_urlStringhttps://api.open-meteo.com/v1(Optional) Open-Meteo API URL
timeout_secsInteger30(Optional) Request timeout
forecast_daysInteger7(Optional) Forecast days (1-16)
cache_ttl_minutesInteger30(Optional) Cache TTL
default_locationObject-(Optional) Default location { latitude, longitude }

CalDAV Calendar

[caldav]
# CalDAV server URL (Baïkal, Radicale, Nextcloud)
# server_url = "https://cal.example.com"
# When using Baïkal via Docker (setup --baikal):
# server_url = "http://baikal:80/dav.php"

# Authentication (store in Vault)
# username = "your-username"
# password = "your-password"

# Default calendar path (optional)
# calendar_path = "/calendars/user/default"

# TLS verification
# verify_certs = true

# Connection timeout in seconds
# timeout_secs = 30
OptionTypeDefaultDescription
server_urlString-(Optional) CalDAV server URL
usernameString-(Optional) Username for authentication (store in Vault)
passwordString-(Optional) Password for authentication (store in Vault)
calendar_pathString/calendars/user/default(Optional) Default calendar path
verify_certsBooleantrue(Optional) Verify TLS certificates
timeout_secsInteger30(Optional) Connection timeout

Email (IMAP/SMTP)

PiSovereign supports any email provider that offers IMAP/SMTP access, including Gmail, Outlook, Proton Mail (via Bridge), and custom servers. Authentication is supported via password or OAuth2 (XOAUTH2).

Migration note: The config section was previously named [proton]. The old name still works (via a serde alias) but [email] is the canonical name going forward.

Quick setup with provider presets:

The easiest way to configure email is using the provider field, which automatically sets sensible defaults for IMAP/SMTP hosts and ports:

[email]
provider = "gmail"    # or "proton" or "custom"
email = "user@gmail.com"
password = "app-password"

Available providers:

ProviderIMAP HostIMAP PortSMTP HostSMTP Port
proton127.0.0.11143127.0.0.11025
gmailimap.gmail.com993smtp.gmail.com465
custom(must specify)(must specify)(must specify)(must specify)

Explicit imap_host, imap_port, smtp_host, smtp_port values always override provider presets.

Full configuration:

[email]
# Provider preset: "proton" (default), "gmail", or "custom"
# provider = "proton"

# IMAP server host (overrides provider preset)
# imap_host = "imap.gmail.com"       # Gmail
# imap_host = "outlook.office365.com" # Outlook
# imap_host = "127.0.0.1"            # Proton Bridge

# IMAP server port (993 for TLS, 1143 for Proton Bridge STARTTLS)
# imap_port = 993

# SMTP server host
# smtp_host = "smtp.gmail.com"       # Gmail
# smtp_host = "smtp.office365.com"   # Outlook
# smtp_host = "127.0.0.1"           # Proton Bridge

# SMTP server port (465 for TLS, 587 for STARTTLS, 1025 for Proton Bridge)
# smtp_port = 465

# Email address
# email = "user@gmail.com"

# Authentication: password or OAuth2
# For password-based auth (app passwords, Bridge passwords):
# password = "app-password"

# For OAuth2 (Gmail, Outlook):
# [email.auth]
# type = "oauth2"
# access_token = "ya29.your-token"

# TLS configuration
[email.tls]
# Verify TLS certificates (set false for self-signed certs like Proton Bridge)
# verify_certificates = true

# Minimum TLS version
# min_tls_version = "1.2"

# Custom CA certificate path (optional)
# ca_cert_path = "/path/to/ca.pem"
OptionTypeDefaultDescription
providerStringproton(Optional) Provider preset: proton, gmail, or custom. Sets default host/port values.
imap_hostString127.0.0.1(Optional) IMAP server host (overrides provider preset)
imap_portInteger1143(Optional) IMAP server port (overrides provider preset)
smtp_hostString127.0.0.1(Optional) SMTP server host (overrides provider preset)
smtp_portInteger1025(Optional) SMTP server port (overrides provider preset)
emailString-(Optional) Email address (store in Vault)
passwordString-(Optional) Password (store in Vault)
auth.typeStringpassword(Optional) Auth method: password or oauth2
auth.access_tokenString-(Optional) OAuth2 access token (store in Vault)
tls.verify_certificatesBooleantrue(Optional) Verify TLS certificates
tls.min_tls_versionString1.2(Optional) Minimum TLS version
tls.ca_cert_pathString-(Optional) Custom CA certificate path

Provider-specific examples:

Gmail
[email]
provider = "gmail"
email = "user@gmail.com"
# Use an App Password (not your Google account password)
# Generate at: https://myaccount.google.com/apppasswords
password = "xxxx xxxx xxxx xxxx"
Outlook / Microsoft 365
[email]
provider = "custom"
imap_host = "outlook.office365.com"
imap_port = 993
smtp_host = "smtp.office365.com"
smtp_port = 587
email = "user@outlook.com"
password = "your-app-password"
Proton Mail (via Bridge)
[email]
provider = "proton"  # default — uses Bridge at 127.0.0.1
email = "user@proton.me"
# Use the Bridge password (from Bridge UI), NOT your Proton account password
password = "bridge-password"

[email.tls]
verify_certificates = false  # Bridge uses self-signed certs
[websearch]
# Brave Search API key (required for primary provider)
# Get your key at: https://brave.com/search/api/
# api_key = "BSA-your-brave-api-key"

# Maximum results per search query (default: 5)
max_results = 5

# Request timeout in seconds (default: 30)
timeout_secs = 30

# Enable DuckDuckGo fallback if Brave fails (default: true)
fallback_enabled = true

# Safe search: "off", "moderate", "strict" (default: "moderate")
safe_search = "moderate"

# Country code for localized results (e.g., "US", "DE", "GB")
country = "DE"

# Language code for results (e.g., "en", "de", "fr")
language = "de"

# Rate limit: requests per minute (default: 60)
rate_limit_rpm = 60

# Cache TTL in minutes (default: 30)
cache_ttl_minutes = 30
OptionTypeDefaultDescription
api_keyString-(Optional) Brave Search API key (store in Vault)
max_resultsInteger5(Optional) Max search results (1-10)
timeout_secsInteger30(Optional) Request timeout
fallback_enabledBooleantrue(Optional) Enable DuckDuckGo fallback
safe_searchStringmoderate(Optional) Safe search: “off”, “moderate”, “strict”
countryStringDE(Optional) Country code for results
languageStringde(Optional) Language code for results
rate_limit_rpmInteger60(Optional) Rate limit (requests/minute)
cache_ttl_minutesInteger30(Optional) Cache time-to-live

Security Note: Store the Brave API key in Vault rather than config.toml:

vault kv put secret/pisovereign/websearch brave_api_key="BSA-..."

Public Transit (ÖPNV)

Provides public transit routing for German transport networks via transport.rest API. Used for “How do I get to X?” queries and location-based reminders.

[transit]
# Base URL for transport.rest API (default: v6.db.transport.rest)
# base_url = "https://v6.db.transport.rest"

# Request timeout in seconds
# timeout_secs = 10

# Maximum number of journey results
# max_results = 3

# Cache TTL in minutes
# cache_ttl_minutes = 5

# Include transit info in location-based reminders
# include_in_reminders = true

# Transport modes to include:
# products_bus = true
# products_suburban = true  # S-Bahn
# products_subway = true    # U-Bahn
# products_tram = true
# products_regional = true  # RB/RE
# products_national = false # ICE/IC

# User's home location for route calculations
# home_location = { latitude = 52.52, longitude = 13.405 }  # Berlin
OptionTypeDefaultDescription
base_urlStringhttps://v6.db.transport.rest(Optional) transport.rest API URL
timeout_secsInteger10(Optional) Request timeout
max_resultsInteger3(Optional) Max journey results
cache_ttl_minutesInteger5(Optional) Cache TTL
include_in_remindersBooleantrue(Optional) Include in location reminders
products_busBooleantrue(Optional) Include bus routes
products_suburbanBooleantrue(Optional) Include S-Bahn
products_subwayBooleantrue(Optional) Include U-Bahn
products_tramBooleantrue(Optional) Include tram
products_regionalBooleantrue(Optional) Include regional trains (RB/RE)
products_nationalBooleanfalse(Optional) Include national trains (ICE/IC)
home_locationObject-(Optional) Home location { latitude, longitude }

Reminder System

Configures the proactive reminder system including CalDAV sync, custom reminders, and scheduling settings.

[reminder]
# Maximum number of snoozes per reminder
# max_snooze = 5

# Default snooze duration in minutes
# default_snooze_minutes = 15

# How far in advance to create reminders from CalDAV events (minutes)
# caldav_reminder_lead_time_minutes = 30

# Interval for checking due reminders (seconds)
# check_interval_secs = 60

# CalDAV sync interval (minutes)
# caldav_sync_interval_minutes = 15

# Morning briefing time (HH:MM format)
# morning_briefing_time = "07:00"

# Enable morning briefing
# morning_briefing_enabled = true
OptionTypeDefaultDescription
max_snoozeInteger5(Optional) Max snoozes per reminder
default_snooze_minutesInteger15(Optional) Default snooze duration
caldav_reminder_lead_time_minutesInteger30(Optional) CalDAV event advance notice
check_interval_secsInteger60(Optional) How often to check for due reminders
caldav_sync_interval_minutesInteger15(Optional) CalDAV sync frequency
morning_briefing_timeString07:00(Optional) Morning briefing time (HH:MM)
morning_briefing_enabledBooleantrue(Optional) Enable daily morning briefing

Model Selector (Deprecated)

Deprecated since v0.6.0: Use [model_routing] instead. See Adaptive Model Routing.

The old [model_selector] section with small_model / large_model is still accepted but will be removed in a future release.


Adaptive Model Routing

Routes requests to different LLM models based on complexity. See the dedicated Adaptive Model Routing page for full documentation.

[model_routing]
enabled = true

[model_routing.models]
trivial = "template"
simple = "gemma3:1b"
moderate = "gemma3:4b"
complex = "gemma3:12b"
OptionTypeDefaultDescription
enabledBooleanfalseEnable adaptive routing
models.trivialString"template"Model for trivial tier (usually "template")
models.simpleString"gemma3:1b"Small model for simple queries
models.moderateString"gemma3:4b"Medium model for moderate queries
models.complexString"gemma3:12b"Large model for complex queries
classification.confidence_thresholdFloat0.6Below this, upgrade tier

Telemetry

[telemetry]
# Enable OpenTelemetry export
enabled = false

# OTLP endpoint (Tempo, Jaeger)
# otlp_endpoint = "http://localhost:4317"

# Sampling ratio (0.0-1.0, 1.0 = all traces)
# sample_ratio = 1.0

# Service name for traces
# service_name = "pisovereign"

# Log level filter (e.g., "info", "debug", "pisovereign=debug,tower_http=info")
# log_filter = "pisovereign=info,tower_http=info"

# Batch export timeout in seconds
# export_timeout_secs = 30

# Maximum batch size for trace export
# max_batch_size = 512

# Graceful fallback to console-only logging if OTLP collector is unavailable.
# When true (default), the application starts with console logging if the collector
# cannot be reached. Set to false to require a working collector in production.
# graceful_fallback = true
OptionTypeDefaultDescription
enabledBooleanfalseEnable OpenTelemetry export
otlp_endpointStringhttp://localhost:4317(Optional) OTLP collector endpoint
sample_ratioFloat1.0(Optional) Trace sampling ratio (0.0-1.0)
service_nameStringpisovereign(Optional) Service name for traces
log_filterStringpisovereign=info,tower_http=info(Optional) Log level filter
export_timeout_secsInteger30(Optional) Batch export timeout
max_batch_sizeInteger512(Optional) Max batch size for export
graceful_fallbackBooleantrue(Optional) Fallback to console logging if collector unavailable

Resilience

Degraded Mode

[degraded_mode]
# Enable fallback when backend unavailable
enabled = true

# Message returned during degraded mode
unavailable_message = "I'm currently experiencing technical difficulties. Please try again in a moment."

# Cooldown before retrying primary backend (seconds)
retry_cooldown_secs = 30

# Number of failures before entering degraded mode
failure_threshold = 3

# Number of successes required to exit degraded mode
success_threshold = 2
OptionTypeDefaultDescription
enabledBooleantrueEnable degraded mode fallback
unavailable_messageStringSee aboveMessage returned during degraded mode
retry_cooldown_secsInteger30Cooldown before retrying primary backend
failure_thresholdInteger3Failures before entering degraded mode
success_thresholdInteger2Successes to exit degraded mode

Retry Configuration

Exponential backoff for retrying failed requests.

[retry]
# Initial delay before first retry in milliseconds
initial_delay_ms = 100

# Maximum delay between retries in milliseconds
max_delay_ms = 10000

# Multiplier for exponential backoff (delay = initial * multiplier^attempt)
multiplier = 2.0

# Maximum number of retry attempts
max_retries = 3
OptionTypeDefaultDescription
initial_delay_msInteger100Initial retry delay (milliseconds)
max_delay_msInteger10000Maximum retry delay (milliseconds)
multiplierFloat2.0Exponential backoff multiplier
max_retriesInteger3Maximum retry attempts

Formula: delay = min(initial_delay * multiplier^attempt, max_delay)


Health Checks

[health]
# Global timeout for all health checks in seconds
global_timeout_secs = 5

# Service-specific timeout overrides (uncomment to customize):
# inference_timeout_secs = 10
# email_timeout_secs = 5
# calendar_timeout_secs = 5
# weather_timeout_secs = 5
OptionTypeDefaultDescription
global_timeout_secsInteger5Global timeout for all health checks
inference_timeout_secsInteger5(Optional) Inference service timeout override
email_timeout_secsInteger5(Optional) Email service timeout override
calendar_timeout_secsInteger5(Optional) Calendar service timeout override
weather_timeout_secsInteger5(Optional) Weather service timeout override

Event Bus

The in-process event bus decouples post-processing from the user-facing response path. When enabled, background handlers asynchronously handle fact extraction, audit logging, conversation persistence verification, and metrics collection — reducing perceived latency by 100–500 ms per request.

[events]
# Enable or disable the event bus (default: true)
enabled = true

# Broadcast channel buffer capacity (default: 1024)
# Increase if handlers can't keep up under high load.
channel_capacity = 1024

# Error handling policy: "log" or "retry" (default: "log", reserved for future use)
# handler_error_policy = "log"

# Retry settings (reserved for future use)
# max_retry_attempts = 3
# retry_delay_ms = 500
OptionTypeDefaultDescription
enabledBooleantrueEnable or disable the event bus
channel_capacityInteger1024Broadcast channel buffer size. Values 256–4096 suit most workloads
handler_error_policyString"log"(Reserved) "log" = log-and-continue, "retry" = retry with backoff
max_retry_attemptsInteger3(Reserved) Max retries when policy is "retry"
retry_delay_msInteger500(Reserved) Base delay between retries in milliseconds

Background handlers spawned automatically:

HandlerRequiresPurpose
FactExtractionHandlerMemory contextExtracts structured facts from conversations via LLM
AuditLogHandlerDatabaseRecords audit trail entries for chat/command/security events
ConversationPersistenceHandlerConversation storeVerifies conversation integrity after each interaction
MetricsHandler(always)Feeds event data into the metrics collector

Tip: Set enabled = false to disable all background processing and fall back to synchronous inline behavior.


Agentic Mode

Multi-agent orchestration for complex tasks. When enabled, the system decomposes complex user requests into parallel sub-tasks, each handled by an independent AI agent.

Note: Requires [agent.tool_calling] enabled = true to be set.

[agentic]
# Enable agentic mode (default: false)
enabled = false

# Maximum concurrent sub-agents running in parallel
max_concurrent_sub_agents = 4

# Maximum sub-agents spawned per task
max_sub_agents_per_task = 10

# Total timeout for the entire agentic task (minutes)
total_timeout_minutes = 30

# Timeout for each individual sub-agent (minutes)
sub_agent_timeout_minutes = 10

# Operations that require user approval before execution
# Example: ["send_email", "delete_contact", "execute_code"]
require_approval_for = []
OptionTypeDefaultDescription
enabledBooleanfalseEnable agentic multi-agent orchestration
max_concurrent_sub_agentsInteger4Max sub-agents running in parallel
max_sub_agents_per_taskInteger10Max sub-agents per task
total_timeout_minutesInteger30Total task timeout (minutes)
sub_agent_timeout_minutesInteger10Per sub-agent timeout (minutes)
require_approval_forArray[]Operations requiring user approval

Vault Integration

[vault]
# Vault server address
# address = "http://127.0.0.1:8200"

# AppRole authentication (recommended)
# role_id = "your-role-id"
# secret_id = "your-secret-id"

# Or token authentication
# token = "hvs.your-token"

# KV engine mount path
# mount_path = "secret"

# Request timeout in seconds
# timeout_secs = 5

# Vault Enterprise namespace (optional)
# namespace = "admin/pisovereign"
OptionTypeDefaultDescription
addressStringhttp://127.0.0.1:8200(Optional) Vault server address
role_idString-(Optional) AppRole role ID (recommended)
secret_idString-(Optional) AppRole secret ID
tokenString-(Optional) Vault token (alternative to AppRole)
mount_pathStringsecret(Optional) KV engine mount path
timeout_secsInteger5(Optional) Request timeout
namespaceString-(Optional) Vault Enterprise namespace

Environment Variables

All configuration options can be set via environment variables. Use __ (double underscore) as the nesting separator to avoid conflicts with field names containing underscores (e.g., phone_number):

Config PathEnvironment Variable
server.portPISOVEREIGN_SERVER__PORT
inference.base_urlPISOVEREIGN_INFERENCE__BASE_URL
signal.phone_numberPISOVEREIGN_SIGNAL__PHONE_NUMBER
database.pathPISOVEREIGN_DATABASE__PATH
vault.addressPISOVEREIGN_VAULT__ADDRESS

Special variables:

VariableDescription
PISOVEREIGN_ALLOW_INSECURE_CONFIGAllow insecure settings in production
RUST_LOGLog level override

Example Configurations

Development

environment = "development"

[server]
host = "127.0.0.1"
port = 3000
log_format = "text"

[inference]
base_url = "http://localhost:11434"
default_model = "qwen2.5:1.5b"

[database]
path = "./dev.db"

[cache]
enabled = false  # Disable for debugging

[security]
rate_limit_enabled = false
tls_verify_certs = false

Production

environment = "production"

[server]
host = "127.0.0.1"  # Behind reverse proxy
port = 3000
log_format = "json"
cors_enabled = true
allowed_origins = ["https://app.example.com"]

[inference]
base_url = "http://localhost:11434"
default_model = "qwen2.5:1.5b"
timeout_ms = 120000

[database]
path = "/var/lib/pisovereign/pisovereign.db"
max_connections = 10

[security]
rate_limit_enabled = true
rate_limit_rpm = 30
min_tls_version = "1.3"

[prompt_security]
enabled = true
sensitivity = "high"
block_on_detection = true

[vault]
address = "https://vault.internal:8200"
role_id = "..."
mount_path = "secret"

[telemetry]
enabled = true
otlp_endpoint = "http://tempo:4317"
sample_ratio = 0.1

Minimal (Quick Start)

environment = "development"

[server]
port = 3000

[inference]
base_url = "http://localhost:11434"
default_model = "qwen2.5:1.5b"

[database]
path = "pisovereign.db"

Adaptive Model Routing

Complexity-based request routing to reduce latency and resource usage

Overview

Model routing classifies every incoming message into one of four complexity tiers and routes it to an appropriately sized LLM model — or answers trivially without calling any model at all.

TierDefault ModelTypical LatencyUse Case
Trivialtemplate (no LLM)<10 msGreetings, thanks, farewells
Simplegemma3:1b~0.5 sShort factual questions
Moderategemma3:4b~2 sMulti-turn conversations, explanations
Complexgemma3:12b~6 sCode generation, analysis, creative writing

Goal: Route 60–70% of queries to the Trivial or Simple tier, reducing average response time from ~8 s to ~3 s.

Configuration

Enable in config.toml:

[model_routing]
enabled = true

[model_routing.models]
trivial = "template"       # No LLM call
simple = "gemma3:1b"
moderate = "gemma3:4b"
complex = "gemma3:12b"

[model_routing.classification]
confidence_threshold = 0.6
max_simple_words = 15
max_simple_chars = 100
max_moderate_sentences = 5
complex_min_words = 50
complex_keywords = [
    "code", "implement", "explain", "analyze",
    "compare", "debug", "refactor", "translate"
]
trivial_patterns = [
    "^hi$", "^hello$", "^hey$", "^hallo$",
    "^moin$", "^danke$", "^thanks$"
]

[model_routing.templates]
greeting = ["Hello! How can I help?", "Hallo! Wie kann ich helfen?"]
farewell = ["Goodbye!", "Tschüss!"]
thanks = ["You're welcome!", "Gerne!"]
help = ["I can help with questions, tasks, weather, transit, and more."]
system_info = ["PiSovereign — your private AI assistant."]
unknown = ["How can I help you?", "Wie kann ich Ihnen helfen?"]

Docker Compose

When routing is enabled, Ollama needs to keep multiple models loaded. Set in compose.yml:

OLLAMA_MAX_LOADED_MODELS: 2

This allows the small and large models to stay warm in memory simultaneously.

How Classification Works

The rule-based classifier runs synchronously (no LLM call) and takes <1 ms:

  1. Trivial detection: Regex patterns, emoji-only, empty input → instant template
  2. Complex detection: Code patterns (backticks, keywords), high word count (≥50), configured keywords → large model
  3. Simple detection: Short messages (≤15 words, ≤100 chars), single sentence, no conversation history → small model
  4. Moderate fallback: Everything else, or follow-up messages in an ongoing conversation

Confidence & Tier Upgrades

Each classification includes a confidence score (0.0–1.0). When confidence falls below the confidence_threshold (default: 0.6), the classifier upgrades to the next higher tier:

  • Simple → Moderate
  • Moderate → Complex

This ensures borderline cases use a more capable model rather than risk a poor response.

Metrics

Model routing exposes Prometheus metrics at /metrics/prometheus:

model_routing_requests_total{tier="trivial"} 142
model_routing_requests_total{tier="simple"} 89
model_routing_requests_total{tier="moderate"} 45
model_routing_requests_total{tier="complex"} 24
model_routing_template_hits_total 142
model_routing_upgrades_total 12

The JSON /metrics endpoint also includes a model_routing object when routing is enabled.

Decorator Chain

When model routing is enabled, the inference decorator chain becomes:

Per tier:
  OllamaInferenceAdapter(tier_model)
    → DegradedInferenceAdapter (per-tier circuit breaker)

ModelRoutingAdapter
  → classifies message → selects tier adapter
  → delegates to appropriate tier

CachedInferenceAdapter (shared across all tiers)
  → SanitizedInferencePort (shared output filter)
    → ChatService

When disabled, the chain is the standard single-model path:

OllamaInferenceAdapter → Degraded → Cached → Sanitized → ChatService

Backward Compatibility

  • The old [model_selector] configuration is deprecated since v0.6.0
  • Setting model_routing.enabled = false (or omitting the section) preserves the original single-model behavior
  • No breaking changes to the InferencePort trait or HTTP API

External Services Setup

Configure WhatsApp, Signal, Email, CalDAV/CardDAV, OpenAI, and Brave Search integrations

Messenger Selection

PiSovereign supports one messenger at a time:

messenger = "signal"     # Signal via signal-cli (default)
messenger = "whatsapp"   # WhatsApp Business API
messenger = "none"       # Disable messenger integration
MessengerUse Case
SignalPrivacy-focused, polling-based, no public URL needed
WhatsAppBusiness integration, webhook-based, requires public URL

WhatsApp Business

PiSovereign uses the WhatsApp Business API for bidirectional messaging.

Meta Business Account

  1. Create a Meta Business Account
  2. Create a Meta Developer Account

WhatsApp App Setup

  1. Create an app at developers.facebook.com/apps (type: Business)
  2. Add the WhatsApp product
  3. In WhatsApp → Getting Started, note the Phone Number ID and generate an Access Token
  4. For a permanent token: Business Settings → System Users → create Admin → generate token with whatsapp_business_messaging permission
  5. Note the App Secret from App Settings → Basic

Webhook Configuration

PiSovereign needs a public URL for WhatsApp webhooks. The Docker Compose stack uses Traefik for this automatically.

Configure in Meta Developer Console:

  1. WhatsApp → Configuration → Edit Webhooks
  2. Callback URL: https://your-domain.com/v1/webhooks/whatsapp
  3. Verify Token: your chosen verify_token
  4. Subscribe to: messages, message_template_status_update

PiSovereign Configuration

Store credentials in Vault:

docker compose exec vault vault kv put secret/pisovereign/whatsapp \
    access_token="your-access-token" \
    app_secret="your-app-secret"

Add to config.toml:

[whatsapp]
phone_number_id = "your-phone-number-id"
verify_token = "your-verify-token"
signature_required = true
api_version = "v18.0"

Signal Messenger

Signal provides privacy-focused messaging with end-to-end encryption, polling-based delivery (no public URL required), and voice message support.

For the full setup guide, see Signal Setup.

Quick config:

messenger = "signal"

[signal]
phone_number = "+1234567890"
socket_path = "/var/run/signal-cli/socket"

Email Integration (IMAP/SMTP)

PiSovereign supports any provider with standard IMAP/SMTP access. Use the provider field for automatic host/port configuration, or specify hosts and ports manually.

Provider Quick Reference

Providerprovider ValueIMAP HostIMAP PortSMTP HostSMTP PortAuth
Gmailgmailimap.gmail.com993smtp.gmail.com465App Password
Outlookcustomoutlook.office365.com993smtp.office365.com587App Password
Proton Mailproton127.0.0.11143127.0.0.11025Bridge Password

Gmail: Enable IMAP in Gmail settings, then generate an App Password (requires 2-Step Verification).

Outlook: Enable IMAP in settings, generate an App Password at account.microsoft.com/security if 2FA is enabled.

Proton Mail: Requires Proton Bridge running on the host. Use the Bridge Password shown in Bridge UI — not your Proton account password. Set verify_certificates = false since Bridge uses self-signed certs.

Configuration

Store the password in Vault:

docker compose exec vault vault kv put secret/pisovereign/email \
    password="your-email-password"

Example configs — choose one:

# Gmail (using provider preset)
[email]
provider = "gmail"
email = "yourname@gmail.com"

# Proton Mail (default provider — via Bridge)
[email]
provider = "proton"
email = "yourname@proton.me"
[email.tls]
verify_certificates = false

# Outlook (custom provider with explicit hosts)
[email]
provider = "custom"
imap_host = "outlook.office365.com"
imap_port = 993
smtp_host = "smtp.office365.com"
smtp_port = 587
email = "yourname@outlook.com"

Migration note: The config section was previously named [proton]. The old name still works but [email] is the canonical name going forward.


CalDAV / CardDAV (Baïkal)

Baïkal is a lightweight, self-hosted CalDAV/CardDAV server included in the Docker Compose stack as an optional profile.

Docker Setup

docker compose --profile caldav up -d

This starts Baïkal at http://localhost/caldav (via Traefik). PiSovereign accesses it internally via the Docker network at http://baikal:80/dav.php.

Security: Baïkal is not directly exposed to the internet. All access is through the Docker network or localhost.

Auto-recreation: PiSovereign automatically re-creates calendars and address books if they return 404 errors (e.g., after a Baïkal database reset or re-initialization). No manual intervention is needed.

Initial Setup

  1. Open http://localhost/caldav in your browser
  2. Complete the setup wizard, set an admin password, choose SQLite
  3. Create a user under Users and Resources
  4. Create a calendar via any CalDAV client or the admin interface

Configuration

Store credentials in Vault (optional):

docker compose exec vault vault kv put secret/pisovereign/caldav \
    username="your-username" \
    password="your-password"

Add to config.toml:

[caldav]
server_url = "http://baikal:80/dav.php"
username = "your-username"
password = "your-password"
calendar_path = "/calendars/username/default/"
verify_certs = true
timeout_secs = 30

CardDAV for contacts uses the same server and credentials — PiSovereign automatically discovers the address book.


OpenAI API

OpenAI is used as an optional cloud fallback for speech processing (STT/TTS).

Setup

  1. Create an account at platform.openai.com
  2. Add a payment method and set usage limits (recommended: $10–20/month)
  3. Create an API key at platform.openai.com/api-keys

Store in Vault:

docker compose exec vault vault kv put secret/pisovereign/openai \
    api_key="sk-your-openai-key"

Configuration

[speech]
provider = "hybrid"  # Local first, OpenAI fallback

openai_base_url = "https://api.openai.com/v1"
stt_model = "whisper-1"
tts_model = "tts-1"
default_voice = "nova"
timeout_ms = 60000

[speech.hybrid]
prefer_local = true
allow_cloud_fallback = true

For maximum privacy (no cloud at all):

[speech]
provider = "local"

[speech.hybrid]
prefer_local = true
allow_cloud_fallback = false

Brave Search API

Brave Search enables web search with source citations. DuckDuckGo is used as an automatic fallback.

Setup

  1. Sign up at brave.com/search/api — the Free tier (2,000 queries/month) is sufficient for personal use
  2. Create an API key in the dashboard

Store in Vault:

docker compose exec vault vault kv put secret/pisovereign/websearch \
    brave_api_key="BSA-your-brave-api-key"

Configuration

[websearch]
api_key = "BSA-your-brave-api-key"
max_results = 5
timeout_secs = 30
fallback_enabled = true
safe_search = "moderate"
country = "DE"
language = "de"

DuckDuckGo’s Instant Answer API is used automatically when Brave is unavailable, rate-limited, or not configured. No API key required. To disable the fallback:

[websearch]
fallback_enabled = false

Verify All Integrations

# Check all services
docker compose exec pisovereign pisovereign-cli status

# Or via HTTP
curl https://your-domain.example.com/ready/all | jq

Troubleshooting

WhatsApp webhook not receiving messages

  • Verify callback URL is publicly accessible
  • Check verify_token matches between config and Meta console
  • Ensure webhook is subscribed to messages

Email connection refused

  • Verify host and port match your provider
  • For Proton: ensure Bridge is running on the host
  • Check password type (App Password for Gmail/Outlook, Bridge Password for Proton)

CalDAV authentication failed

  • Verify username/password
  • Check calendar_path format — must match user and calendar name in Baïkal

Next Steps

Signal Messenger Setup

📱 Connect Signal messenger to PiSovereign via Docker

PiSovereign uses signal-cli as a Docker container to send and receive Signal messages. This guide covers the complete setup process.

Prerequisites

  • Docker must be running (docker compose up -d in the docker/ directory)
  • Signal app installed on your smartphone and registered with a phone number
  • qrencode installed on the host (for QR code display)
  • Phone number stored in .env or Vault

Installing qrencode

macOS:

brew install qrencode

Debian / Raspberry Pi:

sudo apt-get install qrencode

Linking Your Signal Account

signal-cli is connected as a linked device to your existing Signal account (similar to Signal Desktop). No new account is created.

⚠️ Important: The link command outputs a sgnl:// URI that must be converted into a QR code. You cannot pipe the output directly to qrencode, because qrencode waits for EOF — by that time the link process has already terminated and the URI has expired. Therefore, two separate terminal commands must be used.

Open a terminal and run:

docker exec -it pisovereign-signal-cli signal-cli --config /var/lib/signal-cli link -n "PiSovereign" | tee /tmp/signal-uri.txt

This command:

  1. Starts the link process in the background
  2. Captures the URI to /tmp/signal-uri.txt
  3. Displays the URI after 8 seconds (for verification)

Step 2: Display the QR Code and Scan

Once the URI is displayed, generate the QR code:

head -1 /tmp/signal-uri.txt | tr -d '\n' | qrencode -t ANSIUTF8

Now quickly scan with your phone:

  1. Open Signal on your smartphone
  2. Go to Settings → Linked Devices → Link New Device
  3. Scan the QR code shown in the terminal
  4. Confirm the link on your phone

💡 The link process is still running in the background, waiting for the scan. If the QR code has expired, simply repeat both steps.

After a successful scan, restart the container:

cd docker/
docker compose restart signal-cli

The logs should no longer show a NotRegisteredException:

docker compose logs signal-cli

Configuration

Phone Number

The Signal phone number must be known to PiSovereign. Use one of the following methods:

Option A: .env file (in the docker/ directory):

PISOVEREIGN_SIGNAL__PHONE_NUMBER=+491234567890

Option B: Vault:

vault kv put secret/pisovereign signal_phone_number="+491234567890"

config.toml

messenger = "signal"

[signal]
phone_number = "+491234567890"        # E.164 format
socket_path = "/var/run/signal-cli/socket"
timeout_ms = 30000

Environment Variables

export PISOVEREIGN_MESSENGER=signal
export PISOVEREIGN_SIGNAL__PHONE_NUMBER=+491234567890
export PISOVEREIGN_SIGNAL__SOCKET_PATH=/var/run/signal-cli/socket

Troubleshooting

Socket Already in Use

Failed to bind socket /var/run/signal-cli/socket: Address already in use

Cause: A stale socket from a previous run persists in the Docker volume.

Solution: The container uses an entrypoint script that automatically cleans up the socket before starting. If the error still occurs:

docker compose restart signal-cli

NotRegisteredException

WARN MultiAccountManager - Ignoring +49...: User is not registered.

Cause: signal-cli has not been linked to a Signal account.

Solution: Complete the account linking procedure.

Expired QR Code

Cause: qrencode waits for EOF. When piping signal-cli link | qrencode, the QR code is only displayed after the link process terminates — at which point the URI is already invalid.

Solution: Redirect the URI to a file (Step 1) and display it as a QR code separately (Step 2). See Linking Your Signal Account.

Daemon Connection Failed

# Check the socket
docker exec pisovereign-signal-cli ls -la /var/run/signal-cli/socket

# Check container logs
docker compose logs signal-cli

Security

  • Signal messages are end-to-end encrypted
  • signal-cli stores cryptographic keys locally in the signal-cli-data volume
  • The socket (signal-cli-socket) is shared only within the Docker network

Backup

The signal-cli data should be backed up regularly:

docker run --rm -v docker_signal-cli-data:/data -v $(pwd):/backup \
  alpine tar czf /backup/signal-cli-backup.tar.gz -C /data .

See Backup & Restore for complete backup procedures.

See Also

Reminder System

PiSovereign includes a proactive reminder system that helps you stay on top of appointments, tasks, and custom reminders. The system integrates with CalDAV calendars and provides beautiful German-language notifications via WhatsApp or Signal.

Features

  • Calendar Integration: Automatically creates reminders from CalDAV events
  • Custom Reminders: Create personal reminders with natural language
  • Smart Notifications: Beautiful formatted messages with emoji and key information
  • Location Support: Google Maps links and ÖPNV transit connections for location-based events
  • Snooze Management: Snooze reminders up to 5 times (configurable)
  • Morning Briefing: Daily summary of your upcoming appointments

Natural Language Commands

Creating Reminders

"Erinnere mich morgen um 10 Uhr an den Arzttermin"
"Remind me tomorrow at 3pm to call mom"
"Erinnere mich in 2 Stunden an die Wäsche"

Listing Reminders

"Zeige meine Erinnerungen"
"Welche Termine habe ich heute?"
"Liste alle aktiven Erinnerungen"

Snoozing Reminders

"Erinnere mich nochmal in 15 Minuten"
"Snooze für eine Stunde"

Acknowledging Reminders

"Ok, danke!"
"Erledigt"

Deleting Reminders

"Lösche die Erinnerung zum Arzttermin"

Transit Connections

When you have an appointment at a specific location, PiSovereign can automatically include ÖPNV (public transit) connections in your reminder:

📅 **Meeting mit Hans**
📍 Alexanderplatz 1, Berlin
🕒 Morgen um 14:00 Uhr

🚇 **So kommst du hin:**
🚌 Bus 200 → S-Bahn S5 → U-Bahn U2
   Abfahrt: 13:22 (38 min)
   Ankunft: 14:00

🗺️ [Auf Google Maps öffnen](https://www.google.com/maps/...)

Searching Transit Routes

You can also search for transit connections directly:

"Wie komme ich zum Hauptbahnhof?"
"ÖPNV Verbindung nach Alexanderplatz"

Configuration

Add the following sections to your config.toml:

Transit Configuration

[transit]
# Include transit info in location-based reminders
include_in_reminders = true

# Your home location for route calculations
home_location = { latitude = 52.52, longitude = 13.405 }

# Transport modes to include
products_bus = true
products_suburban = true    # S-Bahn
products_subway = true      # U-Bahn
products_tram = true
products_regional = true    # RB/RE
products_national = false   # ICE/IC

Reminder Configuration

[reminder]
# Maximum number of snoozes per reminder (default: 5)
max_snooze = 5

# Default snooze duration in minutes (default: 15)
default_snooze_minutes = 15

# How far in advance to create reminders from CalDAV events
caldav_reminder_lead_time_minutes = 30

# Interval for checking due reminders (seconds)
check_interval_secs = 60

# CalDAV sync interval (minutes)
caldav_sync_interval_minutes = 15

# Morning briefing settings
morning_briefing_time = "07:00"
morning_briefing_enabled = true

CalDAV Configuration

For calendar integration, you need a CalDAV server (like Baikal, Radicale, or Nextcloud):

[caldav]
server_url = "https://cal.example.com/dav.php"
username = "your-username"
password = "your-password"
calendar_path = "/calendars/user/default"

Reminder Sources

Reminders can come from two sources:

  1. CalDAV Events: Automatically synced from your calendar
  2. Custom Reminders: Created via natural language commands

CalDAV events include the original event details (title, time, location) while custom reminders are more flexible and can include any text.

Notification Format

Reminders are formatted as beautiful German messages with:

  • Bold headers for event titles
  • Emoji prefixes for quick scanning (📅 📍 🕒)
  • Time formatting relative to now (“in 30 Minuten”)
  • Location links to Google Maps
  • Transit info for getting there

Example reminder notification:

📅 **Zahnarzt Dr. Müller**
📍 Friedrichstraße 123, Berlin
🕒 Heute um 15:00 (in 2 Stunden)

🗺️ Auf Google Maps öffnen

Morning Briefing

When enabled, you receive a daily summary at the configured time (default 7:00 AM):

☀️ **Guten Morgen!**

📅 **Heute hast du 3 Termine:**

1. 09:00 - Team Meeting (Büro)
2. 12:30 - Mittagessen mit Lisa (Restaurant Mitte)
3. 16:00 - Arzttermin (Praxis Dr. Schmidt)

🌤️ Wetter: 18°C, leicht bewölkt

📋 **Offene Erinnerungen:**
- Geburtstagskarte für Mama kaufen
- Wäsche abholen

Snooze Limits

Each reminder can be snoozed up to max_snooze times (default: 5). After that, the system will indicate that no more snoozes are available:

⏰ Diese Erinnerung wurde bereits 5x verschoben.
Bitte bestätige oder lösche sie.

Status Tracking

Reminders go through these states:

  • Pending: Waiting for the remind time
  • Sent: Notification was delivered
  • Acknowledged: User confirmed receipt
  • Snoozed: User requested a later reminder
  • Deleted: User removed the reminder

You can list reminders filtered by status using commands like “zeige alle erledigten Erinnerungen”.

Troubleshooting

Solutions for common issues with PiSovereign

Quick Diagnostics

Run these commands first to identify the problem:

# Check all containers are running
docker compose ps

# Health check
curl http://localhost/health | jq

# Detailed readiness
curl http://localhost/ready/all | jq

# Recent logs
docker compose logs --tail=100 pisovereign

# System resources
docker stats --no-stream

Hailo AI HAT+

Device not detected

Symptom: Hailo device not available inside the container

Diagnosis:

# Check device files on the host
ls -la /dev/hailo*

# Check kernel module on the host
lsmod | grep hailo

# Check PCIe
lspci | grep -i hailo

Solutions:

  1. Check physical connection — ensure the HAT+ is fully seated on GPIO pins, PCIe FPC cable is connected, and you are using the 27W USB-C power supply

  2. Reinstall drivers on the host:

    sudo apt remove --purge hailo-*
    sudo apt autoremove
    sudo reboot
    sudo apt install hailo-h10-all
    sudo reboot
    
  3. Check device passthrough — ensure docker-compose.yml maps /dev/hailo0 into the container

Hailo firmware error

# Reset the device (on host)
sudo hailortcli fw-control reset

# Update firmware
sudo apt update && sudo apt upgrade hailo-firmware

Inference Problems

Inference timeout

Diagnosis:

# Test Ollama directly inside Docker
docker compose exec ollama curl -s http://localhost:11434/api/generate \
  -d '{"model":"qwen2.5-1.5b-instruct","prompt":"Hi","stream":false}'

Solutions:

  1. Increase timeout:

    [inference]
    timeout_ms = 120000  # 2 minutes
    
  2. Use a smaller model:

    [inference]
    default_model = "llama3.2-1b-instruct"
    

Model not found

# List models
docker compose exec ollama ollama list

# Pull missing model
docker compose exec ollama ollama pull qwen2.5-1.5b-instruct

Poor response quality

Adjust in config.toml:

[inference]
max_tokens = 4096
temperature = 0.5  # Lower = more focused

If model routing is enabled, ensure complex queries use a capable model:

[model_routing.models]
complex = "gemma3:12b"

Model routing — wrong tier selected

Check Prometheus metrics to see tier distribution:

curl -s http://localhost:3000/metrics/prometheus | grep model_routing

If too many requests go to the Simple tier, lower max_simple_words or add more complex_keywords:

[model_routing.classification]
max_simple_words = 10
complex_keywords = ["code", "implement", "explain", "analyze", "compare"]

Model routing — Ollama out of memory

When multiple models are loaded, Ollama may run out of RAM. Reduce the number of concurrent models:

# compose.yml
OLLAMA_MAX_LOADED_MODELS: 1

Or use smaller models for the Simple and Moderate tiers.


Network & Connectivity

Connection refused

Diagnosis:

# Check containers
docker compose ps

# Check Traefik is routing
docker compose logs traefik | tail -20

# Test direct container access
docker compose exec pisovereign curl -s http://localhost:3000/health

Solutions:

  1. Check bind address in config.toml:

    [server]
    host = "0.0.0.0"
    
  2. Check Traefik configuration — verify domain and routing rules in docker/traefik/dynamic.yml

TLS/SSL errors

  • Development: Use http://localhost (Traefik handles TLS for external access)
  • Production: Ensure your domain’s DNS points to the server, and Let’s Encrypt can reach port 80 for validation
  • Self-signed certs (e.g., Proton Bridge): set verify_certificates = false in the relevant config section

Database Issues

Database locked

Cause: Multiple concurrent writers to SQLite

Solutions:

  1. Ensure single PiSovereign instance:

    docker compose ps | grep pisovereign
    # Should show exactly one instance
    
  2. Verify WAL mode:

    docker compose exec pisovereign sqlite3 /data/pisovereign.db "PRAGMA journal_mode;"
    # Should return "wal"
    

Migration failed

# Backup current database
docker compose exec pisovereign cp /data/pisovereign.db /data/pisovereign-backup.db

# Reset database (LOSES DATA — restore from backup afterward)
docker compose exec pisovereign rm /data/pisovereign.db
docker compose restart pisovereign

Database corruption

# Attempt recovery
docker compose exec pisovereign sh -c \
  'sqlite3 /data/pisovereign.db ".recover" | sqlite3 /data/pisovereign-recovered.db'

Integration Problems

WhatsApp

Webhook verification failed:

  1. URL must be publicly accessible — test with curl from an external network
  2. verify_token in config.toml must match the Meta developer console
  3. HTTPS must be configured (Traefik handles this)

Messages not received:

  1. Check webhook is subscribed to the messages field in Meta console
  2. Verify phone number is whitelisted (for test numbers)
  3. Check logs: docker compose logs pisovereign | grep -i whatsapp

Email (IMAP/SMTP)

Connection refused:

# Test IMAP from the container
docker compose exec pisovereign openssl s_client -connect imap.gmail.com:993
  • Verify host/port match your provider (see External Services)
  • For Proton Bridge: ensure Bridge is running on the host
  • If using the provider field, explicit imap_host/smtp_host values override presets

Authentication failed:

  • Gmail: Use an App Password, not your account password
  • Outlook: Use an App Password if 2FA is enabled
  • Proton Mail: Use the Bridge Password from the Bridge UI, not your Proton account password

Migrating from [proton] config: The old [proton] config section still works via a serde alias. If you see “duplicate field” errors, ensure you don’t have both [proton] and [email] sections in your config.

CalDAV

401 Unauthorized:

docker compose exec pisovereign curl -u username:password \
  http://baikal:80/dav.php/calendars/username/

Verify user exists in Baïkal admin at http://localhost/caldav.

404 Not Found — PiSovereign automatically re-creates missing calendars and address books. If you still see 404 errors:

  • Verify calendar_path matches your Baïkal user and calendar name
  • Check that the user has permissions to create calendars
  • List calendars to verify:
docker compose exec pisovereign curl -u username:password -X PROPFIND \
  http://baikal:80/dav.php/calendars/username/

Speech Processing

Whisper (STT) fails

# Check Whisper container
docker compose logs whisper

# Test directly
docker compose exec whisper curl -s http://localhost:8081/health
  • Verify the container has enough memory (~500 MB for base model)
  • Check audio format (mono 16 kHz WAV preferred)

Piper (TTS) fails

# Check Piper container
docker compose logs piper

# Test directly
docker compose exec piper curl -s http://localhost:8082/health
  • Verify voice model files are mounted correctly
  • Check container logs for ONNX runtime errors

Memory System (RAG)

Memories not being retrieved

  1. Check that enable_rag = true in [memory] config
  2. Verify rag_threshold isn’t too high — try lowering to 0.3
  3. Ensure embeddings are generated: GET /v1/memories/stats should show entries with embeddings
  4. Confirm Ollama is running with the nomic-embed-text embedding model

Encryption key errors

  • “Read-only file system”: Ensure encryption_key_path points to a writable directory (e.g., /app/data/memory_encryption.key in Docker)
  • Lost encryption key: Encrypted memories cannot be recovered. Delete the key file, clear the memories and memory_embeddings tables, and let PiSovereign generate a new key on startup

Memory decay not running

The decay task runs automatically every 24 hours. Check logs for memory decay task entries. You can also trigger it manually via POST /v1/memories/decay.


System Commands

Commands not auto-populating

On first startup, 32 default system commands should be auto-discovered. If empty:

  1. Check logs for system_command_discovery entries
  2. Verify the database migration 14_system_commands.sql ran successfully
  3. Check GET /v1/commands/catalog/count — should return {"count": 32}

Startup warnings about Vault credentials

PiSovereign validates Vault credentials at startup and logs diagnostic warnings for missing or empty fields. Check the log line Some configuration fields are empty after secret resolution for affected fields. Store the missing secrets in Vault using just docker-vault-check.


Performance Issues

High memory usage

docker stats --no-stream

Reduce resource consumption in config.toml:

[cache]
l1_max_entries = 1000

[database]
max_connections = 3

Slow response times

  1. Check inference latency — the model may be too large for your hardware
  2. Enable caching: [cache] enabled = true
  3. Use SSD storage — SD card I/O is a common bottleneck on Raspberry Pi

Getting Help

Collect Diagnostic Information

Before reporting an issue, gather:

# Container status
docker compose ps

# PiSovereign version
docker compose exec pisovereign pisovereign-server --version

# Recent logs
docker compose logs --since "1h" > pisovereign-logs.txt

# System info
uname -a
docker --version
free -h
df -h

Report an Issue

Architecture

🏗️ System design and architectural patterns in PiSovereign

This document explains the architectural decisions, design patterns, and structure of PiSovereign.

Table of Contents


Overview

PiSovereign follows Clean Architecture (also known as Hexagonal Architecture or Ports & Adapters) to achieve:

  • Independence from frameworks - Business logic doesn’t depend on Axum, SQLite, or any external library
  • Testability - Core logic can be tested without infrastructure
  • Flexibility - Adapters can be swapped without changing business rules
  • Maintainability - Clear boundaries between concerns
┌─────────────────────────────────────────────────────────────────┐
│                     External World                               │
│  (HTTP Clients, WhatsApp, Email Servers, AI Hardware)           │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Presentation Layer                             │
│  ┌─────────────────┐          ┌─────────────────┐              │
│  │ presentation_   │          │ presentation_   │              │
│  │     http        │          │     cli         │              │
│  │  (Axum API)     │          │  (Clap CLI)     │              │
│  └─────────────────┘          └─────────────────┘              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Application Layer                              │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    application                            │   │
│  │  (Services, Use Cases, Orchestration, Port Definitions)  │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
              ┌───────────────┼───────────────┐
              ▼               ▼               ▼
┌──────────────────┐ ┌──────────────┐ ┌──────────────────────────┐
│   Domain Layer   │ │  AI Layer    │ │   Infrastructure Layer   │
│  ┌────────────┐  │ │ ┌──────────┐ │ │ ┌──────────────────────┐ │
│  │   domain   │  │ │ │ ai_core  │ │ │ │    infrastructure    │ │
│  │ (Entities, │  │ │ │(Inference│ │ │ │  (Adapters, Repos,   │ │
│  │  Values,   │  │ │ │ Engine)  │ │ │ │  Cache, DB, Vault)   │ │
│  │ Commands)  │  │ │ └──────────┘ │ │ └──────────────────────┘ │
│  └────────────┘  │ │ ┌──────────┐ │ │                          │
│                  │ │ │ai_speech │ │ │  ┌──────────────────┐   │
│                  │ │ │(STT/TTS) │ │ │  │  integration_*   │   │
│                  │ │ └──────────┘ │ │  │ (WhatsApp, Mail, │   │
│                  │ │              │ │  │  Calendar, etc.) │   │
│                  │ └──────────────┘ │  └──────────────────┘   │
└──────────────────┘                  └──────────────────────────┘

Clean Architecture

Layer Responsibilities

LayerCratesResponsibility
DomaindomainCore business entities, value objects, commands, domain errors
ApplicationapplicationUse cases, service orchestration, port definitions
Infrastructureinfrastructure, integration_*Adapters for external systems (DB, cache, APIs)
AIai_core, ai_speechAI-specific logic (inference, speech processing)
Presentationpresentation_http, presentation_cliUser interfaces (REST API, CLI)

Dependency Rule

Inner layers NEVER depend on outer layers

domain          → (no dependencies on other PiSovereign crates)
application     → domain
ai_core         → domain, application (ports)
ai_speech       → domain, application (ports)
infrastructure  → domain, application (ports)
integration_*   → domain, application (ports)
presentation_*  → domain, application, infrastructure, ai_*, integration_*

This means:

  • domain knows nothing about databases, HTTP, or external services
  • application defines what it needs via ports (traits), not how it’s done
  • Only presentation crates wire everything together

Crate Dependencies

Dependency Graph

graph TB
    subgraph "Presentation"
        HTTP[presentation_http]
        CLI[presentation_cli]
    end
    
    subgraph "Integration"
        WA[integration_whatsapp]
        PM[integration_email]
        CAL[integration_caldav]
        WX[integration_weather]
    end
    
    subgraph "Infrastructure"
        INFRA[infrastructure]
    end
    
    subgraph "AI"
        CORE[ai_core]
        SPEECH[ai_speech]
    end
    
    subgraph "Core"
        APP[application]
        DOM[domain]
    end
    
    HTTP --> APP
    HTTP --> INFRA
    HTTP --> CORE
    HTTP --> SPEECH
    HTTP --> WA
    HTTP --> PM
    HTTP --> CAL
    HTTP --> WX
    
    CLI --> APP
    CLI --> INFRA
    
    WA --> APP
    WA --> DOM
    
    PM --> APP
    PM --> DOM
    
    CAL --> APP
    CAL --> DOM
    
    WX --> APP
    WX --> DOM
    
    INFRA --> APP
    INFRA --> DOM
    
    CORE --> APP
    CORE --> DOM
    
    SPEECH --> APP
    SPEECH --> DOM
    
    APP --> DOM

Workspace Structure

PiSovereign/
├── Cargo.toml              # Workspace manifest
├── crates/
│   ├── domain/             # Core business logic (no external deps)
│   │   ├── Cargo.toml
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── entities/   # User, Conversation, Message, etc.
│   │       ├── values/     # UserId, MessageContent, etc.
│   │       ├── commands/   # UserCommand, SystemCommand
│   │       └── errors.rs   # Domain errors
│   │
│   ├── application/        # Use cases and ports
│   │   ├── Cargo.toml
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── services/   # ConversationService, CommandService, etc.
│   │       └── ports/      # Trait definitions (InferencePort, etc.)
│   │
│   ├── infrastructure/     # Framework-dependent implementations
│   │   ├── Cargo.toml
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── adapters/   # VaultSecretStore, etc.
│   │       ├── cache/      # MokaCache, RedbCache
│   │       ├── persistence/# SQLite repositories
│   │       └── telemetry/  # OpenTelemetry setup
│   │
│   ├── ai_core/            # Inference engine
│   │   └── src/
│   │       ├── hailo/      # Hailo-Ollama client
│   │       └── selector/   # Model routing
│   │
│   ├── ai_speech/          # Speech processing
│   │   └── src/
│   │       ├── providers/  # Hybrid, Local, OpenAI
│   │       └── converter/  # Audio format conversion
│   │
│   ├── integration_*/      # External service adapters
│   │
│   └── presentation_*/     # User interfaces

Port/Adapter Pattern

Ports (Interfaces)

Ports are traits defined in application/src/ports/ that describe what the application needs:

// application/src/ports/inference.rs
#[async_trait]
pub trait InferencePort: Send + Sync {
    async fn generate(
        &self,
        prompt: &str,
        options: InferenceOptions,
    ) -> Result<InferenceResponse, InferenceError>;
    
    async fn generate_stream(
        &self,
        prompt: &str,
        options: InferenceOptions,
    ) -> Result<impl Stream<Item = Result<String, InferenceError>>, InferenceError>;
    
    async fn health_check(&self) -> Result<bool, InferenceError>;
}
// application/src/ports/secret_store.rs
#[async_trait]
pub trait SecretStore: Send + Sync {
    async fn get_secret(&self, path: &str) -> Result<Option<String>, SecretError>;
    async fn health_check(&self) -> Result<bool, SecretError>;
}
// application/src/ports/memory_context.rs — RAG context injection
#[async_trait]
pub trait MemoryContextPort: Send + Sync {
    async fn retrieve_context(
        &self,
        user_id: &UserId,
        query: &str,
        limit: usize,
    ) -> Result<Vec<MemoryContext>, MemoryError>;
}
// application/src/ports/embedding.rs — Vector embeddings
#[async_trait]
pub trait EmbeddingPort: Send + Sync {
    async fn embed(&self, text: &str) -> Result<Vec<f32>, EmbeddingError>;
}
// application/src/ports/encryption.rs — Content encryption at rest
pub trait EncryptionPort: Send + Sync {
    fn encrypt(&self, plaintext: &[u8]) -> Result<Vec<u8>, EncryptionError>;
    fn decrypt(&self, ciphertext: &[u8]) -> Result<Vec<u8>, EncryptionError>;
}

Adapters (Implementations)

Adapters implement ports and live in infrastructure/ or integration_*/:

// infrastructure/src/adapters/vault_secret_store.rs
pub struct VaultSecretStore {
    client: VaultClient,
    mount_path: String,
}

#[async_trait]
impl SecretStore for VaultSecretStore {
    async fn get_secret(&self, path: &str) -> Result<Option<String>, SecretError> {
        let full_path = format!("{}/{}", self.mount_path, path);
        self.client.read_secret(&full_path).await
    }
    
    async fn health_check(&self) -> Result<bool, SecretError> {
        self.client.health().await
    }
}
// infrastructure/src/adapters/env_secret_store.rs
pub struct EnvironmentSecretStore {
    prefix: Option<String>,
}

#[async_trait]
impl SecretStore for EnvironmentSecretStore {
    async fn get_secret(&self, path: &str) -> Result<Option<String>, SecretError> {
        // Convert "database/password" to "DATABASE_PASSWORD"
        let env_key = self.path_to_env_var(path);
        Ok(std::env::var(&env_key).ok())
    }
    
    async fn health_check(&self) -> Result<bool, SecretError> {
        Ok(true) // Environment is always available
    }
}

Example: Secret Store

The ChainedSecretStore demonstrates the adapter pattern:

// infrastructure/src/adapters/chained_secret_store.rs
pub struct ChainedSecretStore {
    stores: Vec<Box<dyn SecretStore>>,
}

impl ChainedSecretStore {
    pub fn new() -> Self {
        Self { stores: Vec::new() }
    }
    
    pub fn add_store(mut self, store: impl SecretStore + 'static) -> Self {
        self.stores.push(Box::new(store));
        self
    }
}

#[async_trait]
impl SecretStore for ChainedSecretStore {
    async fn get_secret(&self, path: &str) -> Result<Option<String>, SecretError> {
        for store in &self.stores {
            if let Ok(Some(secret)) = store.get_secret(path).await {
                return Ok(Some(secret));
            }
        }
        Ok(None)
    }
}

Usage in application:

// Wiring in presentation layer
let secret_store = ChainedSecretStore::new()
    .add_store(VaultSecretStore::new(vault_config)?)
    .add_store(EnvironmentSecretStore::new(Some("PISOVEREIGN")));

let command_service = CommandService::new(
    Arc::new(secret_store),  // Injected as trait object
    // ... other dependencies
);

Data Flow

Example: Intent Routing Pipeline

User input is routed through a multi-stage pipeline that minimizes LLM calls. Each stage acts as a progressively more expensive filter:

1. User Input: "Hey, it's Andreas. I'm naming you Macci."
   │
   ▼
2. Conversational Filter (zero LLM cost)
   │  Regex-based detection of greetings, introductions, small talk.
   │  If matched → skip to chat (no workflow/intent parsing).
   │
   ▼ (not conversational)
3. Quick Pattern Matching
   │  Regex patterns for well-known commands (e.g., "remind me",
   │  "search for", "send email"). Fast, deterministic.
   │
   ▼ (no quick match)
4. Guarded Workflow Detection
   │  Only invoked when input has ≥8 words AND contains ≥2
   │  workflow-hint keywords ("create", "plan", "distribute", etc.).
   │  Uses LLM to detect multi-step workflows.
   │
   ▼ (not a workflow)
5. LLM Intent Parsing
   │  Full LLM-based intent classification with confidence score.
   │  Post-validated by keyword presence per intent category.
   │  Intents below 0.7 confidence are downgraded to chat.
   │
   ▼
6. Dispatch to appropriate handler or fall through to chat

Example: Chat Request

1. HTTP Request arrives at /v1/chat
   │
   ▼
2. presentation_http extracts request, validates auth
   │
   ▼
3. Calls ConversationService.send_message() [application layer]
   │
   ▼
4. ConversationService:
   ├── Loads conversation from ConversationRepository [port]
   ├── Calls InferencePort.generate() [port]
   └── Saves message via ConversationRepository [port]
   │
   ▼
5. InferencePort implementation (ai_core::HailoClient):
   ├── Sends request to Hailo-Ollama
   └── Returns response
   │
   ▼
6. Response flows back through layers
   │
   ▼
7. HTTP Response returned to client

Example: WhatsApp Voice Message

1. WhatsApp Webhook POST to /v1/webhooks/whatsapp
   │
   ▼
2. integration_whatsapp validates signature, parses message
   │
   ▼
3. VoiceMessageService.process() [application layer]
   │
   ├── Download audio via WhatsAppPort
   ├── Convert format via AudioConverter [ai_speech]
   ├── Transcribe via SpeechPort (STT)
   ├── Process text via CommandService
   ├── (Optional) Synthesize via SpeechPort (TTS)
   └── Send response via WhatsAppPort
   │
   ▼
4. Response sent back to user via WhatsApp

Key Design Decisions

1. Async-First

All I/O operations are async using Tokio:

#[async_trait]
pub trait InferencePort: Send + Sync {
    async fn generate(&self, ...) -> Result<..., ...>;
}

Rationale: Maximizes throughput on limited Raspberry Pi resources.

2. Error Handling via thiserror

Each layer defines its own error types:

// domain/src/errors.rs
#[derive(Debug, thiserror::Error)]
pub enum DomainError {
    #[error("Invalid message content: {0}")]
    InvalidContent(String),
}

// application/src/errors.rs
#[derive(Debug, thiserror::Error)]
pub enum ServiceError {
    #[error("Domain error: {0}")]
    Domain(#[from] DomainError),
    #[error("Inference failed: {0}")]
    Inference(String),
}

Rationale: Clear error boundaries, easy conversion between layers.

3. Feature Flags

Optional features reduce binary size:

# Cargo.toml
[features]
default = ["http"]
http = ["axum", "tower", ...]
cli = ["clap", ...]
speech = ["whisper", "piper", ...]

Rationale: Raspberry Pi has limited storage; include only what’s needed.

4. Configuration via config Crate

Layered configuration (defaults → file → env vars):

let config = Config::builder()
    .add_source(config::File::with_name("config"))
    .add_source(config::Environment::with_prefix("PISOVEREIGN"))
    .build()?;

Rationale: Flexibility for different deployment scenarios.

5. Multi-Layer Caching

Request → L1 (Moka, in-memory) → L2 (Redb, persistent) → L3 (Semantic, pgvector) → Backend
LayerTypeStorageMatch MethodUse Case
L1MokaCacheIn-memoryExact stringHot data, sub-ms access
L2RedbCacheDiskExact stringWarm data, persists across restarts
L3PgSemanticCachePostgreSQL/pgvectorCosine similaritySemantically equivalent queries

Decorator Chain Order:

SanitizedInferencePort (outermost)
  └─ CachedInferenceAdapter (exact L1+L2)
       └─ SemanticCachedInferenceAdapter (similarity L3)
            └─ DegradedInferenceAdapter
                 └─ OllamaInferenceAdapter (innermost)

Rationale: Minimize latency and reduce load on inference engine. The semantic layer catches queries that are phrased differently but mean the same thing, significantly improving cache hit rates.

6. In-Process Event Bus

Post-processing work (fact extraction, audit logging, metrics) runs asynchronously via an in-process event bus backed by tokio::sync::broadcast:

ChatService / AgentService
        │
        ▼ publish(DomainEvent)
 ┌──────────────────────┐
 │  TokioBroadcastBus   │
 └──────────────────────┘
    │    │    │    │
    ▼    ▼    ▼    ▼
  Fact  Audit Conv Metrics
  Ext.  Log   Pers. Handler

Key properties:

  • Fire-and-forget — handlers never block the response path
  • DomainEvent enum defined in the domain layer (7 variants)
  • EventBusPort / EventSubscriberPort defined in application ports
  • TokioBroadcastEventBus adapter in infrastructure
  • Handlers spawned conditionally based on available dependencies
  • Channel overflow → Lagged warning, not data loss for the publisher

Rationale: Moves 100–500 ms of per-request post-processing off the critical path, crucial on resource-constrained Raspberry Pi hardware.

7. Agentic Multi-Agent Orchestration

Complex tasks can be decomposed into parallel sub-tasks, each executed by an independent AI agent:

User Request: "Plan my trip to Berlin next week"
        │
        ▼ POST /v1/agentic/tasks
 ┌──────────────────────────┐
 │  AgenticOrchestrator     │
 │  (application service)   │
 └──────────────────────────┘
    │        │        │
    ▼        ▼        ▼
 SubAgent  SubAgent  SubAgent
 (weather) (calendar)(transit)
    │        │        │
    └────────┴────────┘
             │
             ▼
      Aggregated Result

Key properties:

  • Wave-based parallel execution with configurable concurrency limits
  • Dependency tracking between sub-tasks
  • Individual sub-agent timeouts and total task timeouts
  • Real-time progress via SSE streaming (/v1/agentic/tasks/{id}/stream)
  • Task cancellation support
  • Approval gates for sensitive operations
  • Domain entities in domain, orchestration in application, event bus in infrastructure, REST handlers in presentation_http, UI in presentation_web

Further Reading

Web Frontend

🌐 SolidJS-based progressive web application for PiSovereign

The web frontend provides a modern chat interface for interacting with PiSovereign through any browser. It is built with SolidJS and embedded directly into the Rust binary at compile time via rust-embed.

Table of Contents


Overview

PiSovereign’s web frontend is a single-page application (SPA) that communicates with the backend via REST and Server-Sent Events (SSE). Key design goals:

  • Zero external hosting — Assets are embedded in the Rust binary, no separate web server needed
  • Offline-capable — PWA with service worker for offline resilience
  • Privacy-first — No external CDNs, fonts, or analytics; everything is self-contained
  • Lightweight — ~200 KB production bundle with code splitting

Technology Stack

TechnologyVersionPurpose
SolidJS1.9.xReactive UI framework
SolidJS Router0.15.xClient-side routing
TypeScript5.7Type safety (strict mode)
Vite6.xBuild tool & dev server
Tailwind CSS4.xUtility-first styling
Vitest3.xUnit & component testing
vite-plugin-pwa1.xService worker generation

Architecture

Directory Structure

crates/presentation_web/
├── Cargo.toml              # Rust crate manifest
├── dist/                   # Vite build output (gitignored)
├── frontend/               # SolidJS source code
│   ├── index.html          # HTML entry point
│   ├── package.json        # Node dependencies
│   ├── tsconfig.json       # TypeScript configuration
│   ├── vite.config.ts      # Vite build configuration
│   ├── vitest.config.ts    # Test configuration
│   └── src/
│       ├── index.tsx        # Application entry point
│       ├── app.tsx          # Root component with router
│       ├── api/             # REST & SSE API clients
│       ├── components/      # Reusable UI components
│       │   └── ui/          # Base UI primitives
│       ├── hooks/           # Reactive hooks
│       ├── lib/             # Utilities (cn, format, sanitize)
│       ├── pages/           # Route page components
│       ├── stores/          # Global state stores
│       └── types/           # TypeScript type definitions
└── src/                    # Rust source code
    ├── lib.rs              # Crate root
    ├── assets.rs           # rust-embed asset struct
    ├── csp.rs              # Content Security Policy
    ├── handler.rs          # Static file handler with caching
    └── routes.rs           # SPA router & axum integration

Component Architecture

The frontend follows a layered component architecture:

Pages (routes)
  └── Composed Components (chat, settings panels)
        └── UI Primitives (Button, Card, Modal, Badge, Spinner)
              └── Utility Functions (cn, format, sanitize)

Pages are lazy-loaded via solid-router for code splitting:

  • / — Chat interface (main interaction view)
  • /settings — Application settings
  • /commands — Available bot commands
  • /memories — Memory inspection
  • /audit — Audit log viewer
  • /health — System health dashboard

UI Primitives (components/ui/) are unstyled, composable building blocks:

  • Button — With variants (default, outline, ghost, destructive) and sizes
  • Card — Container with header/content/footer slots
  • Modal — Dialog with focus trap and backdrop
  • Badge — Status indicators with color variants
  • Spinner — Loading indicators

State Management

Global state uses SolidJS signals organized into stores:

StorePurpose
auth.storeAuthentication state, token management
chat.storeConversations, messages, active chat
theme.storeDark/light mode preference
toast.storeNotification queue

Stores are accessed via the StoreProvider context, available throughout the component tree.

API Client Layer

The api/ directory contains typed REST clients:

ClientEndpointPurpose
client.tsBase HTTP client with auth headers
chat.api.ts/api/v1/chatSend messages, SSE streaming
health.api.ts/api/v1/healthSystem health status
memories.api.ts/api/v1/memoriesMemory CRUD
audit.api.ts/api/v1/auditAudit log queries
commands.api.ts/api/v1/commandsBot command listing
settings.api.ts/api/v1/settingsUser preferences

All API calls go through client.ts, which handles:

  • Bearer token injection from the auth store
  • Base URL resolution (same origin in production, proxy in dev)
  • JSON serialization/deserialization
  • Error response mapping

Development

Prerequisites

  • Node.js ≥ 22 (LTS recommended)
  • npm ≥ 10

Getting Started

# Install dependencies
just web-install

# Start development server with hot reload
just web-dev

The Vite dev server starts on http://localhost:5173 and proxies API requests to the backend at http://localhost:3000.

Available Commands

All frontend tasks are available via just:

CommandDescription
just web-installInstall npm dependencies
just web-buildProduction build → dist/
just web-devStart Vite dev server with HMR
just web-lintRun ESLint checks
just web-lint-fixAuto-fix ESLint issues
just web-fmtFormat with Prettier
just web-testRun Vitest test suite
just web-test-coverageRun tests with coverage report
just web-typecheckTypeScript type checking

Development Workflow

  1. Start the backend: just run or cargo run
  2. Start the frontend dev server: just web-dev
  3. Open http://localhost:5173 in your browser
  4. Edit SolidJS components — changes are reflected instantly via HMR

The Vite dev server proxies /api/* requests to localhost:3000, so you get the full API available during development.

Build & Deployment

Production Build

just web-build

This runs vite build which outputs optimized assets to crates/presentation_web/dist/. The build:

  • Tree-shakes unused code
  • Minifies JS/CSS
  • Adds content hashes to filenames for cache busting
  • Generates PWA service worker
  • Code-splits routes for lazy loading
  • Outputs ~200 KB total (gzipped ~60 KB)

Rust Integration

The presentation_web crate embeds the dist/ directory at compile time using rust-embed:

#[derive(Embed)]
#[folder = "dist/"]
pub struct FrontendAssets;

The Rust handler layer provides:

  • Content-type detection — MIME types based on file extension
  • Cache control — Immutable caching for hashed assets, no-cache for HTML
  • ETag support — Conditional requests via If-None-Match
  • CSP headers — Content Security Policy for XSS protection
  • SPA fallback — Unknown routes serve index.html for client-side routing

Important: You must run just web-build before cargo build so the dist/ directory is populated. The Docker build handles this automatically.

Docker Build

The Dockerfile uses a multi-stage build:

# Stage 1: Build frontend
FROM node:22-alpine AS frontend-builder
COPY crates/presentation_web/frontend/ .
RUN npm ci && npm run build

# Stage 2: Build Rust binary
FROM rust:1.93-slim-bookworm AS builder
COPY --from=frontend-builder /app/dist/ crates/presentation_web/dist/
RUN cargo build --release

# Stage 3: Runtime
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/pisovereign .

This ensures the frontend is always built fresh and embedded into the binary.

Testing

Unit & Component Tests

Tests use Vitest with @solidjs/testing-library for component tests and MSW (Mock Service Worker) for API mocking.

# Run all tests
just web-test

# Run with coverage
just web-test-coverage

Test structure mirrors the source layout:

frontend/src/
├── lib/__tests__/          # Utility tests (cn, format, sanitize)
├── stores/__tests__/       # Store logic tests
├── api/__tests__/          # API client tests
└── components/ui/__tests__/ # Component rendering tests

Current coverage: 93 tests across utilities, stores, API clients, and UI components.

End-to-End Tests (Playwright)

The project includes Playwright-based E2E journey tests that simulate real user interactions against a live application instance on localhost:3000. These tests cover every page, CRUD operation, and user action in the frontend.

Setup

# Install Playwright browsers (one-time)
just web-e2e-install

# Ensure the Docker stack is running
just docker-up

Running E2E Tests

# Run all journey tests
just web-e2e

# Run with interactive UI (for debugging)
just web-e2e-ui

# View the last HTML report
just web-e2e-report

Skipping LLM-Dependent Tests

Tests tagged @llm (Chat, Agentic) require Ollama to be running. To skip them:

cd crates/presentation_web/frontend && npx playwright test --grep-invert @llm

Architecture

frontend/e2e/
├── global-setup.ts              # Authenticates once, saves session cookie
├── fixtures/
│   └── auth.fixture.ts          # Pre-authenticated page fixture
├── reporters/
│   └── bugreport.reporter.ts    # Auto-generates bug reports on failure
├── helpers/
│   ├── navigation.helper.ts     # Sidebar navigation, page-load utilities
│   └── form.helper.ts           # Form fills, modal helpers, test IDs
└── journeys/
    ├── auth.journey.spec.ts         # Login, logout, session persistence
    ├── dashboard.journey.spec.ts    # Stat cards, quick actions, sections
    ├── chat.journey.spec.ts         # Send message, SSE streaming (@llm)
    ├── conversations.journey.spec.ts # List, search, delete conversations
    ├── commands.journey.spec.ts     # Parse, execute, catalog CRUD
    ├── approvals.journey.spec.ts    # List, approve, deny requests
    ├── contacts.journey.spec.ts     # Full CRUD + search
    ├── calendar.journey.spec.ts     # Views, event CRUD, date navigation
    ├── tasks.journey.spec.ts        # Task CRUD, filters, completion
    ├── kanban.journey.spec.ts       # Board columns, filter buttons
    ├── memory.journey.spec.ts       # Memory CRUD, search, decay, stats
    ├── agentic.journey.spec.ts      # Multi-agent task lifecycle (@llm)
    ├── mailing.journey.spec.ts      # Email list, refresh, mark read
    └── system.journey.spec.ts       # Status, models, health checks

Writing New Journey Tests

Journey tests follow a consistent pattern using test.step() for structured reproduction steps:

import { test, expect } from '../fixtures/auth.fixture';

test.describe('Feature Journey', () => {
  test('complete user flow', async ({ page }) => {
    await test.step('navigate to the page', async () => {
      await page.goto('/feature');
      await page.waitForLoadState('networkidle');
    });

    await test.step('perform user action', async () => {
      await page.locator('button:has-text("Action")').click();
      await expect(page.locator('text=Result')).toBeVisible();
    });
  });
});

Key conventions:

  • File naming: {feature}.journey.spec.ts
  • Test steps: Use test.step() — these feed the bugreport reporter for clear reproduction steps
  • Data cleanup: Create test data with unique IDs (testId() helper) and clean up in afterAll or at the end of the test
  • Timeouts: Use generous timeouts (60s) for LLM/SSE-dependent tests and tag them with @llm
  • Resilience: Use .catch(() => false) for optional UI elements that may or may not exist depending on backend state

Automatic Bug Reports

When a test fails, the custom BugreportReporter writes a detailed markdown file to bugreports/:

  • Title and test metadata (file, line, browser, duration)
  • Steps to reproduce extracted from test.step() annotations
  • Error message and stack trace
  • Screenshot paths (captured on failure)
  • Environment details (OS, Node.js version)

Bug reports are named YYYY-MM-DD-e2e-{test-slug}.md for chronological ordering.

Code Quality

The project enforces strict code quality standards:

  • TypeScript strict mode — All strict checks enabled, including exactOptionalPropertyTypes
  • ESLint — SolidJS-specific rules + TypeScript checks (0 errors required)
  • Prettier — Consistent formatting
  • Pre-commit checksjust pre-commit runs lint, typecheck, and tests

Quality gates are integrated into the just quality and just pre-commit recipes, which run both frontend and backend checks together.

PWA Support

The frontend is a Progressive Web App with:

  • Service Worker — Generated by vite-plugin-pwa using Workbox
  • Offline support — Cached assets served when offline
  • Installable — Add to home screen on mobile devices
  • Web manifest — App name, icons, theme colors defined in manifest.webmanifest

The service worker uses a cache-first strategy for static assets and network-first for API calls.

Styling

Styling uses Tailwind CSS v4 with a custom design system:

  • CSS custom properties for theming (dark/light mode)
  • Utility classes for layout, spacing, typography
  • cn() helper — Merges Tailwind classes with conflict resolution via tailwind-merge + clsx
  • No external CSS frameworks — Everything is built from Tailwind utilities

The color palette follows a navy/slate theme matching PiSovereign’s brand identity.

Security

The embedded frontend includes several security measures:

  • Content Security Policy (CSP) — Restricts script sources, style sources, and connections
  • No inline scripts — All JavaScript is loaded from hashed asset files
  • Same-origin API calls — No cross-origin requests by design
  • No external dependencies at runtime — Fonts, icons, and all assets are self-hosted
  • Auth token handling — Tokens stored in memory (SolidJS signals), not localStorage

See the Security Hardening guide for production deployment recommendations.

AI Memory System

PiSovereign includes a persistent AI memory system that enables your assistant to remember facts, preferences, and past interactions. This creates a more personalized and contextually aware experience.

Overview

The memory system provides:

  • Persistent Storage: All interactions can be stored in PostgreSQL with encryption at rest
  • Semantic Search (RAG): Retrieve relevant memories based on meaning, not just keywords
  • Automatic Learning: The AI learns from conversations automatically
  • Memory Decay: Less important or rarely accessed memories fade over time
  • Deduplication: Similar memories are merged to prevent redundancy
  • Content Encryption: Sensitive data is encrypted at rest using XChaCha20-Poly1305

How It Works

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   User Query    │────▶│   RAG Retrieval  │────▶│  Context + Query│
│  "What's my     │     │  (Top 5 similar) │     │  sent to LLM    │
│   favorite..."  │     └──────────────────┘     └─────────────────┘
└─────────────────┘              │                        │
                                 │                        ▼
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│ Stored Memory   │◀────│  Learning Phase  │◀────│   AI Response   │
│ (Encrypted)     │     │ (Q&A + Metadata) │     │                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘

1. RAG Context Retrieval

When you ask a question:

  1. The query is converted to an embedding vector using nomic-embed-text
  2. Similar memories are found using cosine similarity search
  3. The top N most relevant memories are sorted by type priority (corrections and facts first) and injected into the prompt with an instructive preamble that explicitly tells the LLM to treat them as known facts
  4. Full memory content is used (not truncated summaries), with a 2 000-character budget to stay within the token window
  5. The AI generates a response with full context

2. Automatic Learning

After each response (including streamed responses):

  1. The Q&A pair is evaluated for importance using lightweight heuristics (no LLM call):
    • AI naming cues (“nenn dich”, “your name is”, “du heißt”) → +0.40
    • Identity cues (“my name is”, “I live in”, “ich heiße”) → +0.35
    • Correction cues (“that’s wrong”, “please remember”, “eigentlich”) → +0.30
    • Preference cues (“I prefer”, “I like”, “ich mag”) → +0.25
    • Word count adjustments (longer = more valuable)
    • Final score clamped to [0.2, 0.9]
  2. The memory type is automatically classified (priority order):
    • AI naming signals → Fact
    • Correction signals → Correction
    • Preference signals → Preference
    • Identity/fact signals → Fact
    • Default → Context
  3. Embeddings are generated for semantic search
  4. If a similar memory exists (>85% similarity), they’re merged (on plaintext, before encryption)
  5. Content is encrypted before storage

Note: Both the HTTP chat endpoint (ChatService) and the messenger path (MemoryEnhancedChat) use the same shared heuristic module (importance.rs) for consistent importance estimation and type classification.

3. Memory Types

TypePurposeExample
FactGeneral knowledge“Paris is the capital of France”
PreferenceUser preferences“User prefers dark mode”
CorrectionFeedback/corrections“Actually, the meeting is Tuesday not Monday”
ToolResultAPI/tool outputs“Weather API returned: 22°C, sunny”
ContextConversation context“Q: What time is it? A: 3:00 PM”

4. Relevance Scoring

When memories are retrieved for RAG context, they are ranked using a combined relevance score that balances three factors:

relevance_score = similarity × 0.50  +  importance × 0.20  +  freshness × 0.30

Where:

  • similarity (50%): Cosine similarity between query and memory embeddings (0.0–1.0)
  • importance (20%): Current importance after decay (0.0–1.0), with per-type floors
  • freshness (30%): Exponential decay based on time since last access: e^(-0.01 × hours). Memories from seconds ago score ~1.0, from one day ago ~0.79, from one week ago ~0.19.

This ensures that memories from the current conversation session (stored moments ago) dominate when relevant, while long-term knowledge still contributes via the similarity and importance terms.

After scoring, memories are sorted by type priority before injection:

  1. Corrections (highest priority — user explicitly corrected the AI)
  2. Facts (identity, names, important knowledge)
  3. Preferences
  4. Context
  5. Tool Results

Configuration

Add to your config.toml:

[memory]
# Enable memory storage
enabled = true

# Enable RAG context retrieval
enable_rag = true

# Enable automatic learning from interactions
enable_learning = true

# Number of memories to retrieve for RAG context
rag_limit = 5

# Minimum similarity threshold for RAG retrieval (0.0-1.0)
rag_threshold = 0.5

# Similarity threshold for memory deduplication (0.0-1.0)
merge_threshold = 0.85

# Minimum importance score to keep memories
min_importance = 0.1

# Decay factor for memory importance over time
decay_factor = 0.95

# Enable content encryption
enable_encryption = true

# Path to encryption key file (generated if not exists)
encryption_key_path = "memory_encryption.key"

[memory.embedding]
# Embedding model name
model = "nomic-embed-text"

# Embedding dimension
dimension = 384

# Request timeout in milliseconds
timeout_ms = 30000

Memory Decay

Memory importance decays over time using an Ebbinghaus-inspired model with per-type modifiers that ensure critical memories resist forgetting:

stability      = 1.0 + ln(1 + access_count)
type_modifier  = memory_type.decay_modifier()
effective_rate = (base_decay_rate × type_modifier) / stability
reinforcement  = min(access_count × 0.005, 0.08)

new_importance = max(
    importance × e^(-effective_rate × days) + reinforcement,
    memory_type.importance_floor()
)

Type-specific modifiers

Memory TypeDecay ModifierImportance FloorEffect
Correction0.500.35Decays very slowly, never drops below 0.35
Fact0.700.30Decays slowly, never drops below 0.30
Preference0.800.25Moderate decay
Tool Result1.000.10Normal decay, ephemeral
Context1.000.10Normal decay, ephemeral

This mirrors the human brain: corrections and facts are “episodic memories” that the brain retains much longer than transient working-memory items.

Other factors:

  • base_decay_rate: Derived from decay_factor (default: 0.95)
  • stability: Grows logarithmically with each access — first access gives stability ≈ 1.0, ~3 accesses double it, with diminishing returns
  • reinforcement: A small bonus (up to 0.08) that prevents heavily-used memories from vanishing entirely
  • days_since_access: Time elapsed since the memory was last retrieved

Memories below min_importance are automatically cleaned up.

Security

Content Encryption

All memory content and summaries are encrypted using:

  • Algorithm: XChaCha20-Poly1305 (AEAD)
  • Key Size: 256 bits
  • Nonce Size: 192 bits (unique per encryption)

The encryption key is stored at encryption_key_path and auto-generated if missing.

⚠️ Important: Backup your encryption key! Without it, encrypted memories cannot be recovered.

Embedding Vectors

Embedding vectors are stored unencrypted to enable similarity search. They reveal:

  • Semantic similarity between memories
  • General topic clustering

They do NOT reveal:

  • Actual content
  • Specific details

Embedding Models

The system supports various Ollama embedding models:

ModelDimensionsUse Case
nomic-embed-text384Default, balanced
mxbai-embed-large1024Higher accuracy
bge-m31024Multilingual

To use a different model:

[memory.embedding]
model = "mxbai-embed-large"
dimension = 1024

Database Schema

Memories are stored in PostgreSQL with pgvector for similarity search:

-- Main memories table
CREATE TABLE memories (
    id UUID PRIMARY KEY,
    user_id UUID NOT NULL,
    conversation_id UUID,
    content TEXT NOT NULL,      -- Encrypted
    summary TEXT NOT NULL,       -- Encrypted
    importance DOUBLE PRECISION NOT NULL,
    memory_type TEXT NOT NULL,
    tags JSONB NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    accessed_at TIMESTAMPTZ NOT NULL,
    access_count INTEGER DEFAULT 0,
    embedding vector(384)        -- pgvector column for similarity search
);

-- IVFFlat index for fast cosine similarity search
CREATE INDEX idx_memories_embedding ON memories
    USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

-- Full-text search index
CREATE INDEX idx_memories_fts ON memories
    USING gin (to_tsvector('english', content || ' ' || summary));

Manual Memory Management

You can manually store specific information:

// Store a fact
memory_service.store_fact(user_id, "User's birthday is March 15", 0.9).await?;

// Store a preference
memory_service.store_preference(user_id, "Prefers metric units", 0.8).await?;

// Store a correction
memory_service.store_correction(user_id, "Actually prefers tea, not coffee", 1.0).await?;

Maintenance

Applying Decay

Memory decay runs as an automatic background task (daily by default). The interval is controlled by the decay_factor configuration. You can also trigger it manually:

let decayed = memory_service.apply_decay().await?;
println!("Decayed {} memories", decayed.len());

Or via the REST API:

curl -X POST -H "Authorization: Bearer $API_KEY" \
  http://localhost:3000/v1/memories/decay

Cleaning Up Low-Importance Memories

let deleted = memory_service.cleanup_low_importance().await?;
println!("Deleted {} memories", deleted);

Statistics

let stats = memory_service.stats(&user_id).await?;
println!("Total: {}, With embeddings: {}, Avg importance: {:.2}",
    stats.total_count, stats.with_embeddings, stats.avg_importance);

Troubleshooting

Memories Not Being Retrieved

  1. Check that enable_rag = true
  2. Verify rag_threshold isn’t too high (try 0.3)
  3. Ensure embeddings are generated (check with_embeddings in stats)
  4. Confirm Ollama is running with the embedding model

High Memory Usage

  1. Lower rag_limit to reduce context size
  2. Run cleanup_low_importance() more frequently
  3. Increase min_importance threshold
  4. Reduce decay_factor for faster decay

Encryption Key Lost

If you lose the encryption key, encrypted memories cannot be recovered.

To start fresh:

  1. Delete memory_encryption.key
  2. Clear the memories and memory_embeddings tables
  3. A new key will be generated on next startup

Architecture

The memory system follows the ports-and-adapters pattern:

  • MemoryContextPort — the primary port interface used by ChatService to inject RAG context into prompts. Implementations receive a query string and return relevant memory snippets.
  • MemoryService — the core service that orchestrates embedding generation, semantic search, encryption, and storage. Requires three ports:
    • MemoryPort — persistence (PostgreSQL adapter)
    • EmbeddingPort — vector generation (Ollama adapter using nomic-embed-text)
    • EncryptionPort — content encryption (ChaChaEncryptionAdapter or NoOpEncryption)
// The MemoryContextPort trait signature
#[async_trait]
pub trait MemoryContextPort: Send + Sync {
    async fn retrieve_context(
        &self,
        user_id: &UserId,
        query: &str,
        limit: usize,
    ) -> Result<Vec<MemoryContext>, MemoryError>;
}

API Endpoints

See the API Reference for full REST API documentation covering:

  • GET /v1/memories — list memories
  • POST /v1/memories — create a memory
  • GET /v1/memories/search?q=... — semantic search
  • GET /v1/memories/stats — storage statistics
  • POST /v1/memories/decay — trigger decay
  • GET /v1/memories/{id} — get specific memory
  • DELETE /v1/memories/{id} — delete memory

LLM Tool Calling (ReAct Agent)

PiSovereign includes a ReAct (Reason + Act) agent that enables the LLM to autonomously invoke tools — weather lookups, calendar queries, web searches, and more — instead of relying solely on rigid command parsing.

How It Works

When a user sends a general question (AgentCommand::Ask), the system follows this flow:

  1. Collect tools — The ToolRegistry asks each wired port which tool definitions are available (e.g., if no weather port is configured, get_weather is omitted).
  2. LLM + tools — The conversation history and tool JSON schemas are sent to Ollama’s /api/chat endpoint with the tools parameter.
  3. Parse response — The LLM either returns a final text response or requests one or more tool calls.
  4. Execute tools — If tool calls are returned, the ToolExecutor dispatches each call to the appropriate port, collects results, and appends them as MessageRole::Tool messages to the conversation.
  5. Loop — Steps 2–4 repeat until the LLM produces a final response or a configurable iteration limit / timeout is reached.
User → LLM (with tool schemas)
         ├─ Final text → done
         └─ Tool calls → execute → append results → loop back to LLM

Architecture

The implementation follows Clean Architecture:

LayerComponentCrate
DomainToolDefinitiondomain
DomainToolCall, ToolResult, ToolCallingResultdomain
DomainMessageRole::Tool, ChatMessage::tool()domain
ApplicationToolRegistryPortapplication
ApplicationToolExecutorPortapplication
ApplicationInferencePort::generate_with_tools()application
ApplicationReActAgentServiceapplication
InfrastructureToolRegistryinfrastructure
InfrastructureToolExecutorinfrastructure
InfrastructureOllamaInferenceAdapter (extended)infrastructure
PresentationWired in main.rs, used in chat handlerspresentation_http

Available Tools

The following 18 tools are registered when their corresponding ports are wired:

ToolPort RequiredDescription
get_weatherWeatherPortCurrent weather and forecast
search_webWebSearchPortWeb search via Brave / DuckDuckGo
list_calendar_eventsCalendarPortList upcoming calendar events
create_calendar_eventCalendarPortCreate a new calendar event
search_contactsContactPortSearch contacts by name/email
get_contactContactPortGet full contact details by ID
list_tasksTaskPortList tasks/todos with filters
create_taskTaskPortCreate a new task
complete_taskTaskPortMark a task as completed
create_reminderReminderPortSchedule a reminder
list_remindersReminderPortList active reminders
search_transitTransitPortSearch public transit connections
store_memoryMemoryStoreStore a fact in long-term memory
recall_memoryMemoryStoreRecall facts from memory
execute_codeCodeExecutionPortRun code in a sandboxed container
search_emailsEmailPortSearch emails by query
draft_emailEmailPort + DraftStorePortDraft an email
send_emailEmailPortSend an email

Configuration

Add to config.toml:

[agent.tool_calling]
# Enable/disable the ReAct agent (default: true)
enabled = true

# Maximum ReAct loop iterations before forcing a final answer
max_iterations = 5

# Timeout per individual tool execution (seconds)
iteration_timeout_secs = 30

# Total timeout for the entire ReAct loop (seconds)
total_timeout_secs = 120

# Run tool calls in parallel when multiple are requested
parallel_tool_execution = true

# Tools that require user approval before execution (future use)
require_approval_for = []

When enabled = false, the system falls back to the standard ChatService::chat_with_context flow without any tool calling.

Relationship to AgentService

The ReAct agent runs alongside the existing AgentService:

  • AgentService handles all structured commands (AgentCommand variants like GetWeather, SearchWeb, CreateTask, etc.) via pattern matching and dedicated handler methods.
  • ReActAgentService handles general questions (AgentCommand::Ask) by letting the LLM decide which tools to call.

The command parsing flow remains unchanged — AgentService::parse_command() still classifies user input. Only Ask commands are routed through the ReAct agent when it’s enabled.

Extending with New Tools

To add a new tool:

  1. Define the port in crates/application/src/ports/ (if not already existing).
  2. Add a tool definition in ToolRegistry — create a def_your_tool() method returning a ToolDefinition with parameter schemas.
  3. Add execution logic in ToolExecutor — create an exec_your_tool() method that extracts arguments, calls the port, and formats the result.
  4. Wire the port in ToolRegistry::collect_tools() and ToolExecutor::execute() dispatch.
  5. Connect in main.rs — pass the port Arc to both ToolRegistry and ToolExecutor via with_your_port() builder methods.

Decorator Forwarding

All inference port decorators forward generate_with_tools() to their inner adapter:

  • SanitizedInferencePort — forwards directly (no sanitization for tool iterations)
  • CachedInferenceAdapter — forwards without caching (tool iterations are non-deterministic)
  • SemanticCachedInferenceAdapter — forwards without semantic caching
  • DegradedInferenceAdapter — forwards with circuit-breaker tracking
  • ModelRoutingAdapter — routes to the most capable (fallback) model

Relationship to Agentic Mode

The ReAct agent handles single-turn tool calling — one user query, one LLM loop deciding which tools to invoke. Agentic Mode extends this to multi-agent orchestration:

AspectReAct AgentAgentic Mode
ScopeSingle queryComplex multi-step task
Agents1 LLM loopMultiple parallel sub-agents
EndpointPOST /v1/chatPOST /v1/agentic/tasks
ProgressSynchronous or SSE chat streamSSE task progress stream
Config[agent.tool_calling][agentic]

Each agentic sub-agent internally uses the same ReAct tool-calling loop. The orchestrator (AgenticOrchestrator) decomposes the user’s request, spawns sub-agents, and aggregates their results.

See API Reference — Agentic Tasks for endpoint documentation.

Contributing

🤝 Guidelines for contributing to PiSovereign

Thank you for your interest in contributing to PiSovereign! This guide will help you get started.

Table of Contents


Code of Conduct

This project adheres to a Code of Conduct. By participating, you are expected to:

  • Be respectful and inclusive
  • Accept constructive criticism gracefully
  • Focus on what’s best for the community
  • Show empathy towards others

Development Setup

Prerequisites

RequirementVersionNotes
Rust1.93.0+Edition 2024
JustLatestCommand runner
SQLite3.xDevelopment database
FFmpeg5.x+Audio processing

Environment Setup

  1. Clone the repository
git clone https://github.com/twohreichel/PiSovereign.git
cd PiSovereign
  1. Install Rust toolchain
# Install rustup if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install required components
rustup component add rustfmt clippy

# Install nightly for docs (optional)
rustup toolchain install nightly
  1. Install Just
# macOS
brew install just

# Linux
cargo install just
  1. Install development dependencies
# macOS
brew install sqlite ffmpeg

# Ubuntu/Debian
sudo apt install libsqlite3-dev ffmpeg pkg-config libssl-dev
  1. Verify setup
# Run quality checks
just quality

# Build the project
just build

Running Tests

# Run all tests
just test

# Run tests with output
just test-verbose

# Run specific crate tests
cargo test -p domain
cargo test -p application

# Run integration tests
cargo test --test '*' -- --ignored

# Generate coverage report
just coverage

Code Style

Rust Formatting

We use rustfmt with custom configuration:

# Format all code
just fmt

# Check formatting (CI will fail if not formatted)
just fmt-check

Configuration in rustfmt.toml:

edition = "2024"
max_width = 100
use_small_heuristics = "Default"
imports_granularity = "Crate"
group_imports = "StdExternalCrate"

Clippy Lints

We enforce strict Clippy lints:

# Run clippy
just lint

# Auto-fix issues
just lint-fix

Key lint categories enabled:

  • clippy::pedantic - Strict lints
  • clippy::nursery - Experimental but useful lints
  • clippy::cargo - Cargo.toml best practices

Commit Messages

We follow Conventional Commits:

<type>(<scope>): <description>

[optional body]

[optional footer(s)]

Types:

TypeDescription
featNew feature
fixBug fix
docsDocumentation only
styleCode style (formatting, no logic change)
refactorCode change that neither fixes nor adds
perfPerformance improvement
testAdding or updating tests
choreMaintenance tasks

Examples:

feat(api): add streaming chat endpoint

Implements SSE-based streaming for /v1/chat/stream endpoint.
Supports token-by-token response streaming for better UX.

Closes #123
fix(inference): handle timeout gracefully

Previously, inference timeouts caused a panic. Now returns
a proper error response with retry information.

Documentation

All public APIs must be documented:

/// Processes a user message and returns an AI response.
///
/// This method handles the full conversation flow including:
/// - Loading conversation context
/// - Calling the inference engine
/// - Persisting the response
///
/// # Arguments
///
/// * `conversation_id` - Optional ID to continue existing conversation
/// * `message` - The user's message content
///
/// # Returns
///
/// Returns the AI's response or an error if processing fails.
///
/// # Errors
///
/// - `ServiceError::Inference` - If the inference engine is unavailable
/// - `ServiceError::Database` - If conversation persistence fails
///
/// # Examples
///
/// ```rust,ignore
/// let response = service.send_message(
///     Some(conversation_id),
///     "What's the weather?".to_string(),
/// ).await?;
/// ```
pub async fn send_message(
    &self,
    conversation_id: Option<ConversationId>,
    message: String,
) -> Result<Message, ServiceError> {
    // ...
}

Pull Request Process

Before You Start

  1. Check existing issues/PRs

    • Look for related issues or PRs
    • Comment on the issue you want to work on
  2. Create an issue first (for features)

    • Describe the feature
    • Discuss approach before implementing
  3. Fork and branch

    git checkout -b feat/my-feature
    # or
    git checkout -b fix/issue-123
    

Creating a PR

  1. Ensure quality checks pass

    just pre-commit
    
  2. Write/update tests

    • Add tests for new functionality
    • Ensure existing tests still pass
  3. Update documentation

    • Update relevant docs in docs/
    • Add doc comments to new public APIs
  4. Push and create PR

    git push origin feat/my-feature
    
  5. Fill out PR template

    • Description of changes
    • Related issues
    • Testing performed
    • Breaking changes (if any)

PR Template

## Description
Brief description of what this PR does.

## Related Issues
Fixes #123
Related to #456

## Type of Change
- [ ] Bug fix (non-breaking)
- [ ] New feature (non-breaking)
- [ ] Breaking change
- [ ] Documentation update

## Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manually tested on Raspberry Pi

## Checklist
- [ ] Code follows project style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] No new warnings

Review Process

  1. Automated checks must pass:

    • Format check (rustfmt)
    • Lint check (clippy)
    • Tests (all platforms)
    • Coverage (no significant decrease)
    • Security scan (cargo-deny)
  2. Human review:

    • At least one maintainer approval required
    • Address all review comments
  3. Merge:

    • Squash and merge for clean history
    • Delete branch after merge

Development Workflow

Common Tasks

# Full quality check (run before pushing)
just quality

# Quick pre-commit check
just pre-commit

# Run the server locally
just run

# Run CLI commands
just cli status
just cli chat "Hello"

# Generate and view documentation
just docs

# Clean build artifacts
just clean

Project Structure

PiSovereign/
├── crates/                 # Rust crates
│   ├── domain/            # Core business logic
│   ├── application/       # Use cases, services
│   ├── infrastructure/    # External adapters
│   ├── ai_core/          # Inference engine
│   ├── ai_speech/        # Speech processing
│   ├── integration_*/    # Service integrations
│   └── presentation_*/   # HTTP API, CLI
├── docs/                  # mdBook documentation
├── grafana/              # Monitoring configuration
├── migrations/           # Database migrations
└── .github/              # CI/CD workflows

Adding a New Feature

  1. Domain layer (if new entities/values needed)

    # Edit crates/domain/src/entities/mod.rs
    # Add new entity module
    
  2. Application layer (service logic)

    # Add port trait in crates/application/src/ports/
    # Add service in crates/application/src/services/
    
  3. Infrastructure layer (adapters)

    # Implement port in crates/infrastructure/src/adapters/
    
  4. Presentation layer (API endpoints)

    # Add handler in crates/presentation_http/src/handlers/
    # Add route in crates/presentation_http/src/router.rs
    
  5. Tests

    # Unit tests alongside code
    # Integration tests in crates/*/tests/
    

Database Migrations

# Create new migration
cat > migrations/V007__my_migration.sql << 'EOF'
-- Description of migration
CREATE TABLE my_table (
    id TEXT PRIMARY KEY,
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
EOF

# Migrations run automatically on startup (if enabled)
# Or manually:
pisovereign-cli migrate

Getting Help

Thank you for contributing! 🎉

Crate Reference

📦 Detailed documentation of all PiSovereign crates

This document provides comprehensive documentation for each crate in the PiSovereign workspace.

Table of Contents


Overview

PiSovereign consists of 12 crates organized by architectural layer:

LayerCratesPurpose
DomaindomainCore business logic, entities, value objects
ApplicationapplicationUse cases, services, port definitions
InfrastructureinfrastructureDatabase, cache, secrets, telemetry
AIai_core, ai_speechInference engine, speech processing
Integrationintegration_*External service adapters
Presentationpresentation_*HTTP API, CLI

Domain Layer

domain

Purpose: Contains the core business logic with zero external dependencies (except std). Defines the ubiquitous language of the application.

Dependencies: None (pure Rust)

Entities

EntityDescription
UserRepresents a system user with profile information
ConversationA chat conversation containing messages
MessageA single message in a conversation
ApprovalRequestPending approval for sensitive operations
AuditEntryAudit log entry for compliance
CalendarEventCalendar event representation
EmailMessageEmail representation
WeatherDataWeather information
// Example: Conversation entity
pub struct Conversation {
    pub id: ConversationId,
    pub title: Option<String>,
    pub system_prompt: Option<String>,
    pub messages: Vec<Message>,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}

Value Objects

Value ObjectDescription
UserIdUnique user identifier (UUID)
ConversationIdUnique conversation identifier
MessageContentValidated message content
TenantIdMulti-tenant identifier
PhoneNumberValidated phone number
// Example: UserId value object
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct UserId(Uuid);

impl UserId {
    pub fn new() -> Self {
        Self(Uuid::new_v4())
    }
    
    pub fn from_uuid(uuid: Uuid) -> Self {
        Self(uuid)
    }
}

Commands

CommandDescription
UserCommandCommands from users (Briefing, Ask, Help, etc.)
SystemCommandInternal system commands
// User command variants
pub enum UserCommand {
    MorningBriefing,
    CreateCalendarEvent { title: String, start: DateTime<Utc>, end: DateTime<Utc> },
    SummarizeInbox { count: usize },
    DraftEmail { to: String, subject: String },
    SendEmail { draft_id: String },
    Ask { query: String },
    Echo { message: String },
    Help,
}

Domain Errors

#[derive(Debug, thiserror::Error)]
pub enum DomainError {
    #[error("Invalid message content: {0}")]
    InvalidContent(String),
    
    #[error("Conversation not found: {0}")]
    ConversationNotFound(ConversationId),
    
    #[error("User not authorized: {0}")]
    Unauthorized(String),
}

Application Layer

application

Purpose: Orchestrates use cases by coordinating domain entities and infrastructure through port interfaces.

Dependencies: domain

Services

ServiceDescription
AgentServiceIntent routing pipeline (conversational filter → quick patterns → workflow detection → LLM intent)
ChatServiceLLM chat with RAG context injection and automatic memory storage
ConversationServiceManages conversations and messages
VoiceMessageServiceSTT → LLM → TTS pipeline
CommandServiceParses and executes user commands
MemoryServiceMemory storage, semantic search, encryption, decay, and deduplication
ApprovalServiceHandles approval workflows
BriefingServiceGenerates morning briefings
CalendarServiceCalendar operations
EmailServiceEmail operations
HealthServiceSystem health checks

Command Parser Modules

ModuleDescription
conversational_filterZero-LLM-cost regex filter for greetings, introductions, and small talk
llmLLM-based intent parsing with confidence scoring and keyword post-validation
workflow_parserMulti-step workflow detection with hardened negative examples
// Example: ConversationService
pub struct ConversationService<R, I>
where
    R: ConversationRepository,
    I: InferencePort,
{
    repository: Arc<R>,
    inference: Arc<I>,
}

impl<R, I> ConversationService<R, I>
where
    R: ConversationRepository,
    I: InferencePort,
{
    pub async fn send_message(
        &self,
        conversation_id: Option<ConversationId>,
        content: String,
    ) -> Result<Message, ServiceError> {
        // 1. Load or create conversation
        // 2. Build prompt with context
        // 3. Call inference engine
        // 4. Save and return response
    }
}

Ports (Trait Definitions)

PortDescription
InferencePortLLM inference operations
ConversationRepositoryConversation persistence
MemoryPortMemory persistence (store, search, decay)
MemoryContextPortRAG context injection into prompts
EmbeddingPortEmbedding vector generation
EncryptionPortContent encryption/decryption
SecretStoreSecret management
CachePortCaching abstraction
CalendarPortCalendar operations
EmailPortEmail operations
WeatherPortWeather data
SpeechPortSTT/TTS operations
WhatsAppPortWhatsApp messaging
ApprovalRepositoryApproval persistence
AuditRepositoryAudit logging
// Example: InferencePort
#[async_trait]
pub trait InferencePort: Send + Sync {
    async fn generate(
        &self,
        prompt: &str,
        options: InferenceOptions,
    ) -> Result<InferenceResponse, InferenceError>;
    
    async fn generate_stream(
        &self,
        prompt: &str,
        options: InferenceOptions,
    ) -> Result<Pin<Box<dyn Stream<Item = Result<String, InferenceError>> + Send>>, InferenceError>;
    
    async fn health_check(&self) -> Result<bool, InferenceError>;
    
    fn default_model(&self) -> &str;
}

Infrastructure Layer

infrastructure

Purpose: Provides concrete implementations of application ports for external systems.

Dependencies: domain, application

Adapters

AdapterImplementsDescription
VaultSecretStoreSecretStoreHashiCorp Vault KV v2
EnvironmentSecretStoreSecretStoreEnvironment variables
ChainedSecretStoreSecretStoreMulti-backend fallback
Argon2PasswordHasherPasswordHasherSecure password hashing
// Example: VaultSecretStore usage
let vault = VaultSecretStore::new(VaultConfig {
    address: "http://127.0.0.1:8200".to_string(),
    role_id: Some("...".to_string()),
    secret_id: Some("...".to_string()),
    mount_path: "secret".to_string(),
    ..Default::default()
})?;

let secret = vault.get_secret("pisovereign/whatsapp/access_token").await?;

Cache

ComponentDescription
MokaCacheL1 in-memory cache (fast, volatile)
RedbCacheL2 persistent cache (survives restarts)
TieredCacheCombines L1 + L2 with fallback
// TieredCache usage
let cache = TieredCache::new(
    MokaCache::new(10_000),  // 10k entries max
    RedbCache::new("/var/lib/pisovereign/cache.redb")?,
);

// Write-through to both layers
cache.set("key", "value", Duration::from_secs(3600)).await?;

// Read checks L1 first, then L2
let value = cache.get("key").await?;

Persistence

ComponentDescription
PgConversationRepositoryConversation storage
PgApprovalRepositoryApproval request storage
PgAuditRepositoryAudit log storage
PgUserRepositoryUser profile storage

Other Components

ComponentDescription
TelemetrySetupOpenTelemetry initialization
CronSchedulerCron-based task scheduling
TeraTemplatesTemplate rendering
RetryExecutorExponential backoff retry
SecurityValidatorConfig validation
ModelRoutingAdapterAdaptive 4-tier model routing (replaces ai_core::ModelSelector)
RuleBasedClassifierRule-based complexity classification
TemplateResponderInstant template responses for trivial queries
ModelRoutingMetricsAtomic counters for routing observability

AI Crates

ai_core

Purpose: Inference engine abstraction and Hailo-Ollama client.

Dependencies: domain, application

Components

ComponentDescription
HailoClientHailo-Ollama HTTP client
ModelSelectorDynamic model routing (Deprecated — use infrastructure::ModelRoutingAdapter)
// HailoClient usage
let client = HailoClient::new(InferenceConfig {
    base_url: "http://localhost:11434".to_string(),
    default_model: "gemma3:12b".to_string(),
    timeout_ms: 60000,
    ..Default::default()
})?;

let response = client.generate(
    "What is the capital of France?",
    InferenceOptions::default(),
).await?;

ai_speech

Purpose: Speech-to-Text and Text-to-Speech processing.

Dependencies: domain, application

Providers

ProviderDescription
HybridSpeechProviderLocal first, cloud fallback
LocalSttProviderwhisper.cpp integration
LocalTtsProviderPiper integration
OpenAiSpeechProviderOpenAI Whisper & TTS
// HybridSpeechProvider usage
let speech = HybridSpeechProvider::new(SpeechConfig {
    provider: SpeechProviderType::Hybrid,
    prefer_local: true,
    allow_cloud_fallback: true,
    ..Default::default()
})?;

// Transcribe audio
let text = speech.transcribe(&audio_data, "en").await?;

// Synthesize speech
let audio = speech.synthesize("Hello, world!", "en").await?;

Audio Conversion

ComponentDescription
AudioConverterFFmpeg-based format conversion

Supported formats: OGG/Opus, MP3, WAV, FLAC, M4A, WebM


Integration Crates

integration_whatsapp

Purpose: WhatsApp Business API integration.

Dependencies: domain, application

Components

ComponentDescription
WhatsAppClientMeta Graph API client
WebhookHandlerIncoming message handler
SignatureValidatorHMAC-SHA256 verification
// WhatsAppClient usage
let whatsapp = WhatsAppClient::new(WhatsAppConfig {
    access_token: "...".to_string(),
    phone_number_id: "...".to_string(),
    api_version: "v18.0".to_string(),
})?;

// Send text message
whatsapp.send_text("+1234567890", "Hello!").await?;

// Send audio message
whatsapp.send_audio("+1234567890", &audio_data).await?;

integration_email

Purpose: Generic email integration via IMAP/SMTP, supporting any provider (Gmail, Outlook, Proton Mail, and custom servers).

Dependencies: domain, application

Components

ComponentDescription
ImapClientEmail reading via IMAP
SmtpClientEmail sending via SMTP
EmailProviderConfigProvider-agnostic configuration
AuthMethodPassword or OAuth2 (XOAUTH2) authentication
ProviderPresetPre-configured settings for Proton, Gmail, Outlook
ReconnectingClientConnection resilience with auto-reconnect
use integration_email::{EmailProviderConfig, AuthMethod, ProviderPreset};

// Proton Mail via Bridge
let proton = EmailProviderConfig::with_credentials("user@proton.me", "bridge-password")
    .with_imap("127.0.0.1", 1143)
    .with_smtp("127.0.0.1", 1025);

// Gmail with OAuth2
let gmail = EmailProviderConfig::with_oauth2("user@gmail.com", "ya29.access-token")
    .with_preset(ProviderPreset::Gmail);

// Outlook with app password
let outlook = EmailProviderConfig::with_credentials("user@outlook.com", "app-password")
    .with_preset(ProviderPreset::Outlook);

integration_caldav

Purpose: CalDAV calendar integration.

Dependencies: domain, application

Components

ComponentDescription
CalDavClientCalDAV protocol client
ICalParseriCalendar parsing
// CalDavClient usage
let calendar = CalDavClient::new(CalDavConfig {
    server_url: "https://cal.example.com/dav.php".to_string(),
    username: "user".to_string(),
    password: "pass".to_string(),
    calendar_path: "/calendars/user/default/".to_string(),
})?;

// Fetch events
let events = calendar.get_events(start_date, end_date).await?;

// Create event
calendar.create_event(CalendarEvent {
    title: "Meeting".to_string(),
    start: start_time,
    end: end_time,
    ..Default::default()
}).await?;

integration_weather

Purpose: Open-Meteo weather API integration.

Dependencies: domain, application

Components

ComponentDescription
OpenMeteoClientWeather API client
// OpenMeteoClient usage
let weather = OpenMeteoClient::new(WeatherConfig {
    base_url: "https://api.open-meteo.com/v1".to_string(),
    forecast_days: 7,
    cache_ttl_minutes: 30,
})?;

// Get current weather
let current = weather.get_current(52.52, 13.405).await?;

// Get forecast
let forecast = weather.get_forecast(52.52, 13.405).await?;

Presentation Crates

presentation_http

Purpose: HTTP REST API using Axum.

Dependencies: All crates (orchestration layer)

Handlers

HandlerEndpointDescription
healthGET /healthLiveness probe
readyGET /readyReadiness with inference status
chatPOST /v1/chatSend chat message
chat_streamPOST /v1/chat/streamStreaming chat (SSE)
commandsPOST /v1/commandsExecute command
webhooksPOST /v1/webhooks/whatsappWhatsApp webhook
metricsGET /metrics/prometheusPrometheus metrics

Middleware

MiddlewareDescription
RateLimiterRequest rate limiting
ApiKeyAuthAPI key authentication
RequestIdRequest correlation ID
CorsCORS handling

Binaries

  • pisovereign-server - HTTP server binary

presentation_cli

Purpose: Command-line interface using Clap.

Dependencies: Core crates

Commands

CommandDescription
statusShow system status
chatSend chat message
commandExecute command
backupDatabase backup
restoreDatabase restore
migrateRun migrations
openapiExport OpenAPI spec
# Examples
pisovereign-cli status
pisovereign-cli chat "Hello"
pisovereign-cli command "briefing"
pisovereign-cli backup --output backup.db
pisovereign-cli openapi --output openapi.json

Binaries

  • pisovereign-cli - CLI binary

Cargo Docs

For detailed API documentation, see the auto-generated Cargo docs:

Generate locally:

just docs
# Opens browser at target/doc/presentation_http/index.html

API Reference

📡 REST API documentation for PiSovereign

This document provides complete REST API documentation including authentication, endpoints, and the OpenAPI specification.

Table of Contents


Overview

Base URL

http://localhost:3000      # Development
https://your-domain.com    # Production (behind Traefik)

Content Type

All requests and responses use JSON:

Content-Type: application/json
Accept: application/json

Request ID

Every response includes a correlation ID for debugging:

X-Request-Id: 550e8400-e29b-41d4-a716-446655440000

Include this when reporting issues.


Authentication

API Key Authentication

Protected endpoints require an API key in the Authorization header:

Authorization: Bearer sk-your-api-key

Configuration

API keys are mapped to user IDs in config.toml:

[security.api_key_users]
"sk-abc123def456" = "550e8400-e29b-41d4-a716-446655440000"
"sk-xyz789ghi012" = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"

Example Request

curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123def456" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

Authentication Errors

StatusCodeDescription
401UNAUTHORIZEDMissing or invalid API key
403FORBIDDENValid key, but action not allowed
{
  "error": {
    "code": "UNAUTHORIZED",
    "message": "Invalid or missing API key",
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Rate Limiting

Rate limiting is applied per IP address.

ConfigurationDefault
rate_limit_rpm120 requests/minute

Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1707321600

Rate Limited Response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Please retry after 30 seconds.",
    "retry_after": 30
  }
}

Endpoints

Health & Status

GET /health

Liveness probe. Returns 200 if the server is running.

Authentication: None required

Response: 200 OK

{
  "status": "ok"
}

GET /ready

Readiness probe with inference engine status.

Authentication: None required

Response: 200 OK (healthy) or 503 Service Unavailable

{
  "status": "ready",
  "inference": {
    "healthy": true,
    "model": "qwen2.5-1.5b-instruct",
    "latency_ms": 45
  }
}

GET /ready/all

Extended health check with all service statuses.

Authentication: None required

Response: 200 OK

{
  "status": "ready",
  "services": {
    "inference": { "healthy": true, "latency_ms": 45 },
    "database": { "healthy": true, "latency_ms": 2 },
    "cache": { "healthy": true },
    "whatsapp": { "healthy": true, "latency_ms": 120 },
    "email": { "healthy": true, "latency_ms": 89 },
    "calendar": { "healthy": true, "latency_ms": 35 },
    "weather": { "healthy": true, "latency_ms": 180 }
  },
  "latency_percentiles": {
    "p50_ms": 45,
    "p90_ms": 120,
    "p99_ms": 250
  }
}

Chat

POST /v1/chat

Send a message and receive a response.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
messagestringYesUser message
conversation_idstringNoContinue existing conversation
system_promptstringNoOverride system prompt
modelstringNoOverride default model
temperaturefloatNoSampling temperature (0.0-2.0)
max_tokensintegerNoMaximum response tokens
{
  "message": "What's the weather in Berlin?",
  "conversation_id": "conv-123",
  "temperature": 0.7
}

Response: 200 OK

{
  "id": "msg-456",
  "conversation_id": "conv-123",
  "role": "assistant",
  "content": "Currently in Berlin, it's 15°C with partly cloudy skies...",
  "model": "qwen2.5-1.5b-instruct",
  "tokens": {
    "prompt": 45,
    "completion": 128,
    "total": 173
  },
  "created_at": "2026-02-07T10:30:00Z"
}

POST /v1/chat/stream

Streaming chat using Server-Sent Events (SSE).

Authentication: Required

Request Body: Same as /v1/chat

Response: 200 OK (text/event-stream)

event: message
data: {"delta": "Currently"}

event: message
data: {"delta": " in Berlin"}

event: message
data: {"delta": ", it's 15°C"}

event: done
data: {"tokens": {"prompt": 45, "completion": 128, "total": 173}}

Example (JavaScript):

const eventSource = new EventSource('/v1/chat/stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({ message: 'Hello' })
});

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  process.stdout.write(data.delta);
};

Commands

POST /v1/commands

Execute a command and get the result.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
commandstringYesCommand to execute
argsobjectNoCommand arguments
{
  "command": "briefing"
}

Response: 200 OK

{
  "command": "MorningBriefing",
  "status": "completed",
  "result": {
    "weather": "15°C, partly cloudy",
    "calendar": [
      {"time": "09:00", "title": "Team standup"},
      {"time": "14:00", "title": "Client meeting"}
    ],
    "emails": {
      "unread": 5,
      "important": 2
    }
  },
  "executed_at": "2026-02-07T07:00:00Z"
}

Available Commands:

CommandDescriptionArguments
briefingMorning briefingNone
weatherCurrent weatherlocation (optional)
calendarToday’s eventsdays (default: 1)
emailsEmail summarycount (default: 10)
helpList commandsNone

POST /v1/commands/parse

Parse a command without executing it.

Authentication: Required

Request Body:

{
  "input": "create meeting tomorrow at 3pm"
}

Response: 200 OK

{
  "parsed": true,
  "command": {
    "type": "CreateCalendarEvent",
    "title": "meeting",
    "start": "2026-02-08T15:00:00Z",
    "end": "2026-02-08T16:00:00Z"
  },
  "confidence": 0.92,
  "requires_approval": true
}

System Command Catalog

The system command catalog provides a discoverable set of shell commands that can be executed on the host system. On first startup, PiSovereign automatically populates 32 default commands (disk usage, system info, network tools, etc.) stored in PostgreSQL.

GET /v1/commands/catalog

List all commands in the catalog.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
limitintegerNoMaximum results (default: 100)
offsetintegerNoPagination offset (default: 0)

Response: 200 OK

[
  {
    "id": "default-disk-free",
    "name": "Disk Free Space",
    "description": "Show available disk space on all mounts",
    "command": "df -h",
    "category": "filesystem",
    "risk_level": "safe",
    "os": "linux",
    "requires_approval": false,
    "created_at": "2026-02-24T08:50:08Z",
    "updated_at": "2026-02-24T08:50:08Z"
  }
]

GET /v1/commands/catalog/search

Search the catalog by keyword.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
qstringYesSearch query (matches name and description)

Response: 200 OK — returns matching commands (same format as listing).

GET /v1/commands/catalog/count

Get the total number of catalog entries.

Authentication: Required

Response: 200 OK

{
  "count": 32
}

GET /v1/commands/catalog/

Get a specific catalog command by ID.

Authentication: Required

Response: 200 OK — returns a single command object.

POST /v1/commands/catalog

Create a custom catalog command.

Authentication: Required

Request Body:

{
  "name": "Check Logs",
  "description": "Tail the last 100 lines of syslog",
  "command": "tail -n 100 /var/log/syslog",
  "category": "system",
  "risk_level": "safe",
  "os": "linux",
  "requires_approval": false
}

Response: 201 Created

POST /v1/commands/catalog/{id}/execute

Execute a command from the catalog. Commands with requires_approval: true will create an approval request instead of executing immediately.

Authentication: Required

Response: 200 OK

DELETE /v1/commands/catalog/

Delete a catalog command.

Authentication: Required

Response: 204 No Content


Memory

The memory API manages the RAG (Retrieval-Augmented Generation) knowledge store. Memories are automatically used to enrich chat context.

GET /v1/memories

List all stored memories.

Authentication: Required

Response: 200 OK

[
  {
    "id": "uuid",
    "content": "The user prefers dark mode",
    "summary": "UI preference: dark mode",
    "memory_type": "Preference",
    "importance": 0.8,
    "access_count": 5,
    "tags": ["ui", "preference"],
    "created_at": "2026-02-24T08:50:00Z",
    "updated_at": "2026-02-24T09:00:00Z"
  }
]

POST /v1/memories

Create a new memory entry.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
contentstringYesMemory content text
summarystringYesShort summary
memory_typestringNoType: fact, preference, tool_result, correction, context (default: context)
importancefloatNoImportance score 0.0–1.0 (default: 0.5)
tagsstring[]NoOptional tags

Response: 201 Created

GET /v1/memories/search

Search memories by semantic similarity.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
qstringYesSearch query

Response: 200 OK — returns matching memories ranked by relevance.

GET /v1/memories/stats

Get memory storage statistics.

Authentication: Required

Response: 200 OK

{
  "total": 42,
  "by_type": [
    {"memory_type": "Fact", "count": 15},
    {"memory_type": "Preference", "count": 8},
    {"memory_type": "Tool Result", "count": 10},
    {"memory_type": "Correction", "count": 2},
    {"memory_type": "Context", "count": 7}
  ]
}

POST /v1/memories/decay

Trigger a manual memory importance decay cycle. Reduces the importance of older, less-accessed memories.

Authentication: Required

Response: 200 OK

GET /v1/memories/

Get a specific memory by ID.

Authentication: Required

Response: 200 OK

DELETE /v1/memories/

Delete a specific memory.

Authentication: Required

Response: 204 No Content


Agentic Tasks

Multi-agent task orchestration. Decompose complex requests into parallel sub-tasks executed by independent AI agents.

Note: Requires [agentic] enabled = true in config.toml.

POST /v1/agentic/tasks

Create a new agentic task for multi-agent processing.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
descriptionstringYesTask description in natural language
require_approvalbooleanNoRequire approval before sub-agent execution (default: false)
{
  "description": "Plan my trip to Berlin next week — check weather, find transit options, and create calendar events",
  "require_approval": false
}

Response: 201 Created

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "planning",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/

Get the current status and results of an agentic task.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "description": "Plan my trip to Berlin",
  "plan_summary": "3 sub-tasks: weather, transit, calendar",
  "sub_agents": [
    { "id": "sa-1", "description": "Check Berlin weather", "status": "completed" },
    { "id": "sa-2", "description": "Search transit", "status": "completed" },
    { "id": "sa-3", "description": "Create events", "status": "completed" }
  ],
  "result": "Your Berlin trip is planned: ...",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/{task_id}/stream

Stream real-time progress updates via Server-Sent Events (SSE).

Authentication: Required

Response: 200 OK (text/event-stream)

event: task_started
data: {"task_id": "550e8400-...", "description": "Plan my trip to Berlin"}

event: plan_created
data: {"task_id": "550e8400-...", "sub_tasks": [...]}

event: sub_agent_started
data: {"sub_agent_id": "sa-1", "description": "Check Berlin weather"}

event: sub_agent_completed
data: {"sub_agent_id": "sa-1", "result": "15°C, partly cloudy"}

event: task_completed
data: {"task_id": "550e8400-...", "result": "Your Berlin trip is planned: ..."}

POST /v1/agentic/tasks/{task_id}/cancel

Cancel a running agentic task and all its sub-agents.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "cancelled"
}

System

GET /v1/system/status

Get system status and resource usage.

Authentication: Required

Response: 200 OK

{
  "version": "0.1.0",
  "uptime_seconds": 86400,
  "environment": "production",
  "resources": {
    "memory_used_mb": 256,
    "cpu_percent": 15.5,
    "database_size_mb": 42
  },
  "statistics": {
    "requests_total": 15420,
    "inference_requests": 8930,
    "cache_hit_rate": 0.73
  }
}

GET /v1/system/models

List available inference models.

Authentication: Required

Response: 200 OK

{
  "models": [
    {
      "id": "qwen2.5-1.5b-instruct",
      "name": "Qwen 2.5 1.5B Instruct",
      "parameters": "1.5B",
      "context_length": 4096,
      "default": true
    },
    {
      "id": "llama3.2-1b-instruct",
      "name": "Llama 3.2 1B Instruct",
      "parameters": "1B",
      "context_length": 4096,
      "default": false
    }
  ]
}

Webhooks

POST /v1/webhooks/whatsapp

WhatsApp webhook endpoint for incoming messages.

Authentication: Signature verification via X-Hub-Signature-256 header

Verification Request (GET):

GET /v1/webhooks/whatsapp?hub.mode=subscribe&hub.verify_token=your-token&hub.challenge=challenge123

Response: The hub.challenge value

Message Webhook (POST):

{
  "object": "whatsapp_business_account",
  "entry": [{
    "changes": [{
      "value": {
        "messages": [{
          "from": "+1234567890",
          "type": "text",
          "text": { "body": "Hello" }
        }]
      }
    }]
  }]
}

Response: 200 OK


Metrics

GET /metrics

JSON metrics for monitoring.

Authentication: None required

Response: 200 OK

{
  "uptime_seconds": 86400,
  "http": {
    "requests_total": 15420,
    "requests_success": 15100,
    "requests_client_error": 280,
    "requests_server_error": 40,
    "active_requests": 3,
    "response_time_avg_ms": 125
  },
  "inference": {
    "requests_total": 8930,
    "requests_success": 8850,
    "requests_failed": 80,
    "time_avg_ms": 450,
    "tokens_total": 1250000,
    "healthy": true
  }
}

GET /metrics/prometheus

Prometheus-compatible metrics.

Authentication: None required

Response: 200 OK (text/plain)

# HELP app_uptime_seconds Application uptime in seconds
# TYPE app_uptime_seconds counter
app_uptime_seconds 86400

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{status="success"} 15100
http_requests_total{status="client_error"} 280
http_requests_total{status="server_error"} 40

# HELP inference_time_ms_bucket Inference time histogram
# TYPE inference_time_ms_bucket histogram
inference_time_ms_bucket{le="100"} 1200
inference_time_ms_bucket{le="250"} 4500
inference_time_ms_bucket{le="500"} 7200
inference_time_ms_bucket{le="1000"} 8500
inference_time_ms_bucket{le="+Inf"} 8930

Error Handling

Error Response Format

All errors follow this format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message",
    "details": {},
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Error Codes

HTTP StatusCodeDescription
400BAD_REQUESTInvalid request body or parameters
401UNAUTHORIZEDMissing or invalid authentication
403FORBIDDENAuthenticated but not authorized
404NOT_FOUNDResource not found
422VALIDATION_ERRORRequest validation failed
429RATE_LIMITEDToo many requests
500INTERNAL_ERRORServer error
502UPSTREAM_ERRORExternal service error
503SERVICE_UNAVAILABLEService temporarily unavailable

Validation Errors

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request validation failed",
    "details": {
      "fields": [
        {"field": "message", "error": "cannot be empty"},
        {"field": "temperature", "error": "must be between 0.0 and 2.0"}
      ]
    }
  }
}

OpenAPI Specification

Interactive Documentation

When the server is running, access interactive API documentation:

  • Swagger UI: http://localhost:3000/swagger-ui/
  • ReDoc: http://localhost:3000/redoc/

Export OpenAPI Spec

# Via CLI
pisovereign-cli openapi --output openapi.json

# Via API (if enabled)
curl http://localhost:3000/api-docs/openapi.json

OpenAPI 3.1 Specification

The full specification is available at:

  • Development: /api-docs/openapi.json
  • GitHub Pages: /api/openapi.json
Example OpenAPI Excerpt
openapi: 3.1.0
info:
  title: PiSovereign API
  description: Local AI Assistant REST API
  version: 0.1.0
  license:
    name: MIT
    url: https://opensource.org/licenses/MIT

servers:
  - url: http://localhost:3000
    description: Development server

security:
  - bearerAuth: []

paths:
  /v1/chat:
    post:
      summary: Send chat message
      operationId: chat
      tags:
        - Chat
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatResponse'
        '401':
          $ref: '#/components/responses/Unauthorized'

components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: API key authentication

  schemas:
    ChatRequest:
      type: object
      required:
        - message
      properties:
        message:
          type: string
          description: User message
          example: "What's the weather?"
        conversation_id:
          type: string
          format: uuid
          description: Continue existing conversation

SDK Examples

cURL

# Chat
curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

# Command
curl -X POST http://localhost:3000/v1/commands \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"command": "briefing"}'

Python

import requests

API_URL = "http://localhost:3000"
API_KEY = "sk-abc123"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# Chat
response = requests.post(
    f"{API_URL}/v1/chat",
    headers=headers,
    json={"message": "What's the weather?"}
)
print(response.json()["content"])

JavaScript/TypeScript

const API_URL = "http://localhost:3000";
const API_KEY = "sk-abc123";

async function chat(message: string): Promise<string> {
  const response = await fetch(`${API_URL}/v1/chat`, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ message }),
  });
  
  const data = await response.json();
  return data.content;
}

Production Deployment

Deploy PiSovereign for production use with TLS, monitoring, and hardened configuration

Overview

PiSovereign is deployed via Docker Compose. The stack includes Traefik for automatic TLS via Let’s Encrypt, Vault for secrets, Ollama for inference, and all supporting services.

Internet
    │
    ▼
┌─────────────┐
│   Traefik   │ ← TLS termination, Let's Encrypt
│  (Reverse   │
│   Proxy)    │
└─────────────┘
    │ HTTP (internal)
    ▼
┌─────────────┐     ┌─────────────┐
│ PiSovereign │ ──▶ │   Ollama    │
│   Server    │     │  (isolated) │
└─────────────┘     └─────────────┘
    │
    ▼
┌─────────────┐     ┌─────────────┐
│  Prometheus │ ──▶ │   Grafana   │
│   Metrics   │     │  Dashboard  │
└─────────────┘     └─────────────┘

Pre-Deployment Checklist

  • Docker Engine 24+ with Compose v2 installed
  • Vault initialized and secrets stored (Vault Setup)
  • Domain name with DNS A record pointing to your server
  • Firewall allows ports 80 and 443 (inbound)
  • Backup strategy defined (Backup & Restore)

Deployment

Refer to the Docker Setup guide for the step-by-step deployment process. The key commands are:

cd PiSovereign/docker

cp .env.example .env
nano .env  # Set PISOVEREIGN_DOMAIN and TRAEFIK_ACME_EMAIL

docker compose up -d
docker compose exec vault /vault/init.sh

Enable All Profiles

docker compose --profile monitoring --profile caldav up -d

Multi-Architecture Builds

PiSovereign images support both ARM64 (Raspberry Pi) and AMD64 (x86 servers):

docker pull --platform linux/arm64 ghcr.io/twohreichel/pisovereign:latest
docker pull --platform linux/amd64 ghcr.io/twohreichel/pisovereign:latest

TLS Configuration

Traefik with Let’s Encrypt

TLS is handled automatically by Traefik. The Docker Compose stack includes Traefik with HTTP challenge for Let’s Encrypt certificates. Key requirements:

  1. DNS A record pointing to your server’s public IP
  2. Ports 80 and 443 open in your firewall
  3. Valid email for Let’s Encrypt notifications (set in .env as TRAEFIK_ACME_EMAIL)

Certificate auto-renewal is handled by Traefik — no manual intervention required.

TLS Hardening

For stricter TLS settings, edit docker/traefik/dynamic.yml:

tls:
  options:
    default:
      minVersion: VersionTLS13
      cipherSuites:
        - TLS_AES_256_GCM_SHA384
        - TLS_CHACHA20_POLY1305_SHA256
      curvePreferences:
        - X25519
        - CurveP384
      sniStrict: true

Production Configuration

Key settings for production in docker/config/config.toml:

environment = "production"

[server]
host = "0.0.0.0"
port = 3000
log_format = "json"
cors_enabled = true
allowed_origins = ["https://your-domain.example.com"]
shutdown_timeout_secs = 30

[inference]
base_url = "http://ollama:11434"
default_model = "gemma3:12b"
timeout_ms = 120000

[security]
rate_limit_enabled = true
rate_limit_rpm = 120
min_tls_version = "1.3"
tls_verify_certs = true

[database]
url = "postgres://pisovereign:pisovereign@postgres:5432/pisovereign"
max_connections = 10
run_migrations = true

[cache]
enabled = true
ttl_short_secs = 300
ttl_medium_secs = 3600
ttl_long_secs = 86400
l1_max_entries = 10000

[vault]
address = "http://vault:8200"
mount_path = "secret"
timeout_secs = 5

[degraded_mode]
enabled = true
unavailable_message = "Service temporarily unavailable. Please try again."
failure_threshold = 3
success_threshold = 2

[health]
global_timeout_secs = 5

See the Configuration Reference for all available options.


Deployment Verification

After deployment, verify everything is working:

# 1. Check all containers are running
docker compose ps

# 2. Check health endpoint
curl https://your-domain.example.com/health

# 3. Check all services are ready
curl https://your-domain.example.com/ready/all | jq

# 4. Test chat endpoint
curl -X POST https://your-domain.example.com/v1/chat \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}' | jq

# 5. Check TLS certificate
openssl s_client -connect your-domain.example.com:443 -brief

# 6. Check metrics
curl http://localhost:3000/metrics/prometheus | head -20

Expected results:

  • Health returns {"status": "ok"}
  • Ready shows all services healthy
  • Chat returns an AI response
  • TLS shows a valid certificate

Advanced: Non-Docker Deployment

For advanced users who prefer running PiSovereign without Docker, you can build the binary directly:

cargo build --release
# Binaries: target/release/pisovereign-server, target/release/pisovereign-cli

You are responsible for managing Ollama, Vault, Signal-CLI, Whisper, Piper, and reverse proxy setup yourself. The Docker Compose stack in docker/compose.yml serves as the reference architecture.


Next Steps

Monitoring

Prometheus metrics, Grafana dashboards, Loki log aggregation, and alerting

Overview

The monitoring stack is included in Docker Compose and activated with a single profile flag:

docker compose --profile monitoring up -d

This starts Prometheus, Grafana, Loki, Promtail, Node Exporter, and the OpenTelemetry Collector — all pre-configured to scrape PiSovereign metrics and collect logs.

┌─────────────────┐
│   PiSovereign   │
│  /metrics/      │
│  prometheus     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     ┌─────────────────┐
│   Prometheus    │────▶│    Grafana      │
│   (Metrics)     │     │  (Dashboards)   │
└─────────────────┘     └─────────────────┘

┌─────────────────┐     ┌─────────────────┐
│    Promtail     │────▶│      Loki       │
│  (Log Shipper)  │     │  (Log Storage)  │
└─────────────────┘     └─────────────────┘

Resource Usage (Raspberry Pi 5)

ComponentMemoryStorage/Day
Prometheus~100 MB~50 MB
Grafana~150 MBMinimal
Loki~200 MB~100 MB
Promtail~30 MB
Total~480 MB~150 MB

Accessing Dashboards

After enabling the monitoring profile:

ServiceURL
Grafanahttp://localhost/grafana (via Traefik)
Prometheushttp://localhost:9090

Default Grafana credentials are admin / admin (change on first login). Dashboards and data sources are auto-provisioned — no manual setup required.


Prometheus Metrics

PiSovereign exposes metrics at /metrics/prometheus:

Application Metrics

MetricTypeDescription
app_uptime_secondsCounterApplication uptime
app_version_infoGaugeVersion information

HTTP Metrics

MetricTypeDescription
http_requests_totalCounterTotal HTTP requests
http_requests_success_totalCounter2xx responses
http_requests_client_error_totalCounter4xx responses
http_requests_server_error_totalCounter5xx responses
http_requests_activeGaugeActive requests
http_response_time_avg_msGaugeAverage response time
http_response_time_ms_bucketHistogramResponse time distribution

Inference Metrics

MetricTypeDescription
inference_requests_totalCounterTotal inference requests
inference_requests_success_totalCounterSuccessful inferences
inference_requests_failed_totalCounterFailed inferences
inference_time_avg_msGaugeAverage inference time
inference_time_ms_bucketHistogramInference time distribution
inference_tokens_totalCounterTotal tokens generated
inference_healthyGaugeHealth status (0/1)

Cache Metrics

MetricTypeDescription
cache_hits_totalCounterCache hits
cache_misses_totalCounterCache misses
cache_sizeGaugeCurrent cache size

Model Routing Metrics

These metrics are only present when [model_routing] is enabled.

MetricTypeDescription
model_routing_requests_total{tier="..."}CounterRequests per tier (trivial/simple/moderate/complex)
model_routing_template_hits_totalCounterTrivial queries answered by template
model_routing_upgrades_totalCounterTier upgrades due to low confidence

Grafana Dashboard Panels

The pre-built PiSovereign dashboard includes:

Overview Row

PanelDescription
UptimeApplication uptime counter
Inference StatusHealth indicator
Total RequestsCumulative request count
Active RequestsCurrent in-flight requests
Avg Response TimeMean latency
Total TokensLLM tokens generated

HTTP Requests Row

PanelVisualizationDescription
Request RateTime seriesRequests/second over time
Status DistributionPie chartSuccess/error breakdown
Response Time P50/P90/P99StatLatency percentiles

Inference Row

PanelVisualizationDescription
Inference RateTime seriesInferences/second
Inference LatencyGaugeCurrent avg latency
Token RateTime seriesTokens/second
Model UsageTablePer-model statistics

System Row

PanelDescription
CPU UsageSystem CPU utilization
Memory UsageRAM usage
Disk I/OStorage throughput
Network I/ONetwork traffic

Alerting

Alert rules are pre-configured in docker/prometheus/rules/ (if present) or can be added:

# prometheus/rules/pisovereign.yml
groups:
  - name: pisovereign
    rules:
      - alert: PiSovereignDown
        expr: up{job="pisovereign"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "PiSovereign is down"

      - alert: InferenceEngineUnhealthy
        expr: inference_healthy == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Inference engine is unhealthy"

      - alert: HighResponseTime
        expr: http_response_time_avg_ms > 5000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Average response time is {{ $value }}ms"

      - alert: HighErrorRate
        expr: rate(http_requests_server_error_total[5m]) / rate(http_requests_total[5m]) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Server error rate is {{ $value | humanizePercentage }}"

      - alert: InferenceFailures
        expr: rate(inference_requests_failed_total[5m]) / rate(inference_requests_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Inference failure rate is {{ $value | humanizePercentage }}"

Log Aggregation

Loki and Promtail are included in the monitoring profile. Logs from all Docker containers are automatically collected and available in Grafana under the Loki data source.

To query logs in Grafana:

  1. Go to Explore → select Loki data source
  2. Use LogQL queries:
{container="pisovereign"} |= "error"
{container="ollama"} | json | level="error"

Resource Optimization

If running on constrained hardware, tune these settings:

# In docker/prometheus/prometheus.yml
global:
  scrape_interval: 30s  # Increase from 15s to reduce load

# Prometheus storage flags (in compose.yml command)
--storage.tsdb.retention.time=3d    # Reduce from 7d
--storage.tsdb.retention.size=500MB # Cap storage
# In docker/loki/loki.yml
limits_config:
  retention_period: 72h  # 3 days instead of 7

Troubleshooting

Metrics not appearing

# Check PiSovereign exposes metrics
curl http://localhost:3000/metrics/prometheus

# Check Prometheus scrape targets
curl http://localhost:9090/api/v1/targets

Grafana dashboard empty

  1. Verify time range includes recent data
  2. Check Prometheus data source is connected (Settings → Data Sources)
  3. Query Prometheus directly at http://localhost:9090/graph

Next Steps

Backup & Restore

💾 Protect your PiSovereign data with comprehensive backup strategies

This guide covers backup procedures, automated backups, and disaster recovery.

Table of Contents


Overview

Backup strategy overview:

ComponentMethodFrequencyRetention
Databasepg_dumpDaily7 daily, 4 weekly, 12 monthly
ConfigurationFile copyOn change5 versions
Vault SecretsVault backupWeekly4 weekly
Full SystemSD/NVMe imageMonthly3 monthly

What to Back Up

Critical Data

PathContentsPriority
PostgreSQL database (via pg_dump)Conversations, approvals, audit logsHigh
/etc/pisovereign/config.tomlApplication configurationHigh
/opt/vault/dataVault storage (if local)High

Important Data

PathContentsPriority
/var/lib/pisovereign/cache.redbPersistent cacheMedium
/opt/hailo/modelsDownloaded modelsMedium
/etc/pisovereign/envEnvironment overridesMedium

Can Be Recreated

PathContentsPriority
Prometheus dataMetricsLow
Grafana dashboardsCan reimportLow
Log filesHistorical onlyLow

Database Backup

Manual Backup

Using the PiSovereign CLI:

# Simple local backup
pisovereign-cli backup \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --output /backup/pisovereign-$(date +%Y%m%d).sql

# With timestamp
pisovereign-cli backup \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --output /backup/pisovereign-$(date +%Y%m%d_%H%M%S).sql

# Compressed backup
pisovereign-cli backup \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --output - | gzip > /backup/pisovereign-$(date +%Y%m%d).sql.gz

Using pg_dump directly:

# Custom format backup (most flexible, supports parallel restore)
pg_dump -Fc -h postgres -U pisovereign -d pisovereign \
  -f /backup/pisovereign-$(date +%Y%m%d).dump

# Plain SQL backup
pg_dump -h postgres -U pisovereign -d pisovereign \
  -f /backup/pisovereign-$(date +%Y%m%d).sql

Automated Backups

Create backup script:

sudo nano /usr/local/bin/pisovereign-backup.sh
#!/bin/bash
set -euo pipefail

# Configuration
BACKUP_DIR="/backup/pisovereign"
DB_URL="postgres://pisovereign:pisovereign@postgres:5432/pisovereign"
RETENTION_DAILY=7
RETENTION_WEEKLY=4
RETENTION_MONTHLY=12

# Create directories
mkdir -p "$BACKUP_DIR"/{daily,weekly,monthly}

# Timestamp
DATE=$(date +%Y%m%d)
DAY_OF_WEEK=$(date +%u)
DAY_OF_MONTH=$(date +%d)

# Daily backup (custom format for flexible restore)
DAILY_FILE="$BACKUP_DIR/daily/pisovereign-$DATE.dump.gz"
echo "Creating daily backup: $DAILY_FILE"
pg_dump -Fc -d "$DB_URL" | gzip > "$DAILY_FILE"

# Weekly backup (Sunday)
if [ "$DAY_OF_WEEK" -eq 7 ]; then
    WEEKLY_FILE="$BACKUP_DIR/weekly/pisovereign-week$(date +%V)-$DATE.dump.gz"
    echo "Creating weekly backup: $WEEKLY_FILE"
    cp "$DAILY_FILE" "$WEEKLY_FILE"
fi

# Monthly backup (1st of month)
if [ "$DAY_OF_MONTH" -eq "01" ]; then
    MONTHLY_FILE="$BACKUP_DIR/monthly/pisovereign-$(date +%Y%m).dump.gz"
    echo "Creating monthly backup: $MONTHLY_FILE"
    cp "$DAILY_FILE" "$MONTHLY_FILE"
fi

# Cleanup old backups
echo "Cleaning up old backups..."
find "$BACKUP_DIR/daily" -name "*.dump.gz" -mtime +$RETENTION_DAILY -delete
find "$BACKUP_DIR/weekly" -name "*.dump.gz" -mtime +$((RETENTION_WEEKLY * 7)) -delete
find "$BACKUP_DIR/monthly" -name "*.dump.gz" -mtime +$((RETENTION_MONTHLY * 30)) -delete

# Backup config
CONFIG_BACKUP="$BACKUP_DIR/config/config-$DATE.toml"
mkdir -p "$BACKUP_DIR/config"
cp /etc/pisovereign/config.toml "$CONFIG_BACKUP"
find "$BACKUP_DIR/config" -name "*.toml" -mtime +30 -delete

echo "Backup completed successfully"
sudo chmod +x /usr/local/bin/pisovereign-backup.sh

Schedule with cron:

sudo crontab -e
# Daily backup at 2 AM
0 2 * * * /usr/local/bin/pisovereign-backup.sh >> /var/log/pisovereign-backup.log 2>&1

S3-Compatible Storage

S3 Configuration

PiSovereign CLI supports S3-compatible storage (AWS S3, MinIO, Backblaze B2):

# Environment variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

Or in configuration file:

# /etc/pisovereign/backup.toml
[s3]
bucket = "pisovereign-backups"
region = "eu-central-1"
endpoint = "https://s3.eu-central-1.amazonaws.com"
# For MinIO or Backblaze B2:
# endpoint = "https://s3.example.com"

S3 Backup Commands

# Backup to S3
pisovereign-cli backup \
  --s3-bucket pisovereign-backups \
  --s3-region eu-central-1 \
  --s3-prefix daily/ \
  --s3-access-key "$AWS_ACCESS_KEY_ID" \
  --s3-secret-key "$AWS_SECRET_ACCESS_KEY"

# With custom endpoint (MinIO)
pisovereign-cli backup \
  --s3-bucket pisovereign-backups \
  --s3-endpoint https://minio.local:9000 \
  --s3-access-key "$MINIO_ACCESS_KEY" \
  --s3-secret-key "$MINIO_SECRET_KEY"

# List backups in S3
aws s3 ls s3://pisovereign-backups/daily/

Automated S3 backup script:

#!/bin/bash
set -euo pipefail

DATE=$(date +%Y%m%d)

# Upload to S3
pisovereign-cli backup \
  --s3-bucket pisovereign-backups \
  --s3-region eu-central-1 \
  --s3-prefix "daily/pisovereign-$DATE.dump.gz" \
  --s3-access-key "$AWS_ACCESS_KEY_ID" \
  --s3-secret-key "$AWS_SECRET_ACCESS_KEY"

# Configure S3 lifecycle for automatic cleanup (one-time setup)
# aws s3api put-bucket-lifecycle-configuration \
#   --bucket pisovereign-backups \
#   --lifecycle-configuration file://lifecycle.json

S3 lifecycle policy (lifecycle.json):

{
  "Rules": [
    {
      "ID": "DeleteOldDailyBackups",
      "Status": "Enabled",
      "Filter": { "Prefix": "daily/" },
      "Expiration": { "Days": 7 }
    },
    {
      "ID": "DeleteOldWeeklyBackups",
      "Status": "Enabled",
      "Filter": { "Prefix": "weekly/" },
      "Expiration": { "Days": 30 }
    },
    {
      "ID": "DeleteOldMonthlyBackups",
      "Status": "Enabled",
      "Filter": { "Prefix": "monthly/" },
      "Expiration": { "Days": 365 }
    }
  ]
}

Full System Backup

SD Card / NVMe Image

Create full system image for disaster recovery:

# Identify storage device
lsblk

# Create image (run from another system or boot USB)
sudo dd if=/dev/mmcblk0 of=/backup/pisovereign-full-$(date +%Y%m%d).img bs=4M status=progress

# Compress (takes a while)
gzip /backup/pisovereign-full-$(date +%Y%m%d).img

Incremental System Backup

Using rsync for incremental backups:

#!/bin/bash
# /usr/local/bin/pisovereign-system-backup.sh

BACKUP_DIR="/backup/system"
DATE=$(date +%Y%m%d)
LATEST="$BACKUP_DIR/latest"

mkdir -p "$BACKUP_DIR/$DATE"

rsync -aHAX --delete \
  --exclude='/proc/*' \
  --exclude='/sys/*' \
  --exclude='/dev/*' \
  --exclude='/tmp/*' \
  --exclude='/run/*' \
  --exclude='/mnt/*' \
  --exclude='/media/*' \
  --exclude='/backup/*' \
  --link-dest="$LATEST" \
  / "$BACKUP_DIR/$DATE/"

rm -f "$LATEST"
ln -s "$BACKUP_DIR/$DATE" "$LATEST"

Restore Procedures

Database Restore

# Stop the service
sudo systemctl stop pisovereign

# Create a backup of the current database (just in case)
pg_dump -Fc -h postgres -U pisovereign -d pisovereign \
  -f /tmp/pisovereign-pre-restore.dump

# Restore from backup (custom format)
gunzip -c /backup/pisovereign/daily/pisovereign-20260207.dump.gz > /tmp/restore.dump
pg_restore -h postgres -U pisovereign -d pisovereign --clean --if-exists /tmp/restore.dump
rm /tmp/restore.dump

# Or using CLI
pisovereign-cli restore \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --input /backup/pisovereign-20260207.dump

# Verify database connectivity and integrity
pg_isready -h postgres -U pisovereign -d pisovereign
psql -h postgres -U pisovereign -d pisovereign -c "SELECT 1;"

# Start service
sudo systemctl start pisovereign

# Verify
pisovereign-cli status

Restore from S3

# Download from S3
aws s3 cp s3://pisovereign-backups/daily/pisovereign-20260207.dump.gz /tmp/

# Or using CLI
pisovereign-cli restore \
  --s3-bucket pisovereign-backups \
  --s3-key daily/pisovereign-20260207.dump.gz \
  --s3-region eu-central-1

Configuration Restore

# Restore config
sudo cp /backup/pisovereign/config/config-20260207.toml /etc/pisovereign/config.toml

# Verify syntax
pisovereign-cli config validate

# Restart service
sudo systemctl restart pisovereign

Disaster Recovery

Complete system recovery procedure:

  1. Flash fresh Raspberry Pi OS
# On another computer, flash SD card
# Use Raspberry Pi Imager
  1. Basic system setup
# SSH in, update system
sudo apt update && sudo apt upgrade -y
  1. Restore from full image (if available)
# On another system
gunzip -c pisovereign-full-20260207.img.gz | sudo dd of=/dev/mmcblk0 bs=4M status=progress
  1. Or restore components
# Install PiSovereign
# (Follow installation guide)

# Restore configuration
sudo mkdir -p /etc/pisovereign
sudo cp config.toml.backup /etc/pisovereign/config.toml

# Restore database
pisovereign-cli restore \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --input pisovereign-backup.dump

# Restore Vault (if using local Vault)
sudo tar -xzf vault-backup.tar.gz -C /opt/vault/

# Start services
sudo systemctl start pisovereign

Backup Verification

Verify Database Backup

# Check file integrity
gzip -t /backup/pisovereign/daily/pisovereign-20260207.dump.gz && echo "OK"

# Test restore to a temporary database
createdb -h postgres -U pisovereign pisovereign_verify
gunzip -c /backup/pisovereign/daily/pisovereign-20260207.dump.gz | \
  pg_restore -h postgres -U pisovereign -d pisovereign_verify
psql -h postgres -U pisovereign -d pisovereign_verify \
  -c "SELECT COUNT(*) FROM conversations;"
dropdb -h postgres -U pisovereign pisovereign_verify

Automated Verification

#!/bin/bash
# /usr/local/bin/verify-backup.sh

DB_URL="postgres://pisovereign:pisovereign@postgres:5432"
BACKUP_FILE="/backup/pisovereign/daily/pisovereign-$(date +%Y%m%d).dump.gz"

if [ ! -f "$BACKUP_FILE" ]; then
    echo "ERROR: Today's backup not found!"
    exit 1
fi

# Verify gzip integrity
if ! gzip -t "$BACKUP_FILE" 2>/dev/null; then
    echo "ERROR: Backup file is corrupted!"
    exit 1
fi

# Verify database integrity by test-restoring to a temporary database
createdb -h postgres -U pisovereign pisovereign_verify
gunzip -c "$BACKUP_FILE" | pg_restore -h postgres -U pisovereign -d pisovereign_verify 2>&1
INTEGRITY=$(psql -h postgres -U pisovereign -d pisovereign_verify -tAc "SELECT 1;" 2>&1)
dropdb -h postgres -U pisovereign pisovereign_verify

if [ "$INTEGRITY" != "1" ]; then
    echo "ERROR: Database integrity check failed: $INTEGRITY"
    exit 1
fi

echo "Backup verification passed"

Add to cron:

# Verify backup at 3 AM (after 2 AM backup)
0 3 * * * /usr/local/bin/verify-backup.sh || echo "Backup verification failed!" | mail -s "PiSovereign Backup Alert" admin@example.com

Retention Policy

TypeRetentionStorage Estimate
Daily7 days~70 MB
Weekly4 weeks~40 MB
Monthly12 months~120 MB
Total-~230 MB

Cleanup Script

#!/bin/bash
# /usr/local/bin/cleanup-backups.sh

BACKUP_DIR="/backup/pisovereign"

# Remove old daily backups (older than 7 days)
find "$BACKUP_DIR/daily" -name "*.dump.gz" -mtime +7 -delete

# Remove old weekly backups (older than 28 days)
find "$BACKUP_DIR/weekly" -name "*.dump.gz" -mtime +28 -delete

# Remove old monthly backups (older than 365 days)
find "$BACKUP_DIR/monthly" -name "*.dump.gz" -mtime +365 -delete

# Remove old config backups (older than 30 days)
find "$BACKUP_DIR/config" -name "*.toml" -mtime +30 -delete

# Report disk usage
echo "Backup disk usage:"
du -sh "$BACKUP_DIR"/*

Quick Reference

Backup Commands

# Local backup
pisovereign-cli backup \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --output /backup/pisovereign.dump

# S3 backup
pisovereign-cli backup --s3-bucket mybucket --s3-prefix daily/

# Verify backup
pg_restore --list /backup/pisovereign.dump

Restore Commands

# Local restore
pisovereign-cli restore \
  --database-url postgres://pisovereign:pisovereign@postgres:5432/pisovereign \
  --input /backup/pisovereign.dump

# S3 restore
pisovereign-cli restore --s3-bucket mybucket --s3-key daily/pisovereign.dump

Monitoring Backup Health

Add to Prometheus:

# prometheus/rules/backups.yml
groups:
  - name: backups
    rules:
      - alert: BackupMissing
        expr: time() - file_mtime{path="/backup/pisovereign/daily/latest.dump.gz"} > 86400
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Daily backup is missing"
          description: "No backup created in the last 24 hours"

Next Steps

Security Hardening

Production security guide for PiSovereign deployments

Security Architecture

┌─────────────────────────────────────────────────┐
│  Network: Traefik TLS 1.3 + Docker isolation    │
├─────────────────────────────────────────────────┤
│  Application: Rate limiting, auth, validation   │
├─────────────────────────────────────────────────┤
│  Secrets: HashiCorp Vault, encrypted storage    │
├─────────────────────────────────────────────────┤
│  Host: SSH hardened, firewall, auto-updates     │
└─────────────────────────────────────────────────┘

Principles: Defense in depth — least privilege — fail secure — audit everything.


Host Security Basics

Docker provides process isolation, but the host still needs hardening. Apply these essentials on any machine running PiSovereign:

AreaAction
SSHDisable password auth, use Ed25519 keys, set PermitRootLogin no, consider a non-default port
FirewallAllow only SSH + 443 (HTTPS). On Linux: ufw default deny incoming && ufw allow 22/tcp && ufw allow 443/tcp && ufw enable
Fail2banapt install fail2ban — protects SSH and can monitor Docker logs for repeated 401/429 responses
UpdatesEnable automatic security updates (unattended-upgrades on Debian/Ubuntu)
UsersLock root (passwd -l root), use a personal account with sudo

For comprehensive OS hardening, refer to the CIS Benchmark for your distribution.


Application Security

Rate Limiting

[security]
rate_limit_enabled = true
rate_limit_rpm = 120          # Per IP per minute

[api]
max_request_size_bytes = 1048576  # 1 MB
request_timeout_secs = 30

API Authentication

Generate and store API keys in Vault:

docker compose exec vault vault kv put secret/pisovereign/api-keys \
  admin="$(openssl rand -base64 32)"

All requests require Authorization: Bearer <api-key>. Invalid keys return a generic 401 — no information leakage. Rate limiting is applied per key.

Input Validation

PiSovereign validates all inputs automatically:

  • Maximum lengths enforced on all string fields
  • Content-type verification
  • JSON schema validation
  • Path traversal protection
  • SQL injection prevention via parameterized queries

Container Isolation

Docker Compose provides process-level isolation. The default stack additionally:

  • Runs Ollama on an internal: true network (ollama-internal) — no direct external access
  • Binds services to 127.0.0.1 where possible (Baïkal, Vault UI)
  • Uses read-only filesystem mounts for config files
  • Limits container capabilities via Docker defaults

Vault Security

PiSovereign uses a ChainedSecretStore — Vault is the primary store with config.toml as fallback. See Vault Setup for initial configuration.

Seal/Unseal

The Docker stack auto-initializes and auto-unseals Vault for convenience. In production, consider:

  • Manual unseal: Remove the vault-init container, unseal interactively after each restart
  • Key splitting (Shamir’s Secret Sharing): vault operator init -key-shares=5 -key-threshold=3 — distribute shares to different people/locations
  • Cloud KMS auto-unseal: Use AWS KMS, GCP KMS, or Azure Key Vault for unattended unseal without storing keys locally

Token Management

PiSovereign uses AppRole authentication with short-lived tokens:

# Tokens expire after 1 hour, max 4 hours
docker compose exec vault vault write auth/approle/role/pisovereign \
  token_policies="pisovereign" \
  token_ttl=1h \
  token_max_ttl=4h \
  secret_id_ttl=24h

Best practices:

  • Use short TTLs (1 hour default is good)
  • Rotate secret IDs regularly
  • Never log tokens
  • Revoke tokens on application shutdown

Audit Logging

docker compose exec vault vault audit enable file \
  file_path=/vault/logs/audit.log

Network Security

TLS Configuration

Traefik handles TLS termination. Harden the defaults:

# docker/traefik/dynamic.yml
tls:
  options:
    default:
      minVersion: VersionTLS13
      cipherSuites:
        - TLS_AES_256_GCM_SHA384
        - TLS_CHACHA20_POLY1305_SHA256
      curvePreferences:
        - X25519
        - CurveP384
      sniStrict: true

In config.toml:

[security]
min_tls_version = "1.3"
tls_verify_certs = true

Network Isolation

The Docker Compose stack defines two networks:

NetworkTypePurpose
pisovereign-networkbridgeMain service communication
ollama-internalinternal bridgeIsolates Ollama — no external access

Traefik is the only service exposed to the host network. All other services communicate internally.


Security Monitoring

Configure structured JSON logging:

[logging]
level = "info"
format = "json"
include_request_id = true
include_user_id = true

Key events to monitor:

  • Failed authentication attempts (401s)
  • Rate limit triggers (429s)
  • Vault access failures
  • Unusual request patterns

See Monitoring for Prometheus alert rules covering HighFailedAuthRate and RateLimitTriggered.


Incident Response

  1. Isolate — stop external access: docker compose down or firewall deny-all
  2. Preserve evidence — copy container logs: docker compose logs > incident-$(date +%Y%m%d).log
  3. Rotate credentials:
    docker compose exec vault vault kv put secret/pisovereign/api-keys \
      admin="$(openssl rand -base64 32)"
    
  4. Review access — check Docker logs, Vault audit log, SSH lastlog
  5. Restore from known-good backup if needed

Security Checklist

Initial Setup

  • Host SSH uses key-only authentication
  • Firewall allows only required ports
  • Automatic security updates enabled
  • Default passwords changed

Application

  • Rate limiting enabled
  • API keys stored in Vault
  • TLS 1.3 minimum enforced
  • Logs do not contain secrets

Vault

  • Unseal keys secured (not on same host in production)
  • AppRole configured with short TTLs
  • Audit logging enabled

Ongoing

  • Monthly credential rotation
  • Review Vault audit logs
  • Keep Docker images updated
  • Review container security scans

References

References

📚 External resources and documentation references

This page collects official documentation, tutorials, and resources referenced throughout the PiSovereign documentation.


Hardware

Raspberry Pi 5

ResourceDescription
Raspberry Pi 5 Product PageOfficial product information
Raspberry Pi 5 DocumentationHardware specifications and setup
Raspberry Pi OSOperating system downloads
Raspberry Pi ImagerSD card flashing tool
GPIO PinoutInteractive pinout reference

Hailo AI Accelerator

ResourceDescription
Hailo-10H AI HAT+ Product PageOfficial product information
Hailo Developer ZoneSDKs, tools, and documentation
HailoRT SDK 4.20 DocumentationRuntime SDK reference
Hailo Model ZooPre-compiled models
Hailo-Ollama GitHubOllama-compatible inference server

Storage

ResourceDescription
NVMe SSD CompatibilityNVMe boot support
PCIe HAT+ DocumentationPCIe expansion

Rust Ecosystem

Language & Tools

ResourceDescription
The Rust Programming LanguageOfficial Rust book
Rust by ExampleLearn Rust through examples
Rust API GuidelinesBest practices for API design
Rust Edition GuideEdition migration guide
rustup DocumentationToolchain manager
Cargo BookPackage manager documentation

Frameworks Used

ResourceDescription
Axum DocumentationWeb framework
Tokio DocumentationAsync runtime
SQLx DocumentationAsync SQL toolkit
Serde DocumentationSerialization framework
Tower DocumentationMiddleware framework
Tracing DocumentationApplication instrumentation
Clap DocumentationCommand-line parser
Reqwest DocumentationHTTP client
Utoipa DocumentationOpenAPI generation

Testing & Quality

ResourceDescription
Rust TestingTesting in Rust
cargo-tarpaulinCode coverage tool
cargo-denyDependency linting
Clippy LintsLint reference
Rustfmt ConfigurationFormatter options

Security

HashiCorp Vault

ResourceDescription
Vault DocumentationOfficial documentation
Vault Getting StartedBeginner tutorials
KV Secrets Engine v2Key-value secrets
AppRole Auth MethodApplication authentication
Vault Security ModelSecurity architecture
Vault Production HardeningProduction best practices

System Security

ResourceDescription
CIS BenchmarksSecurity configuration guides
OWASP API Security Top 10API security risks
Mozilla SSL ConfigurationTLS configuration generator
SSH Hardening GuideSSH security
Fail2ban DocumentationIntrusion prevention

Cryptography

ResourceDescription
RustCryptoPure Rust crypto implementations
ring DocumentationCrypto library
Argon2 SpecificationPassword hashing

APIs & Integrations

AI & Language Models

ResourceDescription
OpenAI API ReferenceOpenAI API docs
Ollama APIOllama REST API
LLM TokenizationUnderstanding tokenizers

Communication

ResourceDescription
WhatsApp Business APIWhatsApp Cloud API
WhatsApp WebhooksWebhook setup

Email

ResourceDescription
Proton BridgeProton Mail IMAP/SMTP bridge
Gmail IMAPGmail IMAP/SMTP settings
Outlook IMAPOutlook IMAP/SMTP settings
IMAP RFC 3501IMAP protocol
SMTP RFC 5321SMTP protocol
XOAUTH2 SASLOAuth2 for IMAP/SMTP

Calendar

ResourceDescription
CalDAV RFC 4791CalDAV protocol
iCalendar RFC 5545iCalendar format
Baïkal ServerCalDAV/CardDAV server

Weather

ResourceDescription
Open-Meteo APIFree weather API

Infrastructure

Docker

ResourceDescription
Docker DocumentationOfficial docs
Docker ComposeMulti-container apps
Docker on Raspberry PiARM installation

Reverse Proxy

ResourceDescription
Traefik DocumentationCloud-native proxy
Let’s EncryptFree TLS certificates
Nginx DocumentationWeb server/proxy

Monitoring

ResourceDescription
Prometheus DocumentationMetrics collection
Grafana DocumentationVisualization
Loki DocumentationLog aggregation
OpenTelemetryObservability framework

Databases

ResourceDescription
PostgreSQL 17 DocumentationRelational database
pgvectorVector similarity search for PostgreSQL

Development Tools

VS Code

ResourceDescription
rust-analyzerRust language server
CodeLLDBDebugger
Even Better TOMLTOML support

GitHub

ResourceDescription
GitHub ActionsCI/CD platform
Release PleaseRelease automation
GitHub PagesStatic site hosting

Documentation

ResourceDescription
mdBook DocumentationDocumentation tool
rustdoc BookRust documentation

Standards & Specifications

ResourceDescription
OpenAPI SpecificationAPI description format
JSON SchemaJSON validation
Semantic VersioningVersion numbering
Keep a ChangelogChangelog format
Conventional CommitsCommit message format

Community

ResourceDescription
Rust Users ForumCommunity forum
Rust DiscordChat community
This Week in RustWeekly newsletter
Raspberry Pi ForumsHardware community

💡 Tip: Many of these resources are updated regularly. Always check for the latest version of documentation when implementing features.