Monitoring

Prometheus metrics, Grafana dashboards, Loki log aggregation, and alerting

Overview

The monitoring stack is included in Docker Compose and activated with a single profile flag:

docker compose --profile monitoring up -d

This starts Prometheus, Grafana, Loki, Promtail, Node Exporter, and the OpenTelemetry Collector — all pre-configured to scrape PiSovereign metrics and collect logs.

┌─────────────────┐
│   PiSovereign   │
│  /metrics/      │
│  prometheus     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     ┌─────────────────┐
│   Prometheus    │────▶│    Grafana      │
│   (Metrics)     │     │  (Dashboards)   │
└─────────────────┘     └─────────────────┘

┌─────────────────┐     ┌─────────────────┐
│    Promtail     │────▶│      Loki       │
│  (Log Shipper)  │     │  (Log Storage)  │
└─────────────────┘     └─────────────────┘

Resource Usage (Raspberry Pi 5)

Component	Memory	Storage/Day
Prometheus	~100 MB	~50 MB
Grafana	~150 MB	Minimal
Loki	~200 MB	~100 MB
Promtail	~30 MB	—
Total	~480 MB	~150 MB

Accessing Dashboards

After enabling the monitoring profile:

Service	URL
Grafana	`http://localhost/grafana` (via Traefik)
Prometheus	`http://localhost:9090`

Default Grafana credentials are admin / admin (change on first login). Dashboards and data sources are auto-provisioned — no manual setup required.

Prometheus Metrics

PiSovereign exposes metrics at /metrics/prometheus:

Application Metrics

Metric	Type	Description
`app_uptime_seconds`	Counter	Application uptime
`app_version_info`	Gauge	Version information

HTTP Metrics

Metric	Type	Description
`http_requests_total`	Counter	Total HTTP requests
`http_requests_success_total`	Counter	2xx responses
`http_requests_client_error_total`	Counter	4xx responses
`http_requests_server_error_total`	Counter	5xx responses
`http_requests_active`	Gauge	Active requests
`http_response_time_avg_ms`	Gauge	Average response time
`http_response_time_ms_bucket`	Histogram	Response time distribution

Inference Metrics

Metric	Type	Description
`inference_requests_total`	Counter	Total inference requests
`inference_requests_success_total`	Counter	Successful inferences
`inference_requests_failed_total`	Counter	Failed inferences
`inference_time_avg_ms`	Gauge	Average inference time
`inference_time_ms_bucket`	Histogram	Inference time distribution
`inference_tokens_total`	Counter	Total tokens generated
`inference_healthy`	Gauge	Health status (0/1)

Cache Metrics

Metric	Type	Description
`cache_hits_total`	Counter	Cache hits
`cache_misses_total`	Counter	Cache misses
`cache_size`	Gauge	Current cache size

Model Routing Metrics

These metrics are only present when [model_routing] is enabled.

Metric	Type	Description
`model_routing_requests_total{tier="..."}`	Counter	Requests per tier (trivial/simple/moderate/complex)
`model_routing_template_hits_total`	Counter	Trivial queries answered by template
`model_routing_upgrades_total`	Counter	Tier upgrades due to low confidence

Dream Mode Metrics

These metrics are only present when [dream] is enabled.

Metric	Type	Labels	Description
`pisovereign_dream_sessions_total`	Counter	`status`	Dream sessions by outcome (completed/failed/cancelled)
`pisovereign_dream_cycles_total`	Counter	`phase`	Cycles by phase (nrem/rem)
`pisovereign_dream_memories_consolidated_total`	Counter	—	Memories merged during NREM
`pisovereign_dream_memories_archived_total`	Counter	—	Memories archived during NREM
`pisovereign_dream_insights_total`	Counter	`type`	Insights by type (pattern/connection/suggestion/hypothesis)
`pisovereign_dream_graph_edges_modified_total`	Counter	`operation`	Edge operations (pruned/created/updated)
`pisovereign_dream_graph_nodes_merged_total`	Counter	—	Knowledge graph nodes merged
`pisovereign_dream_llm_calls_total`	Counter	—	LLM inference calls during dreams
`pisovereign_dream_duration_seconds_total`	Counter	—	Total dreaming time in seconds

Grafana Dashboard Panels

The pre-built PiSovereign dashboard includes:

Overview Row

Panel	Description
Uptime	Application uptime counter
Inference Status	Health indicator
Total Requests	Cumulative request count
Active Requests	Current in-flight requests
Avg Response Time	Mean latency
Total Tokens	LLM tokens generated

HTTP Requests Row

Panel	Visualization	Description
Request Rate	Time series	Requests/second over time
Status Distribution	Pie chart	Success/error breakdown
Response Time P50/P90/P99	Stat	Latency percentiles

Inference Row

Panel	Visualization	Description
Inference Rate	Time series	Inferences/second
Inference Latency	Gauge	Current avg latency
Token Rate	Time series	Tokens/second
Model Usage	Table	Per-model statistics

System Row

Panel	Description
CPU Usage	System CPU utilization
Memory Usage	RAM usage
Disk I/O	Storage throughput
Network I/O	Network traffic

Dream Mode Dashboard

A dedicated Dream Mode dashboard (grafana/dashboards/dream-mode.json) provides 18 panels across 4 sections:

Section	Panels	Description
Overview	Sessions Completed, Sessions Failed, LLM Calls, Total Duration, Nodes Merged, Memories Archived	High-level stats
Cycles & Memory	Cycles by Phase (bar), Memories Consolidated (time series), Memories Archived (time series), Edge Operations (bar)	NREM/REM processing detail
Insights	Insights by Type (pie), Active Hypotheses (stat), Hypothesis Confirmation Rate (gauge)	REM phase output
Duration & Performance	Total Duration (time series), LLM Calls (time series)	Resource usage

Alerting

Alert rules are pre-configured in docker/prometheus/rules/ (if present) or can be added:

# prometheus/rules/pisovereign.yml
groups:
  - name: pisovereign
    rules:
      - alert: PiSovereignDown
        expr: up{job="pisovereign"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "PiSovereign is down"

      - alert: InferenceEngineUnhealthy
        expr: inference_healthy == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Inference engine is unhealthy"

      - alert: HighResponseTime
        expr: http_response_time_avg_ms > 5000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Average response time is {{ $value }}ms"

      - alert: HighErrorRate
        expr: rate(http_requests_server_error_total[5m]) / rate(http_requests_total[5m]) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Server error rate is {{ $value | humanizePercentage }}"

      - alert: InferenceFailures
        expr: rate(inference_requests_failed_total[5m]) / rate(inference_requests_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Inference failure rate is {{ $value | humanizePercentage }}"

Log Aggregation

Loki and Promtail are included in the monitoring profile. Logs from all Docker containers are automatically collected and available in Grafana under the Loki data source.

To query logs in Grafana:

Go to Explore → select Loki data source
Use LogQL queries:

{container="pisovereign"} |= "error"
{container="ollama"} | json | level="error"

Resource Optimization

If running on constrained hardware, tune these settings:

# In docker/prometheus/prometheus.yml
global:
  scrape_interval: 30s  # Increase from 15s to reduce load

# Prometheus storage flags (in compose.yml command)
--storage.tsdb.retention.time=3d    # Reduce from 7d
--storage.tsdb.retention.size=500MB # Cap storage

# In docker/loki/loki.yml
limits_config:
  retention_period: 72h  # 3 days instead of 7

Troubleshooting

Metrics not appearing

# Check PiSovereign exposes metrics
curl http://localhost:3000/metrics/prometheus

# Check Prometheus scrape targets
curl http://localhost:9090/api/v1/targets

Grafana dashboard empty

Verify time range includes recent data
Check Prometheus data source is connected (Settings → Data Sources)
Query Prometheus directly at http://localhost:9090/graph

Next Steps

Backup & Restore — Protect your data
Security Hardening — Secure monitoring endpoints

PiSovereign Documentation