API Reference

📡 REST API documentation for PiSovereign

This document provides complete REST API documentation including authentication, endpoints, and the OpenAPI specification.

Table of Contents


Overview

Base URL

http://localhost:3000      # Development
https://your-domain.com    # Production (behind Traefik)

Content Type

All requests and responses use JSON:

Content-Type: application/json
Accept: application/json

Request ID

Every response includes a correlation ID for debugging:

X-Request-Id: 550e8400-e29b-41d4-a716-446655440000

Include this when reporting issues.


Authentication

API Key Authentication

Protected endpoints require an API key in the Authorization header:

Authorization: Bearer sk-your-api-key

Configuration

API keys are mapped to user IDs in config.toml:

[security.api_key_users]
"sk-abc123def456" = "550e8400-e29b-41d4-a716-446655440000"
"sk-xyz789ghi012" = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"

Example Request

curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123def456" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

Authentication Errors

StatusCodeDescription
401UNAUTHORIZEDMissing or invalid API key
403FORBIDDENValid key, but action not allowed
{
  "error": {
    "code": "UNAUTHORIZED",
    "message": "Invalid or missing API key",
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Rate Limiting

Rate limiting is applied per IP address.

ConfigurationDefault
rate_limit_rpm120 requests/minute

Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1707321600

Rate Limited Response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Please retry after 30 seconds.",
    "retry_after": 30
  }
}

Endpoints

Health & Status

GET /health

Liveness probe. Returns 200 if the server is running.

Authentication: None required

Response: 200 OK

{
  "status": "ok"
}

GET /ready

Readiness probe with inference engine status.

Authentication: None required

Response: 200 OK (healthy) or 503 Service Unavailable

{
  "status": "ready",
  "inference": {
    "healthy": true,
    "model": "qwen2.5-1.5b-instruct",
    "latency_ms": 45
  }
}

GET /ready/all

Extended health check with all service statuses.

Authentication: None required

Response: 200 OK

{
  "status": "ready",
  "services": {
    "inference": { "healthy": true, "latency_ms": 45 },
    "database": { "healthy": true, "latency_ms": 2 },
    "cache": { "healthy": true },
    "whatsapp": { "healthy": true, "latency_ms": 120 },
    "email": { "healthy": true, "latency_ms": 89 },
    "calendar": { "healthy": true, "latency_ms": 35 },
    "weather": { "healthy": true, "latency_ms": 180 }
  },
  "latency_percentiles": {
    "p50_ms": 45,
    "p90_ms": 120,
    "p99_ms": 250
  }
}

Chat

POST /v1/chat

Send a message and receive a response.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
messagestringYesUser message
conversation_idstringNoContinue existing conversation
system_promptstringNoOverride system prompt
modelstringNoOverride default model
temperaturefloatNoSampling temperature (0.0-2.0)
max_tokensintegerNoMaximum response tokens
{
  "message": "What's the weather in Berlin?",
  "conversation_id": "conv-123",
  "temperature": 0.7
}

Response: 200 OK

{
  "id": "msg-456",
  "conversation_id": "conv-123",
  "role": "assistant",
  "content": "Currently in Berlin, it's 15°C with partly cloudy skies...",
  "model": "qwen2.5-1.5b-instruct",
  "tokens": {
    "prompt": 45,
    "completion": 128,
    "total": 173
  },
  "created_at": "2026-02-07T10:30:00Z"
}

POST /v1/chat/stream

Streaming chat using Server-Sent Events (SSE).

Authentication: Required

Request Body: Same as /v1/chat

Response: 200 OK (text/event-stream)

event: message
data: {"delta": "Currently"}

event: message
data: {"delta": " in Berlin"}

event: message
data: {"delta": ", it's 15°C"}

event: done
data: {"tokens": {"prompt": 45, "completion": 128, "total": 173}}

Example (JavaScript):

const eventSource = new EventSource('/v1/chat/stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({ message: 'Hello' })
});

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  process.stdout.write(data.delta);
};

Commands

POST /v1/commands

Execute a command and get the result.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
commandstringYesCommand to execute
argsobjectNoCommand arguments
{
  "command": "briefing"
}

Response: 200 OK

{
  "command": "MorningBriefing",
  "status": "completed",
  "result": {
    "weather": "15°C, partly cloudy",
    "calendar": [
      {"time": "09:00", "title": "Team standup"},
      {"time": "14:00", "title": "Client meeting"}
    ],
    "emails": {
      "unread": 5,
      "important": 2
    }
  },
  "executed_at": "2026-02-07T07:00:00Z"
}

Available Commands:

CommandDescriptionArguments
briefingMorning briefingNone
weatherCurrent weatherlocation (optional)
calendarToday’s eventsdays (default: 1)
emailsEmail summarycount (default: 10)
helpList commandsNone

POST /v1/commands/parse

Parse a command without executing it.

Authentication: Required

Request Body:

{
  "input": "create meeting tomorrow at 3pm"
}

Response: 200 OK

{
  "parsed": true,
  "command": {
    "type": "CreateCalendarEvent",
    "title": "meeting",
    "start": "2026-02-08T15:00:00Z",
    "end": "2026-02-08T16:00:00Z"
  },
  "confidence": 0.92,
  "requires_approval": true
}

System Command Catalog

The system command catalog provides a discoverable set of shell commands that can be executed on the host system. On first startup, PiSovereign automatically populates 32 default commands (disk usage, system info, network tools, etc.) stored in PostgreSQL.

GET /v1/commands/catalog

List all commands in the catalog.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
limitintegerNoMaximum results (default: 100)
offsetintegerNoPagination offset (default: 0)

Response: 200 OK

[
  {
    "id": "default-disk-free",
    "name": "Disk Free Space",
    "description": "Show available disk space on all mounts",
    "command": "df -h",
    "category": "filesystem",
    "risk_level": "safe",
    "os": "linux",
    "requires_approval": false,
    "created_at": "2026-02-24T08:50:08Z",
    "updated_at": "2026-02-24T08:50:08Z"
  }
]

GET /v1/commands/catalog/search

Search the catalog by keyword.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
qstringYesSearch query (matches name and description)

Response: 200 OK — returns matching commands (same format as listing).

GET /v1/commands/catalog/count

Get the total number of catalog entries.

Authentication: Required

Response: 200 OK

{
  "count": 32
}

GET /v1/commands/catalog/

Get a specific catalog command by ID.

Authentication: Required

Response: 200 OK — returns a single command object.

POST /v1/commands/catalog

Create a custom catalog command.

Authentication: Required

Request Body:

{
  "name": "Check Logs",
  "description": "Tail the last 100 lines of syslog",
  "command": "tail -n 100 /var/log/syslog",
  "category": "system",
  "risk_level": "safe",
  "os": "linux",
  "requires_approval": false
}

Response: 201 Created

POST /v1/commands/catalog/{id}/execute

Execute a command from the catalog. Commands with requires_approval: true will create an approval request instead of executing immediately.

Authentication: Required

Response: 200 OK

DELETE /v1/commands/catalog/

Delete a catalog command.

Authentication: Required

Response: 204 No Content


Memory

The memory API manages the RAG (Retrieval-Augmented Generation) knowledge store. Memories are automatically used to enrich chat context.

GET /v1/memories

List all stored memories.

Authentication: Required

Response: 200 OK

[
  {
    "id": "uuid",
    "content": "The user prefers dark mode",
    "summary": "UI preference: dark mode",
    "memory_type": "Preference",
    "importance": 0.8,
    "access_count": 5,
    "tags": ["ui", "preference"],
    "created_at": "2026-02-24T08:50:00Z",
    "updated_at": "2026-02-24T09:00:00Z"
  }
]

POST /v1/memories

Create a new memory entry.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
contentstringYesMemory content text
summarystringYesShort summary
memory_typestringNoType: fact, preference, tool_result, correction, context (default: context)
importancefloatNoImportance score 0.0–1.0 (default: 0.5)
tagsstring[]NoOptional tags

Response: 201 Created

GET /v1/memories/search

Search memories by semantic similarity.

Authentication: Required

Query Parameters:

ParameterTypeRequiredDescription
qstringYesSearch query

Response: 200 OK — returns matching memories ranked by relevance.

GET /v1/memories/stats

Get memory storage statistics.

Authentication: Required

Response: 200 OK

{
  "total": 42,
  "by_type": [
    {"memory_type": "Fact", "count": 15},
    {"memory_type": "Preference", "count": 8},
    {"memory_type": "Tool Result", "count": 10},
    {"memory_type": "Correction", "count": 2},
    {"memory_type": "Context", "count": 7}
  ]
}

POST /v1/memories/decay

Trigger a manual memory importance decay cycle. Reduces the importance of older, less-accessed memories.

Authentication: Required

Response: 200 OK

GET /v1/memories/

Get a specific memory by ID.

Authentication: Required

Response: 200 OK

DELETE /v1/memories/

Delete a specific memory.

Authentication: Required

Response: 204 No Content


Agentic Tasks

Multi-agent task orchestration. Decompose complex requests into parallel sub-tasks executed by independent AI agents.

Note: Requires [agentic] enabled = true in config.toml.

POST /v1/agentic/tasks

Create a new agentic task for multi-agent processing.

Authentication: Required

Request Body:

FieldTypeRequiredDescription
descriptionstringYesTask description in natural language
require_approvalbooleanNoRequire approval before sub-agent execution (default: false)
{
  "description": "Plan my trip to Berlin next week — check weather, find transit options, and create calendar events",
  "require_approval": false
}

Response: 201 Created

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "planning",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/

Get the current status and results of an agentic task.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "description": "Plan my trip to Berlin",
  "plan_summary": "3 sub-tasks: weather, transit, calendar",
  "sub_agents": [
    { "id": "sa-1", "description": "Check Berlin weather", "status": "completed" },
    { "id": "sa-2", "description": "Search transit", "status": "completed" },
    { "id": "sa-3", "description": "Create events", "status": "completed" }
  ],
  "result": "Your Berlin trip is planned: ...",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/{task_id}/stream

Stream real-time progress updates via Server-Sent Events (SSE).

Authentication: Required

Response: 200 OK (text/event-stream)

event: task_started
data: {"task_id": "550e8400-...", "description": "Plan my trip to Berlin"}

event: plan_created
data: {"task_id": "550e8400-...", "sub_tasks": [...]}

event: sub_agent_started
data: {"sub_agent_id": "sa-1", "description": "Check Berlin weather"}

event: sub_agent_completed
data: {"sub_agent_id": "sa-1", "result": "15°C, partly cloudy"}

event: task_completed
data: {"task_id": "550e8400-...", "result": "Your Berlin trip is planned: ..."}

POST /v1/agentic/tasks/{task_id}/cancel

Cancel a running agentic task and all its sub-agents.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "cancelled"
}

System

GET /v1/system/status

Get system status and resource usage.

Authentication: Required

Response: 200 OK

{
  "version": "0.1.0",
  "uptime_seconds": 86400,
  "environment": "production",
  "resources": {
    "memory_used_mb": 256,
    "cpu_percent": 15.5,
    "database_size_mb": 42
  },
  "statistics": {
    "requests_total": 15420,
    "inference_requests": 8930,
    "cache_hit_rate": 0.73
  }
}

GET /v1/system/models

List available inference models.

Authentication: Required

Response: 200 OK

{
  "models": [
    {
      "id": "qwen2.5-1.5b-instruct",
      "name": "Qwen 2.5 1.5B Instruct",
      "parameters": "1.5B",
      "context_length": 4096,
      "default": true
    },
    {
      "id": "llama3.2-1b-instruct",
      "name": "Llama 3.2 1B Instruct",
      "parameters": "1B",
      "context_length": 4096,
      "default": false
    }
  ]
}

Webhooks

POST /v1/webhooks/whatsapp

WhatsApp webhook endpoint for incoming messages.

Authentication: Signature verification via X-Hub-Signature-256 header

Verification Request (GET):

GET /v1/webhooks/whatsapp?hub.mode=subscribe&hub.verify_token=your-token&hub.challenge=challenge123

Response: The hub.challenge value

Message Webhook (POST):

{
  "object": "whatsapp_business_account",
  "entry": [{
    "changes": [{
      "value": {
        "messages": [{
          "from": "+1234567890",
          "type": "text",
          "text": { "body": "Hello" }
        }]
      }
    }]
  }]
}

Response: 200 OK


Metrics

GET /metrics

JSON metrics for monitoring.

Authentication: None required

Response: 200 OK

{
  "uptime_seconds": 86400,
  "http": {
    "requests_total": 15420,
    "requests_success": 15100,
    "requests_client_error": 280,
    "requests_server_error": 40,
    "active_requests": 3,
    "response_time_avg_ms": 125
  },
  "inference": {
    "requests_total": 8930,
    "requests_success": 8850,
    "requests_failed": 80,
    "time_avg_ms": 450,
    "tokens_total": 1250000,
    "healthy": true
  }
}

GET /metrics/prometheus

Prometheus-compatible metrics.

Authentication: None required

Response: 200 OK (text/plain)

# HELP app_uptime_seconds Application uptime in seconds
# TYPE app_uptime_seconds counter
app_uptime_seconds 86400

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{status="success"} 15100
http_requests_total{status="client_error"} 280
http_requests_total{status="server_error"} 40

# HELP inference_time_ms_bucket Inference time histogram
# TYPE inference_time_ms_bucket histogram
inference_time_ms_bucket{le="100"} 1200
inference_time_ms_bucket{le="250"} 4500
inference_time_ms_bucket{le="500"} 7200
inference_time_ms_bucket{le="1000"} 8500
inference_time_ms_bucket{le="+Inf"} 8930

Error Handling

Error Response Format

All errors follow this format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message",
    "details": {},
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Error Codes

HTTP StatusCodeDescription
400BAD_REQUESTInvalid request body or parameters
401UNAUTHORIZEDMissing or invalid authentication
403FORBIDDENAuthenticated but not authorized
404NOT_FOUNDResource not found
422VALIDATION_ERRORRequest validation failed
429RATE_LIMITEDToo many requests
500INTERNAL_ERRORServer error
502UPSTREAM_ERRORExternal service error
503SERVICE_UNAVAILABLEService temporarily unavailable

Validation Errors

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request validation failed",
    "details": {
      "fields": [
        {"field": "message", "error": "cannot be empty"},
        {"field": "temperature", "error": "must be between 0.0 and 2.0"}
      ]
    }
  }
}

OpenAPI Specification

Interactive Documentation

When the server is running, access interactive API documentation:

  • Swagger UI: http://localhost:3000/swagger-ui/
  • ReDoc: http://localhost:3000/redoc/

Export OpenAPI Spec

# Via CLI
pisovereign-cli openapi --output openapi.json

# Via API (if enabled)
curl http://localhost:3000/api-docs/openapi.json

OpenAPI 3.1 Specification

The full specification is available at:

  • Development: /api-docs/openapi.json
  • GitHub Pages: /api/openapi.json
Example OpenAPI Excerpt
openapi: 3.1.0
info:
  title: PiSovereign API
  description: Local AI Assistant REST API
  version: 0.1.0
  license:
    name: MIT
    url: https://opensource.org/licenses/MIT

servers:
  - url: http://localhost:3000
    description: Development server

security:
  - bearerAuth: []

paths:
  /v1/chat:
    post:
      summary: Send chat message
      operationId: chat
      tags:
        - Chat
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatResponse'
        '401':
          $ref: '#/components/responses/Unauthorized'

components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: API key authentication

  schemas:
    ChatRequest:
      type: object
      required:
        - message
      properties:
        message:
          type: string
          description: User message
          example: "What's the weather?"
        conversation_id:
          type: string
          format: uuid
          description: Continue existing conversation

SDK Examples

cURL

# Chat
curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

# Command
curl -X POST http://localhost:3000/v1/commands \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"command": "briefing"}'

Python

import requests

API_URL = "http://localhost:3000"
API_KEY = "sk-abc123"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# Chat
response = requests.post(
    f"{API_URL}/v1/chat",
    headers=headers,
    json={"message": "What's the weather?"}
)
print(response.json()["content"])

JavaScript/TypeScript

const API_URL = "http://localhost:3000";
const API_KEY = "sk-abc123";

async function chat(message: string): Promise<string> {
  const response = await fetch(`${API_URL}/v1/chat`, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ message }),
  });
  
  const data = await response.json();
  return data.content;
}