API Reference

📡 REST API documentation for PiSovereign

This document provides complete REST API documentation including authentication, endpoints, and the OpenAPI specification.

Overview
Authentication
- API Key Authentication
- Error Responses
Rate Limiting
Endpoints
- Health & Status
- Chat
- Commands
- System Command Catalog
- Memory
- Agentic Tasks
- System
- Webhooks
- Metrics
Error Handling
OpenAPI Specification

Authentication

API Key Authentication

Protected endpoints require an API key in the Authorization header:

Authorization: Bearer sk-your-api-key

Configuration

API keys are mapped to user IDs in config.toml:

[security.api_key_users]
"sk-abc123def456" = "550e8400-e29b-41d4-a716-446655440000"
"sk-xyz789ghi012" = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"

Example Request

curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123def456" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

Authentication Errors

Status	Code	Description
401	`UNAUTHORIZED`	Missing or invalid API key
403	`FORBIDDEN`	Valid key, but action not allowed

{
  "error": {
    "code": "UNAUTHORIZED",
    "message": "Invalid or missing API key",
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Rate Limiting

Rate limiting is applied per IP address.

Configuration	Default
`rate_limit_rpm`	120 requests/minute

Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1707321600

Rate Limited Response

HTTP/1.1 429 Too Many Requests
Retry-After: 30

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Please retry after 30 seconds.",
    "retry_after": 30
  }
}

Endpoints

Health & Status

GET /health

Liveness probe. Returns 200 if the server is running.

Authentication: None required

Response: 200 OK

{
  "status": "ok"
}

GET /ready

Readiness probe with inference engine status.

Authentication: None required

Response: 200 OK (healthy) or 503 Service Unavailable

{
  "status": "ready",
  "inference": {
    "healthy": true,
    "model": "qwen2.5-1.5b-instruct",
    "latency_ms": 45
  }
}

GET /ready/all

Extended health check with all service statuses.

Authentication: None required

Response: 200 OK

{
  "status": "ready",
  "services": {
    "inference": { "healthy": true, "latency_ms": 45 },
    "database": { "healthy": true, "latency_ms": 2 },
    "cache": { "healthy": true },
    "whatsapp": { "healthy": true, "latency_ms": 120 },
    "email": { "healthy": true, "latency_ms": 89 },
    "calendar": { "healthy": true, "latency_ms": 35 },
    "weather": { "healthy": true, "latency_ms": 180 }
  },
  "latency_percentiles": {
    "p50_ms": 45,
    "p90_ms": 120,
    "p99_ms": 250
  }
}

Chat

POST /v1/chat

Send a message and receive a response.

Authentication: Required

Request Body:

Field	Type	Required	Description
`message`	string	Yes	User message
`conversation_id`	string	No	Continue existing conversation
`system_prompt`	string	No	Override system prompt
`model`	string	No	Override default model
`temperature`	float	No	Sampling temperature (0.0-2.0)
`max_tokens`	integer	No	Maximum response tokens

{
  "message": "What's the weather in Berlin?",
  "conversation_id": "conv-123",
  "temperature": 0.7
}

Response: 200 OK

{
  "id": "msg-456",
  "conversation_id": "conv-123",
  "role": "assistant",
  "content": "Currently in Berlin, it's 15°C with partly cloudy skies...",
  "model": "qwen2.5-1.5b-instruct",
  "tokens": {
    "prompt": 45,
    "completion": 128,
    "total": 173
  },
  "created_at": "2026-02-07T10:30:00Z"
}

POST /v1/chat/stream

Streaming chat using Server-Sent Events (SSE).

Authentication: Required

Request Body: Same as /v1/chat

Response: 200 OK (text/event-stream)

event: message
data: {"delta": "Currently"}

event: message
data: {"delta": " in Berlin"}

event: message
data: {"delta": ", it's 15°C"}

event: done
data: {"tokens": {"prompt": 45, "completion": 128, "total": 173}}

Example (JavaScript):

const eventSource = new EventSource('/v1/chat/stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sk-...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({ message: 'Hello' })
});

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  process.stdout.write(data.delta);
};

Commands

POST /v1/commands

Execute a command and get the result.

Authentication: Required

Request Body:

Field	Type	Required	Description
`command`	string	Yes	Command to execute
`args`	object	No	Command arguments

{
  "command": "briefing"
}

Response: 200 OK

{
  "command": "MorningBriefing",
  "status": "completed",
  "result": {
    "weather": "15°C, partly cloudy",
    "calendar": [
      {"time": "09:00", "title": "Team standup"},
      {"time": "14:00", "title": "Client meeting"}
    ],
    "emails": {
      "unread": 5,
      "important": 2
    }
  },
  "executed_at": "2026-02-07T07:00:00Z"
}

Available Commands:

Command	Description	Arguments
`briefing`	Morning briefing	None
`weather`	Current weather	`location` (optional)
`calendar`	Today’s events	`days` (default: 1)
`emails`	Email summary	`count` (default: 10)
`help`	List commands	None

POST /v1/commands/parse

Parse a command without executing it.

Authentication: Required

Request Body:

{
  "input": "create meeting tomorrow at 3pm"
}

Response: 200 OK

{
  "parsed": true,
  "command": {
    "type": "CreateCalendarEvent",
    "title": "meeting",
    "start": "2026-02-08T15:00:00Z",
    "end": "2026-02-08T16:00:00Z"
  },
  "confidence": 0.92,
  "requires_approval": true
}

The system command catalog provides a discoverable set of shell commands that can be executed on the host system. On first startup, PiSovereign automatically populates 32 default commands (disk usage, system info, network tools, etc.) stored in PostgreSQL.

GET /v1/commands/catalog

List all commands in the catalog.

Authentication: Required

Query Parameters:

Parameter	Type	Required	Description
`limit`	integer	No	Maximum results (default: 100)
`offset`	integer	No	Pagination offset (default: 0)

Response: 200 OK

[
  {
    "id": "default-disk-free",
    "name": "Disk Free Space",
    "description": "Show available disk space on all mounts",
    "command": "df -h",
    "category": "filesystem",
    "risk_level": "safe",
    "os": "linux",
    "requires_approval": false,
    "created_at": "2026-02-24T08:50:08Z",
    "updated_at": "2026-02-24T08:50:08Z"
  }
]

GET /v1/commands/catalog/search

Search the catalog by keyword.

Authentication: Required

Query Parameters:

Parameter	Type	Required	Description
`q`	string	Yes	Search query (matches name and description)

Response: 200 OK — returns matching commands (same format as listing).

GET /v1/commands/catalog/count

Get the total number of catalog entries.

Authentication: Required

Response: 200 OK

{
  "count": 32
}

GET /v1/commands/catalog/

Get a specific catalog command by ID.

Authentication: Required

Response: 200 OK — returns a single command object.

POST /v1/commands/catalog

Create a custom catalog command.

Authentication: Required

Request Body:

{
  "name": "Check Logs",
  "description": "Tail the last 100 lines of syslog",
  "command": "tail -n 100 /var/log/syslog",
  "category": "system",
  "risk_level": "safe",
  "os": "linux",
  "requires_approval": false
}

Response: 201 Created

POST /v1/commands/catalog/{id}/execute

Execute a command from the catalog. Commands with requires_approval: true will create an approval request instead of executing immediately.

Authentication: Required

Response: 200 OK

DELETE /v1/commands/catalog/

Delete a catalog command.

Authentication: Required

Response: 204 No Content

Memory

The memory API manages the RAG (Retrieval-Augmented Generation) knowledge store. Memories are automatically used to enrich chat context.

GET /v1/memories

List all stored memories.

Authentication: Required

Response: 200 OK

[
  {
    "id": "uuid",
    "content": "The user prefers dark mode",
    "summary": "UI preference: dark mode",
    "memory_type": "Preference",
    "importance": 0.8,
    "access_count": 5,
    "tags": ["ui", "preference"],
    "created_at": "2026-02-24T08:50:00Z",
    "updated_at": "2026-02-24T09:00:00Z"
  }
]

POST /v1/memories

Create a new memory entry.

Authentication: Required

Request Body:

Field	Type	Required	Description
`content`	string	Yes	Memory content text
`summary`	string	Yes	Short summary
`memory_type`	string	No	Type: `fact`, `preference`, `tool_result`, `correction`, `context` (default: `context`)
`importance`	float	No	Importance score 0.0–1.0 (default: 0.5)
`tags`	string[]	No	Optional tags

Response: 201 Created

GET /v1/memories/search

Search memories by semantic similarity.

Authentication: Required

Query Parameters:

Parameter	Type	Required	Description
`q`	string	Yes	Search query

Response: 200 OK — returns matching memories ranked by relevance.

GET /v1/memories/stats

Get memory storage statistics.

Authentication: Required

Response: 200 OK

{
  "total": 42,
  "by_type": [
    {"memory_type": "Fact", "count": 15},
    {"memory_type": "Preference", "count": 8},
    {"memory_type": "Tool Result", "count": 10},
    {"memory_type": "Correction", "count": 2},
    {"memory_type": "Context", "count": 7}
  ]
}

POST /v1/memories/decay

Trigger a manual memory importance decay cycle. Reduces the importance of older, less-accessed memories.

Authentication: Required

Response: 200 OK

GET /v1/memories/

Get a specific memory by ID.

Authentication: Required

Response: 200 OK

DELETE /v1/memories/

Delete a specific memory.

Authentication: Required

Response: 204 No Content

Agentic Tasks

Multi-agent task orchestration. Decompose complex requests into parallel sub-tasks executed by independent AI agents.

Note: Requires [agentic] enabled = true in config.toml.

POST /v1/agentic/tasks

Create a new agentic task for multi-agent processing.

Authentication: Required

Request Body:

Field	Type	Required	Description
`description`	string	Yes	Task description in natural language
`require_approval`	boolean	No	Require approval before sub-agent execution (default: false)

{
  "description": "Plan my trip to Berlin next week — check weather, find transit options, and create calendar events",
  "require_approval": false
}

Response: 201 Created

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "planning",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/

Get the current status and results of an agentic task.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "description": "Plan my trip to Berlin",
  "plan_summary": "3 sub-tasks: weather, transit, calendar",
  "sub_agents": [
    { "id": "sa-1", "description": "Check Berlin weather", "status": "completed" },
    { "id": "sa-2", "description": "Search transit", "status": "completed" },
    { "id": "sa-3", "description": "Create events", "status": "completed" }
  ],
  "result": "Your Berlin trip is planned: ...",
  "created_at": "2026-03-03T10:00:00Z"
}

GET /v1/agentic/tasks/{task_id}/stream

Stream real-time progress updates via Server-Sent Events (SSE).

Authentication: Required

Response: 200 OK (text/event-stream)

event: task_started
data: {"task_id": "550e8400-...", "description": "Plan my trip to Berlin"}

event: plan_created
data: {"task_id": "550e8400-...", "sub_tasks": [...]}

event: sub_agent_started
data: {"sub_agent_id": "sa-1", "description": "Check Berlin weather"}

event: sub_agent_completed
data: {"sub_agent_id": "sa-1", "result": "15°C, partly cloudy"}

event: task_completed
data: {"task_id": "550e8400-...", "result": "Your Berlin trip is planned: ..."}

POST /v1/agentic/tasks/{task_id}/cancel

Cancel a running agentic task and all its sub-agents.

Authentication: Required

Response: 200 OK

{
  "task_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "cancelled"
}

System

GET /v1/system/status

Get system status and resource usage.

Authentication: Required

Response: 200 OK

{
  "version": "0.1.0",
  "uptime_seconds": 86400,
  "environment": "production",
  "resources": {
    "memory_used_mb": 256,
    "cpu_percent": 15.5,
    "database_size_mb": 42
  },
  "statistics": {
    "requests_total": 15420,
    "inference_requests": 8930,
    "cache_hit_rate": 0.73
  }
}

GET /v1/system/models

List available inference models.

Authentication: Required

Response: 200 OK

{
  "models": [
    {
      "id": "qwen2.5-1.5b-instruct",
      "name": "Qwen 2.5 1.5B Instruct",
      "parameters": "1.5B",
      "context_length": 4096,
      "default": true
    },
    {
      "id": "llama3.2-1b-instruct",
      "name": "Llama 3.2 1B Instruct",
      "parameters": "1B",
      "context_length": 4096,
      "default": false
    }
  ]
}

Webhooks

POST /v1/webhooks/whatsapp

WhatsApp webhook endpoint for incoming messages.

Authentication: Signature verification via X-Hub-Signature-256 header

Verification Request (GET):

GET /v1/webhooks/whatsapp?hub.mode=subscribe&hub.verify_token=your-token&hub.challenge=challenge123

Response: The hub.challenge value

Message Webhook (POST):

{
  "object": "whatsapp_business_account",
  "entry": [{
    "changes": [{
      "value": {
        "messages": [{
          "from": "+1234567890",
          "type": "text",
          "text": { "body": "Hello" }
        }]
      }
    }]
  }]
}

Response: 200 OK

Metrics

GET /metrics

JSON metrics for monitoring.

Authentication: None required

Response: 200 OK

{
  "uptime_seconds": 86400,
  "http": {
    "requests_total": 15420,
    "requests_success": 15100,
    "requests_client_error": 280,
    "requests_server_error": 40,
    "active_requests": 3,
    "response_time_avg_ms": 125
  },
  "inference": {
    "requests_total": 8930,
    "requests_success": 8850,
    "requests_failed": 80,
    "time_avg_ms": 450,
    "tokens_total": 1250000,
    "healthy": true
  }
}

GET /metrics/prometheus

Prometheus-compatible metrics.

Authentication: None required

Response: 200 OK (text/plain)

# HELP app_uptime_seconds Application uptime in seconds
# TYPE app_uptime_seconds counter
app_uptime_seconds 86400

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{status="success"} 15100
http_requests_total{status="client_error"} 280
http_requests_total{status="server_error"} 40

# HELP inference_time_ms_bucket Inference time histogram
# TYPE inference_time_ms_bucket histogram
inference_time_ms_bucket{le="100"} 1200
inference_time_ms_bucket{le="250"} 4500
inference_time_ms_bucket{le="500"} 7200
inference_time_ms_bucket{le="1000"} 8500
inference_time_ms_bucket{le="+Inf"} 8930

Error Handling

Error Response Format

All errors follow this format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message",
    "details": {},
    "request_id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Error Codes

HTTP Status	Code	Description
400	`BAD_REQUEST`	Invalid request body or parameters
401	`UNAUTHORIZED`	Missing or invalid authentication
403	`FORBIDDEN`	Authenticated but not authorized
404	`NOT_FOUND`	Resource not found
422	`VALIDATION_ERROR`	Request validation failed
429	`RATE_LIMITED`	Too many requests
500	`INTERNAL_ERROR`	Server error
502	`UPSTREAM_ERROR`	External service error
503	`SERVICE_UNAVAILABLE`	Service temporarily unavailable

Validation Errors

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request validation failed",
    "details": {
      "fields": [
        {"field": "message", "error": "cannot be empty"},
        {"field": "temperature", "error": "must be between 0.0 and 2.0"}
      ]
    }
  }
}

OpenAPI Specification

Interactive Documentation

When the server is running, access interactive API documentation:

Swagger UI: http://localhost:3000/swagger-ui/
ReDoc: http://localhost:3000/redoc/

Export OpenAPI Spec

# Via CLI
pisovereign-cli openapi --output openapi.json

# Via API (if enabled)
curl http://localhost:3000/api-docs/openapi.json

OpenAPI 3.1 Specification

The full specification is available at:

Development: /api-docs/openapi.json
GitHub Pages: /api/openapi.json

Example OpenAPI Excerpt

openapi: 3.1.0
info:
  title: PiSovereign API
  description: Local AI Assistant REST API
  version: 0.1.0
  license:
    name: MIT
    url: https://opensource.org/licenses/MIT

servers:
  - url: http://localhost:3000
    description: Development server

security:
  - bearerAuth: []

paths:
  /v1/chat:
    post:
      summary: Send chat message
      operationId: chat
      tags:
        - Chat
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatResponse'
        '401':
          $ref: '#/components/responses/Unauthorized'

components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: API key authentication

  schemas:
    ChatRequest:
      type: object
      required:
        - message
      properties:
        message:
          type: string
          description: User message
          example: "What's the weather?"
        conversation_id:
          type: string
          format: uuid
          description: Continue existing conversation

SDK Examples

cURL

# Chat
curl -X POST http://localhost:3000/v1/chat \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'

# Command
curl -X POST http://localhost:3000/v1/commands \
  -H "Authorization: Bearer sk-abc123" \
  -H "Content-Type: application/json" \
  -d '{"command": "briefing"}'

Python

import requests

API_URL = "http://localhost:3000"
API_KEY = "sk-abc123"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# Chat
response = requests.post(
    f"{API_URL}/v1/chat",
    headers=headers,
    json={"message": "What's the weather?"}
)
print(response.json()["content"])

JavaScript/TypeScript

const API_URL = "http://localhost:3000";
const API_KEY = "sk-abc123";

async function chat(message: string): Promise<string> {
  const response = await fetch(`${API_URL}/v1/chat`, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ message }),
  });
  
  const data = await response.json();
  return data.content;
}

PiSovereign Documentation

GET /v1/commands/catalog/

DELETE /v1/commands/catalog/

GET /v1/memories/

DELETE /v1/memories/

GET /v1/agentic/tasks/