Skip to main content

Documentation Index

Fetch the complete documentation index at: https://astron-bb4261fd.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

z3rno-server is a FastAPI application that exposes Z3rno’s memory engine over HTTP. It imports z3rno-core for business logic and adds authentication, rate limiting, audit logging, and async task processing.

Endpoints

MethodPathDescription
POST/v1/memoriesStore a new memory
POST/v1/memories/recallSemantic search over memories
DELETE/v1/memories/{id}Soft-delete a memory
DELETE/v1/memories/batchBatch delete memories
DELETE/v1/memories/{id}/gdprGDPR hard delete
GET/v1/memories/{id}/versionsTemporal version history
POST/v1/memories/recall/as-ofPoint-in-time recall
GET/v1/auditQuery audit log
GET/v1/healthHealth check
GET/v1/health/readyReadiness probe (DB connected)

Middleware Chain

Requests pass through these layers in order:
Request → CORS → Request ID → Rate Limiter → Auth → RLS Injection → Handler → Audit Log → Response
  1. CORS — Configurable allowed origins
  2. Request ID — Attaches X-Request-Id for tracing
  3. Rate Limiter — Token bucket per API key (default: 100 req/min)
  4. Auth — Validates Authorization: Bearer <api_key>, resolves tenant
  5. RLS Injection — Sets app.current_tenant on the DB connection
  6. Audit Log — Records operation, latency, and result count

Authentication

API keys are passed via the Authorization header:
curl -X POST https://api.z3rno.dev/v1/memories \
  -H "Authorization: Bearer z3_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "agent-1", "content": "User likes Python"}'
Keys are prefixed: z3_live_ (production) or z3_test_ (sandbox). Sandbox keys operate on isolated data and have relaxed rate limits.

Rate Limiting

TierRequests/minBurst
Free3050
Pro300500
EnterpriseCustomCustom
Rate limit headers are returned on every response:
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 297
X-RateLimit-Reset: 1706140800

Celery Workers

Background tasks are processed by Celery workers with Valkey as the broker:
TaskScheduleDescription
ttl_expirationEvery 1 minExpire memories past their TTL
importance_decayEvery 1 hourReduce importance of unused memories
embedding_backfillOn demandRe-embed memories after model upgrade
graph_maintenanceEvery 6 hoursPrune orphaned graph edges
# Start a worker
celery -A z3rno_server.tasks worker --loglevel=info --concurrency=4

Configuration

Environment variables:
# Required
DATABASE_URL=postgresql+asyncpg://user:pass@host:5432/z3rno
VALKEY_URL=redis://localhost:6379/0

# Optional
Z3RNO_EMBEDDING_MODEL=text-embedding-3-small   # OpenAI model
Z3RNO_EMBEDDING_API_KEY=sk-...                  # OpenAI key
Z3RNO_RATE_LIMIT_DEFAULT=100                    # req/min
Z3RNO_CORS_ORIGINS=https://app.z3rno.dev        # comma-separated
Z3RNO_LOG_LEVEL=info                            # debug|info|warning|error

Running Locally

# Clone and install
git clone https://github.com/the-ai-project-co/z3rno-server
cd z3rno-server
uv sync

# Start dependencies
docker compose up -d postgres valkey

# Run migrations
alembic upgrade head

# Start the server
uvicorn z3rno_server.main:app --reload --port 8000

Error Responses

All errors follow RFC 7807 (Problem Details):
{
  "type": "https://docs.z3rno.dev/errors/rate-limited",
  "title": "Rate Limit Exceeded",
  "status": 429,
  "detail": "You have exceeded 300 requests per minute.",
  "instance": "/v1/memories/recall"
}