Documentation Index
Fetch the complete documentation index at: https://astron-bb4261fd.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
z3rno-server is a FastAPI application that exposes Z3rno’s memory engine over HTTP. It imports z3rno-core for business logic and adds authentication, rate limiting, audit logging, and async task processing.
Endpoints
| Method | Path | Description |
|---|
POST | /v1/memories | Store a new memory |
POST | /v1/memories/recall | Semantic search over memories |
DELETE | /v1/memories/{id} | Soft-delete a memory |
DELETE | /v1/memories/batch | Batch delete memories |
DELETE | /v1/memories/{id}/gdpr | GDPR hard delete |
GET | /v1/memories/{id}/versions | Temporal version history |
POST | /v1/memories/recall/as-of | Point-in-time recall |
GET | /v1/audit | Query audit log |
GET | /v1/health | Health check |
GET | /v1/health/ready | Readiness probe (DB connected) |
Middleware Chain
Requests pass through these layers in order:
Request → CORS → Request ID → Rate Limiter → Auth → RLS Injection → Handler → Audit Log → Response
- CORS — Configurable allowed origins
- Request ID — Attaches
X-Request-Id for tracing
- Rate Limiter — Token bucket per API key (default: 100 req/min)
- Auth — Validates
Authorization: Bearer <api_key>, resolves tenant
- RLS Injection — Sets
app.current_tenant on the DB connection
- Audit Log — Records operation, latency, and result count
Authentication
API keys are passed via the Authorization header:
curl -X POST https://api.z3rno.dev/v1/memories \
-H "Authorization: Bearer z3_live_abc123..." \
-H "Content-Type: application/json" \
-d '{"agent_id": "agent-1", "content": "User likes Python"}'
Keys are prefixed: z3_live_ (production) or z3_test_ (sandbox). Sandbox keys operate on isolated data and have relaxed rate limits.
Rate Limiting
| Tier | Requests/min | Burst |
|---|
| Free | 30 | 50 |
| Pro | 300 | 500 |
| Enterprise | Custom | Custom |
Rate limit headers are returned on every response:
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 297
X-RateLimit-Reset: 1706140800
Celery Workers
Background tasks are processed by Celery workers with Valkey as the broker:
| Task | Schedule | Description |
|---|
ttl_expiration | Every 1 min | Expire memories past their TTL |
importance_decay | Every 1 hour | Reduce importance of unused memories |
embedding_backfill | On demand | Re-embed memories after model upgrade |
graph_maintenance | Every 6 hours | Prune orphaned graph edges |
# Start a worker
celery -A z3rno_server.tasks worker --loglevel=info --concurrency=4
Configuration
Environment variables:
# Required
DATABASE_URL=postgresql+asyncpg://user:pass@host:5432/z3rno
VALKEY_URL=redis://localhost:6379/0
# Optional
Z3RNO_EMBEDDING_MODEL=text-embedding-3-small # OpenAI model
Z3RNO_EMBEDDING_API_KEY=sk-... # OpenAI key
Z3RNO_RATE_LIMIT_DEFAULT=100 # req/min
Z3RNO_CORS_ORIGINS=https://app.z3rno.dev # comma-separated
Z3RNO_LOG_LEVEL=info # debug|info|warning|error
Running Locally
# Clone and install
git clone https://github.com/the-ai-project-co/z3rno-server
cd z3rno-server
uv sync
# Start dependencies
docker compose up -d postgres valkey
# Run migrations
alembic upgrade head
# Start the server
uvicorn z3rno_server.main:app --reload --port 8000
Error Responses
All errors follow RFC 7807 (Problem Details):
{
"type": "https://docs.z3rno.dev/errors/rate-limited",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "You have exceeded 300 requests per minute.",
"instance": "/v1/memories/recall"
}