Documentation Index
Fetch the complete documentation index at: https://astron-bb4261fd.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Test Environment
- Hardware: Apple M-series (Rosetta 2 emulation for PostgreSQL)
- PostgreSQL: 17.x with pgvector 0.8.x
- Embedding dimensions: 1536 (text-embedding-3-small)
- Note: All timings include Rosetta overhead. Native ARM builds are estimated 30-40% faster.
Semantic recall latency using HNSW index (m=16, ef_construction=128, ef_search=64):
| Dataset Size | p50 Latency | p95 Latency | p99 Latency | Recall@10 |
|---|
| 10,000 memories | 1.8 ms | 3.2 ms | 4.1 ms | 0.98 |
| 50,000 memories | 2.9 ms | 5.1 ms | 6.8 ms | 0.97 |
| 100,000 memories | 4.2 ms | 7.3 ms | 9.5 ms | 0.96 |
Native ARM Estimates
| Dataset Size | p50 (est.) | p95 (est.) |
|---|
| 10,000 memories | ~1.1 ms | ~2.0 ms |
| 50,000 memories | ~1.8 ms | ~3.1 ms |
| 100,000 memories | ~2.5 ms | ~4.4 ms |
IVFFlat vs HNSW Comparison
Tested at 100K memories with top_k=10:
| Metric | IVFFlat (nlist=100, nprobe=10) | HNSW (m=16, ef=64) |
|---|
| p50 latency | 6.1 ms | 4.2 ms |
| p95 latency | 11.4 ms | 7.3 ms |
| Recall@10 | 0.91 | 0.96 |
| Index build time | 12.3 s | 45.7 s |
| Index size on disk | 234 MB | 312 MB |
| Incremental insert | Requires retrain | Immediate |
Decision: HNSW chosen for production. The higher recall and no-retrain property outweigh the larger index size and slower initial build.
Append-only audit table with BRIN index on created_at:
| Table Size | Insert (p50) | Range Query 24h (p50) | Range Query 7d (p50) |
|---|
| 100K rows | 0.3 ms | 1.2 ms | 3.8 ms |
| 1M rows | 0.3 ms | 1.4 ms | 5.1 ms |
| 10M rows | 0.4 ms | 1.9 ms | 8.7 ms |
Insert latency remains constant due to append-only writes. BRIN indexing keeps range scans efficient even at 10M+ rows.
Store Operation (End-to-End)
Full store including embedding generation, DB insert, and graph edge creation:
| Component | Time |
|---|
| Embedding API call | 80-150 ms |
| DB insert (memory + audit) | 1.2 ms |
| Graph edge creation | 0.8 ms |
| Total | ~85-155 ms |
Embedding generation dominates. With local embeddings (e.g., ONNX), total drops to ~5 ms.
Test Suite Results
Phase 1 + Phase 2 combined test run:
========================= test session starts =========================
collected 646 items
tests/unit/ ... 412 passed
tests/integration/ ... 189 passed
tests/performance/ ... 45 passed
================ 646 passed, 0 failed, 0 warnings ================
Total time: 127.4s
| Category | Tests | Pass Rate |
|---|
| Unit tests | 412 | 100% |
| Integration tests | 189 | 100% |
| Performance tests | 45 | 100% |
| Total | 646 | 100% |
Rosetta Overhead Note
All benchmarks were collected on Apple Silicon under Rosetta 2 emulation (x86_64 PostgreSQL binary). Based on comparison testing:
- CPU-bound operations (embedding similarity computation): ~35% overhead
- I/O-bound operations (disk reads, network): ~5-10% overhead
- Mixed workloads (typical Z3rno queries): ~20-30% overhead
Production deployments on native x86_64 or native ARM PostgreSQL builds should see proportionally better numbers. The benchmarks above represent conservative worst-case estimates.