Benchmarks - Z3rno

Test Environment

Hardware: Apple M-series (Rosetta 2 emulation for PostgreSQL)
PostgreSQL: 17.x with pgvector 0.8.x
Embedding dimensions: 1536 (text-embedding-3-small)
Note: All timings include Rosetta overhead. Native ARM builds are estimated 30-40% faster.

HNSW Vector Search Performance

Semantic recall latency using HNSW index (m=16, ef_construction=128, ef_search=64):

Dataset Size	p50 Latency	p95 Latency	p99 Latency	Recall@10
10,000 memories	1.8 ms	3.2 ms	4.1 ms	0.98
50,000 memories	2.9 ms	5.1 ms	6.8 ms	0.97
100,000 memories	4.2 ms	7.3 ms	9.5 ms	0.96

Native ARM Estimates

Dataset Size	p50 (est.)	p95 (est.)
10,000 memories	~1.1 ms	~2.0 ms
50,000 memories	~1.8 ms	~3.1 ms
100,000 memories	~2.5 ms	~4.4 ms

IVFFlat vs HNSW Comparison

Tested at 100K memories with top_k=10:

Metric	IVFFlat (nlist=100, nprobe=10)	HNSW (m=16, ef=64)
p50 latency	6.1 ms	4.2 ms
p95 latency	11.4 ms	7.3 ms
Recall@10	0.91	0.96
Index build time	12.3 s	45.7 s
Index size on disk	234 MB	312 MB
Incremental insert	Requires retrain	Immediate

Decision: HNSW chosen for production. The higher recall and no-retrain property outweigh the larger index size and slower initial build.

Audit Log Performance

Append-only audit table with BRIN index on created_at:

Table Size	Insert (p50)	Range Query 24h (p50)	Range Query 7d (p50)
100K rows	0.3 ms	1.2 ms	3.8 ms
1M rows	0.3 ms	1.4 ms	5.1 ms
10M rows	0.4 ms	1.9 ms	8.7 ms

Insert latency remains constant due to append-only writes. BRIN indexing keeps range scans efficient even at 10M+ rows.

Store Operation (End-to-End)

Full store including embedding generation, DB insert, and graph edge creation:

Component	Time
Embedding API call	80-150 ms
DB insert (memory + audit)	1.2 ms
Graph edge creation	0.8 ms
Total	~85-155 ms

Embedding generation dominates. With local embeddings (e.g., ONNX), total drops to ~5 ms.

Test Suite Results

Phase 1 + Phase 2 combined test run:

========================= test session starts =========================
collected 646 items

tests/unit/          ... 412 passed
tests/integration/   ... 189 passed
tests/performance/   ... 45 passed

================ 646 passed, 0 failed, 0 warnings ================

Total time: 127.4s

Category	Tests	Pass Rate
Unit tests	412	100%
Integration tests	189	100%
Performance tests	45	100%
Total	646	100%

Rosetta Overhead Note

All benchmarks were collected on Apple Silicon under Rosetta 2 emulation (x86_64 PostgreSQL binary). Based on comparison testing:

CPU-bound operations (embedding similarity computation): ~35% overhead
I/O-bound operations (disk reads, network): ~5-10% overhead
Mixed workloads (typical Z3rno queries): ~20-30% overhead

Production deployments on native x86_64 or native ARM PostgreSQL builds should see proportionally better numbers. The benchmarks above represent conservative worst-case estimates.

​Test Environment

​HNSW Vector Search Performance

​Native ARM Estimates

​IVFFlat vs HNSW Comparison

​Audit Log Performance

​Store Operation (End-to-End)

​Test Suite Results

​Rosetta Overhead Note