HX-SDP · architecture
The architecture behind the structural data platform.
HX-SDP ingests dense vectors and related data, represents it as economy-SVD latent factors plus an SQ8 rerank sidecar, and serves cache, vector, feature, search, retention, and observability workflows from the same GPU-native runtime.
8 eliminated · 3 collapsed · 1 simplified
hx-engine engine · hx-gate gate
put · get · query · search · serve
What HX-SDP is
One runtime where every access pattern reads the same representation.
Traditional stacks duplicate the same data into caches, vector databases, search indexes, feature stores, streams, and observability systems. HX-SDP collapses those copies into one representation and one serving surface.
01
Input
Dense vectors, features, streams
02
Atlas
classify structure + policy
03
Latent
Z(N,r) + V_T(r,D)
04
SQ8
int8 sidecar rerank
05
Serve
cache · vectors · features · search
Service replacement map
Twelve workflows become one platform boundary.
HX-SDP distinguishes what is eliminated, what is architecturally collapsed, and what is simplified. The result is a concrete operational claim without overstating what remains.
| Workflow | Traditional vendors | Outcome | How HX-SDP handles it |
|---|---|---|---|
| KV cache | Redis, Memcached | Eliminated | Representations are the values; L0/L1/L2 cache hierarchy serves hot reads. |
| Feature store | Feast, Tecton | Eliminated | Online/offline split disappears; features are versioned representations. |
| Search index | Elasticsearch, OpenSearch | Eliminated | BM25, trie, fuzzy, metadata filters, and hybrid search read the same store. |
| Vector DB | Pinecone, Weaviate, Milvus | Eliminated | Tier-1 latent scan plus Tier-2 SQ8 rerank replaces ANN index fleets. |
| ETL pipeline | Airflow, dbt, Spark | Eliminated | Representation is the transformation; no multi-sink DAG. |
| Event stream | Kafka, Kinesis | Eliminated | At-least-once ingest, DLQ behavior, snapshot lineage, and direct representation. WAL remains a roadmap recovery mechanism. |
| Stream retention | Confluent, MSK | Eliminated | Compressed representations make retention economics tractable. |
| API gateway | Kong, Apigee | Eliminated | hx-gate handles auth, ACL, rate limiting, billing, audit, and proxy. |
| GPU cluster | DGX, P5/G6 fleets | Collapsed | Single GPU serving for validated scale envelopes; sharding remains extension path. |
| KV offload | PagedAttention, NVMe spillover | Collapsed | Latent factors and SQ8 sidecar shrink memory footprint before paging is needed. |
| Observability | Datadog, Splunk | Collapsed | One service surface with Prometheus metrics, telemetry aggregation, and JSONL audit. |
| Training pipeline | SageMaker data staging | Simplified | Training compute remains; feature materialization and ETL-to-training shrink. |
Production architecture
SVD-latent + SQ8 is the production hot path. QTT is the broader core.
HX-SDP benchmark claims are tied to the SVD-latent + SQ8 path. QTT remains part of the HolonomiX technology core and an alternate ingest path for callers that already hold TT cores.
Hot path
Dense X → Z + V_T → SQ8 sidecar.
Queries project q into rank-r space, scan Z · w, then rerank candidates from SQ8 in original D-space. No dense materialization in the compute path.
w = V_T @ q
scores = Z @ w
candidates = topk(scores, rerank_k=100)
final = sq8_rescore(candidates, q)Operational surface
hx-gate, hx-engine, Redis.
hx-gate handles tenant auth, namespace ACL, rate limiting, CU billing, audit, and WebSocket proxy. hx-engine runs the GPU serving runtime. Redis holds shared gate state.
load balancer
→ hx-gate :8080
→ Redis :6379
→ hx-engine :8000
→ GPU + /var/lib/holonomixFit boundaries
Clear qualification is part of the product.
HX-SDP is strong when the workload can exploit structure and the buyer can operate a bounded GPU-native deployment. It is not a generic managed-cloud vector database replacement for every team today.
Strong fit
Not the right surface yet
Precision tiers
Pick the tier from the evidence boundary.
The benchmark page carries the detailed tables. The product page gives the operating interpretation and directs exact-recall buyers to fp32/fp64, while describing fp16 as a scale envelope with explicit recall bounds.
Exact
fp64
Exact-recall path. Use when approximate answers are unacceptable.
Production
fp32
Default production tier. Full recall, balanced throughput.
Scale
fp16
Scale envelope. Recall floor disclosed; use with bounded calibration.
Next step
Map your stack to a precision tier.
Send the workload, current services, scale, recall tolerance, rebuild cadence, and deployment constraints. The intake maps those inputs to a bounded evaluation path.