Technical Overview
RNDA is a data architecture protocol, not a product. This page covers the mathematical foundations, empirical benchmarks across 30+ data types, and architectural claims. For engineers and researchers evaluating RNDA for licensing or integration.
Core Architectural Principle
"Uncompressed data is not a state that exists in the RNDA system — not during storage, transit, computation, or output. Input data is encoded to a compact binary signature, the raw data is permanently discarded, and contextually appropriate outputs are reconstructed on demand from signature similarity. Reconstruction is generative, not retrieval."
This inverts the fundamental assumption of every existing data system. Traditional systems treat the uncompressed form as the canonical source of truth. RNDA has no canonical uncompressed form — the signature store IS the source of truth, and all outputs are reconstructed derivatives. Raw data cannot be retrieved because it does not exist.
Mathematical Foundation
Signature Properties
Input data of arbitrary size and type is encoded to a fixed 256-byte binary signature. The encoding satisfies three properties:
Encoding Flow
Query → signature similarity search → contextual reconstruction (not retrieval)
Compression Ratio
Compression ratio = raw input size in bytes ÷ 256. The signature is always 256 bytes regardless of input size. A 36MB genomic FASTQ file and a 2.6MB LiDAR point cloud both encode to the same 256-byte output.
| Raw Input | Example | Raw Size | Signature | Compression |
|---|---|---|---|---|
| Genomic FASTQ | NCBI sequence file | 36MB | 256 bytes | 140,835x |
| Medical image | NIH chest X-ray | 415KB | 256 bytes | 1,618x |
| AV sensor frame | nuScenes multi-sensor | 2.6MB | 256 bytes | 6,366x |
| O&G well log | Norwegian LAS file | 5.2MB | 256 bytes | 20,448x |
| Video clip | UCF-101 action clip | 478KB | 256 bytes | 1,868x |
| Financial data | S&P 500 OHLCV 5yr | 2.9MB | 256 bytes | 11,153x |
Domain-Specific Encoding
RNDA uses domain-appropriate semantic encoders to extract meaningful structure from each data type before signature generation. Different domains require different encoders — a genomic sequence and an audio recording carry meaning in fundamentally different ways.
The RNDA signature format is encoder-agnostic: adding support for a new data type requires training a new encoder, but does not require reindexing existing signatures. All 30+ proven domains share the same 256-byte signature store and query interface.
Discrimination Gap
The discrimination gap measures the semantic separability of the signature space for a given domain:
Self-similarity (same input vs itself) = 1.0 by definition. Mean cross-category similarity measures how different unrelated inputs appear in signature space. A gap of 0.9 means same-category signatures are near 1.0 while different-category signatures are near 0.1.
Gaps above 1.0 occur when cross-category similarity is negative — indicating that different-category signatures are on opposite sides of the embedding space. This is stronger-than-perfect separation and is a property of the encoder, not a measurement error.
Empirical Results — 30+ Domains
All results measured end-to-end through the full encode → discard → query pipeline. All datasets are publicly available and verifiable. No synthetic data used in any POC.
| Domain | Gap | Compression | Dataset |
|---|---|---|---|
| AV Multi-Sensor Fusion | 1.092 | 6,366x | nuScenes — 300 synchronized frames |
| Network Traffic | 1.049 | 169x | Real PCAP captures — 4 protocol types |
| Climate & Weather | 1.056 | 23x | 76 global stations, full year |
| Supply Chain IoT | 1.027 | 991x | NAB industrial sensor benchmark |
| RF Signals | 1.009 | 5.8x | RadioML — 480,000 IQ samples, 7 modulations |
| Video | 0.998 | 1,868x | UCF-101 — 175 clips, 35 action categories |
| Financial Markets | 0.986 | 11,153x | S&P 500 OHLCV — 50 tickers, 5 years |
| Genomic FASTQ | 0.962 | 140,835x | NCBI — 28 files across 6 species |
| 3D Brain MRI | 0.963 | 4,305x | Real T1/fMRI clinical scans — 405 scans |
| Audio | 0.952 | 547x | Real FLAC recordings — 2,000 files |
| DNA / Genomics | 0.939 | 14,774x | NCBI RefSeq — 300 bacterial genomes |
| Satellite Imagery | 0.937 | 192x | EuroSAT — 27,000 Sentinel-2 images |
| Seismic Waveforms | 0.968 | 14.3x | IRIS/EarthScope — 10,000 waveforms |
| Retail / CPG | 0.947 | 21x | UCI Online Retail — 541K transactions |
| Telecom Subscribers | 0.970 | structured | IBM Telco — 7,043 profiles |
| EEG Brain Signals | 0.862 | 7,431x | Clinical EDF recordings — 62 files |
| Medical Imaging | 0.816 | 1,618x | NIH ChestX-ray14 — 2,000 labeled scans |
| O&G Well Logs | 0.731 | 20,448x | FORCE 2020 — 118 Norwegian LAS files |
| Legal Documents | 0.524 | 89x | SEC EDGAR — 29 10-K annual filings |
Full results across all 30+ domains available at rnda.io/poc. All raw data permanently discarded after encoding in every POC.
Patent Claims Summary
US Provisional Patent Application #64/036,090 covers 26 claims across method, system, and apparatus categories. Key independent claims:
API Reference
The RNDA API is available at api.rnda.io. Enterprise access available via key-based authentication. Contact us for enterprise keys.
/api/encodeEncode any data to a 256-byte signature. Raw data permanently discarded.
{ "text": "string", "label": "string (optional)" }/api/queryQuery the signature store, return most similar records by semantic similarity.
{ "query": "string", "contexts": ["general"] }/api/encode/batchBatch encode multiple records in a single request.
{ "records": [{ "text": "string", "label": "string" }] }/api/healthHealth check and signature count.
Licensing & Integration
RNDA is available for enterprise licensing. Each deployment is a dedicated instance scoped to your data type. POC required before licensing — contact us to discuss.
Contact for Licensing →US Patent Application #64/036,090 | Priority Date April 11, 2026 | ZiggyTech Ventures