PATENT PENDING — US #64/036,090 · US #64/038,577

Technical Overview

RNDA is a data architecture protocol, not a product. This page covers the mathematical foundations, empirical benchmarks across 30+ data types, and architectural claims. For engineers and researchers evaluating RNDA for licensing or integration.

Core Architectural Principle

"Uncompressed data is not a state that exists in the RNDA system — not during storage, transit, computation, or output. Input data is encoded to a compact binary signature, the raw data is permanently discarded, and contextually appropriate outputs are reconstructed on demand from signature similarity. Reconstruction is generative, not retrieval."

This inverts the fundamental assumption of every existing data system. Traditional systems treat the uncompressed form as the canonical source of truth. RNDA has no canonical uncompressed form — the signature store IS the source of truth, and all outputs are reconstructed derivatives. Raw data cannot be retrieved because it does not exist.

Mathematical Foundation

Signature Properties

Input data of arbitrary size and type is encoded to a fixed 256-byte binary signature. The encoding satisfies three properties:

Deterministic
Same input always produces the same signature
Similarity-preserving
Semantically similar inputs produce similar signatures
One-way
Given a signature, recovering the original input is computationally infeasible — approximately 10⁷² possible inputs per signature

Encoding Flow

Raw input (any size, any type) → domain encoder → 256-byte signature → raw input permanently discarded
Query → signature similarity search → contextual reconstruction (not retrieval)

Compression Ratio

Compression ratio = raw input size in bytes ÷ 256. The signature is always 256 bytes regardless of input size. A 36MB genomic FASTQ file and a 2.6MB LiDAR point cloud both encode to the same 256-byte output.

Raw InputExampleRaw SizeSignatureCompression
Genomic FASTQNCBI sequence file36MB256 bytes140,835x
Medical imageNIH chest X-ray415KB256 bytes1,618x
AV sensor framenuScenes multi-sensor2.6MB256 bytes6,366x
O&G well logNorwegian LAS file5.2MB256 bytes20,448x
Video clipUCF-101 action clip478KB256 bytes1,868x
Financial dataS&P 500 OHLCV 5yr2.9MB256 bytes11,153x

Domain-Specific Encoding

RNDA uses domain-appropriate semantic encoders to extract meaningful structure from each data type before signature generation. Different domains require different encoders — a genomic sequence and an audio recording carry meaning in fundamentally different ways.

The RNDA signature format is encoder-agnostic: adding support for a new data type requires training a new encoder, but does not require reindexing existing signatures. All 30+ proven domains share the same 256-byte signature store and query interface.

Genomic / DNA
Biological sequence semantics
Medical Imaging
Clinical visual features
Financial Markets
Temporal pattern structure
Autonomous Vehicles
Multi-sensor spatial fusion
Audio / Video
Temporal-spectral features
Network Traffic
Protocol flow patterns
RF Signals
IQ modulation characteristics
Seismic / Well Logs
Geophysical waveform structure
Text / Legal
Semantic language features
21+ more domains
Retail, climate, BCI, pharma…

Discrimination Gap

The discrimination gap measures the semantic separability of the signature space for a given domain:

gap = self_similarity − mean_cross_category_similarity

Self-similarity (same input vs itself) = 1.0 by definition. Mean cross-category similarity measures how different unrelated inputs appear in signature space. A gap of 0.9 means same-category signatures are near 1.0 while different-category signatures are near 0.1.

Gaps above 1.0 occur when cross-category similarity is negative — indicating that different-category signatures are on opposite sides of the embedding space. This is stronger-than-perfect separation and is a property of the encoder, not a measurement error.

> 1.0
Maximum separation
0.9 – 1.0
Elite tier
0.7 – 0.9
Enterprise-grade
0.5 – 0.7
Functional

Empirical Results — 30+ Domains

All results measured end-to-end through the full encode → discard → query pipeline. All datasets are publicly available and verifiable. No synthetic data used in any POC.

DomainGapCompressionDataset
AV Multi-Sensor Fusion1.0926,366xnuScenes — 300 synchronized frames
Network Traffic1.049169xReal PCAP captures — 4 protocol types
Climate & Weather1.05623x76 global stations, full year
Supply Chain IoT1.027991xNAB industrial sensor benchmark
RF Signals1.0095.8xRadioML — 480,000 IQ samples, 7 modulations
Video0.9981,868xUCF-101 — 175 clips, 35 action categories
Financial Markets0.98611,153xS&P 500 OHLCV — 50 tickers, 5 years
Genomic FASTQ0.962140,835xNCBI — 28 files across 6 species
3D Brain MRI0.9634,305xReal T1/fMRI clinical scans — 405 scans
Audio0.952547xReal FLAC recordings — 2,000 files
DNA / Genomics0.93914,774xNCBI RefSeq — 300 bacterial genomes
Satellite Imagery0.937192xEuroSAT — 27,000 Sentinel-2 images
Seismic Waveforms0.96814.3xIRIS/EarthScope — 10,000 waveforms
Retail / CPG0.94721xUCI Online Retail — 541K transactions
Telecom Subscribers0.970structuredIBM Telco — 7,043 profiles
EEG Brain Signals0.8627,431xClinical EDF recordings — 62 files
Medical Imaging0.8161,618xNIH ChestX-ray14 — 2,000 labeled scans
O&G Well Logs0.73120,448xFORCE 2020 — 118 Norwegian LAS files
Legal Documents0.52489xSEC EDGAR — 29 10-K annual filings

Full results across all 30+ domains available at rnda.io/poc. All raw data permanently discarded after encoding in every POC.

Patent Claims Summary

US Provisional Patent Application #64/036,090 covers 26 claims across method, system, and apparatus categories. Key independent claims:

Claim 1
Core Method
A method wherein input data is encoded to compact signatures, raw data is permanently discarded, and contextually appropriate outputs are generated from signature similarity without retrieving the input data.
Claim 2
System
A system comprising an encoding module that discards input data after encoding, a signature store containing no raw data, and a reconstruction engine generating different outputs per query context.
Claim 5
Context-Dependent Reconstruction
A method wherein the same signature store produces different valid outputs for different context parameters, and no output constitutes retrieval of original input data.
Claim 22
AI Infrastructure Cost Reduction
A system reducing AI infrastructure costs by at least 10x by storing only compact signatures of training data, model weights, and user data without retaining uncompressed forms.
Claim 23
Behavioral Identity Preservation
A method enabling interaction with a human behavioral identity model after the subject is no longer able to participate, without any stored record of original behavioral events.

API Reference

The RNDA API is available at api.rnda.io. Enterprise access available via key-based authentication. Contact us for enterprise keys.

POST/api/encode

Encode any data to a 256-byte signature. Raw data permanently discarded.

{ "text": "string", "label": "string (optional)" }
POST/api/query

Query the signature store, return most similar records by semantic similarity.

{ "query": "string", "contexts": ["general"] }
POST/api/encode/batch

Batch encode multiple records in a single request.

{ "records": [{ "text": "string", "label": "string" }] }
GET/api/health

Health check and signature count.

Full API Documentation →

Licensing & Integration

RNDA is available for enterprise licensing. Each deployment is a dedicated instance scoped to your data type. POC required before licensing — contact us to discuss.

Contact for Licensing →

US Patent Application #64/036,090 | Priority Date April 11, 2026 | ZiggyTech Ventures