PATENT PENDING — US #64/036,090 · US #64/038,577

Technical Overview

RNDA is a data architecture protocol, not a product. This page covers the mathematical foundations, empirical benchmarks across 30+ data types, and architectural claims. For engineers and researchers evaluating RNDA for licensing or integration.

Core Architectural Principle

"Uncompressed data is not a state that exists in the RNDA system — not during storage, transit, computation, or output. Input data is encoded to a compact binary signature, the raw data is permanently discarded, and contextually appropriate outputs are reconstructed on demand from signature similarity. Reconstruction is generative, not retrieval."

This inverts the fundamental assumption of every existing data system. Traditional systems treat the uncompressed form as the canonical source of truth. RNDA has no canonical uncompressed form — the signature store IS the source of truth, and all outputs are reconstructed derivatives. Raw data cannot be retrieved because it does not exist.

Mathematical Foundation

Signature Properties

Input data of arbitrary size and type is encoded to a fixed 256-byte binary signature. The encoding satisfies three properties:

Deterministic

Same input always produces the same signature

Similarity-preserving

Semantically similar inputs produce similar signatures

One-way

Given a signature, recovering the original input is computationally infeasible — approximately 10⁷² possible inputs per signature

Encoding Flow

Raw input (any size, any type) → domain encoder → 256-byte signature → raw input permanently discarded
Query → signature similarity search → contextual reconstruction (not retrieval)

Compression Ratio

Compression ratio = raw input size in bytes ÷ 256. The signature is always 256 bytes regardless of input size. A 36MB genomic FASTQ file and a 2.6MB LiDAR point cloud both encode to the same 256-byte output.

Raw Input	Example	Raw Size	Signature	Compression
Genomic FASTQ	NCBI sequence file	36MB	256 bytes	140,835x
Medical image	NIH chest X-ray	415KB	256 bytes	1,618x
AV sensor frame	nuScenes multi-sensor	2.6MB	256 bytes	6,366x
O&G well log	Norwegian LAS file	5.2MB	256 bytes	20,448x
Video clip	UCF-101 action clip	478KB	256 bytes	1,868x
Financial data	S&P 500 OHLCV 5yr	2.9MB	256 bytes	11,153x

Domain-Specific Encoding

RNDA uses domain-appropriate semantic encoders to extract meaningful structure from each data type before signature generation. Different domains require different encoders — a genomic sequence and an audio recording carry meaning in fundamentally different ways.

The RNDA signature format is encoder-agnostic: adding support for a new data type requires training a new encoder, but does not require reindexing existing signatures. All 30+ proven domains share the same 256-byte signature store and query interface.

Genomic / DNA

Biological sequence semantics

Medical Imaging

Clinical visual features

Financial Markets

Temporal pattern structure

Autonomous Vehicles

Multi-sensor spatial fusion

Audio / Video

Temporal-spectral features

Network Traffic

Protocol flow patterns

RF Signals

IQ modulation characteristics

Seismic / Well Logs

Geophysical waveform structure

Text / Legal

Semantic language features

21+ more domains

Retail, climate, BCI, pharma…

Discrimination Gap

The discrimination gap measures the semantic separability of the signature space for a given domain:

gap = self_similarity − mean_cross_category_similarity

Self-similarity (same input vs itself) = 1.0 by definition. Mean cross-category similarity measures how different unrelated inputs appear in signature space. A gap of 0.9 means same-category signatures are near 1.0 while different-category signatures are near 0.1.

Gaps above 1.0 occur when cross-category similarity is negative — indicating that different-category signatures are on opposite sides of the embedding space. This is stronger-than-perfect separation and is a property of the encoder, not a measurement error.

> 1.0

Maximum separation

0.9 – 1.0

Elite tier

0.7 – 0.9

Enterprise-grade

0.5 – 0.7

Functional

Empirical Results — 30+ Domains

All results measured end-to-end through the full encode → discard → query pipeline. All datasets are publicly available and verifiable. No synthetic data used in any POC.

Domain	Gap	Compression	Dataset
AV Multi-Sensor Fusion	1.092	6,366x	nuScenes — 300 synchronized frames
Network Traffic	1.049	169x	Real PCAP captures — 4 protocol types
Climate & Weather	1.056	23x	76 global stations, full year
Supply Chain IoT	1.027	991x	NAB industrial sensor benchmark
RF Signals	1.009	5.8x	RadioML — 480,000 IQ samples, 7 modulations
Video	0.998	1,868x	UCF-101 — 175 clips, 35 action categories
Financial Markets	0.986	11,153x	S&P 500 OHLCV — 50 tickers, 5 years
Genomic FASTQ	0.962	140,835x	NCBI — 28 files across 6 species
3D Brain MRI	0.963	4,305x	Real T1/fMRI clinical scans — 405 scans
Audio	0.952	547x	Real FLAC recordings — 2,000 files
DNA / Genomics	0.939	14,774x	NCBI RefSeq — 300 bacterial genomes
Satellite Imagery	0.937	192x	EuroSAT — 27,000 Sentinel-2 images
Seismic Waveforms	0.968	14.3x	IRIS/EarthScope — 10,000 waveforms
Retail / CPG	0.947	21x	UCI Online Retail — 541K transactions
Telecom Subscribers	0.970	structured	IBM Telco — 7,043 profiles
EEG Brain Signals	0.862	7,431x	Clinical EDF recordings — 62 files
Medical Imaging	0.816	1,618x	NIH ChestX-ray14 — 2,000 labeled scans
O&G Well Logs	0.731	20,448x	FORCE 2020 — 118 Norwegian LAS files
Legal Documents	0.524	89x	SEC EDGAR — 29 10-K annual filings

Full results across all 30+ domains available at rnda.io/poc. All raw data permanently discarded after encoding in every POC.

Patent Claims Summary

US Provisional Patent Application #64/036,090 covers 26 claims across method, system, and apparatus categories. Key independent claims:

Claim 1

Core Method

A method wherein input data is encoded to compact signatures, raw data is permanently discarded, and contextually appropriate outputs are generated from signature similarity without retrieving the input data.

Claim 2

System

A system comprising an encoding module that discards input data after encoding, a signature store containing no raw data, and a reconstruction engine generating different outputs per query context.

Claim 5

Context-Dependent Reconstruction

A method wherein the same signature store produces different valid outputs for different context parameters, and no output constitutes retrieval of original input data.

Claim 22

AI Infrastructure Cost Reduction

A system reducing AI infrastructure costs by at least 10x by storing only compact signatures of training data, model weights, and user data without retaining uncompressed forms.

Claim 23

Behavioral Identity Preservation

A method enabling interaction with a human behavioral identity model after the subject is no longer able to participate, without any stored record of original behavioral events.

API Reference

The RNDA API is available at api.rnda.io. Enterprise access available via key-based authentication. Contact us for enterprise keys.

POST/api/encode

Encode any data to a 256-byte signature. Raw data permanently discarded.

{ "text": "string", "label": "string (optional)" }

POST/api/query

Query the signature store, return most similar records by semantic similarity.

{ "query": "string", "contexts": ["general"] }

POST/api/encode/batch

Batch encode multiple records in a single request.

{ "records": [{ "text": "string", "label": "string" }] }

GET/api/health

Health check and signature count.

Full API Documentation →

Licensing & Integration

RNDA is available for enterprise licensing. Each deployment is a dedicated instance scoped to your data type. POC required before licensing — contact us to discuss.

Contact for Licensing →

US Patent Application #64/036,090 | Priority Date April 11, 2026 | ZiggyTech Ventures