SpadeBrite
DEPL / M-MIC Lightpaper
"A Spatiotemporal Learnability Control Framework for Deepfake Resilience"
| Term | Expansion | Role |
|---|---|---|
| DEPL | Dynamic Entropy Perturbation Layer | Signal and temporal perturbation layer |
| M-MIC | Multi-Modal Entropy Input Core | Entropy and temporal orchestration core that guides DEPL |
Document notice
This Lightpaper is an early-stage technical overview from SpadeBrite LLC. It describes research direction, conceptual architecture, and evaluation intent—not validated product claims or production guarantees. Metrics, thresholds, and adversary models remain under active study.
What we do not claim
The following boundaries apply to this document and to any experimental implementation derived from it.
- We do not claim to make deepfakes impossible.
- We do not claim full protection against all models, pipelines, or future adversaries.
- We do not claim production-grade robustness, certification, or legal suitability yet.
- Our current focus is measurable reduction in learnability—increasing the cost and lowering the reliability of automated identity extraction—while preserving human-usable media.
Overview
SpadeBrite is the company and product suite (Lyre, Phantom, Seneca). DEPL / M-MIC is the underlying research technology: a pre-training defense framework focused on reducing how effectively AI systems can learn identity-bearing patterns from audio and video.
Most public discourse on deepfakes centers on detection after harm is possible. DEPL / M-MIC addresses the complementary question: can we change the data itself so that cloning and imitation become harder before a model is trained or fine-tuned?
The framework operates upstream of model training:
- Input: identity-bearing media (voice, face video, and related modalities over time).
- Process: entropy-guided, content-aware perturbation scheduled across space and time.
- Output: protected media that remains usable to humans but presents a less stable, less compressible target to learning systems.
SpadeBrite products (e.g. Lyre for voice, Phantom for image/video) are intended to apply this research direction in product form. DEPL / M-MIC names the underlying technical stack—not a single shipped feature flag.
Problem statement
Deepfake risk splits into two problem domains:
| Domain | Timing | Typical tools | Limitation |
|---|---|---|---|
| Before the fact | Before training, cloning, or fine-tuning | Learnability control, perturbation, dataset hygiene | Hard to measure; must preserve human quality |
| After the fact | After synthetic media exists | Detection, verification, provenance, takedown | Reactive; identity may already be extracted |
DEPL / M-MIC focuses primarily on before-the-fact learnability control.
Once a high-fidelity clone exists, downstream defenses face an asymmetric game: detectors chase generators, and provenance can be stripped or forged. Reducing extractable identity stability at the source shifts effort earlier—where publishers still control the artifact.
This does not replace detection or provenance. It complements them by lowering the signal quality available to attacker-like learning pipelines in the first place.
Core thesis
Modern AI systems learn by compressing stable, recurring patterns in data: timbre clusters, facial geometry, gesture habits, prosodic rhythm, and cross-modal correlations. Identity is not only what appears in a frame or spectrogram—it is what repeats predictably across samples and time.
DEPL introduces controlled spatiotemporal instability to reduce model learnability while preserving human usability. M-MIC supplies the entropy and scheduling logic that keeps perturbations from collapsing into static, filterable noise.
Canonical formulation:
DEPL / M-MIC is a spatiotemporal learnability control framework that degrades identity stability in data while preserving human usability.
Operational goals (stated modestly):
- Increase representation drift under attacker-like encoders.
- Decrease reliable recovery of speaker/face embeddings from protected samples.
- Maintain perceptual acceptability for human listeners and viewers.
- Increase cost for adversaries (data, compute, tuning) without promising impossibility.
System architecture
Data Transform Pipeline
Input Media (Audio / Video)
│
▼
┌───────────────────────┐
│ Content Analysis Layer│ features, rhythm, identity-bearing regions
└───────────┬───────────┘
▼
┌───────────────────────┐
│ M-MIC │ entropy + temporal orchestration
│ (Entropy Input Core) │
└───────────┬───────────┘
▼
┌───────────────────────┐
│ DEPL │ signal + temporal perturbation
│ (Perturbation Layer) │
└───────────┬───────────┘
▼
Protected Media
│
▼
┌───────────────────────┐
│ Attack Simulation │ attacker-like encoders / clone proxies
│ Layer │
└───────────┬───────────┘
▼
Learnability Score
│
▼
┌───────────────────────┐
│ Adaptation Loop │ adjust intensity, targets, schedules
└───────────────────────┘Layer Responsibilities
Content Analysis Layer
Inspects incoming media to derive context features used by M-MIC and DEPL: modality, segment boundaries, speech vs silence, face track quality, motion density, and coarse rhythm descriptors. This layer does not apply perturbation; it informs where and when perturbation is worth spending perceptual budget.
M-MIC (Multi-Modal Entropy Input Core)
Aggregates entropy from multiple sources and maps it to dynamic perturbation schedules across time and modality. M-MIC is the orchestration brain: it prevents DEPL from repeating the same transform in a way adversaries can learn or average away.
DEPL (Dynamic Entropy Perturbation Layer)
Applies the actual signal and temporal perturbations to audio waveforms, spectral representations, and video frames/sequences. DEPL executes the schedule; it does not choose entropy sources—that is M-MIC’s role.
Attack Simulation Layer
Runs attacker-like probes on protected output: embedding extractors, lightweight clone proxies, temporal consistency checks, and compression survivability tests. This layer estimates whether protection holds under realistic—not hypothetical—pipelines.
Learnability Score
A composite metric (or family of metrics) summarizing how much identity signal remains machine-learnable after protection. Lower scores indicate reduced extractability under the configured attack suite—not “safe forever.”
Adaptation Loop
Closes the loop: scores feed back into M-MIC and DEPL parameters (intensity, targeting, scheduling). The loop is offline-first in R&D (batch tuning); online adaptation is a later engineering concern.
M-MIC: entropy and temporal orchestration
M-MIC is not a single hash or RNG call. It is a minimal entropy stack designed for reproducible experiments and future hardware integration:
| Source | Role |
|---|---|
| Cryptographic entropy | Unpredictable schedule seeds; resists trivial replay |
| Temporal entropy | Time-varying phase offsets; couples perturbation to media timeline |
| Signal-derived entropy | Features from the content itself; ties noise to local structure |
| Optional device/sensor entropy | Future path for capture-time binding (not required for MVP) |
Design principle
Entropy is not used as raw random noise dumped on the signal. It is used to generate dynamic perturbation schedules—when, where, and how strongly DEPL acts.
Why this matters: Static perturbations become a learnable artifact. Adversaries can denoise, fine-tune around, or invert fixed patterns. Entropy keeps the defense non-stationary across files and time.
Entropy principle
Entropy is the mechanism that prevents perturbation patterns from collapsing into predictability.
M-MIC outputs schedules and control parameters consumed by DEPL—when, where, and how strongly to perturb—keyed to content analysis.
DEPL: signal and temporal perturbation
DEPL implements controlled, bounded distortions on identity-bearing structures. Perturbations are designed to be:
- Sparse in perceptual dimensions humans weight heavily.
- Dense in dimensions embedding models exploit.
- Temporally structured so rhythm and continuity are disrupted for machines more than for humans.
Audio (Lyre-aligned)
Primary targets for learnability reduction:
| Target | Rationale |
|---|---|
| Formants | Speaker timbre; stable across utterances |
| Harmonics | Periodic structure exploited by vocoders |
| Phase relationships | Often ignored by listeners; used in analysis |
| Pitch stability | F0 tracks feed speaker embeddings |
| Cadence | Macro timing of speech units |
| Prosody | Stress, intonation, emotional contour |
| Speaker embedding stability | Direct objective for clone pipelines |
Video (Phantom-aligned)
| Target | Rationale |
|---|---|
| Facial landmarks | Geometry for face swap and reenactment |
| Texture consistency | Skin micro-texture in generative models |
| Motion continuity | Optical-flow–friendly identity cues |
| Blink timing | Subtle biometric rhythm |
| Mouth movement timing | Lip-sync and talking-head models |
| Expression transitions | Dynamics between neutral and expressive states |
| Identity embedding stability | Face encoder invariance targets |
DEPL may operate in multiple signal domains depending on experiment stage. This document does not prescribe a specific transform or pipeline—only the intent: degrade stability where models compress identity.
Rhythm as a core learnability surface
Rhythm is the temporal signature of data—how identity manifests over time, not only in static snapshots.
| Modality | Rhythm examples |
|---|---|
| Audio | Cadence, pauses, syllable timing, prosodic phrasing |
| Video | Blink timing, gesture timing, expression onset/offset, head motion |
| Behavioral (future) | Typing cadence, scrolling rhythm, response latency |
AI systems learn not only what data is, but how it behaves over time. Clone pipelines exploit temporal regularities: consistent pause lengths, predictable blink rates, stable mouth–audio alignment.
DEPL / M-MIC therefore treats rhythm as a first-class learnability surface. M-MIC schedules perturbations to disrupt temporal regularities that embeddings summarize into a compact identity code—without making speech or video feel randomly “glitchy” to humans.
Human perception Machine learning
───────────────── ─────────────────
Integrates over ~100ms+ Exploits ms–s regularities
Forgives local jitter Averages stable rhythms
Attends to semantics Compresses timing statsTechnical model
Conceptual formulation (implementation-agnostic):
Original data: x
Protected data: x' = T(x, E, T, C)
| Symbol | Meaning |
|---|---|
T | Perturbation transformation (DEPL family) |
E | Entropy input vector (from M-MIC) |
T | Temporal schedule (segment weights, phases) |
C | Content/context features (from analysis layer) |
Objective (stated in words):
- Maximize machine representation drift under a defined attack suite.
- Minimize human perceptual degradation subject to quality constraints.
There is no claim of a closed-form optimum. In practice, the system searches parameter settings that improve a learnability score while staying above a perceptual floor (MOS proxies, lip-sync error bounds, etc.).
┌─────────────┐
x ──►│ T(·) DEPL │──► x'
└──────▲──────┘
│ E, T, C
┌──────┴──────┐
│ M-MIC │
└─────────────┘Attack simulation layer
Protection that is never tested against attacker-like models is indistinguishable from hope. The attack simulation layer stress-tests protected output using pipelines chosen to represent realistic extraction—not every possible future model.
Example metrics
| Metric | What it approximates |
|---|---|
| Speaker embedding similarity | Voice clone feasibility |
| Face embedding similarity | Face swap / reenactment feasibility |
| Temporal consistency | Stability across frames or segments |
| Clone risk score | Composite risk from proxy attackers |
| Learnability score | Overall extractability under suite |
| Perturbation survival score | Robustness after compression/transcode |
Scores are relative (before vs after protection, across parameter sweeps). Absolute thresholds will be calibrated empirically in R&D.
Note
The attack suite must evolve as public encoders and generative models improve. A score is valid only for the versioned benchmark it was measured against.
Adaptation loop
The adaptation loop implements continuous improvement within an experiment or product build:
Protected Data → Attack Simulation → Score → Adjustment → (repeat)Adjustable parameters
| Parameter | Effect |
|---|---|
| Perturbation intensity | Strength vs perceptual cost |
| Perturbation location | Which features or regions are touched |
| Entropy mapping | How M-MIC seeds DEPL schedules |
| Temporal schedule | When perturbations activate |
| Feature targeting | Audio vs video specific masks |
Early R&D runs this loop offline on curated clips. Production systems may use frozen profiles validated on benchmarks before release.
MVP scope
Phase focus: audio first (Lyre-aligned). Video perturbation follows in prototype form; full multimodal coupling is not required for the first measurable milestone.
MVP pipeline
Audio Input
│
▼
Feature Extraction
│
▼
Entropy Scheduling (M-MIC)
│
▼
Perturbation (DEPL)
│
▼
Embedding Comparison
│
▼
Learnability ScoreSuccess criteria (MVP)
| Criterion | Target intent |
|---|---|
| Human acceptability | Audio remains listenable; no obvious artifacts at nominal settings |
| Embedding drift | Measurable decrease in similarity to pre-protection embeddings under fixed encoders |
| Compression survival | Perturbations remain effective after lossy codecs (e.g. AAC, Opus) at typical bitrates |
| Learnability score | Documented decrease vs baseline on versioned benchmark clips |
Failure modes to log explicitly: intensity too low (no drift), too high (audible damage), encoder mismatch (wrong attack model), schedule collapse (repeating pattern).
Comparison to existing approaches
| Approach | When it acts | What it protects | Relation to DEPL / M-MIC |
|---|---|---|---|
| Detection systems | After synthetic media exists | Consumers and platforms from fooled judgment | Reactive; does not reduce training signal at source |
| Watermarking / provenance | At capture or publish | Authenticity and lineage | Does not prevent learning from unmarked copies |
| Nightshade / Glaze (class) | Pre-training (images) | Visual style / artist identity via data poisoning | Image-focused; different threat model and perceptual tradeoffs |
| Encryption | At rest / in transit | Access confidentiality | Does not control what plaintext teaches if decrypted for training |
Key distinction:
Encryption protects who can access the data. DEPL controls what the data can teach.
DEPL / M-MIC is closest in spirit to learnability reduction and adversarial data shaping, extended explicitly to spatiotemporal identity media (voice and video) with rhythm-aware scheduling.
Limitations
Honest constraints on the current program:
- Models adapt. New encoders, larger pretraining, and attacker fine-tuning can erode any fixed perturbation strategy.
- Perturbations may be filtered. Denoising, heavy compression, cropping, and manual cleanup may remove or dilute defenses.
- Human usability caps intensity. Perceptual floors limit how aggressive DEPL can be on consumer-facing exports.
- Empirical validation is incomplete. Claims in this document are hypotheses until benchmarked on versioned suites.
- Real-world robustness is untested at scale across devices, languages, accents, lighting, and codec chains.
We describe outcomes with language like reduce, degrade, increase cost, and lower reliability—not guarantee or prevent all.
Roadmap
| Phase | Focus | Deliverable |
|---|---|---|
| 1 | Audio MVP | End-to-end Lyre pipeline with learnability score on benchmark clips |
| 2 | Video prototype | Phantom perturbation on face video; embedding drift metrics |
| 3 | Attack simulation benchmark suite | Versioned encoders, reports, regression tracking |
| 4 | Adaptive perturbation engine | Closed-loop tuning with frozen production profiles |
| 5 | API / SDK | Integrations for publishers and tools |
| 6 | Website demo | Public showcase + controlled preview of full flows |
Roadmap order may shift based on empirical results. No phase implies “complete protection.”
Collaboration
DEPL / M-MIC sits at the intersection of signal processing, machine learning, and security engineering. SpadeBrite may engage collaborators in areas such as audio and video ML, adversarial evaluation, privacy engineering, and product development—under explicit written agreement only.
This repository is not an open-source project. Reading this Lightpaper does not grant rights to implement, reproduce, or commercialize the described systems.
Contact
For collaboration inquiries: spadebrite.com/contact
Research notice
| Field | Status |
|---|---|
| Rights | Proprietary — SpadeBrite LLC |
| Research status | Semi-research / builder phase |
| Document version | 1.1 |
| Last updated | 2026-05-18 |
This Lightpaper may be updated as research matures. External references should cite the document version and date.
Notice
This document and the technologies described herein are proprietary to SpadeBrite LLC.
This Lightpaper is provided for informational purposes only. It does not grant any license or right to use, reproduce, implement, distribute, or create derivative works from the described systems without explicit written permission from SpadeBrite LLC.
Document metadata
- Entity: SpadeBrite LLC
- Technology: DEPL / M-MIC (learnability control framework)
- Related products: Lyre (audio), Phantom (image/video), Seneca (text, research)
- Repository path:
docs/Lightpaper.md - Website: spadebrite.com/lightpaper
© 2026 SpadeBrite LLC. All rights reserved.