SpadeBrite

DEPL / M-MIC Lightpaper

"A Spatiotemporal Learnability Control Framework for Deepfake Resilience"

Term	Expansion	Role
DEPL	Dynamic Entropy Perturbation Layer	Signal and temporal perturbation layer
M-MIC	Multi-Modal Entropy Input Core	Entropy and temporal orchestration core that guides DEPL

Document notice
This Lightpaper is an early-stage technical overview from SpadeBrite LLC. It describes research direction, conceptual architecture, and evaluation intent—not validated product claims or production guarantees. Metrics, thresholds, and adversary models remain under active study.

What we do not claim

The following boundaries apply to this document and to any experimental implementation derived from it.

We do not claim to make deepfakes impossible.
We do not claim full protection against all models, pipelines, or future adversaries.
We do not claim production-grade robustness, certification, or legal suitability yet.
Our current focus is measurable reduction in learnability—increasing the cost and lowering the reliability of automated identity extraction—while preserving human-usable media.

Overview

SpadeBrite is the company and product suite (Lyre, Phantom, Seneca). DEPL / M-MIC is the underlying research technology: a pre-training defense framework focused on reducing how effectively AI systems can learn identity-bearing patterns from audio and video.

Most public discourse on deepfakes centers on detection after harm is possible. DEPL / M-MIC addresses the complementary question: can we change the data itself so that cloning and imitation become harder before a model is trained or fine-tuned?

The framework operates upstream of model training:

Input: identity-bearing media (voice, face video, and related modalities over time).
Process: entropy-guided, content-aware perturbation scheduled across space and time.
Output: protected media that remains usable to humans but presents a less stable, less compressible target to learning systems.

SpadeBrite products (e.g. Lyre for voice, Phantom for image/video) are intended to apply this research direction in product form. DEPL / M-MIC names the underlying technical stack—not a single shipped feature flag.

Problem statement

Deepfake risk splits into two problem domains:

Domain	Timing	Typical tools	Limitation
Before the fact	Before training, cloning, or fine-tuning	Learnability control, perturbation, dataset hygiene	Hard to measure; must preserve human quality
After the fact	After synthetic media exists	Detection, verification, provenance, takedown	Reactive; identity may already be extracted

DEPL / M-MIC focuses primarily on before-the-fact learnability control.

Once a high-fidelity clone exists, downstream defenses face an asymmetric game: detectors chase generators, and provenance can be stripped or forged. Reducing extractable identity stability at the source shifts effort earlier—where publishers still control the artifact.

This does not replace detection or provenance. It complements them by lowering the signal quality available to attacker-like learning pipelines in the first place.

Core thesis

Modern AI systems learn by compressing stable, recurring patterns in data: timbre clusters, facial geometry, gesture habits, prosodic rhythm, and cross-modal correlations. Identity is not only what appears in a frame or spectrogram—it is what repeats predictably across samples and time.

DEPL introduces controlled spatiotemporal instability to reduce model learnability while preserving human usability. M-MIC supplies the entropy and scheduling logic that keeps perturbations from collapsing into static, filterable noise.

Canonical formulation:

DEPL / M-MIC is a spatiotemporal learnability control framework that degrades identity stability in data while preserving human usability.

Operational goals (stated modestly):

Increase representation drift under attacker-like encoders.
Decrease reliable recovery of speaker/face embeddings from protected samples.
Maintain perceptual acceptability for human listeners and viewers.
Increase cost for adversaries (data, compute, tuning) without promising impossibility.

System architecture

Data Transform Pipeline

Input Media (Audio / Video)
        │
        ▼
┌───────────────────────┐
│ Content Analysis Layer│  features, rhythm, identity-bearing regions
└───────────┬───────────┘
            ▼
┌───────────────────────┐
│ M-MIC                 │  entropy + temporal orchestration
│ (Entropy Input Core)  │
└───────────┬───────────┘
            ▼
┌───────────────────────┐
│ DEPL                  │  signal + temporal perturbation
│ (Perturbation Layer)  │
└───────────┬───────────┘
            ▼
     Protected Media
            │
            ▼
┌───────────────────────┐
│ Attack Simulation     │  attacker-like encoders / clone proxies
│ Layer                 │
└───────────┬───────────┘
            ▼
   Learnability Score
            │
            ▼
┌───────────────────────┐
│ Adaptation Loop       │  adjust intensity, targets, schedules
└───────────────────────┘

Layer Responsibilities

Content Analysis Layer

Inspects incoming media to derive context features used by M-MIC and DEPL: modality, segment boundaries, speech vs silence, face track quality, motion density, and coarse rhythm descriptors. This layer does not apply perturbation; it informs where and when perturbation is worth spending perceptual budget.

Aggregates entropy from multiple sources and maps it to dynamic perturbation schedules across time and modality. M-MIC is the orchestration brain: it prevents DEPL from repeating the same transform in a way adversaries can learn or average away.

DEPL (Dynamic Entropy Perturbation Layer)

Applies the actual signal and temporal perturbations to audio waveforms, spectral representations, and video frames/sequences. DEPL executes the schedule; it does not choose entropy sources—that is M-MIC’s role.

Attack Simulation Layer

Runs attacker-like probes on protected output: embedding extractors, lightweight clone proxies, temporal consistency checks, and compression survivability tests. This layer estimates whether protection holds under realistic—not hypothetical—pipelines.

Learnability Score

A composite metric (or family of metrics) summarizing how much identity signal remains machine-learnable after protection. Lower scores indicate reduced extractability under the configured attack suite—not “safe forever.”

Adaptation Loop

Closes the loop: scores feed back into M-MIC and DEPL parameters (intensity, targeting, scheduling). The loop is offline-first in R&D (batch tuning); online adaptation is a later engineering concern.

M-MIC: entropy and temporal orchestration

M-MIC is not a single hash or RNG call. It is a minimal entropy stack designed for reproducible experiments and future hardware integration:

Source	Role
Cryptographic entropy	Unpredictable schedule seeds; resists trivial replay
Temporal entropy	Time-varying phase offsets; couples perturbation to media timeline
Signal-derived entropy	Features from the content itself; ties noise to local structure
Optional device/sensor entropy	Future path for capture-time binding (not required for MVP)

Design principle
Entropy is not used as raw random noise dumped on the signal. It is used to generate dynamic perturbation schedules—when, where, and how strongly DEPL acts.

Why this matters: Static perturbations become a learnable artifact. Adversaries can denoise, fine-tune around, or invert fixed patterns. Entropy keeps the defense non-stationary across files and time.

Entropy principle
Entropy is the mechanism that prevents perturbation patterns from collapsing into predictability.

M-MIC outputs schedules and control parameters consumed by DEPL—when, where, and how strongly to perturb—keyed to content analysis.

DEPL: signal and temporal perturbation

DEPL implements controlled, bounded distortions on identity-bearing structures. Perturbations are designed to be:

Sparse in perceptual dimensions humans weight heavily.
Dense in dimensions embedding models exploit.
Temporally structured so rhythm and continuity are disrupted for machines more than for humans.

Audio (Lyre-aligned)

Primary targets for learnability reduction:

Target	Rationale
Formants	Speaker timbre; stable across utterances
Harmonics	Periodic structure exploited by vocoders
Phase relationships	Often ignored by listeners; used in analysis
Pitch stability	F0 tracks feed speaker embeddings
Cadence	Macro timing of speech units
Prosody	Stress, intonation, emotional contour
Speaker embedding stability	Direct objective for clone pipelines

Video (Phantom-aligned)

Target	Rationale
Facial landmarks	Geometry for face swap and reenactment
Texture consistency	Skin micro-texture in generative models
Motion continuity	Optical-flow–friendly identity cues
Blink timing	Subtle biometric rhythm
Mouth movement timing	Lip-sync and talking-head models
Expression transitions	Dynamics between neutral and expressive states
Identity embedding stability	Face encoder invariance targets

DEPL may operate in multiple signal domains depending on experiment stage. This document does not prescribe a specific transform or pipeline—only the intent: degrade stability where models compress identity.

Rhythm as a core learnability surface

Rhythm is the temporal signature of data—how identity manifests over time, not only in static snapshots.

Modality	Rhythm examples
Audio	Cadence, pauses, syllable timing, prosodic phrasing
Video	Blink timing, gesture timing, expression onset/offset, head motion
Behavioral (future)	Typing cadence, scrolling rhythm, response latency

AI systems learn not only what data is, but how it behaves over time. Clone pipelines exploit temporal regularities: consistent pause lengths, predictable blink rates, stable mouth–audio alignment.

DEPL / M-MIC therefore treats rhythm as a first-class learnability surface. M-MIC schedules perturbations to disrupt temporal regularities that embeddings summarize into a compact identity code—without making speech or video feel randomly “glitchy” to humans.

  Human perception          Machine learning
  ─────────────────         ─────────────────
  Integrates over ~100ms+   Exploits ms–s regularities
  Forgives local jitter     Averages stable rhythms
  Attends to semantics      Compresses timing stats

Technical model

Conceptual formulation (implementation-agnostic):

Original data: x

Protected data: x' = T(x, E, T, C)

Symbol	Meaning
`T`	Perturbation transformation (DEPL family)
`E`	Entropy input vector (from M-MIC)
`T`	Temporal schedule (segment weights, phases)
`C`	Content/context features (from analysis layer)

Objective (stated in words):

Maximize machine representation drift under a defined attack suite.
Minimize human perceptual degradation subject to quality constraints.

There is no claim of a closed-form optimum. In practice, the system searches parameter settings that improve a learnability score while staying above a perceptual floor (MOS proxies, lip-sync error bounds, etc.).

        ┌─────────────┐
   x ──►│  T(·) DEPL  │──► x'
        └──────▲──────┘
               │ E, T, C
        ┌──────┴──────┐
        │    M-MIC    │
        └─────────────┘

Attack simulation layer

Protection that is never tested against attacker-like models is indistinguishable from hope. The attack simulation layer stress-tests protected output using pipelines chosen to represent realistic extraction—not every possible future model.

Example metrics

Metric	What it approximates
Speaker embedding similarity	Voice clone feasibility
Face embedding similarity	Face swap / reenactment feasibility
Temporal consistency	Stability across frames or segments
Clone risk score	Composite risk from proxy attackers
Learnability score	Overall extractability under suite
Perturbation survival score	Robustness after compression/transcode

Scores are relative (before vs after protection, across parameter sweeps). Absolute thresholds will be calibrated empirically in R&D.

Note
The attack suite must evolve as public encoders and generative models improve. A score is valid only for the versioned benchmark it was measured against.

Adaptation loop

The adaptation loop implements continuous improvement within an experiment or product build:

Protected Data → Attack Simulation → Score → Adjustment → (repeat)

Adjustable parameters

Parameter	Effect
Perturbation intensity	Strength vs perceptual cost
Perturbation location	Which features or regions are touched
Entropy mapping	How M-MIC seeds DEPL schedules
Temporal schedule	When perturbations activate
Feature targeting	Audio vs video specific masks

Early R&D runs this loop offline on curated clips. Production systems may use frozen profiles validated on benchmarks before release.

MVP scope

Phase focus: audio first (Lyre-aligned). Video perturbation follows in prototype form; full multimodal coupling is not required for the first measurable milestone.

MVP pipeline

Audio Input
     │
     ▼
Feature Extraction
     │
     ▼
Entropy Scheduling (M-MIC)
     │
     ▼
Perturbation (DEPL)
     │
     ▼
Embedding Comparison
     │
     ▼
Learnability Score

Success criteria (MVP)

Criterion	Target intent
Human acceptability	Audio remains listenable; no obvious artifacts at nominal settings
Embedding drift	Measurable decrease in similarity to pre-protection embeddings under fixed encoders
Compression survival	Perturbations remain effective after lossy codecs (e.g. AAC, Opus) at typical bitrates
Learnability score	Documented decrease vs baseline on versioned benchmark clips

Failure modes to log explicitly: intensity too low (no drift), too high (audible damage), encoder mismatch (wrong attack model), schedule collapse (repeating pattern).

Comparison to existing approaches

Approach	When it acts	What it protects	Relation to DEPL / M-MIC
Detection systems	After synthetic media exists	Consumers and platforms from fooled judgment	Reactive; does not reduce training signal at source
Watermarking / provenance	At capture or publish	Authenticity and lineage	Does not prevent learning from unmarked copies
Nightshade / Glaze (class)	Pre-training (images)	Visual style / artist identity via data poisoning	Image-focused; different threat model and perceptual tradeoffs
Encryption	At rest / in transit	Access confidentiality	Does not control what plaintext teaches if decrypted for training

Key distinction:

Encryption protects who can access the data. DEPL controls what the data can teach.

DEPL / M-MIC is closest in spirit to learnability reduction and adversarial data shaping, extended explicitly to spatiotemporal identity media (voice and video) with rhythm-aware scheduling.

Limitations

Honest constraints on the current program:

Models adapt. New encoders, larger pretraining, and attacker fine-tuning can erode any fixed perturbation strategy.
Perturbations may be filtered. Denoising, heavy compression, cropping, and manual cleanup may remove or dilute defenses.
Human usability caps intensity. Perceptual floors limit how aggressive DEPL can be on consumer-facing exports.
Empirical validation is incomplete. Claims in this document are hypotheses until benchmarked on versioned suites.
Real-world robustness is untested at scale across devices, languages, accents, lighting, and codec chains.

We describe outcomes with language like reduce, degrade, increase cost, and lower reliability—not guarantee or prevent all.

Roadmap

Phase	Focus	Deliverable
1	Audio MVP	End-to-end Lyre pipeline with learnability score on benchmark clips
2	Video prototype	Phantom perturbation on face video; embedding drift metrics
3	Attack simulation benchmark suite	Versioned encoders, reports, regression tracking
4	Adaptive perturbation engine	Closed-loop tuning with frozen production profiles
5	API / SDK	Integrations for publishers and tools
6	Website demo	Public showcase + controlled preview of full flows

Roadmap order may shift based on empirical results. No phase implies “complete protection.”

Collaboration

DEPL / M-MIC sits at the intersection of signal processing, machine learning, and security engineering. SpadeBrite may engage collaborators in areas such as audio and video ML, adversarial evaluation, privacy engineering, and product development—under explicit written agreement only.

This repository is not an open-source project. Reading this Lightpaper does not grant rights to implement, reproduce, or commercialize the described systems.

Contact
For collaboration inquiries: spadebrite.com/contact

Research notice

Field	Status
Rights	Proprietary — SpadeBrite LLC
Research status	Semi-research / builder phase
Document version	1.1
Last updated	2026-05-18

This Lightpaper may be updated as research matures. External references should cite the document version and date.

Notice

This document and the technologies described herein are proprietary to SpadeBrite LLC.

This Lightpaper is provided for informational purposes only. It does not grant any license or right to use, reproduce, implement, distribute, or create derivative works from the described systems without explicit written permission from SpadeBrite LLC.

Document metadata

Entity: SpadeBrite LLC
Technology: DEPL / M-MIC (learnability control framework)
Related products: Lyre (audio), Phantom (image/video), Seneca (text, research)
Repository path: docs/Lightpaper.md
Website: spadebrite.com/lightpaper