SpadeBrite

SpadeBrite

DEPL / M-MIC Lightpaper

"A Spatiotemporal Learnability Control Framework for Deepfake Resilience"

TermExpansionRole
DEPLDynamic Entropy Perturbation LayerSignal and temporal perturbation layer
M-MICMulti-Modal Entropy Input CoreEntropy and temporal orchestration core that guides DEPL

Document notice

This Lightpaper is an early-stage technical overview from SpadeBrite LLC. It describes research direction, conceptual architecture, and evaluation intent—not validated product claims or production guarantees. Metrics, thresholds, and adversary models remain under active study.


What we do not claim

The following boundaries apply to this document and to any experimental implementation derived from it.

  • We do not claim to make deepfakes impossible.
  • We do not claim full protection against all models, pipelines, or future adversaries.
  • We do not claim production-grade robustness, certification, or legal suitability yet.
  • Our current focus is measurable reduction in learnability—increasing the cost and lowering the reliability of automated identity extraction—while preserving human-usable media.

Overview

SpadeBrite is the company and product suite (Lyre, Phantom, Seneca). DEPL / M-MIC is the underlying research technology: a pre-training defense framework focused on reducing how effectively AI systems can learn identity-bearing patterns from audio and video.

Most public discourse on deepfakes centers on detection after harm is possible. DEPL / M-MIC addresses the complementary question: can we change the data itself so that cloning and imitation become harder before a model is trained or fine-tuned?

The framework operates upstream of model training:

  • Input: identity-bearing media (voice, face video, and related modalities over time).
  • Process: entropy-guided, content-aware perturbation scheduled across space and time.
  • Output: protected media that remains usable to humans but presents a less stable, less compressible target to learning systems.

SpadeBrite products (e.g. Lyre for voice, Phantom for image/video) are intended to apply this research direction in product form. DEPL / M-MIC names the underlying technical stack—not a single shipped feature flag.


Problem statement

Deepfake risk splits into two problem domains:

DomainTimingTypical toolsLimitation
Before the factBefore training, cloning, or fine-tuningLearnability control, perturbation, dataset hygieneHard to measure; must preserve human quality
After the factAfter synthetic media existsDetection, verification, provenance, takedownReactive; identity may already be extracted

DEPL / M-MIC focuses primarily on before-the-fact learnability control.

Once a high-fidelity clone exists, downstream defenses face an asymmetric game: detectors chase generators, and provenance can be stripped or forged. Reducing extractable identity stability at the source shifts effort earlier—where publishers still control the artifact.

This does not replace detection or provenance. It complements them by lowering the signal quality available to attacker-like learning pipelines in the first place.


Core thesis

Modern AI systems learn by compressing stable, recurring patterns in data: timbre clusters, facial geometry, gesture habits, prosodic rhythm, and cross-modal correlations. Identity is not only what appears in a frame or spectrogram—it is what repeats predictably across samples and time.

DEPL introduces controlled spatiotemporal instability to reduce model learnability while preserving human usability. M-MIC supplies the entropy and scheduling logic that keeps perturbations from collapsing into static, filterable noise.

Canonical formulation:

DEPL / M-MIC is a spatiotemporal learnability control framework that degrades identity stability in data while preserving human usability.

Operational goals (stated modestly):

  • Increase representation drift under attacker-like encoders.
  • Decrease reliable recovery of speaker/face embeddings from protected samples.
  • Maintain perceptual acceptability for human listeners and viewers.
  • Increase cost for adversaries (data, compute, tuning) without promising impossibility.

System architecture

Data Transform Pipeline

Input Media (Audio / Video)
        │
        ▼
┌───────────────────────┐
│ Content Analysis Layer│  features, rhythm, identity-bearing regions
└───────────┬───────────┘
            ▼
┌───────────────────────┐
│ M-MIC                 │  entropy + temporal orchestration
│ (Entropy Input Core)  │
└───────────┬───────────┘
            ▼
┌───────────────────────┐
│ DEPL                  │  signal + temporal perturbation
│ (Perturbation Layer)  │
└───────────┬───────────┘
            ▼
     Protected Media
            │
            ▼
┌───────────────────────┐
│ Attack Simulation     │  attacker-like encoders / clone proxies
│ Layer                 │
└───────────┬───────────┘
            ▼
   Learnability Score
            │
            ▼
┌───────────────────────┐
│ Adaptation Loop       │  adjust intensity, targets, schedules
└───────────────────────┘

Layer Responsibilities

Content Analysis Layer

Inspects incoming media to derive context features used by M-MIC and DEPL: modality, segment boundaries, speech vs silence, face track quality, motion density, and coarse rhythm descriptors. This layer does not apply perturbation; it informs where and when perturbation is worth spending perceptual budget.

M-MIC (Multi-Modal Entropy Input Core)

Aggregates entropy from multiple sources and maps it to dynamic perturbation schedules across time and modality. M-MIC is the orchestration brain: it prevents DEPL from repeating the same transform in a way adversaries can learn or average away.

DEPL (Dynamic Entropy Perturbation Layer)

Applies the actual signal and temporal perturbations to audio waveforms, spectral representations, and video frames/sequences. DEPL executes the schedule; it does not choose entropy sources—that is M-MIC’s role.

Attack Simulation Layer

Runs attacker-like probes on protected output: embedding extractors, lightweight clone proxies, temporal consistency checks, and compression survivability tests. This layer estimates whether protection holds under realistic—not hypothetical—pipelines.

Learnability Score

A composite metric (or family of metrics) summarizing how much identity signal remains machine-learnable after protection. Lower scores indicate reduced extractability under the configured attack suite—not “safe forever.”

Adaptation Loop

Closes the loop: scores feed back into M-MIC and DEPL parameters (intensity, targeting, scheduling). The loop is offline-first in R&D (batch tuning); online adaptation is a later engineering concern.


M-MIC: entropy and temporal orchestration

M-MIC is not a single hash or RNG call. It is a minimal entropy stack designed for reproducible experiments and future hardware integration:

SourceRole
Cryptographic entropyUnpredictable schedule seeds; resists trivial replay
Temporal entropyTime-varying phase offsets; couples perturbation to media timeline
Signal-derived entropyFeatures from the content itself; ties noise to local structure
Optional device/sensor entropyFuture path for capture-time binding (not required for MVP)

Design principle

Entropy is not used as raw random noise dumped on the signal. It is used to generate dynamic perturbation schedules—when, where, and how strongly DEPL acts.

Why this matters: Static perturbations become a learnable artifact. Adversaries can denoise, fine-tune around, or invert fixed patterns. Entropy keeps the defense non-stationary across files and time.

Entropy principle

Entropy is the mechanism that prevents perturbation patterns from collapsing into predictability.

M-MIC outputs schedules and control parameters consumed by DEPL—when, where, and how strongly to perturb—keyed to content analysis.


DEPL: signal and temporal perturbation

DEPL implements controlled, bounded distortions on identity-bearing structures. Perturbations are designed to be:

  • Sparse in perceptual dimensions humans weight heavily.
  • Dense in dimensions embedding models exploit.
  • Temporally structured so rhythm and continuity are disrupted for machines more than for humans.

Audio (Lyre-aligned)

Primary targets for learnability reduction:

TargetRationale
FormantsSpeaker timbre; stable across utterances
HarmonicsPeriodic structure exploited by vocoders
Phase relationshipsOften ignored by listeners; used in analysis
Pitch stabilityF0 tracks feed speaker embeddings
CadenceMacro timing of speech units
ProsodyStress, intonation, emotional contour
Speaker embedding stabilityDirect objective for clone pipelines

Video (Phantom-aligned)

TargetRationale
Facial landmarksGeometry for face swap and reenactment
Texture consistencySkin micro-texture in generative models
Motion continuityOptical-flow–friendly identity cues
Blink timingSubtle biometric rhythm
Mouth movement timingLip-sync and talking-head models
Expression transitionsDynamics between neutral and expressive states
Identity embedding stabilityFace encoder invariance targets

DEPL may operate in multiple signal domains depending on experiment stage. This document does not prescribe a specific transform or pipeline—only the intent: degrade stability where models compress identity.


Rhythm as a core learnability surface

Rhythm is the temporal signature of data—how identity manifests over time, not only in static snapshots.

ModalityRhythm examples
AudioCadence, pauses, syllable timing, prosodic phrasing
VideoBlink timing, gesture timing, expression onset/offset, head motion
Behavioral (future)Typing cadence, scrolling rhythm, response latency

AI systems learn not only what data is, but how it behaves over time. Clone pipelines exploit temporal regularities: consistent pause lengths, predictable blink rates, stable mouth–audio alignment.

DEPL / M-MIC therefore treats rhythm as a first-class learnability surface. M-MIC schedules perturbations to disrupt temporal regularities that embeddings summarize into a compact identity code—without making speech or video feel randomly “glitchy” to humans.

  Human perception          Machine learning
  ─────────────────         ─────────────────
  Integrates over ~100ms+   Exploits ms–s regularities
  Forgives local jitter     Averages stable rhythms
  Attends to semantics      Compresses timing stats

Technical model

Conceptual formulation (implementation-agnostic):

Original data: x

Protected data: x' = T(x, E, T, C)

SymbolMeaning
TPerturbation transformation (DEPL family)
EEntropy input vector (from M-MIC)
TTemporal schedule (segment weights, phases)
CContent/context features (from analysis layer)

Objective (stated in words):

  • Maximize machine representation drift under a defined attack suite.
  • Minimize human perceptual degradation subject to quality constraints.

There is no claim of a closed-form optimum. In practice, the system searches parameter settings that improve a learnability score while staying above a perceptual floor (MOS proxies, lip-sync error bounds, etc.).

        ┌─────────────┐
   x ──►│  T(·) DEPL  │──► x'
        └──────▲──────┘
               │ E, T, C
        ┌──────┴──────┐
        │    M-MIC    │
        └─────────────┘

Attack simulation layer

Protection that is never tested against attacker-like models is indistinguishable from hope. The attack simulation layer stress-tests protected output using pipelines chosen to represent realistic extraction—not every possible future model.

Example metrics

MetricWhat it approximates
Speaker embedding similarityVoice clone feasibility
Face embedding similarityFace swap / reenactment feasibility
Temporal consistencyStability across frames or segments
Clone risk scoreComposite risk from proxy attackers
Learnability scoreOverall extractability under suite
Perturbation survival scoreRobustness after compression/transcode

Scores are relative (before vs after protection, across parameter sweeps). Absolute thresholds will be calibrated empirically in R&D.

Note

The attack suite must evolve as public encoders and generative models improve. A score is valid only for the versioned benchmark it was measured against.


Adaptation loop

The adaptation loop implements continuous improvement within an experiment or product build:

Protected Data → Attack Simulation → Score → Adjustment → (repeat)

Adjustable parameters

ParameterEffect
Perturbation intensityStrength vs perceptual cost
Perturbation locationWhich features or regions are touched
Entropy mappingHow M-MIC seeds DEPL schedules
Temporal scheduleWhen perturbations activate
Feature targetingAudio vs video specific masks

Early R&D runs this loop offline on curated clips. Production systems may use frozen profiles validated on benchmarks before release.


MVP scope

Phase focus: audio first (Lyre-aligned). Video perturbation follows in prototype form; full multimodal coupling is not required for the first measurable milestone.

MVP pipeline

Audio Input
     │
     ▼
Feature Extraction
     │
     ▼
Entropy Scheduling (M-MIC)
     │
     ▼
Perturbation (DEPL)
     │
     ▼
Embedding Comparison
     │
     ▼
Learnability Score

Success criteria (MVP)

CriterionTarget intent
Human acceptabilityAudio remains listenable; no obvious artifacts at nominal settings
Embedding driftMeasurable decrease in similarity to pre-protection embeddings under fixed encoders
Compression survivalPerturbations remain effective after lossy codecs (e.g. AAC, Opus) at typical bitrates
Learnability scoreDocumented decrease vs baseline on versioned benchmark clips

Failure modes to log explicitly: intensity too low (no drift), too high (audible damage), encoder mismatch (wrong attack model), schedule collapse (repeating pattern).


Comparison to existing approaches

ApproachWhen it actsWhat it protectsRelation to DEPL / M-MIC
Detection systemsAfter synthetic media existsConsumers and platforms from fooled judgmentReactive; does not reduce training signal at source
Watermarking / provenanceAt capture or publishAuthenticity and lineageDoes not prevent learning from unmarked copies
Nightshade / Glaze (class)Pre-training (images)Visual style / artist identity via data poisoningImage-focused; different threat model and perceptual tradeoffs
EncryptionAt rest / in transitAccess confidentialityDoes not control what plaintext teaches if decrypted for training

Key distinction:

Encryption protects who can access the data. DEPL controls what the data can teach.

DEPL / M-MIC is closest in spirit to learnability reduction and adversarial data shaping, extended explicitly to spatiotemporal identity media (voice and video) with rhythm-aware scheduling.


Limitations

Honest constraints on the current program:

  • Models adapt. New encoders, larger pretraining, and attacker fine-tuning can erode any fixed perturbation strategy.
  • Perturbations may be filtered. Denoising, heavy compression, cropping, and manual cleanup may remove or dilute defenses.
  • Human usability caps intensity. Perceptual floors limit how aggressive DEPL can be on consumer-facing exports.
  • Empirical validation is incomplete. Claims in this document are hypotheses until benchmarked on versioned suites.
  • Real-world robustness is untested at scale across devices, languages, accents, lighting, and codec chains.

We describe outcomes with language like reduce, degrade, increase cost, and lower reliability—not guarantee or prevent all.


Roadmap

PhaseFocusDeliverable
1Audio MVPEnd-to-end Lyre pipeline with learnability score on benchmark clips
2Video prototypePhantom perturbation on face video; embedding drift metrics
3Attack simulation benchmark suiteVersioned encoders, reports, regression tracking
4Adaptive perturbation engineClosed-loop tuning with frozen production profiles
5API / SDKIntegrations for publishers and tools
6Website demoPublic showcase + controlled preview of full flows

Roadmap order may shift based on empirical results. No phase implies “complete protection.”


Collaboration

DEPL / M-MIC sits at the intersection of signal processing, machine learning, and security engineering. SpadeBrite may engage collaborators in areas such as audio and video ML, adversarial evaluation, privacy engineering, and product development—under explicit written agreement only.

This repository is not an open-source project. Reading this Lightpaper does not grant rights to implement, reproduce, or commercialize the described systems.

Contact

For collaboration inquiries: spadebrite.com/contact


Research notice

FieldStatus
RightsProprietary — SpadeBrite LLC
Research statusSemi-research / builder phase
Document version1.1
Last updated2026-05-18

This Lightpaper may be updated as research matures. External references should cite the document version and date.


Notice

This document and the technologies described herein are proprietary to SpadeBrite LLC.

This Lightpaper is provided for informational purposes only. It does not grant any license or right to use, reproduce, implement, distribute, or create derivative works from the described systems without explicit written permission from SpadeBrite LLC.


Document metadata

  • Entity: SpadeBrite LLC
  • Technology: DEPL / M-MIC (learnability control framework)
  • Related products: Lyre (audio), Phantom (image/video), Seneca (text, research)
  • Repository path: docs/Lightpaper.md
  • Website: spadebrite.com/lightpaper

© 2026 SpadeBrite LLC. All rights reserved.