It’s 2025. Why Is "Undo" Still Missing From Data Infrastructure?

2025-05-01 5 min

Table of Contents

Untangling data errors feels like delicate surgery, but our tools often force us to use sledgehammers. True "Undo" requires precision.

“What if I could just undo that pipeline run - without corrupting history, or reprocessing everything, or doing manual surgery across five tables?”

Every data engineer has felt that sinking moment. A job runs wrong, and you wish you could just undo it. Cleanly. Transparently.

But you can’t.

We have undo in docs, code, even email. Yet in data infrastructure – systems holding critical decisions – undo is practically impossible. Mistakes mean scrambling rollbacks, expensive reruns, or hoping no one notices.

This isn’t just inconvenient; it’s a failure of imagination. Our tools are fast and scalable, but not forgiving.

It’s 2025. Why can’t our petabyte-scaling systems let us undo?

📐 Who is this post for?

Engineers who’ve wished for Ctrl+Z after a pipeline run.
Platform teams building brittle rollback logic.
Architects tired of constant, risky backfills.
Anyone who thinks fixing data shouldn’t require heroic effort.

(This builds on the idea that lakehouse time travel isn’t enough because current platforms don’t handle time and corrections properly.)

TL;DR - Undo Is The Missing Primitive

🔁 Undo matters. It enables confidence over fear.
❌ But it’s missing. Most platforms focus on destructive updates, not reversible history.
💥 We pay for it: risky rewrites, lost time, slowed progress.
🧩 3 kinds of undo: Immediate revert, Targeted reversal (undoing past mistakes without disrupting later work), Amendment (correcting past logic semantically).
🧱 Real undo needs: Immutability, Determinism, Bitemporality, Lineage.
🚀 With undo: move faster, fix fearlessly, trust your platform.

It’s 2025. Undo shouldn’t still be bolted on. Let’s build it in — for good.

📚 What We’ll Cover

🧠 Why Undo Changes Everything
🚫 The Root Cause: Why Undo Is Missing
🎯 The Three Kinds of Undo Explained
⚙️ What Real Undo Requires (The Foundation)
💡 The Impact: Life With Built-In Undo
🔧 AprioriDB: Building Undo from the Ground Up

🧠 Why Undo Changes Everything

The lack of undo forces engineers to work defensively, terrified of irreversible mistakes.

Built-in undo isn’t just a feature; it changes the mindset:

Move faster: Try risky changes, knowing you can revert.
Debug smarter: Backtrack transactions to inspect past states directly.
Fix cleanly: Amend flawed historical logic without complex backfills or manual merges.
Model evolution: Represent corrections or changing understanding without rewriting history, using logical forks.

Undo shifts the focus from fear to curiosity, from damage control to recoverable change.

The problem isn’t impossibility; it’s that our infrastructure forgets the how and why behind the data, making safe reversal impossible. We compensate with brittle workarounds like backfills and manual data surgery – pale imitations of true undo.

🚫 The Root Cause: Why Undo Is Missing

Why can’t most data systems handle undo gracefully, especially for past mistakes?

Because they weren’t designed for it.

They prioritize throughput and the current state, typically assuming:

Data is mutable (overwritable).
Updates are destructive (past states are lost or obscured).
Focus is on the result, not the reason (the how and why data was produced isn’t durably tracked).

Without a native concept of an editable yet auditable transaction history, reversing changes becomes complex and risky.

The system lacks the context – the logic, inputs, dependencies, and temporal validity – needed for surgical corrections. This forces us into heavy-handed reconstruction (reruns, backfills) instead of true reversal.

We need foundations that remember.

🎯 The Three Kinds of Undo Explained

True undo addresses different needs:

1. Immediate Revert

Need: Undo the very last transaction immediately after realizing a mistake.
Analogy: git revert HEAD, Ctrl+Z.
Requires: Easy access to the previous state.

2. Targeted Reversal

Need: Undo a specific mistake from the past (e.g., days ago) without affecting subsequent valid work.
Analogy: Surgically removing one faulty historical step.
Requires: Bitemporal modeling (system vs. valid time) and lineage to isolate the impact.

3. Amendment (Correction)

Need: Correct the meaning or logic of a past transaction, reflecting an improved understanding, while preserving the record of the original mistake.
Analogy: Issuing a formal correction to a published work.
Requires: Semantic forks in the logical timeline, anchoring the corrected state to the original valid time. This is a semantic edit.

Handling these requires deep system support, far beyond simple rollbacks.

Most systems treat data like a chalkboard: once you erase and rewrite, there’s no trace of what was there.

But the real world needs something closer to a ledger — with a margin for footnotes, corrections, and receipts.

⚙️ What Real Undo Requires (The Foundation)

Supporting true undo isn’t about UI; it’s about the engine’s core capabilities:

🧱 Immutability & Lineage

Every operation — logic, inputs, context — must be durably recorded.
Downstream dependencies must be structurally traceable.

⏱ Determinism

The system must guarantee that recomputing with the same context yields the same result.
No hidden randomness or ambient side effects.

🕰 Bitemporality

Track both system time (when something was written) and valid time (what it was meant to represent).
Enable corrections without paradoxes.

🔍 Lineage

Understanding what depends on what is crucial for understanding impact.

💡 The Impact: Life With Built-In Undo

Undo isn’t just convenience; it’s a trust primitive that transforms data work:

Confidence over Fear: Faster innovation as mistakes become less costly.
Targeted Fixes: Precise, audited corrections replace slow, risky backfills.
Trustworthy History: Clear audit trails with explicit revisions, not silent overwrites.
Simplified Operations: Less brittle custom logic for rollbacks or manual data surgery.
Focus on Meaning: More time spent ensuring logic is correct, less time managing reconstruction plumbing.

Without undo, teams become defensive, workarounds obscure truth, and progress slows. Undo fosters safety, agility, and ultimately, trust in the data.

🔧 AprioriDB: Building Undo from the Ground Up

Bolting undo onto mutable foundations leads to the painful workarounds we normalize today.

At AprioriDB, we’re building a system where correction, reproducibility, and auditability are native — not patched on.

Semantic Transactions: Recording the evaluable logic and context.
Native Bitemporality: Tracking system and valid time explicitly.
Preserved Logic & Provenance: Enabling safe amendment and replay.
Enforced Determinism: Guaranteeing reliable, repeatable operations.

These principles allow AprioriDB to support immediate, targeted, and amendment undo as first-class operations.

Undo is the missing safety net — and the path to fearless data systems. If you’re building tomorrow’s data stack and you’re tired of pretending rollbacks are good enough, let’s build something better. Together.

It’s 2025. It’s time our tools caught up.

👉 Learn more and get involved at https://aprioridb.com.

Written by Jenny Kwan, co-founder and CTO of AprioriDB.

Follow me on Bluesky and LinkedIn.

What do you think?

Ever wished for a real undo button in your data pipelines?
What painful workarounds have you used instead?
How would your work change with safe, native undo?

Share your war stories and thoughts in the comments. I’d love to hear from you. 👇