Understanding Distributed Transactions
Deep dive into distributed transactions, trade-offs, and practical patterns.
Distributed transactions are one of the trickiest problems in distributed systems. This post explores the challenges and practical solutions.
The Problem
In a single database, ACID transactions are straightforward. But what happens when your operation spans multiple databases, services, or geographic regions?
Consider a payment system:
- User sends money from Bank A
- Funds are received at Bank B
If step 1 succeeds but step 2 fails, money disappeared. If both fail, that's also a problem.
CAP Theorem
The CAP theorem tells us we can't have all three:
- Consistency: Every read receives the most recent write
- Availability: Every request gets a response
- Partition tolerance: System works despite network failures
In real systems, you always have network failures, so you must choose between Consistency and Availability.
Practical Patterns
Two-Phase Commit (2PC)
The traditional approach:
- Prepare phase: Ask all participants if they can commit
- Commit phase: All commit or all rollback
Pros: Strong consistency Cons: Blocks resources, doesn't handle coordinator failure well
Saga Pattern
Break the transaction into smaller, compensatable steps:
async function transferMoney(from: string, to: string, amount: number) {
try {
// Step 1: Debit from source
await debitAccount(from, amount);
// Step 2: Credit to destination (might fail)
await creditAccount(to, amount);
} catch (error) {
// Compensating transaction: Refund the debit
await creditAccount(from, amount);
throw error;
}
}
Pros: No blocking, can handle long-running transactions Cons: Potential for inconsistency window, complex error handling
Event Sourcing
Instead of storing state, store all changes:
interface TransferEvent {
type: 'transfer_initiated' | 'transfer_completed' | 'transfer_failed';
fromAccount: string;
toAccount: string;
amount: number;
timestamp: Date;
}
async function applyTransfer(transfer: Transfer) {
const events: TransferEvent[] = [];
events.push({
type: 'transfer_initiated',
fromAccount: transfer.from,
toAccount: transfer.to,
amount: transfer.amount,
timestamp: new Date(),
});
await eventLog.append(events);
// Async processing can replay these events
}
Choosing the Right Approach
- Same database? Use traditional ACID transactions
- Few services, strong consistency needed? Two-phase commit
- Microservices, eventual consistency OK? Saga or event sourcing
- Need audit trail? Event sourcing
Conclusion
There's no one-size-fits-all solution. The key is understanding the trade-offs and choosing what's right for your specific constraints.