How TrueTime Achieves Bounded Uncertainty
TrueTime’s tight uncertainty bounds (1-7ms) are achieved through a combination of expensive hardware and clever algorithms.
The Physical Infrastructure
Google deploys two independent time reference systems in each datacenter:
1. GPS Time Masters
- GPS receivers with dedicated antennas on datacenter roofs
- Receive time signals directly from GPS satellites
- GPS provides ~100 nanosecond accuracy (when signal is clear)
- Failure modes: Antenna failure, jamming, atmospheric interference
2. Atomic Clocks
- Cesium or rubidium atomic clocks installed in datacenters
- Extremely stable: drift ~1 microsecond per week
- Independent of external signals
- Failure modes: Hardware failure, gradual drift over months
Why Both?
GPS and atomic clocks have independent failure modes. If GPS signals are jammed, atomic clocks keep ticking. If an atomic clock fails, GPS provides backup. This redundancy is critical for achieving 99.9999% uptime.
Time Synchronization Architecture
GPS Satellites ↓↓↓ ┌───────────────────────┐ │ GPS Time Masters │ │ (multiple per DC) │ └───────────┬───────────┘ │ ┌───────────────────────┐ │ Atomic Clocks │ │ (multiple per DC) │ └───────────┬───────────┘ │ ╔═════════╧═════════╗ ║ Time Masters ║ ║ (10-20 per DC) ║ ╚═════════╤═════════╝ │ Synchronize via Marzullo's Algorithm │ ┌───────────────┼───────────────┐ ↓ ↓ ↓ [Server 1] [Server 2] [Server N]Synchronization Process:
- Every 30 seconds: Servers poll multiple Time Masters
- Collect timestamps: Get time readings from ~10 Time Masters
- Run Marzullo’s Algorithm: Detect and exclude faulty clocks (outliers)
- Compute uncertainty: Calculate the uncertainty interval based on:
- Network round-trip time
- Last sync interval
- Local clock drift rate
- Discrepancies between Time Masters
The TrueTime Daemon
Here’s how each server maintains its TrueTime interval:
class TrueTimeDaemon: def __init__(self): self.local_clock_drift_rate = 200 # microseconds per second self.last_sync_time = 0 self.last_sync_uncertainty = 0
def poll_time_masters(self): """Synchronize with Time Masters using Marzullo's algorithm.""" timestamps = []
# Poll multiple Time Masters for master in TIME_MASTERS: t_send = local_time() response = master.get_time() t_receive = local_time()
# Account for network delay rtt = t_receive - t_send master_time = response.timestamp uncertainty = response.uncertainty + (rtt / 2)
timestamps.append((master_time, uncertainty))
# Marzullo's algorithm: find the smallest interval # that overlaps with a majority of reported intervals agreed_time, agreed_uncertainty = marzullos_algorithm(timestamps)
self.last_sync_time = agreed_time self.last_sync_uncertainty = agreed_uncertainty
def now(self) -> TTinterval: """ Compute current TrueTime interval.
Uncertainty grows linearly since last sync due to clock drift. """ time_since_sync = local_time() - self.last_sync_time drift_uncertainty = time_since_sync * self.local_clock_drift_rate
total_uncertainty = self.last_sync_uncertainty + drift_uncertainty
now = local_time() return TTinterval( earliest=now - total_uncertainty, latest=now + total_uncertainty )Result: Across Google’s network, the uncertainty ε is typically:
- Average: 4ms
- Best case: 1ms (right after sync)
- Worst case: 7ms (just before next sync)
Marzullo’s Algorithm: Detecting Faulty Clocks
Marzullo’s algorithm is the secret sauce that allows TrueTime to tolerate faulty time sources.
The Problem
When you poll 10 Time Masters, you get 10 different time readings:
Master 1: [100.0, 100.2]Master 2: [100.1, 100.3]Master 3: [100.0, 100.2]Master 4: [99.5, 99.7] ← Clearly wrong!Master 5: [100.1, 100.3]...How do you determine which masters are correct and which are faulty?
The Solution
Marzullo’s algorithm finds the smallest interval that overlaps with a majority of reported intervals:
def marzullos_algorithm(intervals): """ Find the smallest interval overlapping with majority of inputs.
Returns: (agreed_time, uncertainty) """ events = []
# Create events for interval starts and ends for earliest, latest in intervals: events.append((earliest, +1)) # Interval starts events.append((latest, -1)) # Interval ends
# Sort events by time events.sort()
best_count = 0 current_count = 0 best_interval_start = None
for time, delta in events: current_count += delta
if current_count > best_count: best_count = current_count best_interval_start = time
# Find where the best interval ends best_interval_end = None current_count = 0
for time, delta in events: current_count += delta if current_count == best_count: best_interval_end = time break
# Return the midpoint and uncertainty agreed_time = (best_interval_start + best_interval_end) / 2 uncertainty = (best_interval_end - best_interval_start) / 2
return agreed_time, uncertaintyExample Execution:
Input intervals: Master 1: [100.0, 100.4] ————————— Master 2: [100.2, 100.6] ————————— Master 3: [100.1, 100.5] ————————— Master 4: [99.5, 99.9] ———— (outlier!) Master 5: [100.3, 100.7] —————————
Overlap Analysis: 99.5 100.0 100.3 100.4 100.7 | | | | |Count: 1 2 4 3 1 ↑ Maximum overlap!
Result: [100.2, 100.4] with 4 masters agreeingUncertainty: ±0.1msKey Properties:
- Fault tolerance: Can tolerate up to ⌊n/2⌋ faulty masters
- Optimal: Finds the smallest interval with maximum agreement
- Fast: O(n log n) due to sorting
- Byzantine-resistant: Works even if faulty clocks lie arbitrarily
Important
Redundancy Budget: Google deploys 10-20 Time Masters per datacenter specifically to tolerate failures. Even if half fail, TrueTime continues working with bounded uncertainty.
How Uncertainty Grows Between Syncs
Between synchronizations, uncertainty grows due to clock drift:
Uncertainty over time:
ε (ms) 7 | / | / 6 | / | / 5 | / | / 4 | / ← Average ε | / 3 | / | / 2 | / | / 1 |______/ | +-----|-----|-----|-----|-----|----> Time Sync 5s 10s 15s 20s 25s 30s Next SyncFormula: ε(t) = ε₀ + (drift_rate × time_since_sync)
- ε₀ = uncertainty immediately after sync (~1ms)
- drift_rate = local clock drift (~200 μs/second)
- time_since_sync = seconds since last sync (max 30s)
Maximum uncertainty: 1ms + (200 μs/s × 30s) = 1ms + 6ms = 7ms
This is why Google synchronizes every 30 seconds - to keep ε bounded!
The Cost of Infrastructure
Replicating TrueTime requires significant investment:
Hardware Costs (per datacenter)
| Component | Cost | Quantity | Total |
|---|---|---|---|
| Atomic clocks | $50,000 | 2-4 | $100k-$200k |
| GPS receivers | $5,000 | 4-6 | $20k-$30k |
| GPS antennas | $1,000 | 4-6 | $4k-$6k |
| Time Master servers | $10,000 | 10-20 | $100k-$200k |
| Redundant networking | $10,000 | - | $10k-$50k |
| Total per DC | $234k-$486k |
For Google’s ~100 datacenters: $23M-$48M in hardware alone
Operational Costs
- Maintenance: Atomic clocks require specialized technicians
- Monitoring: 24/7 monitoring of time drift and failures
- Replacement: Components fail and need replacement
- Expertise: Deep knowledge of time synchronization protocols
Annual operational cost: $5M-$10M
Warning
Why Google Can Do This: The cost of deploying atomic clocks and GPS receivers in every datacenter is significant. Most companies cannot afford this infrastructure, which is why TrueTime-style systems remain rare.
Failure Scenarios and Handling
TrueTime is designed to handle various failure modes gracefully:
1. GPS Signal Loss
Scenario: Construction blocks GPS antennaAction: Fall back to atomic clocksImpact: Slight uncertainty increase (~2-3ms)Duration: Until GPS restored2. Atomic Clock Failure
Scenario: Atomic clock hardware failureAction: Use remaining atomic clocks + GPSImpact: Minimal (redundancy)Duration: Until replacement installed3. Network Partition
Scenario: Time Master unreachableAction: Marzullo's algorithm excludes itImpact: Minimal if majority reachableDuration: Until network restored4. Catastrophic: All Time Masters Fail
Scenario: Datacenter-wide failureAction: ε grows unbounded, system halts writesImpact: SEVERE - no new transactionsDuration: Until manual interventionThe Final Safety: If uncertainty exceeds a threshold (~100ms), Spanner stops accepting writes rather than violating external consistency. Availability is sacrificed to maintain correctness.
Tip - Next Steps
Now that you understand how TrueTime achieves bounded uncertainty, let’s see how Google Spanner uses it to provide external consistency across datacenters.
Summary
- TrueTime relies on GPS + atomic clocks for redundancy
- Marzullo’s algorithm detects and excludes faulty time sources
- Uncertainty is typically 1-7ms, growing between syncs
- Infrastructure costs $25M-$50M for Google’s scale
- Graceful degradation: System tolerates multiple failures
- Safety over availability: Halts writes if uncertainty unbounded
The physical infrastructure is what makes TrueTime possible. Without atomic clocks and GPS, the tight uncertainty bounds simply cannot be achieved.