links
Google research paper introducing Percolator, a system built on Bigtable that enables incremental processing of large datasets through distributed transactions and a notification-driven computation model. It replaced the traditional MapReduce batch-processing model, allowing Google to update its search index continuously as individual pages are crawled rather than waiting for a full global rebuild. The system uses a "snapshot isolation" technique to ensure data consistency across distributed tables, where "observers" (code snippets) are triggered by specific data changes to propagate updates through the indexing pipeline. This architecture underpins the shift from the "Google Dance" (monthly index refreshes) to the Caffeine update, providing the infrastructure for near-real-time discovery of content and backlinks, though the ultimate "propagation wave" through various ranking layers still prevents instantaneous global ranking changes.