Papers
Influential papers in distributed systems and databases that I recommend reading.
-
MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat · OSDI 2004
The foundational paper on distributed data processing that shaped modern big data systems.
-
Dynamo: Amazon's Highly Available Key-value Store
Giuseppe DeCandia et al. · SOSP 2007
Introduced concepts like consistent hashing, vector clocks, and eventual consistency.
-
Bigtable: A Distributed Storage System for Structured Data
Fay Chang et al. · OSDI 2006
The design behind Google's distributed storage system, inspiring HBase and Cassandra.
-
In Search of an Understandable Consensus Algorithm (Raft)
Diego Ongaro and John Ousterhout · USENIX ATC 2014
A more understandable alternative to Paxos for implementing distributed consensus.
-
The Ubiquitous B-Tree
Douglas Comer · ACM Computing Surveys 1979
Classic paper on B-tree data structures, fundamental to database indexing.