Distributed And Cloud-Based Storage Systems
https://ceres.cs.umd.edu/818
TTh 5:00-6:15, CSI 2120


The guiding philosophy of this course is that the best way to learn about real systems is to build one. We will gain an in-depth understanding of the issues involved in designing and deploying large-scale distributed file systems. In the course of this investigation we will be tackling a variety of topics, such as peer-to-peer systems, remote procedure calls, multi-threading, consensus protocols, cloud systems, layered systems (supporting high-level consistency guarantees on top of cloud services), and security as it relates to such systems.

Announcements:

Professor

Pete Keleher <keleher@cs.umd.edu> (include "818" in all correspondance)
Office hours: By appt.

Information

The class will consist of lectures by the instructor, student project presentations, a final, and a series of probably four programming projects, all in the language Go (fear not if you don't know anything about go, we'll all be learning together). The end goal is to have built a full-scale reliable, highly-available, and secure distributed file system, using both local disks and cloud services as backing stores. My lectures will be split between those describing the tools we will use to build our file systems, and lectures based on recent research in the literature (such as those at FAST, OSDI, NSDI, and SOSP.

Examples of technologies we may use include FUSE (and MacFUSE), key value stores like Bolt or gkvlite or diskv or leveldb-go, the Amazon Simple Storage Service (and go binding), Google's Protocol Buffers or json (from Go), Google's Go language, PAXOS, SQLite, and Snappy.

      Note: this paper list will change slightly as the semester goes on.

Tuesday Thursday
Aug 29
Reading: A Tour of Go, and Effective Go

Solve the following puzzle, copy your solution into a fresh playground, and send me the "Share" url before class Thursday.

(intro notes)

Aug 31
Intro/Go

"Immutability Changes Everything"

(intro2 notes)

Sep 5
Intro/Go
"The Design and Implementation of a Log-Structured File System"

 

"A Low-bandwidth Network File System"

 

(LFS notes, LBFS notes)

Sep 7
Event orderings.
Global system event orderings, and time.

P1: Go, and UbiStore due Sunday.

Sep 12
"The Google File System"
     and
"GFS: Evolution on Fast-forward"

 

(GFS notes)

Sep 14
Versioning
"Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System"
     and
"Deciding when to forget in the Elephant file system"

 

(paxos/byz notes)

Sep 19
"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications"
     and
"Wide-area cooperative storage with CFS"

(chord notes)

Sep 21
"The worldwide computer"
     and
"OceanStore: An Architecture for Global-Scale Persistent Storage"

P2: Separation of Concerns, due Sunday.

Sep 26
Consensus
"Paxos Made Simple"
     and
"Paxos Made Live: an Engineering Perspective"
     
(paxos/byz notes)
Sep 28
Consensus
"In search of an understandable consensus algorithm"
     
(raft quiz)
Oct 3
Consensus
Crypto

"Egalitarian paxos"

Oct 5
"Scalable Causal Consistency for Wide-Area Storage with COPS"
     and
 "Bolt-on causal consistency"

P3: Crypto, due Sunday.

Oct 10
Databases. No reading Oct 12
More Databases. No reading
Oct 17
Databases (slides) and P4 Oct 19
"Dynamo: Amazon's highly available key-value store" - armaan
     
(slides)
Oct 24
"Salt: Combining ACID and BASE in a Distributed Database"
     and
Correctness Anomalies Under Serializable Isolation (no blog)
     and
"HAT, not CAP: Towards Highly Available Transactions"
Oct 26

P4: Anti-Entropy, and Containers, due Sunday.

Oct 31
Ramcloud
"Fast crash recovery in RAMCloud"
     and
"Implementing Linearizability at Large Scale and Low Latency"
Nov 2
Spanner
"Spanner: Google's Globally-Distributed Database" - nethanial
(slides)
     and
"Living Without Atomic Clocks" (CockRoachDB)
(notes)
Nov 7
Fault Tolerance and Security
"Practical Byzantine Fault Tolerance" (notes)
     and
"SPORC: Group Collaboration using Untrusted Cloud Resources" - jefferson
Nov 9
"Calvin: fast distributed transactions for partitioned database systems"
Nov 14
"Tango: Distributed data structures over a shared log"
     and
"Communicating Sequential Processes" - jason
Nov 16
"XFaaS: Hyperscale and Low Cost Serverless Functions at Meta" - abhinav
     and
"The Fuzzylog: a Partially Ordered Shared Log"
Nov 21Pinesgiving Holiday Nov 23Thanksgiving Holiday

P5: Raft, due Sunday.

Nov 28
"Aegean: Replication beyond the client-server model" - olivia
     and
"SLOG: Serializable, Low-latency, Geo-replicated Transactions"
Nov 30
"Log-structured Protocols in Delos" - yancheng
Dec 5
"HyperDex: A Distributed, Searchable Key-Value Store" - om
     and
"Rabia: Simplifying State-Machine Replication Through Randomization" - mingwei
Dec 7
"MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation" - Yinuo
(presentation)
     and
"Exploiting Nil-Externality for Fast Replicated Storage" - Dev

P6: Log-Structured Objects, due midnight Dec 16.

Final Exam : due midnight Dec 16.

Late Policies

All projects will have a due date, and a late due date two days later.
  • Do each project by yourself. Sadly, we can and do detect and fail those that do not abide by this policy each semester. You may ask, and answer, general questions on Piazza.
  • Your grade loses 20% of the max score if the project is turned in after the due date, but by the late due date. Anything after the late due date gives you a zero.

Attendance and general grading policies

Students are responsible for all material covered, and all announcements, deadlines, policies, etc., discussed in lecture and discussion section, regardless of whether they were in class to hear the information or not. It’s understood that students may occasionally have to miss class for various reasons, but email and office hours are not intended as a replacement for class attendance. Consequently, only students who typically and regularly attend class will receive assistance during office hours.

Coursework will count toward the final grade according to the following percentages:

  1. Projects: 70%
    • There will six projects: the first three worth 10%, P4 and P5 worth 15% each, and P6 is 10%.
    • Must get at least half credit on each project to pass the course.
    • You are required to upload a blog entry before each class except the first. More details in class.
  2. Paper presentation / class participation / blog entries: 10%
  3. Final exam: 20%

Academic integrity

The Campus Senate has adopted a policy asking students to include the following statement on each examination or assignment in every course: “I pledge on my honor that I have not given or received any unauthorized assistance on this examination (or assignment).” Consequently, you will be requested to include this pledge on each exam and project. You may review the University’s Code of Academic Integrity for yourself at
https://www.faculty.umd.edu/teach/integrity.html

 Web Accessibility