Skip to content

Commit 653458e

Browse files
committed
Formatting and typos
1 parent 60990ae commit 653458e

File tree

5 files changed

+38
-28
lines changed

5 files changed

+38
-28
lines changed

html/decandia2007dynamo.html

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,23 +11,34 @@
1111
<a href="../">Papers</a>
1212
</div>
1313
<div id="container">
14-
<h2 id="dynamo-amazons-highly-available-key-value-store-2007"><a href="https://scholar.google.com/scholar?cluster=5432858092023181552&amp;hl=en&amp;as_sdt=0,5">Dynamo: Amazon's Highly Available Key-value Store (2007)</a></h2>
15-
<p><strong>Overview.</strong> Amazon has a service-oriented infrastructure which consists of a large number of networked services, each with a strict <em>SLA</em>: a formal contract between the clients and server which guarantees the server meet certain performance benchmarks (e.g. 99.9% of responses are within 500 milliseconds). Amazon's user-facing business model makes it more important to meet the SLAs by providing availability, scalability, and low-latency than it is to provide strong consistency. Dynamo is Amazon's distributed higly-available eventually consistent zero-hop distributed hash table (a.k.a. key-value store) that uses consistent hashing, vector clocks, quorums, gossip, and more.</p>
16-
<p><strong>System Interface.</strong> Dynamo is a key-value store where the values are arbitrary blobs of data. Users can issue <code>get(key)</code> requests which returns either an object or a list of conflicting objects and a context. If multiple objects are returned, the user is responsible for merging them. Moreover, users can issue <code>put(key, context, value)</code> requests where <code>context</code> is used to maintain version clocks.</p>
17-
<p><strong>Partitioning Algorithm.</strong> Dynamo uses consistent hashing to partition data very similarly to Chord. Data is hashed into a circular space. Nodes are broken down into virtual nodes, each of which is randomly provided a point in the circular key space. Each node is responsible for all the keys between it and its predecessor. The number of virtual nodes at each physical node can be tuned according to the capacity of the node.</p>
18-
<p><strong>Replication.</strong> Data is sent to a <em>coordinator</em> which writes the data locally and also sends the data to N-1 other nodes. Moreover, each data item has a <em>preference list</em> of nodes where it should be written, and each node in the system knows the preference list for all data items.</p>
19-
<p><strong>Data Versioning.</strong> Data in Dynamo is timestamped with a vector clock. If a write <code>a</code> happens before a write <code>b</code>, then the two writes can be reconciled trivially; this is known as <em>syntactic reconciliation</em>. However, if <code>a</code> and <code>b</code> are concurrent, then the system or the user has to perform <em>semantic reconciliation</em>. To avoid vector clocks of unbounded size, vector clocks are given a maximum size, and each entry in a vector clock is timestamped with a physical time. When the vector clock exceeds its maximum size, the oldest entry is evicted.</p>
20-
<p><strong>Execution of <code>get()</code> and <code>put()</code>.</strong> To execute a <code>get()</code> or <code>put()</code>, a Dynamo client can</p>
14+
<h1 id="dynamo-amazons-highly-available-key-value-store-2007"><a href="https://scholar.google.com/scholar?cluster=5432858092023181552&amp;hl=en&amp;as_sdt=0,5">Dynamo: Amazon's Highly Available Key-value Store (2007)</a></h1>
15+
<h2 id="overview">Overview</h2>
16+
<p>Amazon has a service-oriented infrastructure which consists of a large number of networked services, each with a strict <em>SLA</em>: a formal contract between the clients and server which guarantees the server meet certain performance benchmarks (e.g. 99.9% of responses are within 500 milliseconds). Amazon's user-facing business model makes it more important to meet the SLAs by providing availability, scalability, and low-latency than it is to provide strong consistency. Dynamo is Amazon's distributed higly-available eventually consistent zero-hop distributed hash table (a.k.a. key-value store) that uses consistent hashing, vector clocks, quorums, gossip, and more.</p>
17+
<h2 id="system-interface">System Interface</h2>
18+
<p>Dynamo is a key-value store where the values are arbitrary blobs of data. Users can issue <code>get(key)</code> requests which returns either an object or a list of conflicting objects and a context. If multiple objects are returned, the user is responsible for merging them. Moreover, users can issue <code>put(key, context, value)</code> requests where <code>context</code> is used to maintain version clocks.</p>
19+
<h2 id="partitioning-algorithm">Partitioning Algorithm</h2>
20+
<p>Dynamo uses consistent hashing to partition data very similarly to Chord. Data is hashed into a circular space. Nodes are broken down into virtual nodes, each of which is randomly provided a point in the circular key space. Each node is responsible for all the keys between it and its predecessor. The number of virtual nodes at each physical node can be tuned according to the capacity of the node.</p>
21+
<h2 id="replication">Replication</h2>
22+
<p>Data is sent to a <em>coordinator</em> which writes the data locally and also sends the data to N-1 other nodes. Moreover, each data item has a <em>preference list</em> of nodes where it should be written, and each node in the system knows the preference list for all data items.</p>
23+
<h2 id="data-versioning">Data Versioning</h2>
24+
<p>Data in Dynamo is timestamped with a vector clock. If a write <code>a</code> happens before a write <code>b</code>, then the two writes can be reconciled trivially; this is known as <em>syntactic reconciliation</em>. However, if <code>a</code> and <code>b</code> are concurrent, then the system or the user has to perform <em>semantic reconciliation</em>. To avoid vector clocks of unbounded size, vector clocks are given a maximum size, and each entry in a vector clock is timestamped with a physical time. When the vector clock exceeds its maximum size, the oldest entry is evicted.</p>
25+
<h2 id="execution-of-get-and-put">Execution of <code>get()</code> and <code>put()</code></h2>
26+
<p>To execute a <code>get()</code> or <code>put()</code>, a Dynamo client can</p>
2127
<ol style="list-style-type: decimal">
2228
<li>Issue a request to a load balancer, or</li>
2329
<li>issue it itself if it is a partition aware client (more on this later).</li>
2430
</ol>
2531
<p>Dynamo uses quorums to write data. A read must be acknowledged by <code>R</code> servers, a write must be acknowledged by <code>W</code> servers, and <code>R + W &gt; N</code>.</p>
26-
<p><strong>Handling Failures.</strong> Dynamo uses a <em>sloppy quorum</em> where data can be stored at a node outside its preference list. The data is tagged with the node where the data should be, and the node transfers it there eventually. Moreover, preference lists span multiple data centers.</p>
27-
<p><strong>Handling Permanent Failures.</strong> Nodes user Merkle trees to determine what state has diverged from one another.</p>
28-
<p><strong>Membership and Failure Detection.</strong> Membership changes are initiated manually by a human. Nodes gossip membership information and use it transfer data to the newly joined and removed nodes. There are also seed nodes in the ring which nodes always gossip with to avoid a split ring.</p>
29-
<p><strong>Implementation.</strong> Dynamo is implemented with a pluggable storage engine and uses a SEDA architecture implemented in Java.</p>
30-
<p><strong>Experiences and Lessons Learned.</strong> Amazon has learned a lot from its experience with Dynamo:</p>
32+
<h2 id="handling-failures">Handling Failures</h2>
33+
<p>Dynamo uses a <em>sloppy quorum</em> where data can be stored at a node outside its preference list. The data is tagged with the node where the data should be, and the node transfers it there eventually. Moreover, preference lists span multiple data centers.</p>
34+
<h2 id="handling-permanent-failures">Handling Permanent Failures</h2>
35+
<p>Nodes user Merkle trees to determine what state has diverged from one another.</p>
36+
<h2 id="membership-and-failure-detection">Membership and Failure Detection</h2>
37+
<p>Membership changes are initiated manually by a human. Nodes gossip membership information and use it transfer data to the newly joined and removed nodes. There are also seed nodes in the ring which nodes always gossip with to avoid a split ring.</p>
38+
<h2 id="implementation">Implementation</h2>
39+
<p>Dynamo is implemented with a pluggable storage engine and uses a SEDA architecture implemented in Java.</p>
40+
<h2 id="experiences-and-lessons-learned">Experiences and Lessons Learned</h2>
41+
<p>Amazon has learned a lot from its experience with Dynamo:</p>
3142
<ul>
3243
<li><em>Balancing performance and durability.</em> Improving durability can decrease performance. For example, if we want to write to <code>W</code> nodes, then increasing <code>W</code> decreases the availability of the system. Dynamo allows writes to be buffered by nodes, rather than written to their disks to increase availability at the cost of durability.</li>
3344
<li><em>Ensuring uniform load.</em> Assuming there are enough hot keys, hashing data into a circular key space should ensure uniform load. However, the partitioning scheme described above where each node is divided into some number of virtual nodes, the virtual nodes are placed randomly on the ring, and the node placement determines data partitioning has some downsides. Two alternatives are to divide the key space into equal sized partitions and give each node a random number of virtual nodes. Or, to divide the key space into equal sized partitions and adjust the total number of tokens as nodes join and leave the system.</li>

html/diaconu2013hekaton.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ <h2 id="programmability-and-query-processing">Programmability and Query Processi
3636
<p>Hekaton does not compile query plans into a series of function calls. Instead, a query plan is compiled into a single function and operators are connected together via labels and gotos. This allows the code to bypass some otherwise unnecessary function calls. For example, when the query is initially executed, it jumps immediately to the leaves of the query plan rather than recursively calling down to them. Some code (e.g. sort and complicated arithmetic functions) is not generated.</p>
3737
<p>Hekaton stored procedures have some restrictions (e.g. the schema of the tables that a stored procedure reads must be fixed, the stored procedures must execute within a single transaction). To overcome some of these restrictions, SQL Server allows regular/unrestricted/interpreted stored procedures to read and write Hekaton tables.</p>
3838
<h2 id="transaction-management">Transaction Management</h2>
39-
<p>Hekaton supports snapshot isolation, repeatable read, and serializability all implemented with multiversion concurrency control. There are two conditions which can be checked during validation:</p>
39+
<p>Hekaton supports snapshot isolation, repeatable read, and serializability all implemented with optimistic multiversion concurrency control. There are two conditions which can be checked during validation:</p>
4040
<ol style="list-style-type: decimal">
4141
<li><strong>Read stability</strong>. All the versions that a transaction read must still be valid versions upon commit.</li>
4242
<li><strong>Phantom avoidance</strong>. All the scans a transaction made must be repeatable upon commit.</li>

html/terry1995managing.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ <h3 id="merge-procedures">Merge Procedures</h3>
2929
<h2 id="replica-consistency">Replica Consistency</h2>
3030
<p>Every server maintains a logical timestamp that is roughly kept in correspondence with its physical time. Servers tag writes with an id of the form (timestamp, server id). These ids form a total order, and servers order writes with respect to it. Servers immediately apply writes whenever they are received, and these writes are <strong>tentative</strong>. Slowly, writes are deemed <strong>committed</strong> and ordered before the tentative writes. It's possible that a new write appears and is inserted in the middle of the sequence of writes. This forces a server to <em>undo</em> the affects of later writes. The undo process is described later.</p>
3131
<h2 id="write-stability-and-commitment">Write Stability and Commitment</h2>
32-
<p>When a write is applied by a server for the last time, it is considered <strong>stable</strong> (equivalently, committed). Clients can query servers to see which writes have been committed. How to servers commit writes? One approach is to commit a write whenever its timestamp is less than the current timestamp of all servers. Unfortunately, if any of the servers is disconnected, this strategy can delay commit. In Bayou, a single server is designated as the primary and determines the order in which writes are committed. If this primary becomes disconnected, other servers may not see committed data for a while.</p>
32+
<p>When a write is applied by a server for the last time, it is considered <strong>stable</strong> (equivalently, committed). Clients can query servers to see which writes have been committed. How do servers commit writes? One approach is to commit a write whenever its timestamp is less than the current timestamp of all servers. Unfortunately, if any of the servers is disconnected, this strategy can delay commit. In Bayou, a single server is designated as the primary and determines the order in which writes are committed. If this primary becomes disconnected, other servers may not see committed data for a while.</p>
3333
<h2 id="storage-system-implementation-issues">Storage System Implementation Issues</h2>
3434
<p>There are three main components to each server: a write log, a tuple store, and an undo log.</p>
3535
<ol style="list-style-type: decimal">

papers/decandia2007dynamo.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
## [Dynamo: Amazon's Highly Available Key-value Store (2007)](https://scholar.google.com/scholar?cluster=5432858092023181552&hl=en&as_sdt=0,5)
2-
**Overview.**
1+
# [Dynamo: Amazon's Highly Available Key-value Store (2007)](https://scholar.google.com/scholar?cluster=5432858092023181552&hl=en&as_sdt=0,5)
2+
## Overview
33
Amazon has a service-oriented infrastructure which consists of a large number
44
of networked services, each with a strict *SLA*: a formal contract between the
55
clients and server which guarantees the server meet certain performance
@@ -10,28 +10,28 @@ strong consistency. Dynamo is Amazon's distributed higly-available eventually
1010
consistent zero-hop distributed hash table (a.k.a. key-value store) that uses
1111
consistent hashing, vector clocks, quorums, gossip, and more.
1212

13-
**System Interface.**
13+
## System Interface
1414
Dynamo is a key-value store where the values are arbitrary blobs of data. Users
1515
can issue `get(key)` requests which returns either an object or a list of
1616
conflicting objects and a context. If multiple objects are returned, the user
1717
is responsible for merging them. Moreover, users can issue `put(key, context,
1818
value)` requests where `context` is used to maintain version clocks.
1919

20-
**Partitioning Algorithm.**
20+
## Partitioning Algorithm
2121
Dynamo uses consistent hashing to partition data very similarly to Chord. Data
2222
is hashed into a circular space. Nodes are broken down into virtual nodes, each
2323
of which is randomly provided a point in the circular key space. Each node is
2424
responsible for all the keys between it and its predecessor. The number of
2525
virtual nodes at each physical node can be tuned according to the capacity of
2626
the node.
2727

28-
**Replication.**
28+
## Replication
2929
Data is sent to a *coordinator* which writes the data locally and also sends
3030
the data to N-1 other nodes. Moreover, each data item has a *preference list*
3131
of nodes where it should be written, and each node in the system knows the
3232
preference list for all data items.
3333

34-
**Data Versioning.**
34+
## Data Versioning
3535
Data in Dynamo is timestamped with a vector clock. If a write `a` happens
3636
before a write `b`, then the two writes can be reconciled trivially; this is
3737
known as *syntactic reconciliation*. However, if `a` and `b` are concurrent,
@@ -40,7 +40,7 @@ vector clocks of unbounded size, vector clocks are given a maximum size, and
4040
each entry in a vector clock is timestamped with a physical time. When the
4141
vector clock exceeds its maximum size, the oldest entry is evicted.
4242

43-
**Execution of `get()` and `put()`.**
43+
## Execution of `get()` and `put()`
4444
To execute a `get()` or `put()`, a Dynamo client can
4545

4646
1. Issue a request to a load balancer, or
@@ -49,26 +49,26 @@ To execute a `get()` or `put()`, a Dynamo client can
4949
Dynamo uses quorums to write data. A read must be acknowledged by `R` servers,
5050
a write must be acknowledged by `W` servers, and `R + W > N`.
5151

52-
**Handling Failures.**
52+
## Handling Failures
5353
Dynamo uses a *sloppy quorum* where data can be stored at a node outside its
5454
preference list. The data is tagged with the node where the data should be, and
5555
the node transfers it there eventually. Moreover, preference lists span
5656
multiple data centers.
5757

58-
**Handling Permanent Failures.**
58+
## Handling Permanent Failures
5959
Nodes user Merkle trees to determine what state has diverged from one another.
6060

61-
**Membership and Failure Detection.**
61+
## Membership and Failure Detection
6262
Membership changes are initiated manually by a human. Nodes gossip membership
6363
information and use it transfer data to the newly joined and removed nodes.
6464
There are also seed nodes in the ring which nodes always gossip with to avoid
6565
a split ring.
6666

67-
**Implementation.**
67+
## Implementation
6868
Dynamo is implemented with a pluggable storage engine and uses a SEDA
6969
architecture implemented in Java.
7070

71-
**Experiences and Lessons Learned.**
71+
## Experiences and Lessons Learned
7272
Amazon has learned a lot from its experience with Dynamo:
7373

7474
- *Balancing performance and durability.* Improving durability can decrease
@@ -93,4 +93,3 @@ Amazon has learned a lot from its experience with Dynamo:
9393
- *Balancing foreground and background.* Dynamo uses a resource controller to
9494
implement admission control for background tasks, preventing them from
9595
interfering with important foreground tasks.
96-

papers/terry1995managing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ described later.
7272
## Write Stability and Commitment
7373
When a write is applied by a server for the last time, it is considered
7474
**stable** (equivalently, committed). Clients can query servers to see which
75-
writes have been committed. How to servers commit writes? One approach is to
75+
writes have been committed. How do servers commit writes? One approach is to
7676
commit a write whenever its timestamp is less than the current timestamp of all
7777
servers. Unfortunately, if any of the servers is disconnected, this strategy
7878
can delay commit. In Bayou, a single server is designated as the primary and

0 commit comments

Comments
 (0)