CSE 662 Fall 2019 - $Bloom^L$

Sept. 10

### How is distributed consistency enforced?

1. Don't allow the program to get into an inconsistent state.
2. Detect inconsistencies and fix them after the fact.
3. Eventually converge to a consistent state.

### The CALM Principle

A monotonic program eventually converges naturally.

### Monotonicity

"Once you learn a fact, it never becomes false" (although you might never learn all available facts)

#### Computation under monotonicity

1. What facts do you know right now?
2. What facts can you compute given what you know?
3. Broadcast/react to newly discovered facts

### What causes concurrency violations?

A computation step needs a complete input before it can produce a complete output

The output is incorrect if...

• The input is incomplete, and...
• An incomplete output is not correct.

• Avoid incomplete inputs
• Block until you know all inputs are ready (Point of Order)
• Avoid computations where incomplete outputs are incorrect
• Monotonic programs never produce incorrect outputs, just incomplete ones.

### Negation

$$R = \{A, B, C\}; S = \{C, D\}$$

Let's say that $T = R - S$

You know two facts about $T$: $A \in T$ and $B \in T$

If you ever learn that $A \in S$,
the "fact" that $A \in T$ becomes false

### Aggregation

$$R = \{1, 2, 3\}$$

Let's say that $T = \sum_{i \in R} i$

You know several facts about $T$ including: $T = 6$

If you ever learn that $4 \in R$,
the "fact" that $T = 6$ becomes false

### Datalog

Atoms: $Parent(A, B)$
($A$ is a $Parent$ of $B$)

Rules: $Ancestor(A, B)$ :- $Parent(A, B)$
(If $A$ is a $Parent$ of $B$, then $A$ is also an $Ancestor$ of $B$)

$Ancestor(A, B)$ :- $Parent(A, B)$

$Ancestor(A, C)$ :- $Parent(A, B), Ancestor(B, C)$

($A$ is an $Ancestor$ of $C$ if $A$ is a $Parent$ of $B$ and $B$ is an $Ancestor$ of $C$)

$Ancestor$ computes the transitive closure of $Parent$

No fact (atom) that you can ever learn will invalidate a fact that you've already learned.

### Bloom

Datalog with timesteps and asynchronous events

SymbolMeaning
<=Add a new fact right now
<+Add a new fact in the next timestep
<-Remove a fact from the next timestep
<~Send a fact to another node

### Example: Shortest Paths (Dijkstra's)

class ShortestPaths
include Bud

state {
table :link, [:from, :to] => [:cost]
scratch :path, [:from, :to, :next_hop, :cost]
scratch :min_cost, [:from, :to] => [:cost]
￼   }

bloom {
path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
￼￼[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to], min(:cost))
}
end

### Optimization


path <= (link*path).pairs(:to => :from) { |l,p|
￼￼[l.from, p.to, l.to, l.cost + p.cost]
}


Compute only new facts: This step can be performed incrementally as path entries are added

### Example: Quorum Voting


class QuorumVote
include Bud
state {
channel :vote_chn, [:@addr, :voter_id]
channel  :result_chn, [:@addr]
table :votes, [:voter_id]
scratch :cnt, [] => [:cnt]
}

bloom {
votes <= vote_chn {|v| [v.voter_id]}
cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
}
end

### Problem!


cnt <= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}


cnt isn't monotonic: It can't be computed until all votes are present and available!

### ... but not the only thing that's monotonic

• Growing Integers
• Facts that become true
• Many aggregates (MAX, MIN, COUNT)
• Vector Clocks

There's a term for these... bounded join semilattices

### Bounded Join Semilattice

$$< S, \sqcup, \bot >$$

$S$: A set (Integers, Boolean values, Sets of facts)

$\sqcup : S \times S \rightarrow S$: A 'merge' operation for elements of $S$

$\bot \in S$: A 'starting' element of $S$

### "Merge"

(Least upper bound)

• Associative: $(a \sqcup b) \sqcup c = a \sqcup (b \sqcup c)$
• Commutative: $a \sqcup b = b \sqcup a$
• Idempotent: $a \sqcup a = a$

Defines a partial order: $a < b$ if $a \sqcup b = b$

### Examples

$$< \mathbb R, MAX, -\infty >$$
$$< \mathbb R, MIN, +\infty >$$
$$< \mathbb B, \wedge, T >$$
$$< \mathbb B, \vee, F >$$
$$< sets\ of\ \mathbb R, \cup, \emptyset >$$

New notion of 'Fact': How 'far' in the lattice's set you are.

EventAliceBobCarolDave
Initial State61710
$Alice \sqcup Bob$66710
$Carol \sqcup Dave$661010
$Alice \sqcup Dave$1061010
$Bob \sqcup Carol$10101010

The lmax lattice always goes up

### Programs don't rely exclusively on one type!

We need mappings between different lattice types

### Monotone Functions

$$f : S \rightarrow T$$

For any monotone $f$,
whenever $a <_S b$ then $f(a) <_T f(b)$

Monotone functions preserve partial orders

### Example Monotone Functions

• sizeof : $set \rightarrow \mathbb N$
• $\sum$ : $set\ of\ \mathbb R^+ \rightarrow \mathbb R^+$
• $\cap$ : $set \times set \rightarrow \mathbb set$
• $>(\mathbb R)$ : $lmax \rightarrow \mathbb B$

### If all computations in a program are monotone functions, the program is naturally eventually consistent

... but we can do better

### Morphism

$$f : S \rightarrow T$$

$f$ is a morphism if
$f$ is monotone and $f(a \sqcup b) = f(a) \sqcup f(b)$
($f$ commutes with $\sqcup$)

Monotone functions are decomposable

### Example Morphisms

• $\cap$ : $set \times set \rightarrow \mathbb set$
• $>(\mathbb R)$ : $lmax \rightarrow \mathbb B$

path <= link {|l| [l.from, l.to, l.to, l.cost]}
path <= (link*path).pairs(:to => :from) { |l,p|
[l.from, p.to, l.to, l.cost + p.cost]
}
min_cost <= path.group([:from, :to]) { |group|
group.project(:cost).min
}

Morphisms (using bags & lmin)
(link * path).pairs
+
project
group
min

### Incremental Computation

We need to update an input with new data and compute: $$f(old \sqcup new)$$

We (probably) already have $f(old)$.

Insight: Computing $f(old) \sqcup f(new)$ is probably cheaper.

... but is only correct if $f$ is a morphism

### Example: Set Lattice

class Bud::SetLattice < Bud::Lattice
wrapper_name :lset
def initialize(x=[])
@v = x.uniq # Remove duplicates from input
end
def merge(i)
self.class.new(@v | i.reveal)
end
morph :intersect do |i|
self.class.new(@v & i.reveal)
end
morph :contains? do |i|
Bud::BoolLattice.new(@v.member? i)
end
monotone :size do
Bud::MaxLattice.new(@v.size)
end
end

### Example: A Key Value Store

class KvsReplica
include Bud
include KvsProtocol
state { lmap :kv_store }
bloom do
# Fulfil any put requests
kv_store <= kvput {|c| {c.key => c.val}}
# Acknowledge any put requests
kvput_resp <~ kvput {|c|
[ c.reqid, c.client_addr, ip_port ]}
# Respond to any get requests
kvget_resp <~ kvget {|c|
[ c.reqid, c.client_addr,
kv_store.at(c.key), ip_port ]}
end
end