CSE 662 Fall 2019 - Legorithmics

### Legorithmics

#### CSE 662 Fall 2019

October 22

Same great algorithms,
awesome new hardware flavor

• Batching: Prefetch blocks of data to avoid random seeks.
• Partitioning: Group related blocks of data to minimize cross-partition work.
• Reordering: Cluster data accesses for better cache locality.

The hard part is picking where to apply the transformations and selecting values for the transformation's parameters

Key Insights:

• The basic algorithms haven't changed since the 70s.
• Hardware changes very slowly
• You have as much time as you need to design a hardware-specific algorithm

Automate the search!

What do we need?

• A way to describe the algorithm.
• A way to describe the hardware.
• A cost model.
• Transformation rules.
• A (possibly exp-time) optimizer

### Typesystems

A collection of rules that assign a property called a type to the parts of a computer program: variables, expressions, etc...
[Wikipedia]

### Typesystems

A typesystem allows you to:

• Define interfaces between different parts of a program.
• Check that these parts have been connected consistently.
• Define global properties in terms of local properties.

### Typesystems

A type
($\tau := D\;|\;[\tau]\;|<\tau,\tau>\;|\;\tau\rightarrow\tau$)

A set of inference rules
($\frac{e\;:\;\tau}{[e]\;:\;[\tau]}$)

These example types are part of the monad algebra

### Types

TypeMeaning
$D$Primitive Type (int, float, etc...)
$[\tau]$An array of elements with type $\tau$
$<\tau_1,\tau_2>$A pair of elements of types $\tau_1$ and $\tau_2$.
$\tau_1\rightarrow\tau_2$A function with one argument of type $\tau_1$ and one return value of type $\tau_2$.

### Type Examples

$[ < int,float > ]$

$[int] \rightarrow float$

$< [int], [int] >\; \rightarrow [ < int,int > ]$

### Inference Rules

Defined over a language of expressions like $f(a, b)$.

$$\frac{a : \tau_a\;\;\;b : \tau_b}{f(a,b):\tau_f(\tau_a, \tau_b)}$$

If expression $a$ has type $\tau_a$ and expression $b$ has type $\tau_b$...

...then expression $f(a, b)$ has type $\tau_f(\tau_a, \tau_b)$.

### Inference Examples

$\frac{e: \tau}{[e] : [\tau]}$

$\frac{c: Bool\;\;\;e_1:\tau\;\;\;e_2:\tau}{(\textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2)\;:\; \tau}$

$\frac{e: < \tau_1, \tau_2 >\;\;\;i \in \{1,2\}}{e.i\; :\; \tau_i}$

A primitive language for describing data processing.

OperatorMeaning
$\lambda x.e$Define a function with body $e$ that uses variable $x$.
$e_1\;e_2$Apply the function defined by $e_1$ to the value obtained from $e_2$.
$\textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2$If $c$ is true then evaluate $e_1$, and otherwise evaluate $e_2$.

OperatorMeaning
$< e_1, e_2 >$Construct a tuple from $e_1$ and $e_2$.
$e.i$Extract attribute $i$ from the tuple $e$.
$[e]$Construct a single-element array from e.
$[]$Construct an empty array.
$e_1 \sqcup e_2$Concatenate the arrays $e_1$ and $e_2$.
$$\textbf{flatMap}(f : \tau_1 \rightarrow [\tau_2])(e : [\tau_1])$$

Apply function $f$ to every element of array $e$. Concatenate all of the arrays returned by $f$.

$$\textbf{foldL}(c : \tau_2, f : < \tau_2, \tau_1 >)(e : [\tau_1])$$

Apply function $f$ to every element of array $e$, with each invocation passing its return value to the next call (e.g., aggregation)

$$\textbf{for}(xB : [\tau_1] [k] \leftarrow e_{in} : [\tau_1])(e_{loop} : [\tau_2])$$

Extract blocks of size $k$ from $e_{in}$. For each block compute a flatMap using expression $e_{loop}$.

### Example - Average

$(\lambda tot.(tot.1 / tot.2))$
$(\textbf{foldL}($ $< 0, 0 >,$ $(\lambda < a, x >.<$ $a.1 + x$ $,$ $a.2 + 1$ $>$ $))$

Fold implements aggregation

Fold takes a 'previous' $a$ and a 'current' $x$

We need a sum, and a count

Initial sum and count are both 0

Postprocess with division ($\lambda$ creates a variable $tot$)

### Cost Estimation

We need...

• Cardinality estimation
• A model of IO
• IO Cost relative to cardinality

### Cardinality Estimation

Basic Approach: Define a second type for tracking data sizes

$$\alpha\;:=\;[\alpha]^x\;|\;< \alpha_1, \alpha_2 >|\;c$$

e.g., $[ < 1, [1]^y > ]^x$ corresponds to:

• an array with $x$ elements, where each element consists of...
• a value of fixed-size 1, and...
• an array of $y$ fixed-size values.

### Cardinality Estimation

• $size(c) = c$
• $size([\alpha]^x) = x \cdot size(\alpha)$
• $size( < \alpha_1, \alpha_2 >) = size(\alpha_1) + size(\alpha_2)$

### Cardinality Estimation

$R(\Gamma, e)$ computes the cardinality type for $e$

$\Gamma : x \rightarrow \alpha$ is a context / scope

### For Loops

$$R\left(\Gamma, \textbf{for}(x [k] \leftarrow e_1)\; e_2\right) :=$$

$\frac{cardinality(R(\Gamma, e_1))}{k}\cdot$ $R(\Gamma', e_2)$

$\Gamma' := \Gamma \cup \{x \mapsto [sizeofElement(R(\Gamma, e_1))]^k\}$

The cardinality is based on that of $e_2$

Repeated once for every time through the loop

And $e_2$ is evaluated in the context of a $k$ element array.

### If Then Else

$$R\left(\Gamma, \textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2\right) :=$$

$max(R\left(\Gamma, e_1\right), R\left(\Gamma, e_2\right))$

Pessimistic assumption of biggest possible size.
Avoids needing to estimate $p(c = true)$.

### Example: Block-Nested-Loop Join

# Loop over blocks in outer rel.
$\textbf{for}( xB [k_1] \leftarrow R )$
# Loop over blocks in inner rel.
$\textbf{for}( yB [k_2] \leftarrow S )$
# Loop over elems in outer block.
$\textbf{for}( x \leftarrow xB )$
# Loop over elems in inner block.
$\textbf{for}( y \leftarrow yB )$
# Join test.
$\textbf{if}\;joinCond(x, y)$
$\textbf{then}\;[< x, y >]$
$\textbf{else}\;[]$
ExpressionContextResult Size
$\textbf{for}( xB [k_1] \leftarrow R )$ $\Gamma_1 = R \mapsto [1]^x, S \mapsto [1]^y$ $[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$
$\textbf{for}( yB [k_2] \leftarrow S )$ $\Gamma_2 = \Gamma_1 \cup xB \mapsto [1]^{k_1}$ $[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$
$\textbf{for}( x \leftarrow xB )$ $\Gamma_3 = \Gamma_2 \cup yB \mapsto [1]^{k_2}$ $[ < 1, 1 > ]^{k_1 \cdot k_2}$
$\textbf{for}( y \leftarrow yB )$ $\Gamma_4 = \Gamma_3 \cup x \mapsto 1$ $[ < 1, 1 > ]^{k_2}$
$\textbf{if}\;joinCond(x, y)$ $\Gamma_5 = \Gamma_4 \cup y \mapsto 1$ $[ < 1, 1 > ]^1$
$\textbf{then}\;[< x, y >]$ $\Gamma_5$ $[ < 1, 1 > ]^1$
$\textbf{else}\;[]$ $\Gamma_5$ $0$

### IO Model

IO Costs have 2 components:

• $InitCom$: The cost of initializing a connection (e.g., seek time).
• $UnitTr$: The cost of transferring one unit of data.

Costs are defined for every pair of memory hierarchy levels:

• $UnitTr(HDD \rightarrow RAM)$ is the cost of reading from HDD into Ram.
• $InitCom(RAM \rightarrow HDD)$ is the cost of seeking to a write from Ram onto a HDD.

ExpressionResult SizeHDD to RAMRAM to HDD
$\textbf{for}( xB [k_1] \leftarrow R )$ $[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$ $x+\frac{x}{k_1}y$ $2xy$
$\textbf{for}( yB [k_2] \leftarrow S )$ $[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$ $y$ $2k_1y$
$\textbf{for}( x \leftarrow xB )$ $[ < 1, 1 > ]^{k_1 \cdot k_2}$ $0$ $2k_1k_2$
$\textbf{for}( y \leftarrow yB )$ $[ < 1, 1 > ]^{k_2}$ $0$ $(1+1)k_2$
$\textbf{if}\;joinCond(x, y)$ $[ < 1, 1 > ]^1$ $0$ $(1+1)k_2$
$\textbf{then}\;[< x, y >]$ $[ < 1, 1 > ]^1$ $0$ $(1+1)k_2$
$\textbf{else}\;[]$ $0$ $0$ $0$

HDD: $R, S, Result$ RAM: $x, xB, y, yB$

### Rewrite Rules

#### Batching

$$for(x [1] \leftarrow R)\; e \Rightarrow for(xB [k] \leftarrow R)\; for(x [1] \leftarrow xB)\; e$$

#### Reordering Iterators

$$for(x_1 [k_1] \leftarrow R_1)\;for(x_2 [k_2] \leftarrow R_2) \Rightarrow$$ $$for(x_2 [k_2] \leftarrow R_2)\;for(x_1 [k_1] \leftarrow R_1)$$

#### Size-Dependent, Commutative Functions

$$f \Rightarrow (\lambda< x_1, x_2>.f(\textbf{if}\;|x_1|\leq |x_2|\;\textbf{then}< x_1, x_2 >\;\textbf{else}\;< x_2, x_1 >))$$ $$f \Rightarrow (\lambda< x_1, x_2>.f(\textbf{if}\;|x_1|\leq |x_2|\;\textbf{then}< x_2, x_1 >\;\textbf{else}\;< x_1, x_2 >))$$

### The Optimizer

Starting with $e$...
1. For every possible rewrite of $e$:
2. Use a linear optimizer to find the best $k$s
3. If the rewrite improved the cost then recur.