Data Management Systems (including Relational Databases, Non-Relational Databases, and NoSQL storage systems) are the basis for any big data project. A data management system is responsible for storing data, enabling efficient access to that data, as well as mediating concurrent modifications. This class teaches data management systems both in terms of the basic principles of their design, and the practical challenges of implementing them. The course is built around a term-long programming assignment, in which you will build a system for answering SQL queries efficiently. Course lectures will focus on the conceptual basis for this system and how the techniques that you implement in the project generalize (e.g., to the use of NoSQL systems)
- Jan. 26: Introduction and Outline
( slides )
- Jan. 28: Relational Algebra
( slides | video )
- Feb. 2: SQL
( slides | video )
- Feb. 4: SQL to RA and Evaluation
( slides | video )
- Feb. 9: Project 1 Overview, Generalized RA
( slides | video )
- Feb. 11: Data Layout (Serialization, Paging, Columnar)
( slides | video )
- Feb. 16: Data Organization (Sorting, Tree Indexes)
( slides | video )
- Feb. 18: Hash Indexes, Joins, and Optimization
( slides )
- Feb. 23: Joins and Optimization
( slides )
- Feb. 25: Midterm 1 Review
( slides | video )
- Mar. 1: Midterm 1
- Mar. 3: External Algorithms
( slides | video )
- Mar. 8: Project 2 Review
( code | video )
- Mar. 10: Data Modeling (E/R and Constraints)
( slides | video )
- Mar. 15: Spring Break!
- Mar. 17: Spring Break!
- Mar. 22: Cost-Based-Optimization
( slides | video )
- Mar. 24: Transaction Correctness
( slides | videos )
- Mar. 29: Transactions-Locking
( slides | video )
- Mar. 31: Transactions-OCC, Versioning
( slides | video )
- Apr. 5: ARIES (Write-Ahead Logging, Undo-Logging, Recovery)
( slides | video )
- Apr. 7: Midterm 2 Review
( slides | video )
- Apr. 12: Midterm 2
- Apr. 14: Project 3 Review
( video )
- Apr. 19: Views
( slides | video )
- Apr. 21: Stream Queries
( slides )
- Apr. 26: Parallel Data
( slides )
- Apr. 28: Parallel Joins
( slides )
- May 3: Parallel Updates
( slides )
- May 5: Final Review
Content Outline
- Project 0 - Basic Setup
- Project 1 - Infrastructure & Query Evaluation
- Relational Algebra (Ch 2.4, 5.1)
- SQL (Ch 2.3, 6.1-6.4 and 16.1)
- Query Compilers (Ch 15.1-15.3, 16.1, 16.3)
- Data Modeling (Ch 2.1-2.2)
- Project 2 - Optimization & External Algorithms
- Algebraic Query Optimization (Ch 16.2)
- Join Algorithms (Ch 15.4, 15.5)
- Extended Relational Algebra (Ch 5.2)
- Buffering & External Algorithms (Ch 15.7-15.8)
- Physical Plans (Ch 16.7)
- Project 3 - Indexing & Physical Layout
- The Memory Hierarchy (Ch 13.1-13.3)
- Physical Design (Ch 13.5-13.7)
- Indexing (Ch 8.3, 14.1-14.4)
- Materialized Views (Ch 8.1-8.2, 8.5)
- Cost-Based Optimization (Ch 8.4, 16.4-16.6)
- Concepts (No Project)
- Failure Recovery (Ch 13.4, 19.1, 19.3)
- Updating Data (Ch 6.5, 13.8)
- Transactions (Ch 6.6, 18.1-18.2)
- Locking (Ch 18.3-18.7)
- Deadlocks (Ch 19.2)
- Lock-free Concurrency (Ch 18.8-18.9)
- Distributed Data Management (Ch 20)
- Uncertain Data Management
- Time permitting, other subjects will also be covered.
Assignment Submission
Homeworks will be collected in recitation on the day they are due, or by email to the instructor and TA(s) before class begins if you are unable to attend class. Late homeworks will not be accepted and will receive a grade of 0. Note that your lowest 2 homework grades will be dropped.
Projects are submitted through the online submission system using GIT. You may submit your assignments to be graded many times as you like. Late submissions for Checkpoint 0 will not be accepted. Submissions for checkpoints 1-3 will receive a 3/15 point penalty per day late. Late penalties are per-submission. Your group's grade is the highest of all grades received by your group for the project, so your grade can never decrease from additional submissions (even if they're late). Don't be afraid to experiment with new ideas once you've gotten a grade you're happy with!
Letter Grades
You will get a(n)... |
A |
A- |
B+ |
B |
B- |
C+ |
C |
C- |
D |
F |
If your number grade is at least... |
93 |
90 |
87 |
83 |
80 |
77 |
73 |
70 |
60 |
$-\infty$ |
But less than... |
$\infty$ |
93 |
90 |
87 |
83 |
80 |
77 |
73 |
70 |
59 |
Attendance
I see attendance as a privilege rather than a requirement: You (or your parents, or your scholarship, etc...) are paying me so that you can interrupt me with a question if when I say something stupid or confusing in class. There is no mandatory attendance. However, if you ask me a question by email or during office hours that I already addressed to the class's satisfaction, I reserve the right to tease you mercilessly.
That all being said, I will try to post lecture videos online so that you can catch up on what you missed.
Academic Integrity
Students may discuss and advise one another on their projects, but groups are expected to turn in their own work. Discussing concepts is permitted. Referencing code (e.g., from another group, or from stack overflow or github) is not. When in doubt, ask a TA or the instructor. Violations typically result in an F grade in the course for all students involved. The Department's policy on academic integrity can be reviewed at:
UB-CSE's Academic Integrity Policy
Medical Emergencies
Accommodations for medical emergencies will be made on a case-by-case basis. Requests for extensions based on medical emergencies must be accompanied by documentation of the emergency from student health services:
Student Health Services
Accessibility Resources
If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, please advise the instructor and contact the Office of Accessibility Resources during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations. The Office of Accessibility Resources can be reached at:
The Office of Accessibility Resources |
(716) 645-2608 | accessibility@buffalo.edu
Note that OAR gets extremely busy towards the end of the term. If you anticipate needing special testing accomodations, please make sure to contact them 2-4 weeks in advance.