Showcase - CMU 15-721 :: Advanced Database Systems (Spring 2019)

LLVM Execution Engine ★ Voted Best Project ★

Students: Amadou Ngom, Katia Villevald, Tanuj Nayak, Wan Shen Lim
Source Code: https://github.com/cmu-db/terrier/pull/371

This project aims to integrate the new LLVM-based execution engine with CMU's new yet-to-be-named DBMS. We implemented a translation layer that converts a physical plan tree into our database-oriented domain specific language. We then extended the execution engine to add new functionality, such as index and nested loop joins.

Settings Manager ★ Voted Best Project ★

Students: Weichen Ke, Yuze Liao, Wenhao Huang
Source Code: https://github.com/cmu-db/terrier/pull/336

This project implements a new centralized SettingsManager component for the DBMS. It is designed to be part of the DBMS's overall self-driving architecture. Our implementation parses command-line arguments and config file through gflags, provides programmatic interface for accessing setting parameters, and provides macros for defining new parameters. If setting value is modified, then the DBMS enforces parameter constraints (mutability/type/range), keeps track of action state, and invokes corresponding callback function.

[PRESENTATION] Interval Garbage Collector

Interval Garbage Collector ★ Voted Best Project ★

Students: Huzaifa Abbasi, Pulkit Agarwal, Utkarsh Agarwal
Source Code: https://github.com/cmu-db/terrier/pull/340

This project implements a SAP HANA-style interval garbage collection. The current system uses delta records, unlike HANA which has append-only newest-to-oldest version chains. The use of delta records means that there is no single record version that is the source of truth for any transaction, and instead, deltas occurring after that transaction started have to be applied to get the correct version. This prohibits the collection of tuple versions on the version chain. So instead, we implemented interval compaction on top of the earlier garbage collection mechanism where undo records in an interval are compacted to create a new undo record. While performing the interval compaction, we unlink the undo records. The records are deallocated later after ensuring that no transaction is able to access them. The interval compaction ensures that version chains are small and that time to compute versions after applying delta changes is low.

Rule-based Query Rewriter

Students: Erik Sargent, Newton Xie, William Zhang
Source Code: https://github.com/cmu-db/peloton/pull/1496

This project aims at incorporating a cost-less rule-based query rewriter into the Peloton system. This project implements the core framework, which leverages the existing Cascades-based optimizer, required for rule-based query rewriting. In this implementation, we have already implemented rules for short circuiting, simplifying comparators, transitivity, and reducing based on column nullability information from the catalog.

Metrics Collection

Students: Wenxuan Qiu, Qidu He, Dongsheng Yang
Source Code: https://github.com/cmu-db/terrier/pull/338

This project aims to build an infrastructure for collecting internal statistics during execution in the CMU DBMS. This infrastructure allows data to be collected at multiple levels including database, transaction, and tuple levels.The collected statistics are stored in SQL tables, which can be accessed from the catalog. Statistics collected are expected to be used to achieve autonomous self-driving.

Non-Blocking Schema Change

Students: Yash Nannapaneni, Sai Kiriti Badam, Mister X
Source Code: https://github.com/cmu-db/terrier/pull/342

This project implements a lazy non-blocking backend for schema changes, that migrates a tuple to the new version only upon updates to any of the new columns.

Add/Drop Index

Students: Jiaqi Zuo, Yesheng Ma, Xueyuan Zhao
Source Code: https://github.com/cmu-db/terrier/pull/337

This project adds supports for creating and deleting indexes transactionally in both blocking and non-blocking manner. Non-blocking manner means creating/deleting does not block any modification on indexed attributes during the process while the consistency is still guaranteed.

Checkpoints & Recovery

Students: Zhaozhe Song, Yuning Zhang, Mengyang Lyu
Source Code: https://github.com/cmu-db/terrier/pull/341

This project implements consistent complete checkpointing and adds the support for recovering tables from write-ahead logs and checkpoints. Checkpoints are packed in the format of checkpoint pages and written out asynchronously in the SiloR manner. All column types are supported, including variable length values that may be not inlined in the tuple itself.