LLVM Execution Engine ★ Voted Best Project ★
Students: Amadou Ngom, Katia Villevald, Tanuj Nayak, Wan Shen LimSource Code: https://github.com/cmu-db/terrier/pull/371
This project aims to integrate the new LLVM-based execution engine with CMU's new yet-to-be-named DBMS. We implemented a translation layer that converts a physical plan tree into our database-oriented domain specific language. We then extended the execution engine to add new functionality, such as index and nested loop joins.
Settings Manager ★ Voted Best Project ★
Students: Weichen Ke, Yuze Liao, Wenhao HuangSource Code: https://github.com/cmu-db/terrier/pull/336
This project implements a new centralized SettingsManager component for the DBMS. It is designed to be part of the DBMS's overall self-driving architecture. Our implementation parses command-line arguments and config file through gflags, provides programmatic interface for accessing setting parameters, and provides macros for defining new parameters. If setting value is modified, then the DBMS enforces parameter constraints (mutability/type/range), keeps track of action state, and invokes corresponding callback function.
Interval Garbage Collector ★ Voted Best Project ★
Students: Huzaifa Abbasi, Pulkit Agarwal, Utkarsh AgarwalSource Code: https://github.com/cmu-db/terrier/pull/340
This project implements a SAP HANA-style interval garbage collection. The current system uses delta records, unlike HANA which has append-only newest-to-oldest version chains. The use of delta records means that there is no single record version that is the source of truth for any transaction, and instead, deltas occurring after that transaction started have to be applied to get the correct version. This prohibits the collection of tuple versions on the version chain. So instead, we implemented interval compaction on top of the earlier garbage collection mechanism where undo records in an interval are compacted to create a new undo record. While performing the interval compaction, we unlink the undo records. The records are deallocated later after ensuring that no transaction is able to access them. The interval compaction ensures that version chains are small and that time to compute versions after applying delta changes is low.
Rule-based Query Rewriter
Students: Erik Sargent, Newton Xie, William ZhangSource Code: https://github.com/cmu-db/peloton/pull/1496
This project aims at incorporating a cost-less rule-based query rewriter into the Peloton system. This project implements the core framework, which leverages the existing Cascades-based optimizer, required for rule-based query rewriting. In this implementation, we have already implemented rules for short circuiting, simplifying comparators, transitivity, and reducing based on column nullability information from the catalog.
Metrics Collection
Students: Wenxuan Qiu, Qidu He, Dongsheng YangSource Code: https://github.com/cmu-db/terrier/pull/338
This project aims to build an infrastructure for collecting internal statistics during execution in the CMU DBMS. This infrastructure allows data to be collected at multiple levels including database, transaction, and tuple levels.The collected statistics are stored in SQL tables, which can be accessed from the catalog. Statistics collected are expected to be used to achieve autonomous self-driving.
Non-Blocking Schema Change
Students: Yash Nannapaneni, Sai Kiriti Badam, Mister XSource Code: https://github.com/cmu-db/terrier/pull/342
This project implements a lazy non-blocking backend for schema changes, that migrates a tuple to the new version only upon updates to any of the new columns.
Add/Drop Index
Students: Jiaqi Zuo, Yesheng Ma, Xueyuan ZhaoSource Code: https://github.com/cmu-db/terrier/pull/337
This project adds supports for creating and deleting indexes transactionally in both blocking and non-blocking manner. Non-blocking manner means creating/deleting does not block any modification on indexed attributes during the process while the consistency is still guaranteed.
Checkpoints & Recovery
Students: Zhaozhe Song, Yuning Zhang, Mengyang LyuSource Code: https://github.com/cmu-db/terrier/pull/341
This project implements consistent complete checkpointing and adds the support for recovering tables from write-ahead logs and checkpoints. Checkpoints are packed in the format of checkpoint pages and written out asynchronously in the SiloR manner. All column types are supported, including variable length values that may be not inlined in the tuple itself.