Common Table Expressions
Students: Preetansh Goyal, Rohan Aggarwal, Gautam JainSource Code: https://github.com/cmu-db/terrier/pull/885
This project adds support for Common Table Expressions (CTEs). The advantage of CTEs is that they enable the DBMS to store the output of a complex query in a temporary table that can be directly used by many nodes elsewhere in the physical plan. To tackle the different points of access to the complex query, we extended the query optimizer to choose the most appropriate plan. We also added the support for the creation, population, and destruction of a temporary table structure in the execution engine.
Checkpoints & Recovery
Students: Xuanxuan Ge, Kaige Liu, Tianlei PanSource Code: https://github.com/cmu-db/terrier/pull/857
We proposed and implemented a checkpoint recovery mechanism that builds upon the existing log recovery infrastructure. The checkpoint contains information of all the tables, which can be used to directly recover the database to its previous state without replaying the previous logs one by one. The database system finds the most recent checkpoint and applies log records to reach the full state. Our implementation provides 3x faster recovery speed in benchmarks and scales linearly with table size and transaction numbers.
Optimizer Cost Model
Students: Masha Oreshko, Kevin Geng, Vivian HuangSource Code: https://github.com/cmu-db/terrier/pull/889
This project implements the ANALYZE operator and a new cost model. ANALYZE uses existing analytic functions to calculate useful statistics for tables, which are then stored. The cost model which calculations are based on Postgres's cost model, uses the results from ANALYZE to primarily decide between joins.
Nested Queries
Students: Xinzhu Cai, Guancheng Li, Ian RominesSource Code: https://github.com/cmu-db/terrier/pull/864
This provides support for nested queries in the DBMS. It implements unnesting algorithms as rewriting rules in the query optimizer and new operators in the query execution engine. Four types of query nesting are supported.
NUMA-Aware Thread Pool
Students: Emmanuel Eppinger, Deepayan Patra, Ricky ZhouSource Code: https://github.com/cmu-db/terrier/pull/851
This project implements a NUMA-aware, resumable task execution framework for CMU's self-driving database. Our implementation includes a thread pool that allows for tasks to be executed on cores local to the data accessed, and integrates with system's latches to allow for tasks to be paused and later resumed, preventing busy-waiting. Our benchmarks observed performance improvements of up to 600x on worst-case high-contention workloads, and general-case performance improvements of 3-4x.
Schema Changes
Students: Sheng Xu, Ling Zhang, Ricky XuSource Code: https://github.com/cmu-db/terrier/pull/862
This project adds the schema change functionality and ALTER TABLE command to the DBMS. We adapted the whole pipeline (parser, binder, optimizer, planner, execution, and storage) to support simple ALTER TABLES commands such as add and drop column. We implemented schema change in a non-blocking manner, allowing for arbitrary number of schema changes that are executed concurrently with sql queries on the same table. We also implemented schema change in a lazy manner, by maintaining seperate physical tables for each schema version, and only migrating tuples to the latest version when they are updated. We wrote benchmarks that show concurrent scheme changes slow down sql queries on the same table by at most 10%.
Multi-threaded Queries
Students: Yinuo Pan, Zecheng He, Yuhong ZhangSource Code: https://github.com/cmu-db/terrier/pull/860
The project works on providing intra-operator multi-threaded functionalities starting with the sequential scan for CMU's database terrier. The new execution engine can generate tpl codes which invokes multiple threads to concurrently produce scan outputs for the upper-level operators within its pipeline.
Apache Arrow Block Compaction
Students: Vilas Bhat, Arvind Sai Krishnan, Abhijith Anilkumar, Paulina DavisonSource Code: https://github.com/cmu-db/terrier/pull/858
This project adds index updates to the block compaction process and support the ability of the execution engine to operate on compressed data. To do so, it migrates core block compactor functionality from directly modifying data table blocks to accessing data table functions through the execution engine using the DBMS's internal DSL.
Sequences
Students: Tianhan Hu, Zian Ke, Adrian Lo-Yu ChangSource Code: https://github.com/cmu-db/terrier/pull/897
This project adds the implementation of Postgres-style sequence to the CMU DBMS. It supports nextval and currval, implemented as built-in functions, with session-specific behaviors. Parameters like range and increment values can also be specified on a sequence.
Add/Drop Indexes
Students: Kunal Jobanputra, Alex Stanescu, Cal LavickaSource Code: https://github.com/cmu-db/terrier/pull/863
This project implements CREATE INDEX, which entailed adding, populating, and/or dropping the index in a transactionally consistent manner. Transactions that attempt to modify the table as the index is building using a sequential scan are blocked.
Constraints
Students: Yingjing Lu, Wuwen Wang, Yi ZhouSource Code: https://github.com/cmu-db/terrier/pull/861
This project implements PRIMARY KEY, UNIQUE, and FOREIGN KEY functionality for the new tin-memory DBMS.Out implementation supports constraint definition at table creation time for single column and multi column constraints. Constraints are verified during INSERT, DELETION, and UPDATE operations. There is additional support for CASCADE propagation per constraint definition. We also added future expandability for other constraints, such as EXCLUSION and CHECK.