Jan 16, 2023 |
Martin Luther King Day — No Class |
|
|
|
Jan 18, 2023 |
#01 — History of Databases
- M. Stonebraker, et al., What Goes Around Comes Around, in Readings in Database Systems, 4th Edition, 2006
(Optional)
- M. Stonebraker, et al., What Goes Around Comes Around... And Around (
CMU Only
), 2023
(Optional)
|
|
— |
|
Jan 23, 2023 |
#02 — Modern Analytical Database Systems (No In-Class Lecture)
|
|
— |
|
Jan 25, 2023 |
Legal Problems — No Class |
|
|
|
Jan 30, 2023 |
#03 — Storage Models & Data Layout
|
|
— |
|
Feb 01, 2023 |
#04 — OLAP Indexes
- B. Hentschel, et al., Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation, in SIGMOD, 2018
- C.Y. Chan, et al., Bitmap Index Design and Evaluation, in SIGMOD, 1998
(Optional)
- P.-A. Larson, et al., SQL Server Column Store Indexes, in SIGMOD, 2011
(Optional)
- L. Sidirourgos, et al., Column Imprints: A Secondary Index Structure, in SIGMOD, 2013
(Optional)
- J. Rao, et al., Cache Conscious Indexing for Decision-Support in Main Memory, in VLDB, 1999
(Optional)
- Y. Li, et al., BitWeaving: Fast Scans for Main Memory Data Processing, in SIGMOD, 2013
(Optional)
|
|
— |
|
Feb 06, 2023 |
#05 — Database Compression
- D. Abadi, et al., Integrating Compression and Execution in Column-Oriented Database Systems, in SIGMOD, 2006
- C. Binnig, et al., Dictionary-based Order-preserving String Compression for Main Memory Column Stores, in SIGMOD, 2009
(Optional)
- I. Müller, et al., Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems, in EDBT, 2014
(Optional)
- V. Raman, et al., How to Wring a Table Dry: Entropy Compression of Relations and Querying of Compressed Relations, in VLDB, 2006
(Optional)
- C. Liu, et al., Mostly Order Preserving Dictionaries, in ICDE, 2019
(Optional)
|
|
— |
|
Feb 08, 2023 |
#06 — Query Execution & Processing
- P. Boncz, et al., MonetDB/X100: Hyper-Pipelining Query Execution, in CIDR, 2005
- L. Shrinivas, et al., Materialization Strategies in the Vertica Analytic Database: Lessons Learned, in ICDE, 2013
(Optional)
- M. Kester, et al., Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe?, in SIGMOD, 2017
(Optional)
|
|
— |
|
Feb 13, 2023 |
#07 — Query Scheduling
- V. Leis, et al., Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age, in SIGMOD, 2014
- I. Psaroudakis, et al., Scaling Up Concurrent Main-Memory Column-Store Scans: Towards Adaptive NUMA-aware Data and Task Placement, in VLDB, 2015
(Optional)
- I. Psaroudakis, et al., Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads, in ADMS, 2013
(Optional)
- B. Wagner, et al., Self-Tuning Query Scheduling for Analytical Workloads, in SIGMOD, 2021
(Optional)
|
|
— |
|
Feb 15, 2023 |
#08 — Vectorized Execution
- H. Lang, et al., Make the Most out of Your SIMD Investments: Counter Control Flow Divergence in Compiled Query Pipelines, in VLDB Journal, 2020
- O. Polychroniou, et al., Rethinking SIMD Vectorization for In-Memory Databases, in SIGMOD, 2015
(Optional)
- T. Willhalm, et al., SIMD-scan: Ultra Fast In-memory Table Scan using On-chip Vector Processing Units, in VLDB, 2009
(Optional)
- P. Menon, et al., Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last, in VLDB, 2017
(Optional)
|
|
— |
|
Feb 20, 2023 |
#09 — Query Compilation
- T. Neumann, Efficiently Compiling Efficient Query Plans for Modern Hardware, in VLDB, 2011
- K. Krikellas, et al., Generating Code for Holistic Query Evaluation, in ICDE, 2010
(Optional)
- H. Pirk, et al., CPU and Cache Efficient Management of Memory-Resident Databases, in ICDE, 2013
(Optional)
- B. Raducanu, et al., Micro Adaptivity in Vectorwise, in SIGMOD, 2013
(Optional)
- A. Shaikhha, et al., How to Architect a Query Compiler, in SIGMOD, 2016
(Optional)
- A. Kohn, et al., Adaptive Execution of Compiled Queries, in ICDE, 2018
(Optional)
|
|
— |
|
Feb 22, 2023 |
#10 — Vectorization vs. Compilation
- T. Kersten, et al., Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask, in VLDB, 2018
- J. Sompolski, et al., Vectorization vs. Compilation in Query Execution, in DaMoN, 2011
(Optional)
- H. Lang, et al., Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation, in SIGMOD, 2016
(Optional)
|
|
— |
|
Feb 27, 2023 |
#11 — Parallel Join Algorithms (Hashing)
- S. Schuh, et al., An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory, in SIGMOD, 2016
- S. Richter, et al., A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing, in VLDB, 2015
(Optional)
- S. Blanas, et al., Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs, in SIGMOD, 2011
(Optional)
- C. Balkesen, et al., Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware, in ICDE, 2013
(Optional)
- M. Bandle, et al., To Partition, or Not to Partition, That is the Join Question in a Real System, in SIGMOD, 2021
(Optional)
|
|
— |
|
Mar 01, 2023 |
Student Project Proposals |
— |
— |
— |
Mar 06, 2023 |
Spring Break — No Class |
|
|
|
Mar 08, 2023 |
Spring Break — No Class |
|
|
|
Mar 13, 2023 |
#12 — Parallel Join Algorithms (Sorting)
- C. Balkesen, et al., Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited, in VLDB, 2013
- C. Kim, et al., Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs, in VLDB, 2009
(Optional)
- G. Graefe, et al., Sort vs. Hash Revisited, in TKDE, 1994
(Optional)
- M.-C. Albutiu, et al., Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems, in VLDB, 2012
(Optional)
|
|
— |
|
Mar 15, 2023 |
#13 — Multi-Way Joins
- M. Freitag, et al., Adopting Worst-Case Optimal Joins in Relational Database Systems, in VLDB, 2020
- H.Q. Ngo, et al., Skew Strikes Back: New Developments in the Theory of Join Algorithms, in SIGMOD Record, 2013
(Optional)
- C. Aberger, et al., LevelHeaded: A Unified Engine for Business Intelligence and Linear Algebra Querying, in ICDE, 2018
(Optional)
|
|
— |
|
Mar 20, 2023 |
#14 — Server-side Logic Execution
- K. Ramachandra, et al., Froid: Optimization of Imperative Programs in a Relational Database, in VLDB, 2017
- S. Gupta, et al., Aggify: Lifting the Curse of Cursor Loops using Custom Aggregates, in SIGMOD, 2020
(Optional)
- C. Duta, et al., Compiling PL/SQL Away, in CIDR, 2020
(Optional)
- S. Gupta, et al., Procedural Extensions of SQL: Understanding Their Usage in the Wild, in VLDB, 2021
(Optional)
- C. Duta, et al., Functional-Style SQL UDFs With a Capital ‘F’, in SIGMOD, 2020
(Optional)
|
|
— |
|
Mar 22, 2023 |
#15 — Networking Protocols
- M. Raasveldt, et al., Don't Hold My Data Hostage: A Case for Client Protocol Redesign, in VLDB, 2017
- F. Li, et al., Accelerating Relational Databases by Leveraging Remote Memory and RDMA, in SIGMOD, 2016
(Optional)
- F. Binnig, et al., The End of Slow Networks: It's Time for a Redesign, in VLDB, 2016
(Optional)
|
|
— |
|
Mar 27, 2023 |
#16 — Optimizer Implementation (Overview)
- S. Chaudhuri, An Overview of Query Optimization in Relational Systems, in PODS, 1998
- G. Graefe, et al., The Volcano Optimizer Generator: Extensibility and Efficient Search, in ICDE, 1993
(Optional)
- G. Graefe, The Cascades Framework for Query Optimization, in IEEE Data Engineering Bulletin, 1995
(Optional)
- M.A. Soliman, et al., Orca: A Modular Query Optimizer Architecture for Big Data, in SIGMOD, 2014
(Optional)
- L.D. Shapiro, et al., Exploiting Upper and Lower Bounds In Top-Down Query Optimization, in IDEAS, 2001
(Optional)
|
|
— |
|
Mar 29, 2023 |
#17 — Optimizer Implementation (Top-Down vs. Bottom-Up)
- Yongwen Xu, Efficiency in the Columbia Database Query Optimizer (pages 1-35), in Portland State University, 1998
- J. Chen, et al., The MemSQL Query Optimizer, in VLDB, 2017
(Optional)
- G. Moerkotte, et al., Dynamic Programming Strikes Back, in SIGMOD, 2008
(Optional)
- T. Neumann, et al., The Complete Story of Joins (in HyPer), in BTW, 2017
(Optional)
- E. Begoli, et al., Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources, in SIGMOD, 2018
(Optional)
|
|
— |
|
Apr 03, 2023 |
#18 — Cost Models
- V. Leis, et al., How Good are Query Optimizers, Really?, in VLDB, 2015
- M. Stillger, et al., LEO - DB2's LEarning Optimizer, in VLDB, 2001
(Optional)
- Z. Yang, et al., Deep Unsupervised Cardinality Estimation, in VLDB, 2019
(Optional)
- J. Sun, et al., An End-to-End Learning-based Cost Estimator, in VLDB, 2019
(Optional)
- D. Vengerov, et al., Join Size Estimation Subject to Filter Conditions, in VLDB, 2015
(Optional)
- Y. Chen, et al., Two-Level Sampling for Join Size Estimation, in SIGMOD, 2017
(Optional)
|
|
— |
|
Apr 05, 2023 |
Student Project Updates |
— |
— |
— |
Apr 10, 2023 |
#19 — System Analysis (Dremel / BigQuery)
|
|
— |
|
Apr 12, 2023 |
#20 — System Analysis (Databricks / Spark)
- A. Behm, et al., Photon: A Fast Query Engine for Lakehouse Systems, in SIGMOD, 2022
- P. Jain, et al., Analyzing and Comparing Lakehouse Storage Systems, in CIDR, 2023
(Optional)
- M. Armbrust, et al., Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores, in VLDB, 2020
(Optional)
|
|
— |
|
Apr 17, 2023 |
#21 — System Analysis (Snowflake)
|
|
— |
|
Apr 19, 2023 |
#22 — System Analysis (DuckDB) (Guest Speaker: Mark Raasveldt)
|
|
— |
|
Apr 24, 2023 |
#23 — System Analysis (Velox)
|
|
— |
|
Apr 26, 2023 |
#24 — System Analysis (Amazon Redshift) (Guest Speaker: Ippokratis Pandis (PhD'07))
|
— |
— |
|