Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Robust Querying for Data Analysis and Processing

Robust Querying for Data Analysis and Processing

File(s)
Wei_cornellgrad_0058F_14345.pdf (3.4 MB)
Permanent Link(s)
https://doi.org/10.7298/hfbv-jj91
https://hdl.handle.net/1813/116614
Collections
Cornell Theses and Dissertations
Author
Wei, Ziyun
Abstract

Traditionally, formal languages such as SQL have been used by users for data analysis. However, these interfaces are not easily accessible to lay users without an IT background. This has led to the emergence of novel interfaces such as visual and natural language query interfaces. While these interfaces democratize data access, they can introduce ambiguities in understanding the user's query intent in the frontend. Furthermore, efficiently executing these queries in the backend is not a straightforward task. Traditional systems typically optimize the query execution by selecting a plan based on analytical cost models. However, these models can lead to suboptimal choices due to statistical inaccuracies. This thesis focuses on developing a robust data analysis platform that addresses these issues by making multiple query and plan choices instead of using a single one. In this thesis, I introduce three systems that facilitate robust data analysis and processing. The first system, MUVE, enables natural language queries through typed or voice input. It provides users with alternative query interpretations and optimizes visual output to minimize the time required to identify the correct results. The second system, SkinnerMT, parallelizes adaptive query processing to improve efficiency and robustness. It utilizes different parallel methods, allocating threads for plan searching or execution on data partitions. The third system, ROME, strategically selects complementary plans for concurrent execution, increasing the likelihood of incorporating an optimal plan. These systems contribute to the robustness of interactive data analysis systems by optimally selecting queries and plans from both the frontend and backend.

Description
171 pages
Date Issued
2024-08
Committee Chair
Trummer, Immanuel
Committee Member
Banerjee, Siddhartha
Sun, Wen
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/16611844

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance