Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Applied Machine Learning and Bioinformatics Methods for Advanced Single-Cell Metabolic and Taxonomic Analysis for Environmental Applications

Applied Machine Learning and Bioinformatics Methods for Advanced Single-Cell Metabolic and Taxonomic Analysis for Environmental Applications

Access Restricted

Access to this document is restricted. Some items have been embargoed at the request of the author, but will be made publicly available after the "No Access Until" date.

During the embargo period, you may request access to the item by clicking the link to the restricted file(s) and completing the request form. If we have contact information for a Cornell author, we will contact the author and request permission to provide access. If we do not have contact information for a Cornell author, or the author denies or does not respond to our inquiry, we will not be able to provide access. For more information, review our policies for restricted content.

File(s)
Li_cornellgrad_0058F_14287.pdf (9.26 MB)
No Access Until
2026-06-17
Permanent Link(s)
https://doi.org/10.7298/ry1a-e672
https://hdl.handle.net/1813/115950
Collections
Cornell Theses and Dissertations
Author
Li, Guangyu
Abstract

Single-cell technology has emerged as a promising tool for high-resolution and fundamental studies in environmental microbiology, surpassing traditional cultivation-based and bulk-measurement methods. With the advancement of machine-learning methods, the taxonomic, metabolic, and functional analysis of datasets generated by single-cell technology has reached unprecedented levels of scale and efficiency. In this dissertation, several methods and pipelines integrating machine-learning and bioinformatics methods have been proposed to enhance the sampling size, taxonomy, and metabolic function analysis of environmental datasets obtained through single-cell technologies. First, a sampling size assessment protocol was developed that does not require prior knowledge of population sizes, designed specifically for single-cell-based sampling from large communities like environmental microbial communities. This protocol aims to standardize sampling size assessments across all single-cell technologies, replacing conventional empirical estimations. Second, a standardized pipeline for single-cell Raman spectroscopy (SCRS) classification was developed, suitable for environmental applications, to unveil biochemical fingerprints linked to taxonomic and cell age differentiation. Third, an improved agent-based metabolic simulation model was developed to incorporate cell state heterogeneities, metabolic pathway switching, and metabolic phenotypes, providing unprecedented resolution for investigating phenotype-based microbial interactions, validated using single-cell phenotypic survey datasets such as SCRS. Fourth, a pipeline was applied to investigate microdiversity from 16S rRNA amplicon sequencing datasets, resolving operational taxonomic units (OTUs) into subgenus-level taxa. These resolved taxa offer better resolution for single-cell technologies and unveil co-occurrence patterns among similar environments, enhancing our understanding of microbial interactions. Lastly, metagenomic analysis was applied to compare taxonomic, functional, and core marker gene distinctions between Enhanced Biological Phosphorus Removal (EBPR) and Side-Stream EBPR systems. The results indicated that fine-scale microdiversity is more crucial than overall functional profiling and highlighted knowledge gaps regarding novel species of core functional organisms.

Description
271 pages
Date Issued
2024-05
Keywords
Computational Biology
•
Environmental Engineering
•
Environmental Microbiology
•
Machine Learning
•
Single-cell Phenotype
•
Single-cell Raman Spectroscopy
Committee Chair
Gu Leip, April
Committee Member
Weinberger, Kilian
Giometto, Andrea
Degree Discipline
Civil and Environmental Engineering
Degree Name
Ph. D., Civil and Environmental Engineering
Degree Level
Doctor of Philosophy
Rights
Attribution-ShareAlike 4.0 International
Rights URI
https://creativecommons.org/licenses/by-sa/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/16575598

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance