APPLICATION OF RAPID DIAGNOSTICS AND NEXT-GENERATION AMPLICON SEQUENCING TO ADDRESS PRODUCT QUALITY IN THE FOOD INDUSTRY A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Jonathan Harold Sogin May 2023 © 2023 Jonathan Harold Sogin APPLICATION OF RAPID DIAGNOSTICS AND NEXT-GENERATION AMPLICON SEQUENCING TO ADDRESS PRODUCT QUALITY IN THE FOOD INDUSTRY Jonathan Harold Sogin, Ph.D. Cornell University 2023 The cost of foodborne illness and food loss to the United States (US) economy is approximately $300 billion per year. This number monetarily quantifies the almost 10 million individuals who contract a foodborne illness and 141 trillion kcal of disposed food, enough to feed half the US population, each year. Despite the US having one of the safest and highest quality food supplies in the world, these figures illustrate the need to improve upon food safety and quality within the US supply chain and globally. Increasing consumer desires for minimally processed and clean label foods contributes to the challenge of addressing this need, as proven processing methods and food additives are being substituted or otherwise avoided/removed. To improve food safety and quality, food producers are seeking novel processing methods and additives to eliminate or kill foodborne pathogens and prevent the growth of spoilage microorganisms. Additional strategies to accomplish these goals include improvements to hygiene management within food processing facilities, and optimization of fermentation conditions or cultures as determined by an improved understanding of the microbial communities involved in commercial food fermentations. This dissertation presents examples of these strategies within the context of tools usage, namely adenosine triphosphate (ATP) monitoring of a food processing environment and next-generation amplicon sequencing of commercial tempeh and kombucha. The following studies demonstrate that the application of these tools is useful within the context of improving food product quality. BIOGRAPHICAL SKETCH Jonathan Harold Sogin was born in Minneapolis, MN to Emily Elizabeth Duke and Daniel Aaron Sogin, and grew up just across the Mississippi River in St. Paul, MN. Jonathan received his Bachelor of Science degree in Food Science from the University of Wisconsin–Madison in 2018, where he developed a keen interest in Food Microbiology. He began his studies at Cornell in the fall of 2018 and has since worked on a variety of projects related to fermentation, food quality, and food safety. He received his Master of Science degree from Cornell University in the spring of 2021 during the completion of this Ph.D. iv To my family, friends, and cats, who supported me through thick and thin, and To all the individuals who support the operations of the University, which allowed me to learn and do work throughout my studies at Cornell. v ACKNOWLEDGMENTS “It takes a village to raise a child Ph.D.” I can say without a doubt I would not have finished this degree without the patience, support, and encouragement of my parents, Emily Duke and Daniel Sogin; sisters, Hallie and Jennifer Sogin; partner, Caitlyn Berk; friends, Caitlin Carmody and Amanyi Richardson; and advisor, Randy Worobo. Each of these individuals, and collectively the core of my ‘village’, has celebrated with me in good times and been a refuge for me in tough times. Although it’s somewhat cliché to say that my Ph.D. progressed atypically, I feel justified in this assessment. First came COVID, and then came ACL reconstruction surgery with a side of complications which ultimately landed me on a continuous self-administered intravenous infusion of 12 g ampicillin daily for 30 days – for reference, that’s a boatload of ampicillin. Luckily, I have no lasting complications from either of these impositions, but in those moments and the many months that followed, I considered tapping out of this degree many many times. Luckily, I had a good village. In addition to the core of my village, there are many people at Cornell who helped me along the way, including Linda Cote and Peter Schweitzer at the Cornell Institute of Biotechnology; the other members of my special committee Patrick Gibney and Daniel Buckley; and the other members of the Worobo lab, who were almost always in good spirits and created an environment that was supportive and collaborative. Each of these individuals were genuinely nice and helpful, even when I vi didn’t know what I was doing – and never made me feel bad about it. A special shoutout goes to Dwayne Bershaw, as I had an incredibly fun time as his teaching assistant for the beers and distilled spirits lectures and labs, and would otherwise not have spent any time in the Cornell Food Science Department’s teaching winery. I’d also like to acknowledge the staff at Cornell, whose names I largely do not know, but whose work keeps the place running. These are the individuals that keep the lights on and the water flowing, fix things when they’re broken, salt the sidewalks in the winters, and do the many other things that keep Cornell operational, even through a pandemic. As much as I would not have reached this point without my core village, I would not have reached this point without the hard and often unappreciated work of these individuals. Finally, I’d like to thank Randy for his unconditional academic support and appreciation for personal well-being. There are a lot of things that I could thank him for, but I will sum it up with this: he is a nice person, and he cares about people. I am extremely grateful for his advising over the past five years and hope that if I ever find myself in a similar position that I can pass along the same selflessness and support he has shown me. vii TABLE OF CONTENTS BIOGRAPHICAL SKETCH III ACKNOWLEDGMENTS V TABLE OF CONTENTS VII LIST OF FIGURES VIII LIST OF TABLES X CHAPTER 1 1 A BRIEF CONTEXTUALIZATION OF FOOD SAFETY AND QUALITY MANAGEMENT WITHIN THE UNITED STATES ECONOMY AND CONSUMER PREFERENCES FOR MINIMALLY PROCESSED FOODS REFERENCES 7 CHAPTER 2 10 IMPLEMENTATION OF ATP AND MICROBIAL INDICATOR TESTING FOR HYGIENE MONITORING IN A TOFU PRODUCTION FACILITY IMPROVES PRODUCT QUALITY AND HYGIENIC CONDITIONS OF FOOD CONTACT SURFACES: A CASE STUDY REFERENCES 37 CHAPTER 3 41 NEXT-GENERATION AMPLICON SEQUENCING ANALYSIS OF COMMERCIAL TEMPEH DURING FERMENTATION REFERENCES 73 APPENDIX 78 CHAPTER 4 82 MICROBIOLOGICAL AND PHYSIOCHEMICAL ANALYSIS OF COMMERCIAL KOMBUCHA PRODUCTS SOLD IN US RETAIL MARKETS REVEALS BRAND-DISTINCT PRODUCT PROFILES AND FURTHER INDICATES THE ROLE OF LACTIC ACID BACTERIA IN COMMERCIAL KOMBUCHA PRODUCTION REFERENCES 119 APPENDIX 125 CHAPTER 5 130 CONCLUDING THOUGHTS AND FUTURE DIRECTIONS REFERENCES 133 viii LIST OF FIGURES Figure 2-1. Conceptual overview of process to implement hygiene management 17 Figure 2-2. ATP swab failure rate over time 23 Figure 2-3. ATP and microbial swab failure rates by zone 25 Figure 2-4. Microbial loads of products before pasteurization 29 Figure 2-5. Microbial loads of products after pasteurization 29 Figure 3-1. Bacterial Family Relative Abundance 51 Figure 3-2. Bacterial alpha diversity comparisons 52 Figure 3-3. Bacterial Bray-Curtis NMDS ordination 54 Figure 3-4. Bacterial species PERMANOVA coefficients for product type 56 Figure 3-5. Core bacterial species and differential abundance analysis 57 Figure 3-6. Fungal Family Relative Abundance 61 Figure 3-7. Fungal alpha diversity comparisons 62 Figure 3-8. Fungal Bray-Curtis NMDS ordination 63 Figure 3-9. Fungal species PERMANOVA coefficients for product type 64 Figure 3-10. Core fungal species and differential abundance analysis 65 Figure 4-1. Chemical properties of kombucha samples 93 Figure 4-2. Chemical Mahalanobis NMDS ordination 94 Figure 4-3. Microbial loads of Kombucha samples plated on various media 95 Figure 4-4. Bacterial genera relative abundances 99 Figure 4-5. Bacterial Bray-Curtis NMDS ordination 100 ix Figure 4-6. Bacterial species PERMANOVA coefficients lactic acid content 102 Figure 4-7. Correlation between lactic acid and Lactobacillales relative abundance 103 Figure 4-8. Differentially abundant bacterial species between kombucha brands 105 Figure 4-9. Fungal species relative abundances 107 Figure 4-10. Fungal Bray-Curtis NMDS ordination 108 Figure 4-11. Differentially abundant fungal species between kombucha brands 110 Figure 4-12. Bacterial and fungal species cooccurrence network in kombucha 113 Appendix Figure 4-13. Correlation between ethanol content and relative sugar content versus the label 125 Appendix Figure 4-14. Correlation between ethanol content and pH 126 Appendix Figure 4-15. Fungal species relative abundance when using forward and reverse reads 126 Appendix Figure 4-16. Results of MaAsLin2 analysis of bacterial species associations with chemical and microbial load variables 127 Appendix Figure 4-17. Results of MaAsLin2 analysis of fungal species associations with chemical and microbial load variables 128 Appendix Figure 4-18. Differentially abundant bacterial species between kombucha samples due to the inclusion of ginger 129 x LIST OF TABLES Table 2-1. Swab failure rate by site 27 Table 3-1. Significantly differentially abundance core bacterial species 59 Table 3-2. Significantly differentially abundance core fungal species 67 Appendix Table 3-3. 16S primers used for Illumina sequencing 1st step PCR 78 Appendix Table 3-4. ITS primers used for Illumina sequencing 1st step PCR 79 Appendix Table 3-5. Alpha diversity metrics significance table 80 1 CHAPTER 1 A BRIEF CONTEXTUALIZATION OF FOOD SAFETY AND QUALITY MANAGEMENT WITHIN THE UNITED STATES ECONOMY AND CONSUMER PREFERENCES FOR MINIMALLY PROCESSED FOODS The Economics of Foodborne Illness and Food Loss Agriculture, food, and related industries contributed $1.25 trillion and 21.1 million jobs to the United States (US) economy in 2021, representing 5.4% of gross domestic product and 11% of the US workforce (1). Despite the scale of these industries and the long distances foods travel from farm to fork, which averages approximately 1020 miles (1640 km) (2), the US has one of the highest quality and safest food supplies in the world (3rd in Quality and Safety, 2022, Global Food Security Index) (3). The two countries with food supplies that ranked higher than the US for quality and safety in 2022 were Canada and Denmark, first and second (3), both which have populations at least 85% lower than the US (4). Food safety regulations that are reinforced by strong food safety culture (5) are primary contributors to the US’ success. Federal legislation has broadly required the safe production of foods in the US since congress enacted the Pure Food and Drug Act and Federal Meat Inspection Act in 1906 (6). Since those initial acts, many amendments and additional laws have been enacted, most recently with the passage of the Food Safety Modernization Act (FSMA), which was signed into law by President Barak Obama on January 4th, 2011, (7). The two major regulatory bodies responsible for food safety enforcement in the US are the Food Safety and Inspection Service (FSIS) and the Food and Drug 2 Administration (FDA). The FSIS regulates products containing meat, poultry, and eggs, which constitutes 10-20% of the US food supply, whereas the FDA regulates everything else, which constitutes 80-90% of the US food supply (8). Despite the US’ relatively safe food supply, a 2011 study estimated that 9.4 million illnesses, 56 thousand hospitalizations, and 1,400 deaths occur every year due to food-acquired bacterial, viral, and parasitic microorganisms (9). Most illnesses are attributed to viruses (5.5 million vs 3.6 million bacterial), whereas most hospitalizations and deaths are attributed to bacteria (9). The most common food source of bacterial pathogens is land animals, and the most common source of viral pathogens is plants (10). Monetary loss per illness ranges from $208 to $7.0 million, and the total annual cost of all foodborne illnesses is $36-56 billion (2015 estimates) (11, 12). Furthermore, the cost to a food producer implicated in selling products that caused a foodborne illness outbreak can reach hundreds of millions of dollars (13). Quality, in addition to safety, is critically important to food producers and consumers. If foods are deemed too poor quality to sell, which can occur for a variety of reasons, they are disposed of by landfilling, composting, anaerobic digestion, incineration, or other means (14). The disposal (loss) of food at the retail and consumer levels was 31% in the US in 2010, which equated to 67 million tons of food worth $162 billion (15). The energy and nutrient contents of food loss are substantial, representing 2% of the annual US energy consumption (16) and 141 trillion kcal (enough to feed 194 million people for a year at 2000 kcal/day) in 2010 (15). The top three categories of foods lost in 2010 were meat, poultry, and fish (30%), vegetables (19%), and dairy (17%) (15). 3 The combined economic loss due to foodborne illnesses and food loss is almost $300 billion every year (based on est. ~$56 b due to illnesses in 2015, and $162 b due to food loss in 2010, adjusted for inflation with consumer price index) (11, 12, 15). As the above data illustrates, individual companies, industries, and the government should be economically motivated to improve food safety and quality. Consumer Desires for Healthy and Minimally Processed Foods Food processing is broadly defined as the act of preparing or transforming a raw agricultural commodity for human consumption, which may improve food safety, shelf-life, availability, convenience, affordability, and nutrition (17, 18). It is not a uniquely human trait, as evidenced by the storage of nuts by grey squirrels (19) or the use of stone tools by Burmese long-tailed macaques to process marine prey (20), but specifically cooking food is unique to humans and may be biologically obligatory (21). Agriculture and food processing were fundamental to the growth of the world’s population and civilizations and will remain fundamental to sustain adequate nutrition to a growing world population (22). Counterproductive to improving food safety and quality, consumers are increasingly interested in purchasing ‘clean label’ or minimally processed foods (23). There is no regulatory definition of clean label, nor is there a consensus as to what clean label means (24). Clean label products are often characterized by co-occurring themes of sustainability, nutrition equity and justice, and health and wellness (23–25), but a commonality of most clean label products is an ingredients label with components that an average consumer can understand and do not sound ‘chemical-ly’. Reducing or eliminating chemically-sounding functional ingredients can 4 worsen sensory qualities of products, but those negative changes may be overlooked by consumers because the product is clean label (26). This same logic cannot be applied to food safety due to federal regulations and because consumers highly value food safety. Food producers struggle to substitute known and effective food antimicrobials such as sodium benzoate and sodium nitrite to prevent the growth of pathogenic and spoilage causing microorganisms (27) but may attempt to do so with naturally occurring antimicrobials (28). Related to consumer perceptions of minimally processed foods and food producers’ need to reduce or eliminate the presence of microorganisms that may cause disease or spoilage, food producers are increasingly utilizing a variety of non-thermal processing methods. Such methods include high pressure processing, sterile filtration, ultraviolet light treatment, use of protective cultures, and others (29, 30). These methods are varyingly effective depending on the food matrix and target microorganisms but can provide improved nutritional and sensory characteristics compared to thermal processing (30). Although consumers may be averse from buying products explicitly marketed as being produced using such technologies because they are unfamiliar with them (30), they may be more likely to buy such products if they are marketed as ‘non-thermally treated’, with the implication that they are more minimally processed and fresher. Tools Used to Investigate the Microbiological Landscapes of Food and Food Processing Facilities Maintaining the sterility of food processing environments is impossible. Microorganisms can enter food processing facilities through three broad channels: raw 5 materials, personnel, and the surrounding facility environment. Each of these channels is a likely source of pathogenic and spoilage-causing microorganisms. As most processing facilities contain an abundance of food residues and moisture, microorganisms will grow and may form resident communities via biofilm formation in drains, hard to clean portions of equipment, and building crevasses. While typically avoided for most food products, such biofilms may be an integral requirement for complex fermented foods such as cheese (31). It is the job of facility personnel to manage the relative risks of product contamination and ensure that products will not contain pathogenic microorganisms or other microorganisms that may cause premature product spoilage. Facility personnel accomplish this through the implementation of the hazard analysis and critical control point system (HACCP), and a variety of supporting programs including standard sanitation operating procedures (SSOPs), environmental monitoring, raw materials sourcing specifications, and others. Depending on what type of product is being manufactured, a microbial load reduction may be required to remove any pertinent pathogens of concern. Although food producers are not mandated to control spoilage microorganisms by FSIS or FDA, some of the methods to reduce or remove pathogens may also reduce or remove relevant spoilage microorganisms. Other times, additional processing treatments or food additives may be required to prevent product spoilage. As previously mentioned, it is more difficult to do this for clean label or minimally processed products. A variety of tools are used by food manufacturers to track the microorganisms that are present in food production facilities and foods. These include biomarker tools, 6 culture-based tools, and molecular sequencing-based tools. This dissertation presents examples managing and investigating the quality of foods, specifically tofu, tempeh, and kombucha using a biomarker tool, adenosine triphosphate luminescence (ATP) testing, and a molecular sequencing-based tool, next-generation amplicon sequencing. The choice of tools a producer uses is dictated by a slew of factors which includes the nature of the facility, the product, the processing steps, and the desired information yielded by the test. Some tests such as ATP testing are used for monitoring, whereas next-generation amplicon sequencing may be used to address specific quality attributes that cannot be solved through cheaper and easier means like culture-based analyses or single-strain polymerase chain reaction (PCR) followed by Sanger Sequencing. The benefits of using ATP testing and next-generation amplicon sequencing are discussed throughout this dissertation. 7 REFERENCES 1. Kassel K, Lanigan T, Martin A, Michael-Midkiff J, Russell D, Ruth T, Sanguinett C, Smits J. 2023. Selected Charts from Ag and Food Statistics: Charting the Essentials, February 2023. Economic Research Service 111. 2. Weber CL, Matthews HS. 2008. Food-miles and the relative climate impacts of food choices in the United States. Environ Sci Technol 42:3508–3513. 3. 2022. Global Food Security Index 2022. Economist Impact. 4. 2023. Country Comparisons - Population. The CIA World Factbook. https://www.cia.gov/the-world-factbook/field/population/country-comparison. Retrieved 4 April 2023. 5. Powell DA, Jacob CJ, Chapman BJ. 2011. Enhancing food safety culture to reduce rates of foodborne illness. Food Control 22:817–822. 6. Barkan ID. 1984. Industry Invites Regulation: The Passage of the Pure Food and Drug Act of 1906. Am J Public Health 75:18–26. 7. Strauss DM. 2011. An Analysis of the FDA Food Safety Modernization Act: Protection for Consumers and Boon for Business. Drug Law Journal 66:353–376. 8. Johnson R. 2016. The Federal Food Safety System: A Primer. Congressional Research Service RS22600. 9. Scallan E, Hoekstra RM, Angulo FJ, Tauxe R V., Widdowson M-A, Roy SL, Jones JL, Griffin PM. 2011. Foodborne Illness Acquired in the United States—Major Pathogens. Emerg Infect Dis 17:7–15. 10. Painter JA, Hoekstra RM, Ayers T, Tauxe R V., Braden CR, Angulo FJ, Griffin PM. 2013. Attribution of Foodborne Illnesses, Hospitalizations, and Deaths to Food Commodities by using Outbreak Data, United States, 1998–2008. Emerg Infect Dis 19:407–415. 11. Minor T, Lasher A, Klontz K, Brown B, Nardinelli C, Zorn D. 2015. The Per Case and Total Annual Costs of Foodborne Illness in the United States. Risk Analysis 35:1125– 1139. 12. Scharff RL. 2015. State Estimates for the Annual Cost of Foodborne Illness. J Food Prot 78:1064–1071. 13. Hussain M, Dawson C. 2013. Economic Impact of Food Safety Outbreaks on Food Businesses. Foods 2:585–589. 8 14. Badgett A, Milbrandt A. 2021. Food waste disposal and utilization in the United States: A spatial cost benefit analysis. J Clean Prod 314:128057. 15. Buzby JC, Wells HF, Hyman J. 2014. The Estimated Amount, Value, and Calories of Postharvest Food Losses at the Retail and Consumer Levels in the United States. Economic Research Service 121. 16. Cuéllar AD, Webber ME. 2010. Wasted Food, Wasted Energy: The Embedded Energy in Food Waste in the United States. Environ Sci Technol 44:6464–6469. 17. Get the Facts: Food Processing. Institute of Food Technologists. https://www.ift.org/- /media/policy-advocacy/ift-comments/efsa/ift-food-processing-toolkit.pdf. Retrieved 3 April 2023. 18. van Boekel M, Fogliano V, Pellegrini N, Stanton C, Scholz G, Lalljie S, Somoza V, Knorr D, Jasti PR, Eisenbrand G. 2010. A review on the beneficial aspects of food processing. Mol Nutr Food Res 54:1215–1247. 19. Hopewell LJ, Leaver LA. 2008. Evidence of Social Influences on Cache-Making by Grey Squirrels (Sciurus carolinensis). Ethology 114:1061–1068. 20. Gumert MD, Malaivijitnond S. 2012. Marine prey processed with stone tools by burmese long-tailed macaques (Macaca fascicularis aurea) in intertidal habitats. Am J Phys Anthropol 149:447–457. 21. Wrangham R, Conklin-Brittain N. 2003. ‘Cooking as a biological trait.’ Comp Biochem Physiol A Mol Integr Physiol 136:35–46. 22. Floros JD, Newsome R, Fisher W, Barbosa-Cánovas G V., Chen H, Dunne CP, German JB, Hall RL, Heldman DR, Karwe M V., Knabel SJ, Labuza TP, Lund DB, Newell-McGloughlin M, Robinson JL, Sebranek JG, Shewfelt RL, Tracy WF, Weaver CM, Ziegler GR. 2010. Feeding the World Today and Tomorrow: The Importance of Food Science and Technology. Compr Rev Food Sci Food Saf 9:572–599. 23. Dornblaser L. 2020. Clean Label: Shifting consumer perceptions. Mintel. 24. Asioli D, Aschemann-Witzel J, Caputo V, Vecchio R, Annunziata A, Næs T, Varela P. 2017. Making sense of the “clean label” trends: A review of consumer food choice behavior and discussion of industry implications. Food Research International 99:58– 71. 25. Bartelme MZ, Mattucci S, Zegler J, Beckett A, Koyenikan ayisha, Hong TH, Faulkner D, Li D, Henry C, Gilsogamo AP. 2022. 2023 Global Food & Drink Trends. Mintel. 26. Maruyama S, Lim J, Streletskaya NA. 2021. Clean Label Trade-Offs: A Case Study of Plain Yogurt. Front Nutr 8:704473. 9 27. Erickson MC, Doyle MP. 2017. The Challenges of Eliminating or Substituting Antimicrobial Preservatives in Foods. Annu Rev Food Sci Technol 8:371–390. 28. Juneja VK, Dwivedi HP, Yan X. 2012. Novel Natural Food Antimicrobials. Annu Rev Food Sci Technol 3:381–403. 29. Jadhav HB, Annapure US, Deshmukh RR. 2021. Non-thermal Technologies for Food Processing. Front Nutr 8:657090. 30. dos Santos Rocha C, Magnani M, de Paiva Anciens Ramos GL, Bezerril FF, Freitas MQ, Cruz AG, Pimentel TC. 2022. Emerging technologies in food processing: impacts on sensory characteristics and consumer perception. Curr Opin Food Sci 47:100892. 31. Bokulich NA, Mills DA. 2013. Facility-Specific “House” Microbiome Drives Microbial Landscapes of Artisan Cheesemaking Plants. Appl Environ Microbiol 79:5214–5223. 10 CHAPTER 2 IMPLEMENTATION OF ATP AND MICROBIAL INDICATOR TESTING FOR HYGIENE MONITORING IN A TOFU PRODUCTION FACILITY IMPROVES PRODUCT QUALITY AND HYGIENIC CONDITIONS OF FOOD CONTACT SURFACES: A CASE STUDY ABSTRACT Rapid ATP testing and microbiological enumeration are two common methods to monitor the effectiveness of cleaning and sanitation in the food industry. In this study, ATP testing and microbiological enumeration were implemented at a tofu production facility with the goal of improving cleaning practices and overall plant hygiene. Results from ATP monitoring were used to target areas of the production environment needing additional cleaning. ATP results were verified by microbiological enumeration of aerobic microorganisms, lactic acid bacteria, and yeasts and molds. Products from the production line were enumerated for the same microorganisms to determine if there was an impact on product quality. After the implementation of ATP monitoring and targeted cleaning, there was a statistically lower proportion of swabs that failed to meet established sanitary requirements for ATP, aerobic microorganisms, and lactic acid bacteria (p < 0.05), but not for yeasts and molds. ATP swabs and microbiological enumeration agreed on site hygiene 75.1% (72.3-77.7%, 95% CI) of the time. Product data indicated that unpasteurized finished products contained a statistically lower microbial load of the three groups of organisms following implementation of targeted cleaning (p < 0.05). 11 IMPORTANCE Cleaning and sanitation are critical to maintaining safe and high-quality food production. Monitoring these activities is important to ensure proper execution of procedure and to assure compliance with regulatory guidelines. The results from monitoring activities can direct targeted cleaning of areas with higher risk of contamination from foodstuffs and microorganisms. The results of this study show that ATP monitoring and microbiological enumeration are useful tools to verify and improve the efficacy of cleaning and sanitation practices, which can have a positive impact on both plant hygiene and product quality. However, testing regimes and critical parameters will vary based on the product and facility. This study was published in Applied and Environmental Microbiology: Sogin JH, Lopez-Velasco G, Yordem B, Lingle CK, David JM, Çobo M, Worobo RW. 2021. Implementation of ATP and Microbial Indicator Testing for Hygiene Monitoring in a Tofu Production Facility Improves Product Quality and Hygienic Conditions of Food Contact Surfaces: a Case Study. Appl Environ Microbiol 87:e02278. https://doi.org/10.1128/AEM.02278-20. 12 INTRODUCTION Many factors contribute to the microbial ecology of a food processing plant including the product, processing steps, and scale of the operation. However, all food processing facilities regardless of product or size must maintain a sanitary processing environment (9 CFR § 416, 21 CFR §117) (1, 2). Companies execute Sanitary Standard Operating Procedures (SSOPs) to reduce and control for the infiltration, movement, and growth of microorganisms in a plant (e.g. use of personal protective equipment, water treatment, air filtering, pest exclusion), but these programs do not and cannot completely prevent microorganisms from entering the processing environment. A major component of SSOPs involves cleaning and sanitizing the processing line and surrounding areas. Both the Food & Drug Administration and United States Department of Agriculture require that food-contact and non-food- contact surfaces be cleaned and sanitized as frequently as necessary to prevent contamination of products (9 CFR § 416.4, 21 CFR §117.35) (3, 4). These rules exist to prevent product contamination with hazards – “any biological, chemical (including radiological), or physical agent that has the potential to cause illness or injury” (21 CFR §117.3, see 9 CFR § 417.1 for equivalent USDA definition) (5, 6). However, cleaning and sanitation (as well as other SSOPs) also control for spoilage microorganisms. Cleaning and sanitation involve removal of material build-up from surfaces and subsequent application of a substance to reduce target microorganisms to an acceptable level. Cleaning should specifically remove proteins, carbohydrates, FOG (fats, oils, and grease), minerals, and water (7). These substances are growth substrates 13 for various pathogenic and non-pathogenic microorganisms and quench the effectiveness of sanitizers by serving as off-target substrates for sanitizing compounds. Thus, sanitation is only effective if adequate cleaning precedes it. Cleaning and sanitation occur before and after production runs by a qualified and trained team; efforts are prioritized based on production zones, distinguished by likelihood to contaminate food. Zone 1 corresponds to food-contact surfaces (highest risk of introducing contamination), Zone 2 corresponds to non-food-contact surfaces near Zone 1, Zone 3 corresponds to more distant non-food-contact surfaces than Zone 2, and Zone 4 corresponds to surfaces outside the production room (lowest risk of introducing contamination) (8). An individual on the cleaning and sanitation team utilizes visual inspection and timers to monitor progress and determine the endpoint of cleaning and sanitation activities. However, visual inspection has limited effectiveness, as food residues and/or microorganisms may be present even on a visually clean surface (9). Visual inspection should be an expectation, but companies use other verification methodologies to verify the efficacy of cleaning and sanitation. Hygiene monitoring is the regular, systematic, and site-specific testing of a processing plant for an attribute relevant to the processing environment to verify the efficacy of a sanitation program or overall cleanliness of the plant (10). Hygiene monitoring differs from environmental monitoring in that it does not identify specific organisms, namely environmental pathogens, but rather detects broad groups of microorganisms and/or food residues relevant to the processing environment, thus enabling the identification of sites that can harbor or support the growth of microorganisms due to the presence of food residues. Despite its non-specific nature, 14 hygiene monitoring is an important activity that supports the effectiveness of both food safety and food quality programs. Two common hygiene monitoring methods are culture-based quantification assays and rapid indicator-testing for biological markers. Culture-based enumeration provides quantification of microorganisms at a site, but is biased due to sampling procedure, processing, and growth conditions (e.g., media composition, growth temperature, growth time) (11). The use of different growth media for hygiene monitoring selects for various groups of microorganisms, but not individual taxonomic clades. Unlike environmental monitoring, hygiene monitoring seldom integrates sequencing tools (i.e., Sanger sequencing and next- generation platforms) for regular use; nonetheless, and beneficially, culture-based enumeration allows for the isolation and identification of suspected spoilage and/or residential microorganisms. Because culture-based methods rely on the growth of microorganisms, results are obtained after a minimum of 24-72 hours, depending on the group of microorganisms in question (12, 13). Thus, enumeration-based monitoring tools lead to strictly reactive, but not real-time, solutions to deviations from cleaning and sanitation, delaying implementation of corrective actions when improper cleaning has been performed. Further, enumeration requires use of a dedicated laboratory space, trained personnel to process samples, and the purchase of several additional laboratory materials (e.g., growth media, incubators, pipettes), so only companies able to regularly pay for and manage or outsource these activities can effectively utilize culture-based monitoring tools. Conversely, the use of rapid tests can be used to detect for the presence of biological markers resulting from microbial metabolism or food. Rapid tests are easy 15 to conduct, fast, portable, and can direct real-time improvements to cleaning and sanitation regimes. Unlike culture-based monitoring assays, rapid tools do not require a dedicated laboratory space nor extensively trained personnel to conduct the tests. One of the most common rapid methodologies tests for the presence of adenosine triphosphate (ATP), which is a molecule produced by all living cells, and detected via an enzymatic assay utilizing a luciferin/luciferase complex for light production (14). Light production can be measured as relative light units (RLU), thus converting ATP levels at individual sites to numerical values. The amount of light produced is directly proportional to the level of ATP, and therefore used to assess cleanliness of a site. Compared to culture-based methods, ATP detection has no sensitivity for specific groups of microbes (spoilage and/or pathogenic) because the structure of ATP is identical in all cells; ATP in food processing environments is derived from microorganisms, food residues, and other organic matter (15). Therefore, it can help identify niche sites where food is not efficiently removed during cleaning and sanitation (15). ATP detection is a more rapid (providing results in minutes) and accessible tool to a greater range of food production facilities for cleaning verification and hygiene monitoring compared to culture-based enumeration. However, ATP monitoring, like culture-based methods, is susceptible to sampling biases (e.g. swabbing pattern, swabbing area, surface characteristics), degradation of ATP or interference with the assay due to cleaning and sanitation compounds, and inability to effectively detect spores (16). Even though ATP testing is usually performed after every cleaning and sanitation operation, microbial quantification should be performed periodically, in 16 addition to ATP testing, to verify that sanitation is effective. The use of both microbial and ATP monitoring can provide a robust set of data that verifies the efficacy of SSOPs (9). To effectively verify and improve cleaning and sanitation processes, hygiene monitoring programs need to address the frequency of testing, the location of test sites, and actionable limits for tests. These considerations are product and process specific but require a systematic framework to implement. Selecting sampling sites may require mapping the complete facility and production process, dividing the facility into zones based on microbiological risk to the product, and completing an assessment of the most appropriate test sites (8). Test sites should be selected after conducting an appropriate risk analysis to understand the risks associated with sites given the processing stage, proximity to food, potential for cross-contamination, ease of cleaning, and condition of the surface being tested (17). A conceptual overview of the process to implement hygiene monitoring is presented in Figure 2-1. 17 Figure 2-1. Conceptual overview of process to implement hygiene management In this study, targeted cleaning directed by site-specific ATP bioluminescence detection was implemented as a measure to improve environmental cleanliness of a tofu production facility. This process was chosen for study because tofu production applies few hurdles to control for microbial growth and is therefore particularly sensitive to spoilage (18). Thus, it was hypothesized that targeted improvements to the cleanliness of the processing environment, monitored by ATP luminescence detection, would improve the microbiological quality of products and the processing environment. This study was conducted over three phases (Figure 2-1): establishment of a baseline hygiene level of the plant and products without targeted cleaning (Phase 1), implementation of targeted cleaning practices directed by ATP results from Phase 1 while maintaining extensive ATP testing to verify efficacy (Phase 2), and maintenance of cleaning and sanitation practices with reduced ATP testing (Phase 3). 18 ATP testing was complemented by culture-based testing of environmental and product samples for three groups of target microorganisms: total aerobic microorganisms, yeasts and mold, and lactic acid bacteria, selected because they are common measures of environmental cleanliness, and because they often cause food spoilage (19–21). MATERIALS AND METHODS Facility and Process Overview This study was conducted in a medium-sized facility that produces only soy- based products. In brief, the tofu produced at this facility is made from coagulated soymilk extracted from hydrated soybeans; extraction occurs at approximately 88℃ (190° F). After coagulation, the curd is pressed, cut, water-cooled, and vacuum packaged for retail or institutional use (pre-pasteurization). After packaging, products are pasteurized in package, water-cooled, and refrigerated for cold chain distribution (post-pasteurization); the shelf life of products is declared as 60 days. The facility in which this study was conducted was chosen due to a professional relationship between the authors and facility management. Study Design and Implementation The goal of this study was to investigate the impact of targeted cleaning on the microbial quality of the environment and finished products. Microbiological environmental and product testing occurred in two phases for analysis: pre- intervention (Phase 1) and post-intervention (Phase 3) of targeted cleaning activities. Prior to the start of study, 30 sites (21 Zone 1, 9 Zone 2) were identified by the authors and facility management for ATP monitoring and microbiological enumeration. These 19 sites were chosen based on phase of production, and relative cleaning and sanitation difficulty (i.e., sites deemed harder to clean were favored for testing over others); the sites and zone designations are listed in Table 2-1. ATP monitoring occurred over three phases; Phase 1 (baseline assessment): verification of cleaning and sanitation procedures utilizing extensive ATP testing (30 sites targeted per day), Phase 2: post- implementation of targeted cleaning, maintaining extensive ATP testing (30 sites targeted per day) and Phase 3: post-implementation with maintenance of cleaning and sanitation practices and reduced ATP testing (18 randomized sites targeted per day). Reduction and randomization of sites for ATP testing conducted in Phase 3 were performed utilizing the 3M™ Clean-Trace™ Data Management Software (v 1.3.0.0) with the randomization function. To establish a baseline hygiene level of the facility and products, the pre- intervention phase of the study occurred over 3 weeks following the facility’s normal cleaning and sanitation program. After this, an adjustment period of 6 weeks was allowed for the cleaning and sanitation crew to adopt targeted cleaning practices indicated by Phase 1 data. Targeted cleaning practices incorporated results from ATP and microbiological testing, described below, into the cleaning and sanitation program. Management informed the crew which sites consistently had high levels (RLU/swab > 500) of ATP during the pre-intervention phase, and the cleaning and sanitation crew subsequently targeted those sites for enhanced cleaning. Enhanced cleaning included increased time spent cleaning portions of the line associated with the pre-identified sites, and some disassembly of equipment to access hard-to-clean areas. Sanitation proceeded as usual after cleaning. Following the adjustment period, 20 the post-intervention phase of study occurred over 16 weeks to determine the impact of targeted cleaning. Biochemical and Microbiological Testing During Phases 1, 2, and 3, ATP was quantified at the 30 predetermined sites using 3M™ Clean-Trace™ Surface ATP Swabs (3M Company, St. Paul. MN) and the 3M™ Clean-Trace™ Hygiene Monitoring System LM1 Luminometer (v 1.1.0.0) (3M Company, St. Paul MN) immediately following cleaning and sanitation according to manufacturer instructions. During Phases 1 and 3, yeasts and molds, lactic acid bacteria, and aerobic microorganisms were quantified from a single complimentary 3M™ Quick Swab (3M Company, St. Paul. MN) taken adjacent ( < 15 cm or 6”) to the area swabbed for ATP. Samples were collected by swabbing in two directions: horizontally and vertically an area of approximately 100 cm2. Both the ATP and microbiological swabs were taken by a trained member of the cleaning and sanitation crew; thus, the swabbing portion of this study represents ‘real-world’ execution. Microbiological swabs were kept refrigerated ( < 4℃) and processed within 1-6 days (dictated by transportation time, shift, and day of the week). Swabs were serially diluted in Butterfield’s buffer (3M, St. Paul, MN) and plated onto 3M™ Petrifilm™ Plates according to manufacturer instructions: Rapid Yeast and Mold Count Plates (RYM; 3 days at 25℃), Lactic Acid Bacteria Count Plates (LAB; 2 days at 30℃), and Rapid Aerobic Count Plates (RAC; 1 day at 35℃). Two packaged products from each production lot, one pre-pasteurized and the other post-pasteurized, were taken directly from the production line during Phases 1 and 3. These products were kept refrigerated ( < 4℃) and microbiologically 21 characterized within 1-6 days. In brief, one 25 g subsample of tofu was aseptically removed from each packaged product and stomached in a sterile filter bag with 225 mL 0.1% peptone water (Becton Dickinson, Franklin Lakes, NJ); stomaching occurred at 200 RPM for 90 s. Samples were serially diluted with Butterfield’s buffer and plated onto RYM, LAB, and RAC plates according to manufacturer instructions (see above). Data and Statistical Analyses Quantitative data collected from ATP and microbiological swabs was transformed to binary pass/fail values with the following cutoffs, i.e. the minimum sanitary requirements (specific to this study): ATP – RLU/swab (100 cm2) > 500, RYM – log10 CFU/swab > 1.30 (20 CFU), LAB – log10 CFU/swab > 2.30 (200 CFU), and RAC – log10 CFU/swab > 2.30. Measurements that exceeded these cutoffs failed the test, indicating that the site was not adequately cleaned and sanitized for processing, i.e., these swabs failed to meet the minimum sanitary requirements. These values were chosen based on manufacturer recommendations, in coordination with plant management, and based on the authors’ experience. They are in alignment with previously established levels (22, 23), but it is important to note that present day hygiene monitoring emphasizes risk-based decision making and thus these cutoffs will vary depending on the product and process (24). Quantitative data obtained from microbiological analysis of finished products was not transformed except that for values less than or greater than the limit of detection (based on the chosen dilutions and countable ranges for each of the utilized media) were set to the limit of detection. Data analysis was conducted in R (v4.0.2) (25) using R Studio (v1.3.1073) 22 (26) with the following packages: readxl (v1.3.1) (27), dplyr (v1.0.2) (28), ggpubr (v0.4.0) (29), tidyr (v1.1.1) (30), and kableExtra (v1.2.1) (31). Trending data were visualized via locally estimated scatterplot smoothing (LOESS) with the following parameters: span = 0.75, degree = 2, and confidence interval = 95%. All statistical tests were conducted as two-sided tests with α = 0.05. Binary data were analyzed using the binomial distribution to obtain 95% confidence intervals for groups; groups were compared using Fisher’s exact test. Quantitative product data were compared using the nonparametric Mann-Whitney U test due to the skewed nature of the data. Agreement between the results from ATP swabs and microbiological swabs was analyzed using the sampling distribution to obtain 95% confidence intervals for groups. Data Availability The data and code used to draw conclusions in this study are deposited in a zenodo repository under doi: 10.5281/zenodo.4287499 (32). RESULTS Environmental Quality ATP, yeasts and molds, lactic acid bacteria, and aerobic microorganisms were quantified from swabs at 30 predetermined sites (21 Zone 1, 9 Zone 2), then transformed to binary pass/fail results based on predetermined cutoffs specific to each measure to determine the impact of targeted cleaning on the hygiene of the processing environment. After excluding data due to excessive sample processing time ( > 6 days) a total of 5,196 measurements were retained across all phases of the study. 23 Over the course of study, the proportion of sites failing to meet the minimum sanitary requirement day to day, based on ATP swabs, was highest during Phase 1 and then steadily decreased during Phase 2 before leveling off in Phase 3 (Figure 2-2); this indicated that targeted cleaning was improving the cleanliness of the facility. Figure 2-2. ATP swab failure rate over time Trend lines are locally fitted polynomial regressions computed via the LOESS method, grouped by zone. Vertical lines correspond to the separation of the three phases utilized in this study (Phase 1: pre-intervention – 30 sites targeted per day (21 Z1, 9 Z2), Phase 2: post-intervention – 30 sites targeted per day (21 Z1, 9 Z2), Phase 3: post-intervention – 18 randomized sites targeted per day (12 Z1, 6 Z2). When aggregated by zone, the results show that targeted cleaning significantly lowered the proportion of swabs that failed to meet the minimum sanitary requirements between Phases 1 and 3 for lactic acid bacteria and aerobic microorganisms in both Zones 1 and 2, but did not significantly change the proportion of swabs that failed to meet the minimum sanitary requirement for yeasts and molds in either Zones 1 or 2 (p < 0.05, Fisher’s exact test) (Figure 2-3). The reduction between 24 Phases 1 and 3 was larger for aerobic microorganisms (21.8% – Z1, 26.8% – Z2) compared to lactic acid bacteria (9.7% – Z1, 14.1% – Z2), and was reflected in the significantly lower proportion of ATP swabs that failed to meet the minimum sanitary requirements in both Zones 1 and 2 (p < 0.05, Fisher’s exact test) (Figure 2-3). The reduction between Phases 1 and 3 was largest for ATP swabs among all the tests (26.5% – Z1, 51.0% – Z2). Finally, there was a significantly higher proportion of swabs in Zone 2 that failed to meet the minimum sanitary requirements compared to Zone 1 for only two groups: ATP swabs in Phase 1 and lactic acid bacteria swabs in Phase 3 (though the difference between Zones 1 and 2 for lactic acid bacteria was small – 2.4%). 25 Figure 2-3. ATP and microbial swab failure rates by zone Proportion of swabs, aggregated by zone, failing to meet the minimum sanitary requirements based on the measurement of adenosine triphosphate (ATP), yeasts and molds (RYM), lactic acid bacteria (LAB), and aerobic microorganisms (RAC) during Phases 1 and 3. Error bars represent the 95% confidence interval for each group based on the binomial distribution. Asterisks correspond to a significant difference between phases for a given zone (p < 0.001, Fisher’s exact test). When aggregated by site, as opposed to zone, targeted cleaning caused a decrease in the proportion of swabs failing to meet the minimum sanitary requirements between Phases 1 and 3 for the majority of sites (Table 2-1). Though some sites exhibited an increase in the proportion of failing swabs, these increases were not significant (p > 0.05, Fisher’s exact test). There was a significant decrease in the proportion of swabs that failed to meet the minimum sanitary requirement for two sites when quantifying lactic acid bacteria (1 – Z1, 1 – Z2) and nine sites when quantifying aerobic microorganisms (6 – Z1, 3 – Z2); targeted cleaning did not result in a 26 significant decrease in the proportion of swabs that failed to meet the minimum sanitary requirement for any sites when measuring yeasts and molds (p < 0.05, Fisher’s exact test). Similar to when data were aggregated by zone, the significant decrease in the proportion of swabs failing to meet the minimum sanitary requirements across specific sites for lactic acid bacteria and aerobic microorganisms was reflected in 14 sites that showed a significant decrease in the proportion of swabs failing to meet the minimum sanitary requirement when measuring ATP (9 – Z1, 5 – Z2). The reduction in the proportion of failing swabs for all metrics between the pre- and post- intervention phases of study across all sites is presented in Table 2-1. 27 Table 2-1. Swab failure rate by site Reduction in the proportion of swabs failing to meet the minimum sanitary requirements across all sites and measurements between Phases 1 and 3 Zone Site ATPa RYMb LABc RACd Red.e Sig.f Red. Sig. Red. Sig. Red. Sig. 1 01. Soybean Hopper Corner 49.6% *** 24.9% ns 79.1% *** -15.8% ns 02. Auger shaft- flexicon 52.9% *** -7.9% ns -5.5% ns 12.6% ns 03. Slurry tank inside 0.0% ns 0.0% ns 7.7% ns 15.4% ns 04. Bulk (Soymilk) tank inside 23.1% ns 0.0% ns 7.7% ns 23.1% ns 05. Bulk (Soymilk) tank agitator 92.3% *** -7.9% ns 15.4% ns 20.4% * 06. Roller extractor shaft 30.8% * 3.1% ns 23.1% ns 24.1% ns 07. Roller extractor roller 38.5% * 9.1% ns 9.1% ns 24.5% ns 08. Bucket inside 40.9% * 1.8% ns -5.9% ns 23.1% ns 09. Bucket agitator 23.1% ns -15.9% ns 0.0% ns 38.5% * 10. Bucket turbulent stick 56.2% *** -2.6% ns 7.7% ns 38.5% *** 11. Curd holding tank/ Curd transfer barrel 47.9% *** -3.0% ns 5.1% ns 12.8% ns 12. Conveyor belt white mat/ Auto press belt 20.3% ns -0.2% ns 0.0% ns 15.4% ns 13. Conveyor green plastic side belt 7.7% ns 1.8% ns 23.1% ns 30.8% * 14. Chain conveyor/ Transfer conveyor 49.1% ** 1.8% ns 0.0% ns 15.4% ns 15. Chilling tank smooth surface/ Conveyor tank 0.0% ns -11.8% ns 0.0% ns 7.7% ns 16. Chilling tank inside corner 0.0% ns 0.0% ns 0.0% ns 0.0% ns 17. Chilling tank conveyor 23.1% ns -12.3% ns 0.0% ns 15.4% ns 18. Chilling tank roller shaft 0.0% ns -7.7% ns 23.1% ns 38.5% * 19. Chilling tank roller sprocket 16.7% ns 16.8% ns 23.1% ns 38.5% * 20. Overflow tofu tank inside/ Rolling tanks 25.8% ns -3.4% ns 7.7% ns -3.4% ns 21. Overflow tofu tank corner 7.7% ns -7.1% ns 0.0% ns 0.0% ns table continues next page  28 Zone Site ATPa RYMb LABc RACd Red.e Sig.f Red. Sig. Red. Sig. Red. Sig. 2 50. MV4 HMI Screen 85.6% *** 10.1% ns 7.7% ns 4.9% ns 51. MV4 HMI Screen control button and E-stop -5.5% ns 10.0% ns 15.4% ns 10.0% ns 52. MV4 film Rollers 89.2% *** -0.6% ns 7.7% ns 20.3% ns 53. Rolling rack 40.5% * 3.1% ns 7.7% ns 23.1% ns 54. Rolling rack trays -11.8% ns 2.1% ns 7.7% ns 10.8% ns 55. MV side rail 23.8% ns 1.4% ns 23.1% ns 26.0% ns 56. Chiller tank outside/ Conveyor tank outside 9.7% ns -11.1% ns 7.7% ns 30.8% * 57. Waterpack control panel buttons 76.5% *** -10.8% ns 28.1% * 51.1% *** 58. Waterpack upper guide rails prior to sealer 38.5% ** 8.7% ns 23.1% ns 61.5% *** a Adenosine triphosphate, b yeasts and mold, c lactic acid bacteria, and d aerobic microorganisms e Reduction calculated as Pfail, P1 – Pfail, P3. f Significance according to Fisher’s exact test: ns – not significant, * – p < 0.05, ** – p < 0.01, *** – p < 0.001. Microbiological Product Quality Tofu products taken from the line during production were sampled for yeasts and molds, lactic acid bacteria, and aerobic microorganisms to determine the impact of targeted cleaning on the quality of finished products. Among pre-pasteurized products, the mean rank log10 CFU/g of post-intervention products (n = 68) was significantly lower than pre-intervention products (n = 19) for yeasts and mold, lactic acid bacteria, and aerobic microorganisms (p < 0.05, Mann-Whitney U test) (Figure 2-4). Among post-pasteurized products, the mean rank log10 CFU/g of post-intervention products (n = 69) was not significantly different than pre-intervention products (n = 19) for yeasts and mold, lactic acid bacteria, and aerobic microorganisms (p > 0.05, Mann-Whitney U test) (Figure 2-5). 29 Figure 2-4. Microbial loads of products before pasteurization Microbial load of yeasts and mold (RYM), lactic acid bacteria (LAB), and aerobic microorganisms (RAC) in packaged tofu products sampled pre-pasteurization. Red lines correspond to the limits of detection for each group of interest. Numbers correspond to the p value between Phases 1 and 3. Figure 2-5. Microbial loads of products after pasteurization Microbial load of yeasts and mold (RYM), lactic acid bacteria (LAB), and aerobic microorganisms (RAC) in packaged tofu products sampled post-pasteurization. Red lines correspond to the limits of detection for each group of interest. Numbers correspond to the p value between Phases 1 and 3. 30 ATP and Microbiological Swab Agreement Because microbiological swabs were taken immediately adjacent to ATP swabs, it was possible to analyze post-hoc the agreement between ATP swabs and microbiological swabs at those sites. In total, 960 samples over the pre- and post- intervention phases of study were quantified for both ATP and viable microorganisms. Data were transformed to binary pass/fail results based on the same predetermined cutoffs as previously mentioned, but rather than analyze each microbiological group individually (i.e., yeasts and molds, lactic acid bacteria, and aerobic microorganisms) a binary transformation was made such that if any of the microbiological measurements exceeded their respective cutoffs, then the site failed to meet the minimum sanitary requirement. For a site to pass the minimum microbiological sanitary requirements, every microbiological group had to be below its individual predetermined cutoff. ATP and microbiological swab results agreed for 75.1% (72.3- 77.7%) of samples (95% C.I., sampling distribution). ATP swab results failed the minimum sanitary requirements but microbiological swabs passed for 11.7% (9.8- 13.9%) of samples (95% C.I., sampling distribution), and ATP swabs passed the minimum sanitary requirements but microbiological swabs failed for 13.2% (11.2- 15.5%) of samples (data not shown). DISCUSSION Cleaning and sanitation procedures are critical in the food industry and are the first line of defense to prevent contamination of food products from the production environment. They are required to prevent the presence and proliferation of 31 pathogenic and spoilage microorganisms. In this study, cleaning and sanitation operations were monitored in a soy-based manufacturing production facility over a period of three weeks utilizing both ATP bioluminescence and microbial indicators to assess the cleaning efficacy. After this time, targeted cleaning of specific sites that showed the highest rate of cleaning and sanitation failures was implemented; the effect of targeted cleaning was monitored with the same verification methods. Environmental Quality The results from this study indicate that targeted cleaning directed by ATP monitoring may improve the environmental hygiene of food processing facilities (Figure 2-2); this was verified by microbiological tests. A significant decrease in the proportion of swabs failing to meet the minimum sanitary requirements for lactic acid bacteria and aerobic microorganisms indicates that the targeted cleaning applied after Phase 1 had a positive effect on the facility’s hygiene. In contrast, the hygiene measure for yeasts and molds remained unchanged with targeted cleaning efforts. However, it is important to consider that equipment surfaces may not be the only sources of yeast and molds. Other sources of yeasts and molds include air, raw materials, and packaging, which would likely be unaffected by improvements to cleaning and sanitation (19). There was minimal difference between the microbial indicator failure rates between Zones 1 and 2 over the course of the study. However, there was a significant difference between failure rates of Zone 1 and Zone 2 sites for ATP during Phase 1 that was absent in Phase 3. Though the data do not suggest that Zone 2 surfaces were less hygienic than Zone 1 with regards to microorganisms, the insignificant difference 32 for ATP between Zone 1 and Zone 2 sites during Phase 3 suggests the plant was overall cleaner. In this case, cleaner Zone 2 surfaces corresponded with improved plant hygiene, which was reflected in the improvement of hygiene measures associated with aerobic microorganisms and lactic acid bacteria. Such an effect protects against the establishment of spoilage and likely pathogenic microorganism populations (33). This study also showed that ATP monitoring and the use of microbiological indicators may result in more effective equipment surface cleaning. When comparing individual sites (Table 2-1), targeted cleaning was most effective in specific portions of the manufacturing line. A total of 14 sites for ATP and 9 sites for aerobic microorganisms showed a significant reduction in the proportion of failing swabs between Phases 1 and 3. Some sites exhibited an increase in the proportion of failing swabs, but these increases were not significant. Taken together, these data indicate that targeted cleaning may only improve hygiene for specific sites and that other factors aside from cleaning (e.g., equipment geometry, temperature of product at the processing step, etc.) may have a greater effect on the hygienic quality of some sites. For this plant, ATP monitoring and the use of total aerobic count best reflected (greatest change) the effect of targeted cleaning; these could be selected as methods for routine verification of cleaning and sanitation operations in this facility. These two measures may not apply to all products and facilities. It is important for facilities to choose metrics that are sensitive to changes in the plant environment and can easily detect deviations to cleaning and sanitation. Setting critical parameters for monitoring ATP as well as microbiological criteria on equipment surfaces is highly dependent on 33 the manufacturing site, design and state of the equipment, product, process, and cleaning process. Relying on data that can be trended over time is often helpful to establish an appropriate baseline. Baseline testing for ATP and microbiological parameters should be periodically reviewed and reassessed to verify that cleaning and sanitation operation procedures remain effective. Microbiological Product Quality In this study, there was an improvement in the microbial load of yeasts and molds, lactic acid bacteria, and aerobic microorganisms for pre-pasteurized products (Figure 2-4), but not post-pasteurized products after implementation of targeted cleaning (Figure 2-5). It is important to note that for this process, the product undergoes two thermal processing steps. Soymilk requires an extraction that occurs at 88℃, which significantly reduces the microbial load associated with the raw materials; after this process, coagulated product is pressed, cut, and packaged before it is pasteurized in-package. Because the microbial load is reduced in soymilk during extraction, the microbial load of products determined in this study in pre-pasteurized products is primarily associated with post-process contamination, including contact with equipment surfaces. The microbial load of post-pasteurized products was not significantly different, which is expected as the packaged product is heat treated again, killing vegetative cells. The results from this study indicate that targeted cleaning monitored by ATP bioluminescence and microbial indicators could improve microbiological product quality for products that do not undergo an in-package pasteurization step (e.g., fresh fruits and vegetables). 34 Hygiene monitoring with ATP bioluminescence ATP bioluminescence was utilized as a tool for hygiene monitoring during this study. Prior to implementation of targeted cleaning, site selection was done to identify areas in the production environment that were most likely to pose challenges during cleaning and sanitation. Site selection should comprehensively cover the production environment such that results obtained during monitoring activities reflect the cleanliness of the area (34). ATP bioluminescence monitoring accompanied by microbial evaluation demonstrated the effectiveness of targeted cleaning. Significant differences were determined between the proportion of swabs failing to meet the minimum sanitary requirements set for both ATP and microbiological parameters for specific sites (Table 2-1). In cases where ATP did not agree with the microbiological result, it is important to consider sampling area (which although adjacent could display differences) as well as the microbial load required for ATP detection (approximately 103-104 CFU) (35). Overall, the ATP data over time showed a decrease in the proportion of failing swabs as Phases 2 and 3 were implemented (Figure 2-2). Evaluation of ATP swabs as a tool to identify microbial contamination showed that 75.1% (72.3-77.7%) of ATP swabs reflected the microbiological status of test sites. This correlation is likely due to inadequate sanitation resulting from poor cleaning or adequate sanitation resulting from proper cleaning. Although ATP generally does not directly correlate to the number of microorganisms on a given surface (16, 36) it can be used as a rapid tool to assess equipment cleanliness (22). ATP sources are not exclusive to microbial ATP; food residues, which cannot be detected by microbiological tests, can also account for 35 failing ATP swabs. In this study, this occurred when the microbial counts passed the minimum sanitary requirements, but the ATP levels did not – 11.7% (9.8-13.9%). The ATP from non-bacterial sources can evidence potential niches where food residues may accumulate over time and thus enable microbial growth. The need to conduct both biochemical and microbiological tests is especially highlighted where the microbiological results did not meet the minimum sanitary requirements, but ATP did – 13.2% (11.2-15.5%). In these cases, spores may cause the discrepancy because they do not produce ATP while in spore form (37, 38), but can still enter packages and induce spoilage after germination. ATP is considered a method to rapidly verify cleaning while microbial tests provide results to verify the sanitation status; both tests can provide a more holistic assessment of the effectiveness of cleaning and sanitation operations. ATP swabs alone can provide rapid and robust daily verification for cleaning and sanitation operations of a facility. In this study, ATP swabs taken after cleaning and sanitation would have either correctly verified the microbiological hygiene status or elicited additional cleaning due to food residues in 86.8% (84.5-88.8%) of cases. Microbiological testing could supplement this on a less frequent basis (e.g., weekly) to ensure continued efficacy of cleaning and sanitation. CONCLUSION This study showed that the microbiological quality of products improved following targeted cleaning implemented to improve the hygiene of the production environment. The use of ATP bioluminescence and microbial indicators seem to be 36 effective tools to monitor cleaning and sanitation operations and to direct the efforts of the cleaning and sanitation crew. Further, these results indicate that both biochemical and microbiological tests should be used to monitor hygiene, as they are complementary in efficiently assessing the cleaning and sanitary status of the manufacturing environment and processed products. This study serves as a framework for companies to implement hygiene monitoring in their own facilities, but it is important to note that different products and plants may require different tests and/or critical limits to the tests used in this study. 37 REFERENCES 1. USDA FSIS. 2020. Sanitation, 9 CFR § 416. Code of Federal Regulations Regulatory Requirements Under the Federal Meat Inspection Act and the Poultry Products Inspection Act. 2. DHHS FDA. 2020. Current Good Manufacturing Practice, Hazard Analysis, and Risk- Based Preventive Controls for Human Consumption, 21 CFR § 117. Code of Federal Regulations Food for Human Consumption. 3. USDA FSIS. 2020. Sanitation: Sanitary Operations, 9 CFR § 416.4. Code of Federal Regulations Regulatory Requirements Under the Federal Meat Inspection Act and the Poultry Products Inspection Act. 4. DHHS FDA. 2020. Current Good Manufacturing Practice, Hazard Analysis, and Risk- Based Preventive Controls for Human Consumption: Sanitary Operations, 21 CFR § 117.35. Code of Federal Regulations Food for Human Consumption. 5. DHHS FDA. 2020. Current Good Manufacturing Practice, Hazard Analysis, and Risk- Based Preventive Controls for Human Consumption: Definitions, 21 CFR § 117.3. Code of Federal Regulations Food for Human Consumption. 6. USDA FSIS. 2020. Hazard Analysis and Critical Control Point (HACCP) Systems: Definitions, 9 CFR § 417.1. Code of Federal Regulations Regulatory Requirements Under the Federal Meat Inspection Act and the Poultry Products Inspection Act. 7. Keener L. 2005. Improving cleaning-out-of-place (COP), p. 445–467. In Lelieveld, HLM, Mostert, MA, Holah, J (eds.), Handbook of Hygiene Control in the Food Industry. 8. Gombas D, Bierschwale S, Blackman S, Butts JN, Carter D, Coles C, Crawford WM, Denault-Bryce P, Eisenberg BA, Estrada Jr M, Ewell H, Foster S, Hardin M, Hau H, Kerr J, Mills B, Owens EM, Parker C, Petran RL, Prince G, Raede J, Roberson M, Shergill G, Snyder K, Stoltenberg SK, Suslow T, Zomorodi B. 2013. Guidance on Environmental Monitoring and Control of Listeria for the Fresh Produce Industry. United Fresh Produce Association. 9. Moore G, Griffith C. 2002. A comparison of traditional and recently developed methods for monitoring surface hygiene within the food industry: An industry trial. Int J Environ Health Res 12:317–329. 10. Holah JT. 1992. Industrial Monitoring: Hygiene in Food Processing, p. 645–659. In Melo, LF, Bott, TR, Fletcher, M, Capdeville, B (eds.), Biofilms — Science and Technology. Springer Netherlands, Dordrecht. 38 11. Nivens DE, Co BM, Franklin MJ. 2009. Sampling and quantification of biofilms in food processing and other environments, p. 539–568. In Biofilms in the Food and Beverage Industries. Elsevier. 12. Maturin L, Peeler JT. 2001. BAM Chapter 3: Aerobic Plate Count. Bacteriological Analytical Manual (BAM). https://www.fda.gov/food/laboratory-methods-food/bam- chapter-3-aerobic-plate-count. 13. Tournas V, Stack ME, Mislivec PB, Koch H a, Bandler R. 2001. BAM Chapter 18: Yeasts, Molds and Mycotoxins. FDA Bacteriological Analytical Manual. https://www.fda.gov/food/laboratory-methods-food/bam-chapter-18-yeasts-molds- and-mycotoxins. 14. Hawronskyj JM, Holah J. 1997. ATP: A universal hygiene monitor. Trends Food Sci Technol 8:79–84. 15. Mildenhall KB, Rankin SA. 2020. Implications of Adenylate Metabolism in Hygiene Assessment: A Review. J Food Prot 83:1619–1631. 16. Shama G, Malik DJ. 2013. The uses and abuses of rapid bioluminescence-based ATP assays. Int J Hyg Environ Health 216:115–125. 17. Roberts L, Lang G, Yordem B. ATP and Protein-based Hygiene MonitoringEnvironmental Monitoring Handbook for the Food and Beverage Industries, 1st ed. 3M. 18. Rahman MS. 2015. Hurdle Technology in Food Preservation, p. 17–33. In Rahman, MS, Siddiqui, MW (eds.), Minimally Processed Foods. Springer International Publishing. 19. Snyder AB, Worobo RW. 2018. Fungal Spoilage in Food Processing. J Food Prot 81:1035–1040. 20. Clavero R. 2010. Solving Microbial Spoilage Problems in Processed Foods, p. 63–78. In Kornacki, JL (ed.), Principles of Microbiological Troubleshooting in the Industrial Food Processing Environment. Springer New York, New York, NY. 21. Lorenzo JM, Munekata PE, Dominguez R, Pateiro M, Saraiva JA, Franco D. 2018. Main Groups of Microorganisms of Relevance for Food Safety and Stability, p. 53– 107. In Barba, FJ, Sant’Ana, AS, Orlien, V, Koubaa, M (eds.), Innovative Technologies for Food Preservation. Elsevier. 22. Cunningham AE, Rajagopal R, Lauer J, Allwood P. 2011. Assessment of Hygienic Quality of Surfaces in Retail Food Service Establishments Based on Microbial Counts and Real-Time Detection of ATP. J Food Prot 74:686–690. 23. Griffith C. 2005. Improving surface sampling and detection of contamination, p. 588– 39 618. In Handbook of hygiene control in the food industry. Woodhead Publishing Limited. 24. Møretrø T, Langsrud S. 2017. Residential Bacteria on Surfaces in the Food Industry and Their Implications for Food Safety and Quality. Compr Rev Food Sci Food Saf 16:1022–1041. 25. R Core Team. 2020. R: A language and environment for statistical computing. 4.0.2. R Foundation for Statistical Computing, Vienna, Austria. 26. RStudio Team. 2020. RStudio: Integrated Development Environment for R. 1.3.1073. RStudio, Inc., Boston, MA. 27. Wickham H, Bryan J. 2019. readxl: Read Excel Files. 1.3.1. 28. Wickham H, François R, Henry L, Müller K. 2020. dplyr: A Grammar of Data Manipulation. 1.0.2. 29. Kassambara A. 2020. ggpubr: “ggplot2” Based Publication Ready Plots. 0.4.0. 30. Wickham H, Henry L. 2020. tidyr: Tidy Messy Data. 1.1.1. 31. Zhu H, Travison T, Tsai T, Beasley W, Xie Y, Yu G, Laurent S, Shepard R, Sidi Y, Salzer B, Gui G, Fan Y, Murdoch D. 2020. kableExtra. 1.2.1. 32. Sogin JH, Velasco Lopez G, Yordem B, Lingle CK, David JM, Cobo M, Worobo RW. 2020. Tofu Case Study Statistical Analyses. 1. Zenodo https://doi.org/10.5281/zenodo.4287499. 33. Hammons SR, Stasiewicz MJ, Roof S, Oliver HF. 2015. Aerobic Plate Counts and ATP Levels Correlate with Listeria monocytogenes Detection in Retail Delis. J Food Prot 78:825–830. 34. Simmons CK, Wiedmann M. 2018. Identification and classification of sampling sites for pathogen environmental monitoring programs for Listeria monocytogenes: Results from an expert elicitation. Food Microbiol 75:2–17. 35. Ukuku DO, Pilizota V, Sapers GM. 2001. Bioluminescence ATP Assay for Estimating Total Plate Counts of Surface Microflora of Whole Cantaloupe and Determining Efficacy of Washing Treatments†. J Food Prot 64:813–819. 36. Osimani A, Garofalo C, Clementi F, Tavoletti S, Aquilanti L. 2014. Bioluminescence ATP Monitoring for the Routine Assessment of Food Contact Surface Cleanliness in a University Canteen. Int J Environ Res Public Health 11:10824–10837. 37. Setlow P, Kornberg A. 1970. Biochemical Studies of Bacterial Sporulation and Germination. XXII. Energy metabolism in early stages of germination of Bacillus 40 megaterium sporesThe Journal of Biological Chemistry. 38. Korza G, Setlow B, Rao L, Li Q, Setlow P. 2016. Changes in Bacillus Spore Small Molecules, rRNA, Germination, and Outgrowth after Extended Sublethal Exposure to Various Temperatures: Evidence that Protein Synthesis Is Not Essential for Spore Germination. J Bacteriol 198:3254–3264. 41 CHAPTER 3 NEXT-GENERATION AMPLICON SEQUENCING ANALYSIS OF COMMERCIAL TEMPEH DURING FERMENTATION ABSTRACT Tempeh is a mold-fermented food product native to Indonesia. Rhizopus microsporus is typically used as an inoculum for cooked soybeans, but production processes introduce other bacteria and other fungi which form a microbial community. This study investigated the influence of product type (i.e., raw materials composition) between soybean and multigrain tempeh and time point in fermentation on the bacterial and fungal communities of commercially produced tempeh. Next-generation amplicon sequencing of the bacterial 16S rRNA gene V3-V4 regions (16S) and fungal internal transcribed spacer region 2 (ITS2) were used to investigate community differences. For both bacterial and fungal communities, multigrain tempeh exhibited overall higher alpha diversity than soybean tempeh. The results of PERMANOVA and ZicoSeq differential abundance analyses indicated that product type was significantly (p < 0.05) associated with differences in bacterial communities, whereas time point was significantly associated with differences in fungal communities. Core microbiome analysis showed that Leuconostoc, Enterococcus, Lactococcus, and Kocuria species were the core bacterial species, and that Rhizopus microsporus and Trichosporon asteroides were the core fungal species. This study reports tempeh community information over the course of fermentation and provides a direct comparison of the associated communities due to ingredient composition. The discussion suggests further study on the influence of Acinetobacter and Trichosporon species on sensory 42 characteristics of tempeh. This study is being prepared for publication and is currently under review: Sogin JH, & Worobo RW. 2023. Next-Generation Amplicon Sequencing Analysis of Commercial Tempeh During Fermentation. In review. 43 INTRODUCTION Tempeh is a plant-based protein source originating from ethnic Javanese people in Indonesia (1). It is mainly made from cooked soybeans, but other grains including broad beans, peas, and chickpeas can be used too (2). The typical steps of tempeh production are raw soybean soaking, dehulling, washing, boiling, draining, cooling, inoculation with Rhizopus starter culture, packaging and incubation, but many variations exist due to processing scale (3, 4). The most noticeable role of fermentation in tempeh production is the physical aggregation of individual soybeans into a cohesive mass (‘soybean cake’). Rhizopus molds cause this transformation through the breakdown of soy constituents and growth of hyphae that fill vacant space between beans, thus forming a mycelium (5, 6). Traditionally produced tempeh in Indonesia used to contain three Rhizopus species: R. arrhizus, R. delmar, and R. microsporus, but widespread commercial starter use has led to the loss of R. arrhizus and R. delmar (7). The nature of tempeh production makes it extremely difficult to maintain single-strain fermentations outside a laboratory like other fermented foods such as sauerkraut and koji (8, 9). Soybeans and food production environments harbor microorganisms that can grow during soybean soaking and will subsequently grow in the tempeh fermentation (2, 10, 11). Although soybeans are cooked during production, thereby killing many of the organisms that may have originated from the raw materials, food processing environments harbor resident microbial communities that will persist even with frequent cleaning and sanitation (12), especially in a fermented food facility. Furthermore, mycelium development throughout the soybean cake 44 during fermentation requires oxygen (among other factors), which is accomplished via a high surface area to volume approach, leaving portions of the cake exposed to the environment, unlike other fermentations such as beer, wine, or vegetable products which are often fermented in closed vessels. A handful of previous studies have conducted community analyses of tempeh. One amplicon sequencing study identified differences in bacterial communities between two tempeh producers in Indonesia, finding the two majorly abundant genera in the products to be Enterococcus and Lactobacillus (13). Another amplicon sequencing study investigated the bacterial and fungal communities of over-fermented tempeh; some of the predominant bacterial genera were Chryseobacterium, Lactococcus, Lactobacillus, Streptococcus, Acetobacter, and Klebsiella, and some of the predominant fungal genera were Rhizopus and Trichosporon (14). One shotgun metagenomic study comparing the same tempeh products compared by Radita et al. (2018), found the predominant bacterial genera to be Novosphingobium and Enterobacter, with lower relative abundances of Leuconostoc, Enterococcus, and Lactobacillus observed (15). The aforementioned studies analyzed finished or past-finished tempeh. We were interested in how the bacterial and fungal communities of tempeh changed over the course of fermentation. Furthermore, two of the studies (13, 15), compared products made by different manufacturers, indicating general differences between different tempeh, but no comparisons were made between products due to specific processing factors. Some studies have compared the microbiology of tempeh soak water between different grains (2) or due to different wraps (11), but neither used 45 next-generation sequencing to directly compare the bacterial and fungal communities of tempeh due to grain composition. Therefore, the purpose of this study was to characterize the bacterial and fungal communities of soybean and multigrain tempeh produced in a commercial facility throughout fermentation. METHODS Tempeh Production & Sample Acquisition Tempeh was acquired directly from a commercial manufacturer of various soy- based products; other products including tofu were produced near the tempeh. Two product types were sampled: soy and multigrain. Soy tempeh was made using only soybeans, whereas multigrain tempeh was made from soybeans, brown rice, millet, kasha, and quinoa. Sampling was conducted by an employee of the manufacturer at predetermined time points during production. Sampling occurred over the course of several weeks to obtain samples from multiple production runs/product lots. Samples were ultimately taken from four soy production runs (n = 24) and two multigrain runs (n = 12). Samples were taken before inoculation (‘pre-inoculation’), after inoculation (‘post-inoculation’), during fermentation (‘early’ ~5 h, ‘middle’ 10-15 h, ‘end’ ~20 h), and after product packaging (‘packaged’). Samples were defined as a single 8-ounce (227 g) product unit except for bean samples (pre-inoculation and post-inoculation), which were defined as an approximately 500 g portion of the bulk rehydrated, cooked, and drained beans. Samples were immediately frozen in bags, stored at -20℃ at the plant, and eventually transported to Cornell University for further processing. All samples were processed within 90 days of acquisition. 46 DNA Extraction Frozen tempeh samples were defrosted under refrigeration (2℃) overnight (~12 h) prior to DNA extraction. For each sample, the product was aseptically opened and a 10 g portion from the center of the tempeh block was placed in a stomaching bag. 90 mL PBS was added to the stomaching bag, which was then hand massaged to break up the tempeh. The bag was then stomached at 230 rpm for 1 min. 1.8 mL of the stomached tempeh mixture was centrifuged at 13,000 rcf for 1 min. The supernatant was decanted, and DNA was extracted from the resulting pellet using the Qiagen Powerfood DNA extraction kit (Qiagen, Germantown, MD, USA). An ‘extraction negative control’ was subject to the same procedure using 100 mL PBS instead of 10 g tempeh + 90 mL PBS. Next-Generation Amplicon Sequencing A two-step amplicon sequencing protocol was followed to amplify and pool marker gene fragments for bacteria and fungi from the tempeh DNA samples prior to short-read next-generation sequencing. First Step The 16S rRNA V3-V4 region (16S) from bacteria and internal transcribed spacer region 2 from fungi (ITS2) were amplified in separate reactions. The locus- specific portions of the 16S primers were F-5’-CCTACGGGNGGCWGCAG-3’ and R-5’-GACTACHVGGGTATCTAATCC-3’, colloquially known as Bakt_341F and Bakt_805R (16, 17); the locus-specific portions of the ITS primers were F-5’- AACTTTYRRCAAYGGATCWCT-3’ and R-5’- AGCCTCCGCTTATTGATATGCTTAART-3’, colloquially known as 5.8S-Fun and 47 ITS4-Fun (18). Both 16S and ITS primers were designed to include the required Illumina indexing overhangs (19), a 0-4 bp heterogeneity spacer (20), a 2 bp linker (21), and the appropriate locus-specific primer region. The use of a heterogeneity spacer resulted in there being five variations of each primer, which were equimolarly mixed prior to PCR amplification; a complete list of the primers used for 16S rRNA and ITS2 amplifications can be found in Appendix Tables 3-3 & 3-4. Samples were amplified using Invitrogen Platinum II Taq Hot-Start Green PCR Master Mix (2X) in 25 µL reactions (Invitrogen, Waltham, MA, USA). Each reaction contained 12.5 µL 2X master mix, 2.5 µL 25 mM supplemental magnesium (as MgCl2), 1 uL 10 uM forward primer mixture, 1 uL 10 uM reverse primer mixture, 3 µL GC enhancer (included with master mix), and 5 µL DNA template. PCR conditions for both 16S and ITS amplifications were the same: 1 cycle 94 ℃ for 5 min; 25 or 35 cycles 94 ℃ for 15 sec, 50 ℃ for 15 sec, 68 ℃ for 15 sec; 1 cycle 68 ℃ for 2.5 min. Successful amplification was verified via gel electrophoresis and EtBr staining. An attempt was made to amplify all samples with 25 amplification cycles, but several samples required 35 amplification cycles to yield a visible band of EtBr- stained PCR product following gel electrophoresis. In general, tempeh sampled at the pre-inoculation and post-inoculation time points required 35 cycles, whereas those sampled at later time points only required 25 cycles. ‘PCR negative controls’ using nuclease-free water instead of DNA template were subject to the same procedure for every PCR condition. Additionally, the ‘extraction negative control’ was subject to the same procedure for every PCR condition. A ‘PCR negative control’ and ‘extraction negative control’ were included for every PCR condition. 48 Second Step The amplified samples were submitted to the Cornell Biotechnology Resource Center for quantification, indexing, multiplexing, cleanup, quality control, and Illumina MiSeq sequencing (Illumina, San Diego, CA, USA). Unique dual indexing was applied to the samples to prevent index hopping (22). Sequencing proceeded according to manufacturer specifications using a MiSeq 2×300 bp V3 sequencing kit. Data Processing & Analysis Raw reads were processed in Qiime2 (23) using Cutadapt (24) to trim adapter sequences and DADA2 (25) to merge, denoise, and bin reads into amplicon sequence variants (ASVs). Sequences were taxonomically classified using the Qiime2 naïve- bayes feature classifier (26) trained on amplicon specific regions of 16S sequences from the Silva version 138 99% sequence identity OTU data release (27, 28) processed with RESCRIPt (29), and on full length ITS sequences from the expert curated OTU thresholds UNITE 9.0 database (30). Data analysis was conducted using R (31) within RStudio (32) using renv (33). Processed read data were then transferred to R using Phyloseq (34), decontaminated using Decontam (35), ‘denoised’ (rare sequence removal) with PERFect (36), and analyzed with vegan (37), microbiome (38), and GUniFrac (39) R packages; GUniFrac was used for differential abundance testing with ZicoSeq (40). When data were agglomerated to higher taxonomic levels, taxonomically unresolved OTUs at the desired agglomeration level were kept separate. Other packages were used for data visualization: ggpubr (41), ggtext (42), and ggVennDiagram (43). The full code used to process raw reads and analyze data is deposited and accessible at 49 https://github.com/jsogin574/Sogin_and_Worobo_Tempeh_2023. Data Availability Sample metadata and raw sequencing data were deposited to the US National Center for Biotechnology Information (NCBI) under BioProject number PRJNA941282 and Sequence Read Archives SRR23815527- SRR23815527606. Analysis codes are available at https://github.com/jsogin574/Sogin_and_Worobo_Tempeh_2023. RESULTS Next-Generation Sequencing and Sequence Processing Samples were sequenced on an Illumina MiSeq with a 2×300 bp V3 sequencing kit (Illumina, Madison, WI, USA). 11 M raw reads were obtained from Illumina sequencing (5.3 M 16S, 5.8 M ITS); 7.8 M reads were retained (3.2 M 16S, 4.6 M ITS) after raw sequence processing. 5.8 M reads were retained (1.6 M 16S, 4.2 M ITS) after filtering, contaminant removal with Decontam, and exclusion of negative controls. Chloroplast reads were a large fraction of those removed from 16S data (1.0 M). At this point, samples from pre-inoculation and post-inoculation time points were removed for statistical analyses. Because fermentation had not begun, these samples had inherently low biomass compared to other samples, which resulted in low read counts, a larger number of contaminant sequences present, and many different OTUs compared to samples during fermentation (early, middle, end, packaged). Had they not been excluded, the differences in read depth and large number of zeroes in the data would have drastically reduced the sensitivity and specificity of analyses. Unless 50 explicitly noted, the pre-inoculation and post-inoculation samples were not included in any analyses. The remaining samples were ‘denoised’ with PERFect to remove rare and uninformative OTUs. Bacterial Community Analyses Bacterial Family Relative Abundance Filtered and decontaminated data were agglomerated to family taxonomic level, converted to proportions, and plotted in Figure 3-1. Because no statistical comparisons were being made, pre-inoculation and post-inoculation time point samples were included. Taxa with mean relative abundance less than 0.025 across all samples were collapsed into an ‘other’ category. For early, middle, end, and packaged samples, most of the community was represented by seven families: Leuconostocaceae, Moraxellaceae, Streptococcaceae, Staphylococcaceae, Micrococcaceae, Bacillaceae, and Enterococcaceae. The multigrain tempeh seemed to exhibit more variability and had larger ‘other’ portions than the soy tempeh. The pre-inoculation and post-inoculation samples were poorly represented by these seven families and exhibited very large ‘other’ portions compared to later time points. However, many of the families that appeared dominant in the other samples were observed in the pre-inoculation and post-inoculation samples, notably Leuconostocaceae, Moraxellaceae, and Streptococcaceae. 51 Figure 3-1. Bacterial Family Relative Abundance Relative abundances of bacterial families in individual samples grouped by product and time point. Only families present at mean relative abundance greater than 0.025 across all samples were plotted; other families were collapsed into ‘other’. Plotted from family-agglomerated proportion data. Bacterial Alpha Diversity Several alpha diversity metrics were calculated for samples from denoised and rarefied data: Shannon-Wiener diversity (44, 45), Inverse Simpson diversity (46), 50% Coverage diversity, Observed richness, Chao1 richness (47), Pielou evenness (48), Relative dominance, Core Abundance dominance, and Low Abundance rarity. A description of these metrics can be found in Appendix Table 3-5. Samples were then grouped by product, time point, and a combined product*time point factor. Figure 3-2 shows a boxplot comparison of the various metrics with samples grouped by product. 52 For simplicity, only the product comparisons were plotted. Multigrain tempeh exhibited generally higher diversity, richness, and evenness, whereas soy tempeh exhibited higher dominance; both exhibited similar rarity, although this measure is highly impacted by denoising with PERFect. Figure 3-2. Bacterial alpha diversity comparisons Boxplot comparisons between products of calculated bacterial alpha diversity metrics. Alpha diversity metrics were calculated for every sample from rarefied OTU-level count data. A Kruskal-Wallis test was used to determine whether any of the alpha diversity metrics differed overall between group levels. Significant differences (p < 0.05) were observed between products for Shannon-Wiener, Inverse Simpson, 50% Coverage, and Core Abundance metrics. None of the metrics significantly differed (p < 0.05) when samples were grouped by time point. Only Shannon-Wiener and Core 53 Abundance metrics were significantly different (p < 0.05) when grouped by the product*time point interaction factor. Pairwise comparisons were not made due to the limited number of samples for each time point; the product grouping is inherently paired. Multiple comparisons were controlled for by the false discovery rate method. See Appendix Table 3-5 for a complete listing of significance values from the various comparisons. Bacterial Beta Diversity Analysis A distance matrix was computed using the Bray-Curtis (49) method between samples on denoised species-agglomerated proportion data. Nonmetric multidimensional scaling (NMDS) was used to ordinate the samples in 2D, as seen in Figure 3-3. The ordination was grouped in two ways based on product and time point levels. The product ordination shows the presence of two separate clusters for soy and multigrain tempeh. The time point ordination shows the presence of three separate clusters for middle, end, and packaged time points, and a completely dispersed cluster from early time points that spans the other time points. Interestingly, the end and packaged time points are more compact compared to the early and middle time points. 54 Figure 3-3. Bacterial Bray-Curtis NMDS ordination Nonmetric multidimensional scaling ordination of bacterial Bray-Curtis dissimilarities grouped by product and time point. Bray-Curtis dissimilarities were calculated between all samples from species-agglomerated proportion data. Statistical analysis was then conducted using the distance matrix to determine whether product, time point, and their interaction product*time point were significantly associated with the observed sample differences. To determine the marginal effect of each of these factors, a distance-based redundancy analysis was conducted (dbRDA) and an analysis of variance (ANOVA) was used to interpret the result. The ANOVA showed that product (p < 0.01) and time point (p < 0.05) significantly influenced the dbRDA but that their interaction did not (p = 0.072). A permutational multivariate analysis of variance was then conducted (PERMANOVA) (50) to determine which species were most associated with the observed community 55 dissimilarities. Two PERMANOVA models were run, one analyzing product while controlling for time point, and the other vice versa. The interaction of product*time point was not included due to the result of the dbRDA. The two PERMANOVA results agreed with the dbRDA analysis, showing that both product (p < 0.01) and time point (p < 0.05) were significantly associated with differences in the observed community dissimilarities. The top 10 species coefficients contributing to product differences were plotted in Figure 3-4. A Leuconostoc sp. and Weissella sp. were most strongly associated with soy tempeh compared to multigrain tempeh, whereas Enterococcus italicus and Acinetobacter tandoii were most strongly associated with multigrain tempeh compared to soy tempeh. Pairwise comparisons of species coefficients between time points were not made. PERMANOVA analysis can result in false positives due to different sample dispersions between levels; a multivariate homogeneity of groups dispersions (PERMDISP2) (51) test was conducted to determine whether dispersions significantly differed between group levels. ANOVA test of the PERMDISP2 result showed that dispersions did not significantly differ (p > 0.05) between product levels, but dispersions did significantly differ between time point levels (p = 0.0325). 56 Figure 3-4. Bacterial species PERMANOVA coefficients for product type Top ten (absolute value) bacterial species coefficients contributing to community dissimilarity due to product, as determined by PERMANOVA analysis. Positive values indicate the species was observed more frequently in soy tempeh, whereas negative values indicate the species was observed more frequently in multigrain tempeh. UR preceding the species name indicates the OTU was taxonomically unresolved; the four-character hash is unique to the specific OTU. Bray-Curtis dissimilarities were calculated between all samples from species-agglomerated proportion data. Bacterial Core Species To further investigate community differences, core species were identified between product and time point levels using denoised species-agglomerated proportion data. Core species were defined as species present at a relative abundance greater than 0.005 in greater than 66% of samples within group levels. To better understand core species, the mean abundance of core species within group levels were calculated. The number, identification, and mean abundance of core species was visualized in Figure 3-5; mean abundances were negative natural log transformed to emphasize differences in abundance between group levels. 57 Figure 3-5. Core bacterial species and differential abundance analysis Core bacterial species grouped by product and time point. Venn diagrams indicate the number of shared core species between levels of each group. Heat maps show negative natural log transformations of mean proportion data for core species within levels of each group; blue indicates high mean abundance and yellow indicates low mean abundance. Species name coloring corresponds to which group levels species are core to, and match Venn diagram region label coloring. UR preceding the species name indicates the OTU was taxonomically unresolved; the four-character hash is unique to the specific OTU. Core species were defined as those species present at a relative abundance greater than 0.005 in greater than 66% of samples within levels of each group. Species that were not identified as core to any group level were collapsed into ‘other’. Analysis was done with species-agglomerated proportion data. Seven species were identified as core to both soy and multigrain tempeh; multigrain tempeh had six unique core species, and soy tempeh had two unique core species. Eight species were identified as core to early, middle, end, and packaged time points; early and middle time points shared one unique core species, and middle and packaged time points each had one unique core species. Many of the core species 58 identified between products were also identified as core between time points. These species included three Leuconostoc spp., Enterococcus italicus, two Lactococcus spp., and Kocuria rhizophila. Species identified as uniquely core to one group level were generally more abundant than those same species in other levels. Interestingly, core species identified in different time point levels generally appeared to become less abundant at later stages of the fermentation. Bacterial Differentially Abundant Species Differential abundance (DA) analysis was conducted using ZicoSeq on denoised species-agglomerated count data. Species abundance was compared within product and time point groups separately while controlling for the other factor; no pairwise comparisons were made between different time points. Only species present in at least 33% of samples within a group and with mean relative abundance greater than 0.0005 were included. Multiple comparisons were controlled by the false discovery rate method. Core species that were significantly different (p < 0.05) between soy and multigrain tempeh are listed in Table 3-1. The four most significantly differentially abundant core species between soy and multigrain tempeh were all Acinetobacter spp. (p ≤ 0.001). Non-core species were also identified as differentially abundant. No species were found to be significantly differentially abundant between time point levels. 59 Table 3-1. Significantly differentially abundance core bacterial species Significantly differentially abundant (p < 0.05) core bacterial speciesa,b between Product levels identified using ZicoSeq. Soy Multigrain Species p padj c μ σ/ μ μ σ/ μ Acinetobacter tandoii 1.0 × 10-4 1.7 × 10-5 5.4 × 10-4 2.2 5.0 × 10-2 0.86 URd Acinetobacter 5b17 1.0 × 10-4 1.7 × 10-5 8.3 × 10-4 1.2 3.1 × 10-2 0.66 UR Acinetobacter 3099 1.0 × 10-4 1.7 × 10-5 2.9 × 10-4 2.7 4.2 × 10-2 0.72 Acinetobacter bereziniae 1.0 × 10-4 2.0 × 10-5 1.8 × 10-4 1.4 1.5 × 10-2 0.75 UR Weissella d395 3.0 × 10-4 3.6 × 10-4 9.4 × 10-2 0.88 5.7 × 10-3 0.97 UR Weissella ac2f 5.0 × 10-4 8.7 × 10-4 1.1 × 10-2 0.94 5.3 × 10-4 1.5 UR Kurthia cfe3 6.0 × 10-4 1.6 × 10-3 1.1 × 10-3 1.8 1.2 × 10-2 0.91 Enterococcus italicus 1.5 × 10-3 1.8 × 10-3 5.4 × 10-2 0.44 1.1 × 10-1 0.41 UR Enhydrobacter 006f 3.1 × 10-3 5.2 × 10-3 1.8 × 10-3 3.8 8.2 × 10-3 0.76 Lactococcus lactis 1.1 × 10-2 1.9 × 10-2 7.2 × 10-2 0.59 3.1 × 10-2 0.47 a Species-agglomerated count data. b Core bacteri