This Reichler_JDS_2020_readme.txt file was generated on 2020-12-03 by Samuel Reichler, edited on 2021-03-24 by Sarah Wright GENERAL INFORMATION 1. Title of Dataset: Supplemental Data for "Identification, subtyping, and tracking of dairy spoilage-associated Pseudomonas by sequencing the ileS gene" 2. Author Information A. Principal Investigator Contact Information Name: Martin Wiedmann Institution: Cornell University Address: 341 Stocking Hall, Ithaca, NY 14853 Email: martin.wiedmann@cornell.edu B. Associate or Co-investigator Contact Information Name: Samuel Reichler Institution: Cornell University Address: 326 Stocking Hall, Ithaca, NY 14853 Email: sjr267@cornell.edu C. Alternate Contact Information Name: Nicole Martin Institution: Cornell University Address: 326 Stocking Hall, Ithaca, NY 14853 Email: nhw6@cornell.edu 3. Date of data collection (date range): 2015-07-16 to 2019-07-26 4. Geographic location of data collection: Ithaca, NY, USA 5. Information about funding sources that supported the collection of the data: National Dairy Council (Rosemont, IL; OSP # 73849), bioMérieux USA (Hazelwood, MO; OSP # 83959), New York State Dairy Promotion Advisory Board (Albany, NY; OSP # 83562) SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: CC0 2. Links to publications that cite or use the data: Reichler, S. J., S. I. Murphy, N. H. Martin, and M. Wiedmann. 2021. Identification, subtyping, and tracking of dairy spoilage-associated Pseudomonas by sequencing the ileS gene. J. Dairy Sci. 104(3):2668-2683. https://doi.org/10.3168/jds.2020-19283. 3. Recommended citation for this dataset: Samuel J. Reichler, Sarah I. Murphy, Nicole H. Martin, and Martin Wiedmann. (2020) Supplemental data for: Identification, subtyping, and tracking of dairy spoilage-associated Pseudomonas by sequencing the ileS gene. Cornell University eCommons Repository. https://doi.org/10.7298/rekh-4w25 DATA & FILE OVERVIEW 1. File List: A. Filename: Reichler_JDS_2020_readme.txt Short description: Metadata for this dataset B. Filename: Reichler_JDS_2020_FullSupplementalData.pdf Short description: Contains Supplemental Tables 1, 2, and 3; Supplemental Figure 1A; and Supplemental Figure 1B, as described below C. Filename: Reichler_JDS_2020_Table1_PCRmix.csv Short description: Reagent volumes for 7-gene multilocus sequence typing polymerase chain reaction (PCR) mix D. Filename: Reichler_JDS_2020_Table2_WGSmetadata.csv Short description: Whole-genome sequencing metadata and National Center for Biotechnology Information (NCBI) database accession identifiers E. Filename: Reichler_JDS_2020_Table3_ANIbDistanceMatrix.csv Short description: An average nucleotide identity by BLAST (ANIb) distance matrix F. Filename: Reichler_JDS_2020_Figure1A.svg Short description: an annotated phylogenetic tree of Pseudomonas type strains G. Filename: Reichler_JDS_2020_Figure1B.svg Short description: an annotated phylogenetic tree of Pseudomonas type strains H. Filename: Reichler_JDS_2020_Fig1Aand1B.nex Short description: NEXUS file containing the phylogenetic trees featured in Supplemental Figures 1A and 1B, the sequence alignments used to construct them, and the Pseudomonas species groups and subgroups assigned to each type strain isolate in both phylogenies 2. Relationship between files, if important: Reichler_JDS_2020_FullSupplementalData.pdf contains all supplemental data for this project in pdf format; other files are in formats that we hope enable reuse of the data. 3. Additional related data collected that was not included in the current data package: n/a 4. Are there multiple versions of the dataset? No METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: Pseudomonas bacterial isolates were collected from pasteurized milk samples from the Northeast USA. Isolates were subjected to 16S rDNA PCR and Sanger sequencing for preliminary identification, then ileS PCR and Sanger sequencing for further identification and subtyping. A subset of Pseudomonas isolates were whole-genome sequenced on the Illumina platform. Type strain Pseudomonas spp. whole-genome sequences were downloaded from NCBI GenBank. [add publication DOI when available] 2. Methods for processing the data: The ANIb distance matrix was generated from Pseudomonas whole-genome sequences using pyani version 20171222 (https://github.com/widdowquinn/pyani). The maximum-likelihood phylogenetic trees in Supplemental Figures 1A and 1B were generated using RAxML version 8.2.12 (https://doi.org/10.1093/bioinformatics/btu033), rendered using FigTree version 1.4.4 (https://github.com/rambaut/figtree), and annotated using Inkscape version 0.92.4 (https://inkscape.org/). 3. Instrument- or software-specific information needed to interpret the data: NEXUS file can be interpreted using Mesquite version 3.61 (https://github.com/MesquiteProject/MesquiteCore) or other compatible software. 4. Standards and calibration information, if appropriate: n/a 5. Environmental/experimental conditions: n/a 6. Describe any quality-assurance procedures performed on the data: Sanger sequences generated for Pseudomonas bacterial isolates were assigned per-base quality scores using KB Basecaller software, and sequence assemblies were further assessed for quality using Sequencher version 5.4.5 (http://www.genecodes.com/). Whole-genome sequencing raw reads were checked for contamination using Kraken2 (https://github.com/DerrickWood/kraken2/) and MetaPhlAn version 2.0 (https://github.com/biobakery/MetaPhlAn), and assemblies were checked for quality using QUAST version 4.0 (https://github.com/ablab/quast). 7. People involved with sample collection, processing, analysis and/or submission: Collection and Processing: Samuel Reichler, Alexander Alles Analysis: Samuel Reichler, Alexander Alles, Sarah Murphy, Renato Orsi, Nicole Martin, Martin Wiedmann Submission: Samuel Reichler, Renato Orsi DATA-SPECIFIC INFORMATION FOR: Reichler_JDS_2020_Table3_ANIbDistanceMatrix.csv Description: Supplemental Table 3. ANIb distance matrix for 27 newly sequenced Pseudomonas genomes and 8 related type strain genomes. 1. Number of variables: 35 2. Number of cases/rows: 35 3. Variable List: FSL A6-1183 FSL R10-0056 FSL R10-0071 FSL R10-0072 FSL R10-0399 FSL R10-0765 FSL R10-0802 FSL R10-1339 FSL R10-1350 FSL R10-1371 FSL R10-1594 FSL R10-1637 FSL R10-1876 FSL R10-1984 FSL R10-2107 FSL R10-2172 FSL R10-2189 FSL R10-2216 FSL R10-2245 FSL R10-2339 FSL R10-2342 FSL R10-2398 FSL R10-2932 FSL R10-2940 FSL R10-2964 FSL R10-3254 FSL R10-3257 P. weihenstephanensis DSM 29166 P. taetrolens DSM 21104 P. psychrophila DSM 17535 P. lundensis DSM 6252 P. helleri DSM 29165 P. fragi NBRC 3458 P. endophytica BSTT44 P. deceptionensis DSM 26521 4. Missing data codes: n/a 5. Specialized formats or other abbreviations used: n/a DATA-SPECIFIC INFORMATION FOR: Reichler_JDS_2020_FullSupplementalData.pdf Description: Contains Supplemental Table 3, Supplemental Figure 1A, and Supplemental Figure 1B, in pdf format, checked for accessibility by PAVE. This file contains all supplemental data for this project in pdf format; the additional files are in formats that we hope enable reuse of the data. DATA-SPECIFIC INFORMATION FOR: Reichler_JDS_2020_Figure1A.svg Description: an annotated phylogenetic tree of Pseudomonas type strains in SVG format. The file includes the following metadata fields: title, date, creator, rights, publisher, language, keywords, description, and contributors. DATA-SPECIFIC INFORMATION FOR: Reichler_JDS_2020_Figure1B.svg Description: an annotated phylogenetic tree of Pseudomonas type strains in SVG format. The file includes the following metadata fields: title, date, creator, rights, publisher, language, keywords, description, and contributors. DATA-SPECIFIC INFORMATION FOR: Reichler_JDS_2020_Fig1Aand1B.nex Short description: NEXUS file containing the phylogenetic trees featured in Supplemental Figures 1A and 1B, the sequence alignments used to construct them, and the Pseudomonas species groups and subgroups assigned to each type strain isolate in both phylogenies. NEXUS files are text-based and readable with a free software package called Mesquite. This is the file format and file contents used by TreeBASE. Long description: Supplemental Figure 1A. Maximum likelihood phylogenetic tree of 178 Pseudomonas spp. type strains based on concatenated partial AA sequences of 120 single-copy housekeeping genes. Cellvibrio japonicus Ueda 107 was used as an outgroup to root this tree. This tree is drawn to scale, and the scale bar represents 0.1 AA substitutions per site. The numbers located next to the nodes represent node confidence expressed as the percentage of 100 bootstrap replicates. Pseudomonas spp. group and subgroup assignments can be found in "Supplemental Figures 1A and 1B Pseudomonas spp. type strain species and subspecies groups." The amino acid alignment used to construct this tree can be found in "Supplemental Figure 1A amino acid sequence alignment for phylogenetic tree construction." Supplemental Figure 1B. Maximum likelihood phylogenetic tree of 178 Pseudomonas spp. type strains based on a nucleotide sequence of approximately 552 bp from within the ileS gene. Cellvibrio japonicus Ueda 107 was used as an outgroup to root this tree. This tree is drawn to scale, and the scale bar represents 0.2 nucleotide substitutions per site. The numbers located next to the nodes represent node confidence expressed as the percentage of 100 bootstrap replicates. Pseudomonas spp. group and subgroup assignments can be found in "Supplemental Figures 1A and 1B Pseudomonas spp. type strain species and subspecies groups." The nucleotide alignment used to construct this tree can be found in "Supplemental Figure 1B ileS DNA sequence alignment for phylogenetic tree construction."