This SNFmethodsPaper_Readme.txt file was generated on 2020-11-05 by Katherine Muller. In addition to describing the dataset, it lists each file and its relationship to other files (R inputs and outputs). Column names are described in Muller_CropScience_DataDictionary_2021.csv. ------------------- # GENERAL INFORMATION -------------------- 1. Title of dataset: Data from: Estimating agronomically relevant symbiotic N fixation in green manure breeding programs 2. Author Information. - Lead author: Katherine Muller Postdoctoral Scholar Cornell University, School of Integrated Plant Science kem325@cornell.edu - Principal Investigator and corresponding author: Laurie Drinkwater Professor Cornell Unviersity, School of Integrated Plant Science led24@cornell.edu - Additional authors: Joe Guinness Associate Professor Cornell University, Department of Statistics and Data Science Matthew Hecking Former technician in the Drinkwater lab. 3. Date of data collection 201709 to 201810 4. Geographic location of data collection. Hairy vetch samples came from the 2017-2018 Cover Crop Adaptability Trial at Big Flats Plant Materials Program (BF-PMC). Hairy vetch was seeded on 2017-09-12. All plots are 10' x 10'. Seeds were hand-broadcasted, raked, cultipacked. Bunker Triticale (216.4g) used to fill empty spots. There are 7-food aisles between blocks of plots (each block has 16 plots arranged in two rows). Crimson clover samples came from the 2017-2018 Cover Crop Demonstration Planting Block at Big Flats PMC. Seeds were planted on 2017-08-30 and plots were 10'x 15'. Details and results from these two cover crop trials were published as technical notes available online: USDA-NRCS. 2020. Plant Materials Technical Note no. 2: Evaluation of cool season cover crops in the Northeast Region. Greensboro, NC. https://www.nrcs.usda.gov/wps/portal/nrcs/detail/plantmaterials/technical/publications/?cid=nrcseprd1610414. 5. Information about funding sources that supported the collection of the data. This work was supported by USDA-NIFA grant #2015-51300-24192, with additional support from USDA-NIFA grant #2018-51106-28778. Guinness was supported by the National Science Foundation under grant No. 1916208 and the National Institutes of Health under grant No. R01ES027892. -------------------------- SHARING/ACCESS INFORMATION -------------------------- 1. Licenses/restrictions placed on the data: CCO 2. Links to publications that cite or use the data: Muller, K.E., Guinness, J., Hecking, M. and Drinkwater, L.E. (2021), Estimating agronomically relevant symbiotic N fixation in green manure breeding programs. Crop Sci. Accepted Author Manuscript. https://doi.org/10.1002/csc2.20517 3. Was data derived from another source? No. 4. Recommended citation for the data: Katherine Muller, Joseph Guinness, Matthew Hecking & Laurie Drinkwater. (2021) Data from: Estimating agronomically relevant symbiotic N fixation in green manure breeding programs [Dataset] Cornell University eCommons Repository. https://doi.org/10.7298/vrd4-4k23 Thumbnail photos taken by Sandra Wayman. --------------------- DATA & FILE OVERVIEW --------------------- ## The file list is organized by scripts, with associated data files for each script. All data files are cleaned-up versions combined from raw xls files from the UC Davis Stable Isotope Facility and xls files of field data created by Matthew Hecking. ### Codes used in file names: - Plant species: HV = hairy vetch, CC = crimson clover. - Type of data: 15N = data from the UC Davis Stable Isotope Facility, including natural abundance of the 15N isotope. - refs = reference grasses used for estimating the proportion of N acquired from SNF via 15N natural abundance. - LCCB2018: Indicates the USDA Project title (Legume Cover Crop Breeding Program) and year of data collection. respectively. ### Scripts and associated data files: 1. setUpData_CS.R This script reads in cleaned-up datasets and combines them into R objects that are used in other script. It is run at the beginning of all other scripts. Input data data files: - combinedCCdataByPlant_LCCB2018.csv - combinedHVdataByPlant_LCCB2018.csv - LCCB2018CC_vigorGrowthStage.csv - LCCB2018HV_vigorGrowthStage.csv - longFormatCC15Ndata.csv - longFormatHV15Ndata.csv - refsForCC_LCCB2018.csv - refsForHV_LCCB2018.csv R objects used in other scripts: - ccmonsterdf and hvmonsterdf: combined data for crimson clover and hairy vetch, respectively, with one row per plant - ccmonsterdf_laterplants: complete data for a batch of crimson clover plants that were sampled later than everything else (excluded from manuscript analysis). - CC15N and HV15N: stable isotope data for crimson clover and hairy vetch respectively. Data are in the "long" format with one row for each of the 3 measurements taken on the same plant. - vkc and vkh: long-format data on vigor and growth stages for crimson clover and hairy vetch, respectively. There is one row for each observation taken in the field--about 5 observations per plant. Vigor ratings are averaged for ccmonsterdf and hvmonsterdf. - ccrefs and hvrefs: stable isotope and other information for reference wheat plants used to measure the 15N signature of the field. 2. setUpForFigures_CS.R This script sets up data and analysis results for all figures and is sourced in each figure script. It runs the script setUpData_CS.R and uses analysis results from getTheorValAndCrossVal_20200310.R 3. patternsOfSNFVariation.R This script provides summary statistics for patterns of trait variation in hairy vetch and crimson clover. It runs setUpData_CS.R and uses the packages nlme and lme4. 4. getTheorValAndCrossVal_20200310.R This script runs MCMC models and makes predictions using subsets of MCMC parameters respresenting different combinations. Functions are loaded from the script MCMCmodel/gibbsfunctions.R. It saves output into RData and CSV files: - MCMCmodel/MCMCoutputForHV_20200317.RData - MCMCmodel/MCMCoutputforCC_20200317.RData - MCMCmodel/crossValForHVandCC20200310.RDATA - MCMCmodel/resultsTableForNdfa20200311.csv - MCMCmodel/resultsTableForTotalN20200311.csv - MCMCmodel/HV.Ndfa.modeltable.csv - MCMCmodel/HV.totalN.modeltable.csv - MCMCmodel/CC.Ndfa.modeltable.csv - MCMCmodel/CC.totalN.modeltable.csv 5. modelComparisonsWithCrossVal.R This script does pairwise comparisons between models with different sets of predictors. Results are presented in in Supplementary Table S2. 6. MCMCmodel/gibbsfunctions.R This script contains code for our gibbs sampler with a multivariate normal model, developed by Joe Guinness. It also contains functions to calculate summary statistics from MCMC models, to perform cross validation, and to calculate theoretical prediction variances. 7. is2vigorRatingsEnough.R This contains code used to address the question of whether the common practice of using two visual vigor ratings as a proxy for plant biomass could be improved by adding additional vigor ratings (descirbed in Discussion section of manuscript). 8. Scripts for Figures and Tables: Separate scripts create each figure in the main text and supplement. Tables were formatted in Microscoft Word. - figure1_regressions.R - figure2_correlationhist.R - figure3_modelcomparison.R - figureS1samplingDiagram.R - figureS2_refd15N.R - figureS3_totalNvsNdfaAndDW.R - figureS4_varRatioTest.R - figureS5_vigorRatings.R - tableS1_cormat.R: saves contents of Table S1 to a CSV to be formatted in MS word. ###################################### ## Description of data files All data files used for analysis are grouped in the subdirectory cleanedUpDataFiles/ 1. combinedCCdataByPlant_LCCB2018.csv Contains all data for crimson clover, including information on plants, SNF measurements, biomass measurements, and data collected in the field. 2. combinedHVdataByPlant_LCCB2018.csv Contains all data for hairy vetch, including information on plants, SNF measurements, biomass measurements, and data collected in the field. Information on plants "plantID":Unique identifier for each plant containing the plotID. "plotID" "cultivarCode" "easting.ft", "northing.ft": coordinates from the south-west corner of the experimental plot, in ft. SNF measurements in the Meristem, Wedge, Biomass, and Seed samples. Raw d15N measurements were adjusted by internal standards for consistency between runs (.adj). "percentNbymass_MER" "d15N.adj_MER" "percentNbymass_WEDGE" "d15N.adj_WEDGE" "percentNbymass_BIOMASS" "d15N.adj_BIOMASS" "percentNbymass_SEED" "d15N.adj_SEED" Other measurements M = Meristem, W = Wedge, S = Seed, PB = Biomass Sampling dates (calendar date and Days after Sowing, DAS). "MWdate" "PBdate" "Sdate" "MW.DAS" "PB.DAS" "S.DAS" Dry weight of samples (in g). "Mdwt.g" "Wdwt.g" "PBdwt.g": whole plant biomass at 50% flowering "SPBdwt.g": whole plant biomass at seed harvest "Sdwt.g": seed biomass "W.StemCount" "PB.StemCount" "MW.Vig" "PB.Vig" "S.Vig" Growth stage of plants at sample collectio (using Kalu Fick scale) "MW.GrowthStage" "PB.GrowthStage" "S.GrowthStage" Growth stage at which plants reached late flowering (Kalu Fick stage 6) "dateOfKF6": calendar date "timeToKF6_DAS": days after sowing. "meanvigor": qualitative vigor rating averaged over 6 observations. "gN_BIOMASS": total biomass N per plant, calculated from %N and plant biomass (PBdwt.g). "gN_SEED": total seed N per plant, calculated from %N and seed biomass (Sdwt.g). "harvestindex_percent": seed biomass as a percentage of total biomass at seed harvest. 3. LCCB2018CC_vigorGrowthStage.csv 4. LCCB2018HV_vigorGrowthStage.csv 5. longFormatCC15Ndata.csv 6. longFormatHV15Ndata.csv 7. refsForCC_LCCB2018.csv 8. refsForHV_LCCB2018.csv