Improved Near Infrared Analysis Method for Bovine Milk 

 
A Thesis 

Presented to the Faculty of the Graduate School 

of Cornell University 

In Partial Fulfillment of the Requirements for the Degree of 

Master of Science  

 
by 

Allison Spillane 

August 2025 

 
© 2025 Allison Spillane 

 
ABSTRACT 

 
This research strives to implement strong analytical chemistry technique to improve 

near infrared (NIR) predictive modeling of bovine milk. This is done with orthogonal 

sample set design as well as sturdy reference chemistry. A method for improving the 

accuracy of enzymatic assays for chemical reference testing methods for measurement 

of lactose and milk urea nitrogen (MUN) concentration in milk through measurement 

and certification of cuvette path length was developed using a confocal displacement 

sensor. This new method is a nondestructive method to measure cuvette path length 

eliminates the need for use of potassium chromate. Partial least square predictive 

models for homogenized and unhomogenized milks were created for measurement of 

the concentration of fat, protein, lactose, and total solids using a commercial NIR 

instrument. The external validation performance of the NIR prediction models 

developed in our study exceeded all previously published NIR prediction models for 

fat, protein, lactose and total solids. These methods will aid in possible 

implementation of NIR milk analysis and for rapid in-line milk analysis.  

 
iv 

 
BIOGRAPHICAL SKETCH 

 
Allison Spillane was born and raised in Illinois and attended Cary-Grove High School. 

She attended the University of Illinois at Urbana-Champaign where she received her 

Bachelor of Science of Liberal Arts and Sciences in Chemistry with a minor in Food 

Science. In her junior year she developed a love of analytical chemistry of food while 

working in Dr. Cadwallader’s lab, and under his recommendation began to work in the 

Integrated Bioprocessing Research Lab where she was trained in pilot scale 

downstream processing. Allison graduated a semester early in 2022 and began 

working full time in the processing plant while applying to graduate programs. Her 

experience granted her the opportunity to pursue her graduate degree in the Barbano 

lab at Cornell University. Outside of academics, Allison is the social media manager 

for the Big Red Brewing Club, an avid artist, and lover of video games.  

 
v 

 
Dedicated to my parents, who have made this dream possible. 

 
vi 

 
 ACKNOWLEDGMENTS 

I would like to thank my family, who have supported me in every step of my 

journey to and through my graduate education. I also want to thank my friends who 

have kept me grounded and made this journey enjoyable as well as survivable. Thanks 

also go out to my many mentors and coworkers that have allowed my work to reach its 

true potential, including the entirety of the Barbano Research Group. 

I also want to acknowledge my committee, Dr. David M. Barbano and Dr. 

Christopher Wolf, who have graciously mentored and guided me through this stage of 

my life and enabled this research.  

 
vii 

 
TABLE OF CONTENTS 

 
ABSTRACT  

  
iii  

BIOGRAPHICAL SKETCH  

DEDICATION  

  
iv  

  
ACKNOWLEDGEMENTS  

  
vi  

TABLE OF CONTENTS  

  
vii  

LIST OF FIGURES  

  
viii  

LIST OF TABLES  

  
ix  

CHAPTER I  INTRODUCTION  1  

      
CHAPTER II  METHOD DEVELOPMENT FOR 

OPTICAL CERTIFICATION OF 

CUVETTE PATH LENGTH 

  
19  

CHAPTER III  NEAR INFRARED MILK COMPONENT 

ANALYSIS MODELS 

  
42  

CHAPTER IV  CONCLUSION AND FUTURE WORK 68  

 
viii 

 
LIST OF FIGURES 

 
Figure 2.1  Confocal optical scanning apparatus for measuring the path length of 

10 mm cuvettes.  

 
25 

Figure 2.2  A confocal scanning apparatus to measure path length. The large 

vertical lines (panel to the right) indicate when the light beam 

encounters solid polystyrene side wall of the cuvettes and when the 

light beam encounters the air gap of the cuvette, the location of point 

2 and point 3 (i.e., the air to polystyrene interface) are determined and 

the difference is the pathlength of the cuvette is calculated in mm. 

 
26 

Figure 2.3  Path length scan of a typical cuvette from cuvette suppliers 1 through 

6. The horizontal line halfway between the two polystyrene walls is 

the center path length of the cuvette. 

 
29 

Figure 2.4  The impact of an unaccounted-for variation of +/- 2% in the relative 

path length (RPL) of cuvettes used for the determination of the 

lactose concentration in milk.  

 
33 

Figure 2.5  The impact of an unaccounted-for variation of +/- 2% in the relative 

path length (RPL) of cuvettes used for the determination of the milk 

urea nitrogen (MUN) concentration in milk.  

 
37 

Figure 3.1  Homogenized milk model predicted (X axis) versus reference 

chemistry (Y axis) graph for each of the four major components: fat, 

protein, lactose, and total solids. 

 
56 

Figure 3.2  Un-homogenized milk model predicted (X axis) versus reference 

chemistry (Y axis) graph for each of the four major components: fat, 

protein, lactose, and total solids. 

 
60 


ix 

 
LIST OF TABLES 

 
Table 2.1  Mean cuvette path length and cuvette path length variation within a 

box of 100 polystyrene cuvettes purchased from each of six 

different suppliers.  

 
28 

Table 2.2  Number of cuvettes sharing the same mold number in one box of 

100 cuvettes from Supplier 4. 

 
30 

Table 2.3  Mean path length measurement of 8 quartz cuvettes (3 measurement 

per cuvette) from two different manufacturers (cuvettes 1 through 6 

are from manufacturer 1 and cuvettes 7 and 8 are from manufacturer 

2). 

 
31 

Table 2.4  The Impact of an unaccounted-for variation of +/- 2% in the relative 

path length (RPL) of cuvettes used for the determination of the 

anhydrous lactose concentration (g/100 g) in milk.   

 
32 

Table 2.5  Impact of an unaccounted-for variation of +/- 2% in the relative path 

length (RPL) of cuvettes used on milk urea nitrogen (MUN) 

measurements (mg/100g milk). 

  
36 

Table 3.1  Model structure parameters and modeling metrics (i.e., R-square, 

RMSECV, and RPD) for prediction of fat, anhydrous lactose, true 

protein total solids concentration of milk when an in-line 

homogenizer was in the NIR flow system. 

  
55 

Table 3.2  Model structure parameters and modeling metrics (i.e., R-square, 

RMSECV, and RPD for prediction of fat, anhydrous lactose, true 

protein total solids concentration of milk when there is no in-line 

homogenizer was in the NIR flow system. 

  
59 

Table 3.3  External validation performance metrics [standard error of 

prediction (SEP) and mean difference (MD)] for NIR PLS models 

with (NIR H) and without (NIR NH) an -inline homogenizer and a 

comparison to a mid-infrared (MIR) milk analyzer calibrated with 

the same samples and measuring the same components on the set of 

validation milks from 48 individual farms.  

62 


1 

 
CHAPTER 1 

INTRODUCTION  

Mid-infrared (MIR) transmission spectrophotometry is used for most of the 

high-speed testing of large numbers of milk samples in the dairy industry. It’s speed of 

up to 600 samples per hour, accuracy, and ability to measure the major components in 

milk (i.e., fat, protein, and lactose) and some minor components in milk (i.e., milk 

urea nitrogen and various groups of milk fatty acids) have made MIR the method of 

choice for milk analysis world-wide. Near-infrared (NIR) reflectance 

spectrophotometry has been applied extensively in the dairy industry to measure fat 

and protein in solid dairy products (i.e., cheese, milk, and whey powders). While 

slower than the liquid analyzers, application of NIR reflectance generally does not 

require any sample preparation other than mixing and blending and the NIR 

measurements have sufficient speed and accuracy for the purpose of product 

composition analysis for off-line process control. As the scale of dairy product 

manufacturing plants has increased the interest in-line measurement of main milk 

components has increased. Measuring composition of a flowing liquid in real-time is a 

challenge for any analytical technology. MIR has high accuracy but physical aspects 

of the sensitivity of the MIR optic system and light transmission cuvettes in the harsh 

environment of a fluid product processing line (e.g., milk or whey) and the lack of 

technology for remote sensing using MIR have limited MIR’s application for process 

control. The NIR optical systems are more developed for remote sensing, but NIR 

prediction models for liquid milk and whey analysis (for fat, protein, and lactose) have 

not been of sufficient accuracy for this application. There is a need to improve the 


2 

 
accuracy of calibration reference chemistry, particularly for lactose, and NIR milk 

component measurement accuracy. Development and performance validation of more 

accurate NIR transmission methods for measurement of fat, protein, and lactose in 

milk and whey are needed for routine laboratory analysis of milk and whey and if this 

can be achieved, then new opportunities for in-line application of NIR for more 

accurate analysis of liquid milk and whey product during processing may become 

feasible.   

Reference Methods for Major Milk Components 

The Method Validation Process 

Application of infrared spectroscopy for milk testing requires robust reference 

chemistry of milk samples used for both for modeling and calibration adjustment of 

prediction models (Lynch, 2006).  The method validation process differs slightly 

between overseeing entities, but all have defined requirements for the creation and 

acceptance of a new reference method.  

The AOAC (Association of Official Analytical Chemists) outlines their 

collaborative study procedure for performance validation of analytical methods 

(AOAC International, Appendix D, 2023). Official methods groups like AOAC 

validate performance of methods, they do not designate which methods are used for 

regulatory purposes. It is the responsibility of regulatory agencies to designate which 

methods they use and accept as official reference methods. In the US, the USDA 

Federal Milk Market Administration is responsible for accuracy of payment testing 

and they designate which methods the use as official reference chemistry. That 

decision and selection of a method by a regulatory agency considers method 


3 

 
performance data evaluated by methods organization, such as AOAC, IDF 

(International Dairy Federation), and ISO (International Standards Organization). 

Requirements for an AOAC collaborative study of a method’s performance include a 

minimum of 5 sample materials, 8 laboratories for full validation, and two blind 

replicates per material. The process starts with one initial laboratory (i.e., study 

director) that determines the purpose of the study and conducts ruggedness testing of 

the method to determine sensitivity of method results to small changes in procedure 

(AOAC International, 2023, Appendix D).  The cost of conducting a collaborative 

study and the fee for the service of the methods agency to review and publish the 

method is paid for by the study director’s laboratory, or the group of collaborating 

laboratories. A collaborative study protocol and proposal for conducting the 

collaborative study needs to be written by the study director, submitted to the official 

method agency (e.g., AOAC/IDF/ISO) for review by an expert panel, revision, and 

approval) before the collaborative study is conducted. Once the protocol is approved, 

then the study director sends a letter of request for participation to qualifying labs 

followed by proper data reporting forms. It is normal to do one or more practice 

sample testing rounds with uncoded samples to ensure the method is working correctly 

in all participating laboratories. Once the participating laboratories are set, samples 

materials are prepared, split, blind coded and tested for homogeneity in the study 

director’s laboratory prior to being sent to participating laboratories with instructions 

and data collection sheets.  and sent out, testing, and results with documentation are 

submitted. Once the data is collected, the study director conducts statistical analysis 

using software provided by AOAC, to remove statistical outliers and calculate the 


4 

 
required within and between laboratory performance parameters. Once all analysis is 

complete, a collaborative study report is submitted to the method agency for review by 

the expert panel. If the outcome is approved the study director writes a collaborative 

study report in the format required by the official methods agency. Once approved, the 

method becomes a performance validated method. 

The IDF (International Dairy Federation) and ISO (International Organization 

for Standardization) follow similar processes for standardization, with some subtle 

differences and preferences. William Horwitz outlined the few differences between the 

ISO and AOAC collaborative study protocols, including minor terminology 

differences in certain design criteria, and procedures for removal of outliers (Horwitz, 

1986). 

Validated Reference Methods for the Major Milk Components 

The validated methods used for direct dairy milk analysis of the major milk 

components for payment testing in the US are set by the USDA Federal Milk Markets. 

Fat is quantified using a modified Mojonnier ether extraction, (AOAC 2023, method 

989.05) true protein is quantified by Kjeldahl true protein nitrogen analysis (AOAC 

2023, method 991.22), anhydrous lactose is quantified by enzymatic assay and 

spectrophotometric measurement (AOAC 2023, method 2006.06), and total solids is 

quantified by an atmospheric forced-air oven drying procedure (AOAC 2023, method 

990.20).     

A factor that influences analytical performance that is unique to reference 

method testing for lactose (and milk urea nitrogen) using enzymatic 

spectrophotometric analysis is the impact of accuracy and uniformity of the path 


5 

 
length of disposable cuvettes used in these methods. The current AOAC 

spectrophotometric method for lactose measurement (AOAC 2023, method 2006.06) 

describes a method for checking and controlling for cuvette path length in the method. 

Unfortunately, the method uses chemicals (i.e., potassium chromate) that require 

safety precautions and that are difficult to dispose of. There is a need for a better 

method to measure and control of path length of cuvettes used for chemical reference 

testing for lactose and milk urea nitrogen. 

The accuracy, within laboratory repeatability and between lab between 

laboratory reproducibility of the chemical reference testing methods set the limits for 

what can be achieved by secondary indirect milk testing methods (e.g., Milk-o testers, 

dye binding, and infrared milk analyzers). 

Indirect Methods of Milk Analysis 

Need for Rapid Milk Analysis 

Chemical reference methods are accurate and reliable but are time and labor 

intensive as well as costly. For efficient processing of perishable milk, quicker indirect 

methods are required for large-scale payment and quality testing. Milk analysis 

turnaround is also important for farmers to make informed decisions for herd 

management and today milk payment test results are the highest frequence with the 

most rapid data available to milk producers (e.g. 36 to 48 h). The Dairy Herd 

Improvement Association offers individualized cow milk analysis to measure 

component values, milk weights, and reproductive status (Dairy One, 2024). While 

this testing is on milk from individual cows, the frequency is low (i.e., one per month 

or once per quarter) and is primarily used to production record keeping and genetic 


6 

 
selection. More rapid tactical day-to-day decision making with in-line milk testing at 

the farm is needed and the dairy industry is currently evolving versions of that 

technology (e.g., AFI milk, DeLaval Herd Navigator, etc.).  

Milk-o Tester 

The first generation of rapid indirect milk analyzers were based on simple 

visible light scattering to measure fat concentration in milk. Time efficient readings 

allow for milk to be tested before being priced and sold within reasonable timeframes 

for a perishable product. Payment testing ensures fairness for both milk producers and 

consumers. The Milk-o Tester by Foss Electric Co. (Hillerod, Denmark) was the first 

commercially used for fat determination using a light scattering measurement. The 

milk-o-tester worked by equating the measurement of light scattering to be 

proportional to the amount of fat in a prepared sample. The method (AOAC 

International 2023, method 960.26) for milk-o-testers in-line homogenization of each 

sample by the instrument homogenizer to achieve uniform fat globule size following 

the adding a solution of Na4EDTA and NaOH to eliminate casein-based turbidity. 

Samples are automatically passed from the sample container, through a 60-degree 

water bath, homogenized, combined with EDTA solution, and assessed percent 

transmittance of light at 600 μm to directly predict the fat percentage. Collaborative 

study of several labs concluded that the milk-o tester was equally accurate to the 

Babcock reference method when properly calibrated, with a standard deviation of 

difference between the two methods of +0.077% (Shipe, 1969). With the advent of 

milk-o-tester, 80 samples could be tested per hour, with subsequent models improving 


7 

 
upon that rate (Shipe 1972). The limitation of the Milk-o Tester technology using light 

scattering is that it can only test for fat content of milk, not protein and solids.   

Dye-Binding 

Protein is the other major component that has high value. One way of total 

protein analysis that is faster than the Kjeldahl reference method, is determination by 

acid dye-binding, where the positive groups in proteins are bound to colored dyes and 

precipitate out of solution. The amount of dye remaining in solution, measurable by 

spectrophotometry, is inversely proportional to the protein content of the sample. The 

first report of dye binding being applied to milk for routine protein determination was 

by Udy (1956), using 25 mL of the dye orange G in a buffer solution added to 1.5 mL 

fresh milk and analyzing at 470-mµ. The Udy method was validated in a collaborative 

study (Luke, 1967), where samples were preserved with HgCl2, which concluded that 

within laboratory-variation was small and highly reproducible, while between-

laboratory reproducibility was larger (RSD(r) = 0.329, RSD(R) =0.851). There are two 

dye binding methods (AOAC International 2023, methods 967.12 and 975.17). Amido 

Black dye is more popular in Europe, while American institutions typically use 

Orange G/Acid Orange (Sherbon, 1967). Dye binding methods allowed for faster 

analysis than Kjeldahl at that time. As the world-wide production and consumption of 

cheese increased, there was an increased need to for more rapid and accurate milk 

protein testing because cheese yield is dependent on the protein concentration in milk.  

Mid-Infrared Milk Analysis 

The first instance of MIR for milk component analysis was described in 1964 

(Goulden, 1964) using a dual beam transmittance optical system with two cuvettes, 


8 

 
one cuvette containing water and the other containing milk. The spectra of the 

reference solution (i.e., water) was subtracted from the sample (i.e., milk) spectra. The 

original infrared transmittance analyzers used optical filters to take absorbance 

readings for specific bands of wavelengths within the 2.5-10 μm range (4000 to 400 

cm-1). These filters were physical optical transmittance lenses that would be rotated 

into the infrared light beam using a wheel of filters (in pairs that are a reference and 

sample wave band)  that would select (allow) a band of wavelengths to pass through 

where known chemical bonds of a major milk component would absorb light for water 

and milk, thus getting a difference absorbance reading for a specific component, such 

as fat carbonyl group at 5.723 um (Smith et al., 1993, 1995). Progressive 

improvements were made in mechanics of the MIR milk analyzers over the years. 

These improvements included enclosing the optical system and using desiccant to 

keep air  moisture content low and consistent in the optical light path, adding an in-

line homogenizer to the milk flow system to reduce sample-to-sample variation in 

light scattering by fat globules, using a split beam optical system with one cuvette  

versus a dual beam with two cuvettes, so the cuvette path length was the same for the 

zeroing and milk solutions, in-line heaters to reduce temperature variation at the 

cuvette, and the ability to adjust the angle of tilt of filters in the filter wheel to fine-

tune the range transmitted light wavelengths from one optical filter to the next. This 

selection of ranges of wavelengths from the IR spectrum before the detector is known 

as dispersive spectrometry. This technology became widely accepted in the mid-

1970’s until about 2000 for high-speed testing (100 to 600 samples per hour) for fat, 

protein, lactose and total solids, but was limited by pumping and flow system timing 


9 

 
and the rate at which a filter wheel could rotate 8 optical filters through the infrared 

light beam, stopping for each once to take an absorbance reading. Research level MIR 

instruments were able to run in a scanning mode and not use optical filters by using a 

rotating prism or a slow-moving diffraction grating to select for wavelength and move 

them across the detector. These were too expense and two slow (about 20 min per scan 

per sample) to be of practical use for high-speed milk testing, but the concept existed.  

 Several technological developments outside of dairy science and milk testing 

produced new opportunities for the evolution of MIR (and NIR) testing technology. 

The key technical advancements were the development and refinement of laser light 

sources that could be used to rapidly measure the position of an object (i.e., a moving 

mirror) or a location, micro-processing chip and personal computer development that 

faster and lower cost computing power, and automation for production of miniaturized 

more complex circuit boards with pre-programmed chips.  In basic physics the concept 

of producing an interferogram with moving mirrors and knowing their exact position 

to systematically vary the distance traveled by light was well known, but the 

computing power required to deconvolute interferograms was cost prohibitive.   

In the late 1980’s and 1990’s interferometer-based MIR spectrophotometers 

began to appear in the market that could calculate an absorbance spectra in a few 

seconds, with that speed increasing in the 1990’s as faster computer chips became 

available. The interferogram was translated into an entire MIR spectra at 8 or 16 cm-1 

resolution with using Fourier transform equations (Agnet, 1998) and the entire spectra 

of absorbance was able to be analyzed at once (Van de Voort, 1992).  Access to the 

entire range of wavelengths in the MIR spectra provided a new opportunity for 


10 

 
development of partial least squares (PLS) prediction models to estimate milk 

components instead of using the traditional method or using mean absorbance of light 

by selected ranges of wavelengths. The hope was that using more information (i.e., 

wavelengths) from the full spectra would improve the accuracy of measurement of fat, 

protein, lactose, and solids concentration in milk. FTIR also allows for faster, more 

sensitive, and driftless measurements (Fellget’s, Jacquinot’s, and Connes’ advantage, 

respectively) (Agnet, 1998). However, the access to a full wavelength range digital 

spectra recording offered some advantages for the basic filter wavelength approach to 

measuring main components in that digital selection of band of wavelengths could be 

done with more precision and the filter center wavelength and band width of basic 

filters could be adjusted and fine-tuned digitally beyond what could be done with the 

physical limitations encountered in manufacturing optical filters. Every digital filter 

from instrument-to-instrument would be the same. The process of optimizing and 

standardization of virtual basic filters for measurement of fat, protein, and lactose 

analysis in milk was first reported by Kaylegian et al. (2009) and when combined with 

the use of modified milk calibration samples to calculate intercorrection factors, the 

accuracy and repeatability (i.e. MD and SDD) of the basic fixed filter model approach 

to measuring fat, protein, and lactose was improved greatly.  

Both optical and virtual filters need proper setup and maintenance by way of 

precalibration, calibration, and continued performance and procedures for this have 

been published (Lynch et al. 2026). Pre-calibration procedures are an evaluation and 

control of the quality of the uncorrected MIR signal, and includes maintenance of the 

flow system, homogenizer, water repeatability, shift, linearity, primary slope, milk 


11 

 
repeatability, purging efficiency, and intercorrections as described by Lynch et al., 

2006.  The linearity adjustment, primary slope, and determination of intercorrection 

factors are applied to basic filter models, but not to PLS models.  All other 

precalibration procedures and tolerances that need to be maintained apply to both 

basic filter models and PLS models.  

Following the precalibration, the calibration adjustment of secondary slope and 

intercept correction is required for both basic filter models and PLS models. The set of 

calibration milk samples can be either a modified milk calibration sample set or a 

producer milk calibration set. The set of calibration samples have chemical reference 

values and are analyzed with IR.  A regression analysis of the instrument predicted 

values as the X-values and the chemical reference values are analyzed by linear 

regression to determine a new secondary slope and intercept for the calibration.  After 

the adjustment the mean difference between instrument values and reference 

chemistry is zero and the standard deviation of the differences (SDD) is calculated. 

The smaller the SDD the better the calibration.  Optimized basic filters using the 

modified milk calibration samples of the typical composition reported by Portnoy et 

al. often achieve SDD values of less than 0.01 for fat, protein, and lactose (Portnoy, 

2021).  

Numerous studies were done in the USDA Federal Milk Market laboratories to 

compare the long-term performance of the virtual filter models to PLS models for 

measurement of fat, protein, and lactose with a conclusion that the virtual basic filter 

models with intercorrection factors  (Lynch et al., 2006) established using modified 

milk samples gave more stable performance (i.e., agreement with all lab mean 


12 

 
reference chemistry on external validation milks from different regions of the US) 

across time.  

 At beginning of the full spectra era of MIR milk analysis not much attention 

was given to measurement of minor (i.e., low concentration) milk components using 

additional spectra information with PLS models.  However, the combination of more 

spectral information made available by FTIR provided an opportunity to expand the 

scope of information about other characteristics of milk that could be measured by 

MIR.  One of the first major new pieces of milk composition information derived from 

full-spectral data and PLS prediction modeling was for the measurement of milk urea 

nitrogen (MUN) (Leifer, 1996) that could be used to monitor efficiency of conversion 

of nitrogen in the feed to protein in milk (Roseler et al., 1993, Nousiainen et al., 2004).  

The next major innovation of commercial value in measuring a characteristic of milk 

was the prediction of groups of milk fatty acids (i.e., de novo, mixed origin, and 

preformed milk fatty acids) that could be related directly to the function of the two 

metabolic pathways by which fatty acids originate and are incorporated into milk fat. 

The first models for fatty acid chain length and mean unsaturation were published by 

Wojciechowski et al.  (2016), and for de novo, mixed origin, and preformed fatty acids 

were published by Woolpert et al. (2016).           

Now, well calibrated Fourier-transform MIR analyzers are the default method 

for quick, cost-effective analysis of dairy milk (Lynch, 2006). Infrared instruments 

have increased the amount of information able to be extracted from one milk sample 

in the smallest amount of time. PLS modeling has also made analysis of minor 

components that don’t have specific bond peaks possible, including models for fatty 


13 

 
acid chain length/unsaturation (Kaylegian, 2009), milk urea nitrogen (Portnoy, 2021), 

particle size of fat globules (Di Marzo, 2016), estimated blood NEFA (Aernouts, 

2020), BHB and acetone (Grelet, 2016). These are all important metrics, many of 

which have direct health implications for the cow, so while MIR testing is helpful for 

health testing, having rapid on-site data for every cow at each milking would be the 

most comprehensive way to monitor health. This is not feasible with benchtop MIR 

but may be possible with integrated NIR systems.     

Near-Infrared Analysis 

Near Infrared (NIR) is the range of wavelengths between MIR and the visible 

spectrum (800 to 2800 nm, 12500 to 3571 cm-1). The NIR spectrum contains 

combinations and overtones of the signal bands observed using MIR, and thus these 

NIR peaks are weaker and less defined than those in the MIR (Agnet, 1998).  

Previous Work and Performance of NIR for Milk Analysis 

Typical NIR modeling studies report Standard Error of Prediction (SEP) or 

Root Mean Square Error of Prediction (RMSEP) as an index model performance. For 

the wavelength range used in the studies (1000 to 2400 nm), Laporte and Paquin 

(1999) achieved an SEP for fat of 0.05 and true protein of 0.12, demonstrating that the 

information to predict milk component concentrations is contained in the NIR spectra 

of milk. With time, more attempts to improve for milk have replicated or improved 

NIR performance, with the Aernouts et al. (2011) reporting a fat RMSECV of 0.043, 

and crude protein prediction of 0.133%. The PLS prediction models reported by 

Laporte and Paquin (1999) and Aernouts et al. (2011) for fat and protein using NIR 

spectra did not achieve the level of accuracy performance of MIR models and do not 


14 

 
meet milk analysis criteria indicated in Standard Methods for Examination of Dairy 

Products (Barbano et al., 2024). Further work is needed to develop more accurate NIR 

milk analysis models for homogenized and unhomogenized milk that meet the 

performance criteria provided in SMEDP.  

Current and Future Needs for Major Milk Component Testing 

 While NIR spectroscopy for milk analysis has improved with time and 

continued research, NIR milk analysis has not been shown to meet the analytical 

performance of standards that are routinely achieved by MIR analyzers. The Standard 

Methods for Examination of Dairy Products has performance criteria (MD and SEP) 

for payment testing listed as 0.02% and 0.04% for fat, protein, and lactose, and 0.05% 

and 0.1% for total solids, for MD and SEP, respectively.  

Research Objectives 

Our first objective was to develop an improved method of cuvette path length 

certification to help enhance the accuracy of reference chemistry methods for lactose 

and milk urea nitrogen that are used as the calibration reference for both MIR and NIR 

methods. Our second objective was to develop new PLS models for NIR analysis of 

homogenized and unhomogenized milks that have improved analytical accuracy 

compared with previously published results. Our hypothesis is that building a PLS 

modeling milk sample population utilizing orthogonal designed modified milk sample 

sets with all laboratory mean (n=8 laboratories) reference chemistry combined with a 

population of bulk tank milks collected from different regions of the US with all 

laboratory mean reference chemistry would enable development of PLS prediction 


15 

 
models for fat, true protein, anhydrous lactose and total solids that would achieve 

improved NIR milk analysis performance.  

 
References  

Aernouts, B., E. Polshin, P. Saeys, and J. De Baerdemaeker. 2011. Visible and near-

infrared spectroscopic analysis of raw milk for cow health monitoring: 

Reflectance or transmittance? J. Dairy Sci. 94:5315–5329. 

https://doi.org/10.3168/jds.2011-4354. 

Aernouts, B., I. Adriaens, J. Diaz-Olivares, W. Saeys, P. Mäntysaari, T. Kokkonen, T. 

Mehtiö, S. Kajava, P. Lidauer, M. H. Lidauer, and M. Pastell. 2020. Mid-

infrared spectroscopic analysis of raw milk to predict the blood nonesterified 

fatty acid concentrations in dairy cows. J. Dairy Sci. 103:6422–6438. 

https://doi.org/10.3168/jds.2019-17952. 

Agnet, Y. 1998. Fourier transform infrared spectrometry. A new concept for milk and 

milk product analysis. Bull. Int. Dairy Fed. 332:58–68. 

AOAC International. 2023. Official Methods of Analysis 22nd ed. Assoc. Off. Anal. 

Chem., Arlington, VA. 

Barbano, D. M., and J. L. Clark. 1989. Infrared milk analysis—Challenges for the 

future. J. Dairy Sci. 72:1627–1636. https://doi.org/10.3168/jds.S0022-

0302(89)79275-4.   

Barbano, D. M., C. Melilli, M. Portnoy, and Technical Committee. 2024.  Chemical 

Methods for Major Milk Components. In Standard Methods for the Examination 

of Dairy Products. 18th ed., ed. J. L. Kornacki, E. T. Ryser, C. M. Mangione, 

and H. M. Wehr. American Public Health Association, Washington, DC. 

https://doi.org/10.2105/9780875533438ch17. 

Di Marzo, L., and D. M. Barbano. 2016. Effect of homogenizer performance on 

accuracy and repeatability of mid-infrared predicted values for major milk 


16 

 
components. J. Dairy Sci. 99:9471–9482. https://doi.org/10.3168/jds.2016-

11618. 

Grelet, C., C. Bastin, M. Gelé, J.-B. Davière, M. Johan, A. Werner, R. Reding, J. A. 

Fernandez Pierna, F. G. Colinet, P. Dardenne, N. Gengler, H. Soyeurt, and F. 

Dehareng. 2016. Development of Fourier transform mid-infrared calibrations 

to predict acetone, β-hydroxybutyrate, and citrate contents in bovine milk 

through a European dairy network. J. Dairy Sci. 99:4816–4825. 

https://doi.org/10.3168/jds.2015-10477. 

Smith, E. B., D. M. Barbano, J. M. Lynch, and J. R. Fleming. 1993. A quantitative 

linearity evaluation method for infrared milk analyzers. J. AOAC Int. 76:1300–

1308. https://doi.org/10.1093/jaoac/76.6.1300. 

Smith, E. B., D. M. Barbano, J. M. Lynch, and J. R. Fleming. 1995. Infrared analysis 

of milk: Effect of homogenizer and optical filter selection on apparent 

homogenization efficiency and repeatability. J. AOAC Int. 78:1225–1233. 

https://doi.org/10.1093/jaoac/78.5.1225. 

Finnegan, W., and J. Goggins. 2021. Environmental impact of the dairy industry. In 

Environmental Impact of Agro-Food Industry and Food Consumption, ed. C. 

M. Galanakis, 129–146. Academic Press, London, UK. 

https://doi.org/10.1016/B978-0-12-821363-6.00004-7. 

Goulden, J. D. S. 1964. Analysis of milk by infra-red absorption. J. Dairy Res. 

31:273–284. https://doi.org/10.1017/S0022029900018203. 

Horwitz, W. 1986. Harmonization of collaborative study protocols. J. Assoc. Off. 

Anal. Chem. 69:393–395. 

Kaniyamattam, K., and A. De Vries. 2014. Agreement between milk fat, protein, and 

lactose observations collected from the Dairy Herd Improvement Association 

(DHIA) and a real-time milk analyzer. J. Dairy Sci. 97:2896–2908. 

https://doi.org/10.3168/jds.2013-7690. 

Kaylegian, K. E., G. E. Houghton, J. M. Lynch, J. R. Fleming, and D. M. Barbano. 

2006. Calibration of infrared milk analyzers: Modified milk versus producer 


17 

 
milk. J. Dairy Sci. 89:2817–2832. https://doi.org/10.3168/jds.S0022-

0302(06)72555-3. 

Kaylegian K.E., J. M. Lynch, J. R. Fleming, and D. M. Barbano. 2009. Influence of 

fatty acid chain length and unsaturation on mid-infrared milk analysis. J Dairy 

Sci. 92:2485-2501. 

Laporte, M.-F., and P. Paquin. 1999. Near-infrared analysis of fat, protein, and casein 

in cow’s milk. J. Agric. Food Chem. 47:2600–2605. 

https://doi.org/10.1021/jf980929r  

Luke, H. A. 1967. Collaborative testing of the dye binding method for milk protein. J. 

Assoc. Off. Anal. Chem. 50:560–564. https://doi.org/10.1093/jaoac/50.3.560. 

Lefier, D. 1996. International Dairy Federation Standard 315: UHT cream – 

Analytical methods for the determination of urea content in milk – Transgenic 

dairy mammals – Oxidized sterols. Int. Dairy Fed., Brussels, Belgium. 

Lynch J. M., D. M. Barbano, M. Schweisthal, and J. R. Fleming. 2006. Precalibration 

Evaluation Procedures for Mid-Infrared Milk Analyzers. J. Dairy Sci. 

89:2761–2774. 

Nousiainen, J., K. J. Shingfield, and P. Huhtanen. 2004. Evaluation of milk urea 

nitrogen as a diagnostic of protein feeding. J. Dairy Sci. 87:386–398. 

https://doi.org/10.3168/jds.S0022-0302(04)73178-1. 

Portnoy, M., C. Coon, and D. M. Barbano. 2021. Infrared milk analyzers: Milk urea 

nitrogen calibration. J. Dairy Sci. 104:11432–11442. 

https://doi.org/10.3168/jds.2020-18772. 

Roseler, D. K., J. D. Ferguson, C. J. Sniffen, and J. Herrema. 1993. Dietary protein 

degradability effects on plasma and milk urea nitrogen and milk nonprotein 

nitrogen in Holstein cows. J. Dairy Sci. 76:525–534. 

https://doi.org/10.3168/jds.S0022-0302(93)77372-5. 

Sherbon, J. W. 1967. Rapid determination of protein in milk by dye binding. J. Assoc. 

Off. Anal. Chem. 50:542–547. https://doi.org/10.1093/jaoac/50.3.542. 


18 

 
Shipe, W. F. 1969. Collaborative Study of the Babcock and Foss Milko-Tester 

Methods for Measuring Fat in Raw Milk. J. Assoc. Off. Anal. Chem. 52:131–

138. https://doi.org/10.1093/jaoac/52.1.131. 

Shipe, W. F. 1972. Current status of the Milko-Tester. J. Dairy Sci. 55:652–655. 

https://doi.org/10.3168/jds.S0022-0302(72)85555-3. 

Udy, D. C. 1956. A rapid method for estimating total protein in milk. Nature 178:314–

315. https://doi.org/10.1038/178314a0. 

Van De Voort, F. R., J. Sedman, G. Emo, and A. A. Ismail. 1992. Assessment of 

Fourier transform infrared analysis of milk. J. AOAC Int. 75:780–785. 

https://doi.org/10.1093/jaoac/75.5.780. 

Wojciechowski, K. L., and D. M. Barbano. 2016. Prediction of fatty acid chain length 

and unsaturation of milk fat by mid-infrared milk analysis. J. Dairy Sci. 

99:8561–8570. https://doi.org/10.3168/jds.2016-11248. 

Woolpert, M. E., H. M. Dann, K. W. Cotanch, C. Melilli, L. E. Chase, R. J. Grant, and 

D. M. Barbano. 2016. Management, nutrition, and lactation performance are 

related to bulk tank milk de novo fatty acid concentration on northeastern US 

dairy farms. J. Dairy Sci. 99:8486–8497. https://doi.org/10.3168/jds.2016-

10998. 

 
19 

 
CHAPTER 2  

METHOD DEVELOPMENT FOR OPTICAL CERTIFICATION OF CUVETTE 

PATH LENGTH 

 
ABSTRACT 

 Our objective was to develop an optical method for the nondestructive method 

certification of cuvette path length using a chromatic confocal displacement sensor for 

use in enzymatic assay calculations. An apparatus was built for non-destructively 

measuring cuvette path length of batches of 8 cuvettes at a time. Boxes of 100 10mm 

path length cuvettes from 6 different suppliers were tested using the sensor and 

analyzed for variation from supplier-to-supplier and mold-to-mold. ANOVA was used 

to detect differences in cuvette path length within and between different cuvette 

suppliers/manufacturers. The impact of the observed variation in path length for both 

lactose and MUN assays was demonstrate using enzymatic test sample data from a 

proficiency testing study. Differences in cuvette path length among different molds 

from the same supplier and among different suppliers was detected.  

 
Key Words: polystyrene cuvette, confocal, path length 

 
INTRODUCTION 

 Two important chemical reference testing method used in the dairy industry 

are the measurement of lactose and milk urea nitrogen (MUN) content of milk. These 

chemical methods need to produce highly accurate results on a small number of 

samples because those results are used to calibrate high speed electronic milk testing 

equipment. MUN concentration is an important indicator of cow health and of 


20 

 
environmental sustainability of milk production. MUN is directly correlated to blood 

urea nitrogen and can thus be used as an alternative test to blood sampling (Broderick 

and Clayton, 1997). MUN concentration is a measure of the efficiency of the cow’s 

metabolism to convert dietary nitrogen into milk proteins, as opposed to excreting 

dietary nitrogen as urea (Lewis, 1957). For rapid herd management and efficient 

payment testing, MUN is measured using mid-infrared (MIR) spectroscopy utilizing 

partial least squares (PLS) regression models (Haaland and Thomas, 1988). MIR 

MUN prediction models require accurate reference chemistry on a series of calibration 

milks for calibration (i.e., adjustment of slope and intercept), and thus rely on the 

standard enzymatic spectrophotometric method for MUN measurement (Portnoy, 

2021). 

 Lactose is a main component of milk and is important for many aspects of 

bovine milk, as reviewed by Portnoy and Barbano (2021). Cow health impacts blood 

glucose metabolism and affects the ability of the cow to synthesize lactose. Lactose 

synthesis in grams per cow per day is directly proportional to the amount of milk 

produced by a cow. Lactose is of major interest in terms of the dairy products made 

from milk because of various solid (crystal and amorphous glass) forms of lactose 

impact mouthfeel and texture. From a nutritional perspective may dairy products are 

reduced in lactose by either enzymatic hydrolysis of lactose or removal of lactose by 

ultrafiltration to address the needs of consumers that are lactose intolerant. As a result, 

there is a need for methods to measure lactose concentration milk and dairy products. 

One method for lactose testing is a spectrophotometric enzymatic assay (Lynch et al., 

2007). 


21 

 
 The enzymatic spectrophotometric methods for lactose and MUN utilize 

disposable cuvettes, the most common ones being polystyrene or methyl acrylate. 

These cuvettes are convenient to use, relatively inexpensive and readily available from 

many distributors. In a collaborative of the enzymatic lactose method by Lynch et al. 

(2007), the range of mean cuvette path length across laboratories was 0.17 mm when 

the disposable cuvettes for the study were purchased from one supplier. In that study, 

when cuvette path length among laboratories was determined and accounted for in the 

calculation of results, the between laboratory agreement metric SR was improved (i.e. 

better between lab agreement) from 0.0349 to 0.0214. The alternative to disposable 

cuvettes is reusable quartz cuvettes. However, these cuvettes are significantly more 

expensive, are fragile and break, which is impractical for assays that require the use of 

many cuvettes at one time. For these reasons, disposable cuvettes are preferred for 

large scale routine analysis.  

 The current enzymatic lactose method (AOAC 2023, method number 2006.06) 

also includes a procedure for determination of relative path cuvette length using 

potassium chromate, a compound that has hazardous properties (ThermoFisher, 2020). 

An alternative more safe and environmentally friendly procedure for checking cuvette 

path length is needed. An optical method for distance measurement might be a 

candidate for this task. 

 In an evaluation of several optical methods for distance/displacement 

measurements, Berkovic and Shafir (2012) outlined the confocal sensor as “generally 

applicable for accurate measurements of displacements and surface profiles at 

distances of millimeters” and this approach may be appropriate for determination of 


22 

 
cuvette path length. The confocal sensor is efficient at measuring transparent 

materials, such as glass, because the index of refraction at the interface of the glass 

and air creates an easily distinguishable signal for the sensor to detect an interface, and 

therefore accurately calculate distance between to surfaces. The interface detection is 

based on Snell’s law of refraction between two materials (Weng, 2017). Some studies 

have reported reading accuracy of 0.25μm and resolution of 0.035nm for a 

measurement range of 175μm (Yu, 2018). It is expected that a similar phenomenon 

will be observed in other transparent materials, such as polystyrene. For the 

application of confocal imaging to measurement of cuvette pathlength, a double layer 

film approach might be able to be accurately measure thickness nondestructively 

(Choi et al, 2020). Thickness measurements, 3D typography, surface roughness, and 

geometric measurements can be made, even in some fast-moving environments (Ma, 

2023). Chromatic confocal distance sensing may meet the needs for a nondestructive 

measurement of cuvette path length without the use of undesirable chemicals. Our 

research objective was to develop an accurate, nondestructive optical method to 

determine the path length of disposable cuvettes used for reference method analysis 

for enzymatic assays in milk analysis.  

MATERIALS AND METHODS 

Cuvettes  

 Six different boxes of polystyrene cuvettes (100 cuvettes with a nominal 

pathlength of 10 mm) were purchased from six different vendors. A full box of 100 

cuvettes was analyzed for each supplier, each box with a different mold ID on the 

cuvettes. Five of the six vendors provide a box of cuvettes that were all from the same 


23 

 
forming mold number, while one of the vendors provided boxes of 100 that contained 

cuvettes with a mixture of mold numbers.  

Our study also included eight quartz cuvettes (6 from one manufacturer and 2 from 

another manufacturer.  

Confocal Sensor  

 A Keyence Corporation (Itasca, IL) CL-3000 confocal displacement sensor 

controller was paired with a CL-P070 spot type optical sensor head. The CL-P070 is 

the mid-range model of optical head, with a reference distance of 70 mm, 

measurement range of ± 10 mm, and resolution of 0.000025 mm (KEYENCE 

Corporation, 2025). The sensor controller was connected to a computer with the CL-

NavigatorN software version 1.7.0.0 (Keyence, Itasca IL) installed. The CL-3000 

confocal displacement sensor controller has software settings for different materials 

(e.g., polystyrene, quartz, glass, etc.) and the frequency of measurement data points as 

a function of scan time can be adjusted. 

 The CL-P070 is a polychromatic confocal sensor that uses multiple 

wavelengths of light. Chromatic dispersion produced by optics causes different 

wavelength of light to be imaged at different distances along the optical axis from the 

sensor to the object. Thus, in the region of the image plane each point along the optical 

axis is the image point of a specific wavelength. This wavelength will be the dominant 

wavelength in the light backscattered confocally to the detector by an object at this 

point. Consequently, spectral measurement of the backscattered light can be translated 

very accurately to an object position. Chromatic confocal sensors offer highly precise 

distance measurements with resolution below one micrometer across a multiple 


24 

 
millimeter range, which fits our application for measurement of the path length of 10 

mm cuvettes (Berkovic and Shafir, 2012). In our application, the cuvette polystyrene 

walls and the air space between walls have different indexes of refraction and reflect 

light differently, which means these points are detectable by the sensor. (Weng et al. 

2013). The distance between the cuvette walls is displayed in the CL-NavigatorN 

program.  

Apparatus for Determination of Cuvette Path Length 

The cuvette scanning apparatus, as viewed from above and the side, is shown 

in Figure 1. A dovetail X-Y optical stage SO-8 and Z bracket SO-13 with a custom-

made cuvette holder attached to the X-Y Stage and the CL-P070 sensor head mounted 

to the Z-bracket (MIRUC Optical Company LTD, Tokyo, Japan). The Z-bracket 

allowed movement of the CL-P070 sensor head up and down to achieve the correct 

focal distance above the cuvettes below. The X-Y stage was driven by a variable speed 

stepper motor to allow scanning a batch of 8 cuvettes placed horizontally under the 

moving sensor head (Figure 1). An electric motor coded 42HDB0014NY-24B, rated 

for 3.5 ohms and 1.0 amps was used. The motor speed was controlled with a variable 

speed drive (ZK-SMC02) that moved the optical head left and right. The scan rate of 

the head is about 3 to 4 mm per second; thus, it takes about 20 to 25 seconds to scan 8 

cuvettes. The scanning process is shown in Figure 2 where cuvettes are scanned at a 

point about halfway between the top and bottom of the cuvette when the light beam in 

a spectrophotometer would pass through the cuvette. The estimated pathlength 

changes dramatically when the light beam no longer encounters the air gap. Those 

points are marked as the side walls of the cuvette, the mid-point between the two walls 


25 

 
is determined and the path length at the mid-point of the cuvette is recorded. The 

scanner is recording the path length as it goes across the cuvette, so if the distance 

between the two cuvette side walls in not uniform, that will be visualized in the trace 

(Figure 2). 

 
Figure 2.1. Confocal optical scanning apparatus for measuring the path length of 10 

mm cuvettes. 

 
26 

 
Figure 2.2. A confocal scanning apparatus to measure path length. The large vertical 

lines (panel to the right) indicate when the light beam encounters solid polystyrene side 

wall of the cuvettes and when the light beam encounters the air gap of the cuvette, the 

location of point 2 and point 3 (i.e., the air to polystyrene interface) are determined and 

the difference is the pathlength of the cuvette is calculated in mm. 

 
Data collection from the confocal sensor. The supplier-to-supplier difference 

in cuvette path length was determined by scanning groups of 8 cuvettes from supplier 

1 until all 100 cuvettes from one box were scanned and the mold number of each 

cuvette was recorded, then 100 cuvettes from supplier 2 were scanned, and so on for 

100 cuvettes from each of the 6 suppliers. In addition, path length of 8 high quality 10 

mm path length cuvettes from two different suppliers was determined. 

Impact of variation in cuvette path length on enzymatic assay method results. 

 Two important enzymatic assay reference methods used in the dairy industry 

are used for measurement of the concentration of MUN and anhydrous lactose in milk. 


27 

 
Results of these methods are used to establish reference values for milk samples used 

to calibrate infrared milk analyzers. Reference chemistry analysis data from testing a 

set of modified milk calibration samples with these two enzymatic assay methods 

were used as the base for a sensitivity analysis to determine the impact supplier-to-

supplier and mold-to-mold variation in cuvette path length on MUN and lactose 

analysis results if path length correction is not done.  

RESULTS And DISCUSSON 

Supplier-to-supplier variation in cuvette path length. 

 
The mean cuvette path length for a box of 100 cuvettes differed (P < 0.05) among 

suppliers (Table 1). Five out of six suppliers provided a box of 100 polystyrene 

cuvettes that were all produced from the same forming mold number (i.e., marked on 

each cuvette). The sixth supplier (supplier 4) had cuvettes from 15 different molds 

mixed together in the same box of 100 cuvettes. The mean path length of boxes of 100 

commercial polystyrene cuvettes varied from a low of 9.9818 mm to a high of 10.2805 

mm across the six suppliers (Table 1). This is a range in relative path length of 

between 3 and 4% relative to total path length and variation from box-to- box of 

cuvettes would be expected to have a significant impact of results for any enzymatic 

assay that assumes for calculation of results that all cuvettes have a path length of 

10.00 mm (i.e., one centimeter). The supplier with cuvettes that were from a mixture 

different mold ID within the same box of 100 cuvettes had a 10-fold higher variation 

in cuvette path length within a box of 100 cuvettes than the cuvettes from the other 5 

suppliers. A laboratory using cuvettes from this supplier would have poorer 

repeatability of duplicate results of the same milks using these cuvettes than when 


28 

 
using cuvettes from any of the other suppliers to test the same set of milk samples, all 

other factors being equal. 

 
Table 2.1. Mean cuvette path length and cuvette path length variation within a box of 

100 polystyrene cuvettes purchased from each of six different suppliers.  

   Supplier     

Measurement 1 2 3 41 5 6 

Mean path length2 

(mm) 

9.9818e 10.0100c 9.9858d 10.2805a 10.1921b 9.9856d 

Standard deviation 

(mm) 

0.0009 0.0011 0.0011 0.0115 0.0029 0.0008 

Maximum (mm) 9.9843 10.0123 9.9886 10.3045 10.1987 9.9888 

Minimum (mm) 9.9797 10.0073 9.9826 10.2563 10.1857 9.9840 

Range (mm) 0.0046 0.0050 0.0060 0.0482 0.0130 0.0048 

Mold ID B2 C7 B3 mixed 2 B5 

       
1Supplier 4 had cuvettes from multiple molds within one box of 100. 

2 Means within the same row that do not share a common superscript are different (P < 

0.05).  

 
29 

 
Figure 2.3. Path length scan of a typical cuvette from cuvette suppliers 1 through 6. 

The horizontal line halfway between the two polystyrene walls is the center path 

length of the cuvette.  

 
Cuvette forming mold-to-mold variation in cuvette path length. 

 
The box of 100 polystyrene cuvettes from supplier 4 contained a mixture of 

cuvettes from 15 different forming molds and therefore forming mold-to-mold 

variation in path length would be present in this box of cuvettes. Within the box of 

100, there were differences (P <0.05) in mean path length among groups of cuvettes 

(low 10.2669 and high 12.2947 mm) produced by different molds (Table 2). The data 

in Table 1 where differences (P < 0.05) were detected in mean path length of boxes of 

100 cuvettes from different forming molds. The mold-to-mold variation between 

suppliers can also be seen in Table 1. Cuvettes from suppliers 1, 2, 3, and 6, were 

marketed by different suppliers, but it was likely those cuvettes were produced by one 


30 

 
manufacturer, based on the format of mold ID (Table 1) and the fact that the foam 

boxes that cuvettes were packed in had the same markings were likely produced by the 

same cuvette manufacturer. The shortest mean cuvette path length was mold B2 

(9.9818 mm) and longest mean path length was mold C2 (10.0100 mm) and the mean 

path length of cuvettes from those two boxes was different (P <0.05). These 

differences could impact within-lab (repeatability of results) as well as across-lab data 

(reproducibility of results). 

 
Table 2.2. Number of cuvettes sharing the same mold number in one box of 100 

cuvettes from Supplier 4. 

Mold ID Number of cuvettes with 

the same mold number 

Mean path length1 (mm) 

1 6 10.2792d 

2 7 10.2809d 

3 7 10.2746e 

4 7 10.2997a 

5 7 10.2669f 

6 6 10.2992a 

7 5 10.2634fg 

8 7 10.2624g 

9 7 10.2836cd 

10 6 10.2726e 

11 8 10.2756e 

12 7 10.2820d 

13 6 10.2849cd 

14 6 10.2947b 

16 8 10.2866c 
1 Means within the same column that do not share a common superscript are different 

(P < 0.05). 

 
31 

 
Cuvette path length of quartz cuvettes. Quartz cuvettes are manufactured by a 

different process than polystyrene cuvettes. Quartz cuvettes are much more expensive 

than disposable polystyrene cuvettes and as result quartz cuvettes washed and reused. 

However, quartz cuvettes can become damaged and broken during repeated use. The 

mean path lengths of cuvettes from two different manufacturers differed (P < 0.05), as 

shown in Table 3. Thus, even quartz cuvettes from different manufacturers can differ 

in path length. 

 
Table 2.3. Mean path length measurement of 8 quartz cuvettes (3 measurement per 

cuvette) from two different manufacturers (cuvettes 1 through 6 are from manufacturer 

1 and cuvettes 7 and 8 are from manufacturer 2). 

Cuvette Path length (mm) 

1 9.9944b 

2 9.9892b 

3 9.9876b 

4 9.9866b 

5 9.9880b 

6 9.9848b 

7 10.0098a 

8 10.0100a 

 
1 Means within the same column that do not share a common superscript are different 

(P < 0.05).  

 
Impact of cuvette path length variation on milk urea/MUN and Lactose results.  

  Enzymatic lactose analysis. An unaccounted-for variation of (+/-) 2% relative 

in cuvette path length from an assumption of a nominal path length of 10.00 mm 


32 

 
caused a systematic variation in the mean reference values for a set of calibration 

reference values from 4.635 to 4.453% (g/100 g) anhydrous lactose concentration in 

milk (Table 4), for mean cuvette path lengths from 9.8000 mm to 10.2000 mm, 

respectively. For different laboratories testing the same milks using different 

polystyrene cuvettes, this unaccounted-for variation in mean cuvette path length would 

impact the laboratory difference in mean test results depending on the source of 

cuvettes that were used and if a relative path length determination was not done and 

accounted for in the calculation of lactose results. This error in reference value would 

be transferred to an error in the calibration infrared milk testing instruments measure 

lactose content of milk which impacts the calculation of energy output in milk 

produced by dairy cows and estimates of feed efficiency (Tyrrel and Reid, 1965). 

 
Table 2.4. The Impact of an unaccounted-for variation of +/- 2% in the relative path 

length (RPL) of cuvettes used for the determination of the anhydrous lactose 

concentration (g/100 g) in milk. 

 
 RPL (1.00 equals 10.0000 mm) 

sample 0.98 0.99 1.00 1.01 1.02 

1 4.0701 4.0289 3.9887 3.9492 3.9104 

2 4.6218 4.5751 4.5294 4.4845 4.4406 

3 5.1878 5.1354 5.0840 5.0337 4.9843 

4 5.0378 4.9870 4.9371 4.8882 4.8403 

5 4.3502 4.3062 4.2632 4.2210 4.1796 

6 4.6307 4.5840 4.5381 4.4932 4.4491 

7 4.6458 4.5988 4.5529 4.5078 4.4636 

8 4.5149 4.4693 4.4246 4.3808 4.3378 

9 4.7681 4.7200 4.6728 4.6265 4.5811 

10 4.2325 4.1897 4.1478 4.1068 4.0665 

11 4.9124 4.8628 4.8142 4.7665 4.7198 

12 4.0943 4.0529 4.0124 3.9727 3.9337 

13 4.6258 4.5790 4.5333 4.4884 4.4444 

14 5.2001 5.1476 5.0961 5.0457 4.9962 

Mean 4.6352 4.5883 4.5425 4.4975 4.4534 


33 

 
Figure 2.4. The impact of an unaccounted-for variation of +/- 2% in the relative path 

length (RPL) of cuvettes used for the determination of the lactose concentration in 

milk. 

 
Based on the data for variation cuvette bath reported in Table 1, a (+/-) 

variation in unaccounted for variation in relative cuvette path length would have a 

slope and bias impact (Figure 4) when comparing results of reference chemistry 

analysis for lactose among multiple laboratories conducting anhydrous lactose 

measurement on milks used as reference samples for calibration of mid-infrared milk 

analyzers. It can be seen in Figure 4 that unaccounted for variation in cuvette path 

length from lab-to-lab can cause both bias and slope differences in the comparison of 

results among laboratories and the use of residual plots evaluate laboratory 

performance in comparison to the all-laboratory mean has been used in UDSA Federal 

-0.15

-0.10

-0.05

0.00

0.05

0.10

0.15

3.90 4.10 4.30 4.50 4.70 4.90 5.10

L
a

ct
o

se
 d

if
fe

re
n

ce
 (

g
/1

0
0
g

 
m
il

k
)

Anhydrous lactose (g/100 g milk)

RPL

0.98
RPL

0.99
RPL

1.00
RPL

1.01
RPL

1.02


34 

 
Milk Markets to been used to identity source of difference in results among 

laboratories (Wojciechowski et al. 2016). The impact of running the enzymatic lactose 

assay on milks by weight versus by volume and accounting for, or not accounting for, 

lab-to-lab variation in mean cuvette path length on method within laboratory 

(repeatability) and between laboratory (reproducibility) performance of individual 

laboratories was reported by Lynch et al. (2007), with standard of deviation of 

repeatability (Sr) without path length adjustment and with an adjustment of 0.0128 and 

0.0130, respectively and the standard deviation of reproducibility (SR) without path 

length adjustment and with adjustment of 0.0349 and 0.0250, respectively. methods. 

Lack of correction for variation in cuvette path length among laboratories increased 

differences in results among laboratories. When the assay is run by volume instead of 

by weight both the within and between lab agreement were much worse (Lynch et al. 

2007). The impact of this unaccounted-for variation in cuvette path length would 

depend on the source, and changes in source of cuvettes purchased by the laboratories. 

When calibration of an infrared analyzer is done using calibration reference samples 

where the reference values are the results of one laboratory, versus an all-laboratory 

mean, the accuracy and stability of calibration of the infrared milk analyzers will be 

improved by calibration reference samples that have all-lab mean reference chemistry 

(Wojciechowski et al. 2016).  

 Enzymatic urea/MUN analysis. An unaccounted-for variation of (+/-) 2% 

relative in cuvette path length from an assumption of a nominal path length of 10.00 

mm caused a systematic variation in the mean reference values for a set of calibration 

reference values from 12.82 to 13.34 mg/100 g milk in MUN concentration (Table 5) 


35 

 
for mean cuvette path lengths from 9.8000 mm to 10.2000 mm, respectively. For 

different laboratories testing the same milks using different polystyrene cuvettes, this 

unaccounted-for variation in mean cuvette path length would impact the laboratory 

difference in mean test results depending on the source of cuvettes that were used and 

if a relative path length determination was not done and accounted for in the 

calculation of milk urea/MUN results. Based on the data for variation cuvette bath 

reported in Table 1, a (+/-) variation in unaccounted for variation in relative cuvette 

path length would have a slope and bias impact (Figure 5) when comparing results of 

reference chemistry analysis for MUN among multiple laboratories conducting MUN 

measurement on milks used as reference samples for calibration of mid-infrared milk 

analyzers. This error in reference value would be transferred to an error in the 

calibration infrared milk testing instruments (as discussed above for lactose analysis) 

used measure urea/MUN content of milk which impacts the calculation of energy 

output in milk produced by dairy cows and estimates nitrogen excretion into the 

environment and the efficiency of dietary nitrogen utilization (Godden et al, 2011). 

Errors in reference values would be transferred to an error in the calibration infrared 

milk testing instruments measuring the urea/MUN content of milk. Variation in 

observed MUN concentration in milk is used by dairy nutritionists to adjust dairy 

cattle rations to minimize excess excretion of urea nitrogen into the environment by 

dairy cows in manure and urine and to improve efficiency of milk protein production. 

 
36 

 
Table 2.5: Impact of an unaccounted-for variation of +/- 2% in the relative path length 

(RPL) of cuvettes used on milk urea nitrogen (MUN) measurements (mg/100g milk). 

 
  Relative path length of MUN  

sample 0.98 0.99 1.00 1.01 1.02 

1 14.34 14.20 14.06 13.92 13.78 

2 17.13 16.95 16.78 16.62 16.45 

3 8.12 8.04 7.92 7.88 7.80 

4 11.82 11.70 11.59 11.47 11.36 

5 17.53 17.36 17.18 17.01 16.85 

6 9.02 8.93 8.84 8.75 8.66 

7 10.29 10.19 10.09 9.99 9.89 

8 20.38 20.17 19.97 19.77 19.58 

9 10.13 10.03 9.93 9.83 9.73 

10 13.87 13.73 13.59 13.45 13.32 

11 18.74 18.55 18.37 18.19 18.01 

12 9.16 9.07 8.98 8.89 8.80 

13 11.63 11.51 11.40 11.28 11.17 

14 14.58 14.44 14.29 14.15 14.01 

Mean 13.34 13.20 13.07 12.94 12.82 

 
37 

 
Figure 2.5. The impact of an unaccounted-for variation of +/- 2% in the relative path 

length (RPL) of cuvettes used for the determination of the milk urea nitrogen (MUN) 

concentration in milk. 

 
CONCLUSIONS 

 An optical method for the nondestructive determination of cuvette path length 

was created using a chromatic confocal displacement sensor. The range of path length 

of polystyrene cuvettes was found to be 3 to 4% relative to total cuvette path length 

among suppliers. The results of this analysis can be used to screen and assign a 

certified path length value to boxes of 100 cuvettes and that path length value can be 

used in the calculation of lactose and MUN reference values. Such a large range 

indicates practically important variation in cuvette path length from one supplier to the 

next. A difference in cuvette path length was found among different molds from the 

-0.50

-0.40

-0.30

-0.20

-0.10

0.00

0.10

0.20

0.30

0.40

0.50

7.0 9.0 11.0 13.0 15.0 17.0 19.0

M
U

N
 d

if
fe

re
n

ce
 (

m
g

/ 
1

0
0

 g
 m

il
k

)

MUN (mg/100 g milk)

RPL 0.98

RPL 0.99

RPL 1.00

RPL 1.01

RPL 1.02


38 

 
same cuvette manufacturer. The impact of these variations in path length was shown 

to have a practical impact on lactose and MUN reference values used to calibrate 

electronic milk testing equipment. 

 
ACKNOWLEDGMENTS 

Funding was provided in part by Test Procedures Committee of the USDA 

Federal Milk Markets (Carrolton, Texas). Manufacture of the cuvette testing apparatus 

was done in the mechanical shop at North Carolina State University, Raleigh, NC. 

 
REFERENCES 

Association of Official Analytical Chemists (AOAC). 2023. 22nd Edition AOAC 

International Official Method of Analysis. AOAC INTERNATIONAL, 2275 

Research Blvd, Suite 300 Rockville, MD 20850. 

Berkovic, G. and E. Shafir. 2012. Optical methods for distance and displacement 

measurements. Advances in Optics and Photonics 4, 441–471. 

doi:10.1364/AOP.4.000441 1943-8206/12/040441-31 c OSA.  

Broderick, G. A., and M. K. Clayton. 1997. A statistical evaluation of animal and 

nutritional factors influencing concentrations of milk urea nitrogen. J. Dairy 

Sci. 80:2964–2971. https://doi.org/10.3168/jds.S0022-0302(97)76262-3.  

Choi, Y.-M., H. Yoo, and D. Kang. 2020. Large-area thickness measurement of 

transparent multi-layer films based on laser confocal reflection sensor. 

Measurement 153:107390. https://doi.org/10.1016/j.measurement.2019.107390 


39 

 
European Chemicals Agency (ECHA). 2024. Substance information: Potassium 

Chromate. European Chemicals Agency. Accessed June 17, 2025. 

https://echa.europa.eu/substance-information/-/substanceinfo/100.029.218 

Godden, S. M., K.D. Lissemore, D.F. Kelton, K.E. Leslie, J.S. Walton, J.H. Lumsden, 

Relationships Between Milk Urea Concentrations and Nutritional 

Management, Production, and Economic Variables in Ontario Dairy Herds, 

Journal of Dairy Science, Volume 84, Issue 5, 2001, Pages 1128-1139, ISSN 

0022-0302, https://doi.org/10.3168/jds.S0022-0302(01)74573-0. 

KEYENCE Corporation. 2025. CL-3000 Series: Confocal Displacement Sensor. 

KEYENCE Corporation of America, Itasca, IL. 

https://www.keyence.com/products/measure/laser-1d/cl-3000/?search_sl=1. 

Accessed July 7, 2025. 

Lewis, D. 1957. Blood-urea concentration in relation to protein utilization in the 

ruminant. J. Agric. Sci. 48:438–446. 

https://doi.org/10.1017/S0021859600032962  

Lynch, J. M. and D. M. Barbano, J.R. Fleming. 2007. Determination of the lactose 

content of fluid milk by spectrophotometric enzymatic analysis using weight 

additions and path length adjustment: collaborative study. Journal of AOAC 

international vol. 90, no. 1. pp 196 – 219. 

Pollott, G. E. 2004. Deconstructing milk yield and composition during lactation using 

biologically based lactation models. J. Dairy Sci.87:2375–2387. 

https://doi.org/10.3168/jds.S0022-0302(04)73359-7.  


40 

 
Portnoy, M., C. Coon, and D. M. Barbano. 2021. Lactose: Use, measurement, and 

expression of results. J. Dairy Sci. 104:8314–8325. 

https://doi.org/10.3168/jds.2020-19967. 

Portnoy, M., C. Coon, and D. M. Barbano. 2021. Performance evaluation of an 

enzymatic spectrophotometric method for milk urea nitrogen. J. Dairy Sci. 

104:11422–11431. https://doi.org/10.3168/jds.2021-20308. 

ThermoFisher. 2021. Safety data sheet: potassium chromate. Fisher Scientific 

Company One Reagent Lane Fair Lawn, NJ 07410. Accessed on July 7, 2025. 

https://www.fishersci.com/store/msds?partNumber=P220100&productDescript

ion=POTASSIUM+CHROMATE+ACS+100GM&vendorId=VN00033897&c

ountryCode=US&language=en. Accessed July 7, 2025.  

Tyrrell, H. F., and J. T. Reid. 1965. Prediction of the energy value of cow’s milk. J. 

Dairy Sci. 48:1215–1223. https://doi.org/10.3168/ jds.S0022-0302(65)88430-

2. 

Wang, Y., L. Qiu, J. Yang, and W. Zhao. 2013. Measurement of the refractive index 

and thickness for lens by confocal technique. Optik 124 (2013) 2825– 2828. 

http://dx.doi.org/10.1016/j.ijleo.2012.08.053  

Weng, C.-J., B.-R. Lu, P.-Y. Cheng, C.-H. Hwang, and C.-Y. Chen. 2017. Measuring 

the thickness of transparent objects using a confocal displacement sensor. 

Proc. IEEE Int. Instrum. Meas. Technol. Conf. 

https://doi.org/10.1109/I2MTC.2017.7969804 


41 

 
Wojciechowski, K. L., C. Melilli, and D. M. Barbano. 2016. A proficiency test system 

to improve performance of milk analysis methods and produce reference 

values for component calibration samples for infrared milk analysis. J. Dairy 

Sci. 99:6808-6827.  

Yu, Q., K. Zhang, C. Cui, R. Zhou, F. Cheng, R. Ye, and Y. Zhang. 2018. Method of 

thickness measurement for transparent specimens with chromatic confocal 

microscopy. Appl. Opt. 57:9722–9728. https://doi.org/10.1364/AO.57.009722 

 
42 

 
CHAPTER 3 

NEAR INFRARED MILK COMPONENT ANALYSIS MODELS 

 
ABSTRACT 

Our objective was to determine if the use of near infrared (NIR) milk spectra 

from a combination of modified milks (with an orthogonal design of main component 

concentrations) and individual farm milks with all-lab mean (n=8 laboratories) 

reference chemistry would produce NIR partial least squares prediction models that 

could achieve the validation accuracy of mid infrared milk analysis. Partial least 

square prediction models were developed for a commercial near infrared milk analyzer 

to predict the fat, true protein, anhydrous lactose, and total solids content of 

homogenized and unhomogenized milk using a modeling population of milks that 

included orthogonal design modified milks and individual farm milks. A commercial 

mid-infrared milk analyzer with models for testing homogenized milk was used for a 

validation performance comparison using a common set of validation samples. The 

unique aspect of the current study used model development samples and validation 

samples that had all-lab mean reference chemistry (n=8 laboratories) for each milk 

sample used in model development and validation. Validation performance of all 3 

indirect methods of estimation of milk components were compared. Partial least 

square models were developed for estimation of fat, true protein and total solids 

concentration in milk using NIR transmission spectra that had analytical accuracy 

performance on external validation that was equivalent to MIR transmittance analysis 

of the same milks. The mean difference and standard error of prediction values for fat, 


43 

 
protein, and total solids were in compliance with the expected performance accuracy 

values indicated in standard methods for examination of dairy products. The accuracy 

of prediction of fat, true protein and total solids on a weight/weight basis was better 

than previously published NIR models and that improvement was attributed to the 

design of the population of milks used for the modeling and the quality of the 

chemical reference method values derived from all lab mean reference chemistry using 

AOAC performance validated reference chemistry methods. 

 
Key Words: Near infrared, milk components, partial least squares  

 
INTRODUCTION 

Data for milk composition is important for determination of producer payment, 

dairy herd management, and dairy food product quality assurance. The current 

industry standard for rapid milk analysis is mid-range infrared analysis, either with 

fixed virtual filter wavelengths (Kaylegian et al., 2009) or Partial Least Squares (PLS) 

regression prediction models (Haaland and Thomas, 1988). Recently, MIR PLS 

models have been developed that have sufficient accuracy to be useful for evaluation 

of health and feed efficiency/rumen function for testing milk in a laboratory. Examples 

of these models are fatty acid models: de novo, mixed origin, and preformed 

(Woolpert et al., 2026), milk fatty acid chain length and mean unsaturation 

(Wojciechowski and Barbano, 2016). In addition, milk estimated blood non-esterified 

fatty acids (NEFA) (Bach et al., 2020), and milk beta-hydroxy butyrate and acetone 

(Bach et al., 2020) are indicators of ketosis and other transition cow metabolic 

problems (Seely et al. 2022). The next goal in the development of new milk analysis 


44 

 
methods for farm management is the implementation of in-line milk sensors to enable 

real time individualized cow data for milk quality, reproduction, and cow health 

monitoring. However, this application is a challenge for current MIR milk analyzers 

that are not well suited for integration into harsh, large-scale farm environments, such 

as robotic milking systems, for testing unhomogenized milk. Some success has been 

made with integrating near-infrared (NIR) spectrometers into milking systems 

(Kawasaki, 2008, Kaniyamattam, 2014) but have yet to meet the current industry 

standards for accuracy in measurement of major milk components, particularly fat, 

when compared to benchtop MIR analysis. 

Currently, the dairy industry uses MIR milk testing at central payment testing 

labs that analyze milk from each farm almost every day and sometimes every tanker 

load of milk on very large farms with data (fat, protein, lactose, solids, milk fatty 

acids, milk urea, etc.) sent back to the dairy farm management in 48 h after milk 

pickup. This testing is highly accurate and fit for purpose for high level management 

decision making on the whole herd, or large feeding groups of cows on large farms 

because the data is frequent and continuous. Central DHIA laboratories provide high-

speed testing of individual cow milks for dairy record-keeping to support genetic 

selection and improvement of dairy cow performance through breeding, but the 

frequence is low (monthly, or quarterly) and not suitable to tactical day-to-day 

decision making on farm. 

There are several challenges when trying to implement a practical, cost-

effective, on-farm, in-line milk analysis system (using any analysis technology) that 

provides actionable information on individual cows for tactical farm management 


45 

 
decision making. The challenges are: accuracy of the milk composition prediction 

models, dissolved air in the milk, variation in milk temperature, foreign material in the 

milk, fouling of the IR cell window, and the change in milk composition from the 

beginning to ending of milking of each individual cow that varies from cow to cow. 

The first step is to develop NIR models that are of sufficient accuracy to be fit for 

purpose for farm management decision making. 

 Assuming that information to predict milk component concentrations is hidden 

in the NIR spectra, improvement must be made to the accuracy of models used to 

predict milk component concentrations. To build a PLS prediction model, many milks 

must be analyzed by accurate reference chemistry methods to obtain a chemical 

reference value to pair with each milk spectra. A PLS model is limited to the accuracy 

of the reference chemistry of the modeling sample population used for model 

development. This principle extends to the strength of model validation, as the 

performance of a model is quantified by the accuracy to the reference chemistry of the 

external validation samples. The chemical analysis methods implemented for 

reference chemistry can make a significant difference in the reliability of the resulting 

values. The chemical reference methods used in our study for fat, true protein, 

anhydrous lactose, and total solids, were all performance validated AOAC as methods 

with established performance statistics for repeatability and reproducibility (Lynch, 

1998). The chemical reference values for the both the modeling and external 

validation milks in our study were strengthened using reference chemistry from 8 or 

more laboratories for each milk component on each sample, as described in 


46 

 
Wojciechowski et al. (2016), to calculate an all-lab mean reference value for each 

component for each milk. 

An iterative modeling process was done using Mahalanobis distance to identify 

and remove concentration and spectral outliers from the population of milks used for 

the modeling. Then, the appropriate number of Eigenvectors (also known as factors or 

rank) of the model are selected to avoid both underfitting and overfitting the model, 

with a goal of getting a high R-squared and RPD (residual prediction deviation) and a 

low RMSCV (root mean square of cross validation) for the prediction model. In 

Williams (1993) reported that the RPD is particularly useful for standardizing the SEP 

for comparison of across PLS models measuring analytes at very different absolute 

concentrations in the sample matrix, as well as serving as a base quantifier for 

expected practical applications of performance when a model is externally validated. 

Despite the universality of RPD in NIR, it is important to address certain limitations of 

the RPD, as discussed by (Ebensen, 2014). Willaims (1993) notes that RPD functions 

on the assumption of a normal distribution of data in the modeling sample set and the 

avoidance of high leverage samples. In fact, even one high leverage sample can be 

enough to artificially inflate RPD values (Ebensen, 2014). These concerns were 

carefully considered and addressed in the design of our study.  

Our objective was to determine if the use of milk spectra from a combination 

of modified milks (with an orthogonal design of main component concentrations) and 

individual farm milks with all-lab mean (n=8 laboratories) reference chemistry would 

produce NIR PLS prediction models that would have smaller RMSEP for farm milks 


47 

 
than models developed only from spectra of individual farm milks with individual lab 

chemical reference values. 

MATERIALS AND METHODS 

 Experimental Design: Partial least square (PLS) prediction models were 

developed for a commercial near infrared milk (NIR) analyzer to predict the fat, true 

protein, anhydrous lactose and total solids content of homogenized and 

unhomogenized milk using modeling population of milks that included modified milks 

and individual farm milks. A commercial mid-infrared milk analyzer was with models 

for testing homogenized milk was used for a validation performance comparison using 

a common set of validation samples. The unique aspect of the current study used 

model development samples and validation samples that had all-lab mean reference 

chemistry (n=8 laboratories) for each milk sample used in model development and 

validation. Validation performance of all 3 indirect methods of estimation of milk 

components were compared. 

Milks used to produce spectra for model development. For the modeling, a 

population of spectra from duplicate analysis of 60 farm milks (from different regions 

of the US over a 4-month period) and 42 modified milks (3 independent sets of 14 

samples produced over 3 months) were used to develop PLS models for prediction of 

the concentration of fat, true protein, anhydrous lactose, and total solids content of 

homogenized milk milks from a NIR spectra. For the unhomogenized milk model 

development, duplicate analysis of 56 individual farm milks and 42 modified milks 

were used for model development. The design and production method for the modified 


48 

 
milks (14 sample orthogonal design) has been described by Kaylegian et al. (2006), as 

modified by Portnoy et al. (2020). The all-laboratory mean chemical reference values 

the modified milks were established as described in Wojciechowski et al. (2016) by a 

combination of USDA Federal Milk Market Laboratories and Cornell University. The 

same approach of running reference chemistry by the same group of laboratories was 

performed on the 60 individual farm milks include in the modeling set. 

The reference chemistry methods used for analysis by all reference testing 

laboratories are as follows: fat, true protein, anhydrous lactose, and total solids 

measurements were determined in duplicate in each laboratory using the following 

validated methods (AOAC International, 2023): fat by modified Mojonnier ether 

extraction (method 989.05), true protein by Kjeldahl analysis (method 991.22), lactose 

by enzymatic analysis (method 2006.06), and total solids by atmospheric forced-air 

oven drying (method 990.20).  

 Milk used for external validation of model performance. The external 

validation performance of the homogenized milk and unhomogenized milk NIR 

prediction models for fat, true protein, anhydrous lactose and total solids models were 

compared to performance of MIR models. Validation was done with 48 individual 

farm milk, from different regions of the US over a 3-month period, that had all-lab 

mean (n=8) chemistry reference chemistry values determined as described above. 

 Near infrared analyzer. A Bruker MPA II near infrared (NIR) Dairy Analyzer 

was paired with a liquid sampling module. The analyzer was equipped with Bruker 

software OPUS version 8.7.41 (Bruker Optics, 2021) and OPUS Insight version 2.0.0 

(Bruker Scientific, Billerica MA). The MPA II contains a variable flow system with 


49 

 
the option to include, or bypass, the in-line, two-stage homogenizer. The homogenizer 

bypass enabled analysis of a homogenized or unhomogenized portion of the same milk 

and collection of a NIR spectra using the same flow-through cuvette. The cuvette had 

a path length of 0.1cm. The MPA II Spectrometer had a spectral range of 11,500 to 

4,000 cm–1 with a resolution maximum of 2 cm–1. 

NIR PLS modeling. The OPUS software that is included with the NIR milk 

analyzer is a PLS modeling software. A PLS model was developed for was made for 

each major milk component (i.e., fat, protein, lactose, and total solids) separately for 

homogenized and unhomogenized milks. Analysis types were saved as a quant file 

(*.q2) using the modeling milk samples described above.  

Several conditions within the PLS modeling process needed to be selected 

prior to building a PLS model. First, many of these conditions should be driven by a 

fundamental knowledge of the components to be measured in the sample material, the 

characteristics of the sample matrix with respect to characteristics not being modeled, 

the homogeneity of the sample material and need for sample preparation prior to 

spectra collection and the quality of the reference data for the component to be 

predicted by the model. Second, the population of samples that will be used for the 

modeling should have a wide range of the concentration of each of the components 

that need to be modeled and the modeling sample population should be constructed to 

avoid co-linearity among the major components that will be modeled. Modeling 

should start from a robust sample population that is designed to address these issues, 

not just a large population of sample that are not uniformly distributed and that have 

major colinear relationships. This is not easy to achieve, but if no emphasis is placed 


50 

 
on this from the beginning, then the models will not be robust. This why we chose to 

build a modeling population that included a diverse range of individual farm milks 

from varied sources and the modified milk samples to break collinearity among the 

major components. Once a beginning modeling population is built, several conditions 

for the modeling process need to be selected. 

The first step in modeling is the selection of wavelength (wave numbers) 

ranges to be included in the areas of spectra that will be used in the model. In practice, 

this means exclusion of ranges of wavelength that are problematic. In biological 

samples that contain high concentration of water, the wavenumbers where water 

absorbance of infrared light is very strong should be excluded. This is the case when 

testing milk and is true for modeling both in MIR and NIR. In the current study, we 

decided to use the wavenumber ranges of 9832 to 8600, 8000 to 7328, 6632 to 5344, 

and 4832 to 4208 cm⁻¹ (within the 1000 to 2500 nm range) to minimize the negative 

impact of area of high absorption of the infrared signal by water while keeping in 

wavenumber that may contain useful information for modeling. The next modeling 

decision to make to choose among a wide range of approaches that can be used to 

preprocess the spectral data, including mean centering. OPUS, and most other 

modeling software, offer several methods of spectra data preprocessing such as 

constant offset elimination, straight line subtraction, vector normalization, min-max 

normalization, multiplicative scattering correction, first derivative, second derivative, 

etc., as described by Conzen (2005). In our study we optimized to 17 smoothing 

points. 


51 

 
Optimized models based upon the selection of the conditions described above 

were then produced by OPUS, and then the models were ranked by the root mean 

squared error of cross validation (RMSECV) and were evaluated by competitive 

RMSECV and Rank (number of factors). These models were cross validated leaving 

one sample out. Outlier samples were identified and removed as either concentration 

or spectral outliers. This process was repeated for each of the major components, both 

with the homogenized and the unhomogenized modeling sample sets.  

 Mid-infrared Analyzer Models. Mid infrared analyzers typically have an in-

line homogenizer built into the pump-in-flow system of the instrument. There, all milk 

that reaches the cuvette for MIR milk analysis has been homogenized by the 

homogenizer in the instrument and the MIR prediction model are developed for 

analysis of instrument homogenized milks. In the MIR spectra the light absorbance by 

signature chemical bonds in fat, protein, and lactose are stronger than in the NIR 

spectra and this makes it possible to use narrow ranges of sample and reference (i.e., 

basic filter wavelengths) wave numbers to measure fat, protein, and lactose. MIR PLS 

models can also be used for prediction of fat, true protein and anhydrous lactose. In 

the current study, the basic filter model approach was used based on optimized basic 

fixed sample and reference MIR filter models as described by Kaylegian et al. (2009). 

The instrument was checked for precalibration performance as described by Lynch et 

al. (2006). 

 Milk fat, true protein, and anhydrous lactose content were determined using a 

Fourier transform mid-infrared (FTIR) spectrophotometer (Lactoscope model FTA, 

Delta Instruments, Drachten, The Netherlands). The prediction models used were the 


52 

 
optimized basic model filter wavelengths and intercorrection factors described by 

Kaylegian et al. 2009. Calibration of the FTIR for measurement of fat, true protein, 

anhydrous lactose, solids and milk urea nitrogen (MUN) was done using a 14-sample 

modified milk calibration set (Kaylegian et al., 2006; Portnoy et al. 2021a) produced 

monthly.  

NIR and MIR Analyzer Calibration and Validation. 

Calibration. Both the NIR and the MIR used in the current study for validation 

sample testing were calibrated with same unhomogenized modified milk calibration 

samples. About 400 sets of these calibration milks are produced every 4 weeks in our 

laboratory. The same sets of modified milk calibration samples (that were not used in 

the model development) were used to calibrate (i.e., adjust the final slope and 

intercept) for the both the MIR (basic filter models) and NIR PLS models (developed 

in the current study) for prediction of fat, true protein, anhydrous lactose, and total 

solids that we developed for homogenized and unhomogenized milks. The reference 

values for the calibration samples fat, true protein, anhydrous lactose, and total solids 

were all-lab mean reference values as described by Wojceichowski et al (2016). 

External validation of NIR and MIR models. The same sets of validation milk 

samples that were not used in the model development were used to validate the 

performance of the MIR and NIR models (homogenized milk and unhomogenized 

milk) for prediction of fat, true protein, anhydrous lactose, and total solids. The 

validation samples were individual 48 individual farm milks. The validation milks 

were the FMMO common control set (10 milks per set) produced fresh by the USDA 

Federal Milk Market laboratories used to check industry laboratories every 3 weeks 


53 

 
and one set of FMMO Validation milks (internal USDA) validation sample set (8 

samples per set) produced quarterly. The 48 farm milks used for validation in the 

current study also had all-lab mean reference chemistry as described by 

Wojceichowski et al (2016). The same sets of validation milks were run on the NIR 

milk analyzer with and without the homogenizer in-line and on an MIR milk analyzer 

with an in-line homogenizer. The validation testing of the 48 milks was done over a 

period of about 16 weeks and during that period both the NIR and MIR instruments 

were calibrated with a new set of modified milk samples every 4 weeks during the 16-

week period. Mean difference from all-lab mean reference chemistry and standard 

error of prediction are reported for all milk components.  

RESULTS 

 
Homogenized Milk Models 

 
 NIR Model development. We first explored different combinations of ranges 

of wavelengths and finally decided on four ranges that contained a total of 481 

wavenumber data points, that worked well in the NIR PLS modeling, and those 

wavelength ranges were kept the same for modeling all four milk components for 

homogenized milk, as shown in Table 1. A population of 204 spectra were used at the 

start of the modeling process for all 4 milk components. Several different spectra data 

preprocessing methods were then tested and a first derivative transformation with a 

multiplicative scatter correction with 17 smoothing points was selected as a method 

that worked well for all four milk components for homogenized milk. Multiple PLS 

modeling runs were done in a stepwise sequence for each milk component to 

progressively identify and remove individual concentration and spectral outliers from 


54 

 
the model for each milk component. During the modeling we observed the change in 

change in R-square and RMSECV, with an increasing rank (i.e., factor) number. 

Generally, as rank increased the R-square increases and the RMSECV decreases at a 

decreasing rate with increasing rank. We selected a final rank for the model when 

these two metrics slowed in their rate of change with increasing rank.  

 
55 

 
Table 3.1. Model structure parameters and modeling metrics (i.e., R-square, 

RMSECV, and RPD) for prediction of fat, anhydrous lactose, true protein total solids 

concentration of milk when an in-line homogenizer was in the NIR flow system. 

 
Modeling 

characteristic 
Fat 

Anhydrous 

lactose 
True protein Total solids 

Total spectra 204 204 204 204 

Spectra used 198 193 197 194 

Mean 

concentration 
3.7693 4.6036 3.2888 12.7795 

Standard deviation  1.3853 0.2383 0.4870 1.5802 

     
Frequency ranges 

cm-1 

(1000 to 2500 nm) 

9832-8600, 

8000-7328, 

6632-5344, 

4832-4208 

9832-8600, 

8000-7328, 

6632-5344, 

4832-4208 

9832-8600, 

8000-7328, 

6632-5344, 

4832-4208 

9832-8600, 

8000-7328, 

6632-5344, 

4832-4208 

     
Selected data 

points 
481 481 481 481 

Preprocessing 

method 

First 

derivative + 

MSC1 

First 

derivative + 

MSC 

First 

derivative + 

MSC 

First 

derivative + 

MSC 

Smoothing Points 17 17 17 17 

Rank 10 9 9 7 

R-square 0.999 0.984 0.999 0.999 

RMSECV2 0.0269 0.0298 0.0164 0.0290 

RPD3 51.5 8.0 29.7 54.3 

Outlier spectra 

removed 
6 11 7 10 

 
1MSC = Multiplicative scatter correction 
2RMSECV = Root mean square error of cross validation  
3RPD = Residual prediction deviation = ratio of standard deviation of the reference 

chemistry divided by the root means square error of cross-validation  

 
56 

 
Figure 3.1. Homogenized milk model predicted (X axis) versus reference chemistry 

(Y axis) graph for each of the four major components: fat, protein, lactose, and total 

solids. 

 
NIR modeling performance. Two common metrics of PLS model performance 

are the RMSECV and RPD. Generally, when the absolute value of the RMSECV is 

less than 1% relative of the mean concentration of the component being modeled, then 

the relative selectivity performance of the model given the component concentration is 

very good. This was the case for all 4 models of components in homogenized milk 

shown in Table 1. The RPD is another metric that should be indicative of the potential 

for performance of a PLS model when testing external validation samples. After 

finalizing outlier removal, the reference chemistry was plotted as a function of 

predicted values by the final PLS models for fat, lactose, protein, and total solids 

content of homogenized milk and plotted in Figure 1. Along with the mean and 

standard deviation of concentration for each component in Table 1, data presented in 


57 

 
Figure 1 had distribution of milk concentrations of each component to avoid high-

leverage samples to minimize distortion of the RPD. The RPD values (Table 1) 

differed among the 4 components. A higher the RPD value determined by cross 

validation during modeling is an indication that a model will perform well. Generally, 

models with an RPD of > 8 are considered very good models [Conzen (2005), 

Williams and Sobering (1993)]. The models for fat, protein and total solids had high 

RPD values, while lactose had a lower model RPD (Table 1). However, running an 

external validation of a model is the best approach for evaluation of PLS model for 

milk analysis.  

Unhomogenized Milk Models  

NIR Model Development. The approach used for development of PLS models 

to predict fat, lactose, protein, and total solids concentration in un-homogenized milks 

was the same as approach as we used for homogenized milks in Table 1. 

Unhomogenized milks were expected to have more scattering light due to large fat 

globules than in homogenized milks. In unhomogenized milk, NIR light scattering, 

and light absorbance by chemical bonds in the structure of milk fat both increase in a 

co-linear fashion. This presents a challenge for PLS modeling of fat and other milk 

components because it is difficult to uncouple the relationship between increasing fat 

concentration and NIR light scattering. As a result, we started the PLS modeling with 

the same conditions that worked well for the homogenized milk. By comparison of the 

spectra of homogenized and unhomogenized milks from the same NIR instrument 

with a 1 cm flow through cuvette and comparing the beta coefficients at each data 

point for the two milk types, we were able to determine ranges of wavenumbers where 


58 

 
light scattering in the unhomogenized milk spectra was having a large influence on the 

spectra compared to homogenized milk. As a result, we modified the wavenumber 

ranges (Table 2) to remove ranges of wavenumbers from the spectra to be used for the 

PLS modeling for unhomogenized milks. In addition, we found that selection of 

different preprocessing methods, compared to homogenized milk, improved the 

metrics PLS model performance (i.e., lower RMSECV and RPD), and decreased the 

number of factors. The distributions of the final concentrations of fat, protein, lactose, 

and total solids in samples used to make the unhomogenized milk PLS model are 

shown in Figure 2.  

  
59 

 
Table 3.2. Model structure parameters and modeling metrics (i.e., R-square, 

RMSECV, and RPD for prediction of fat, anhydrous lactose, true protein total solids 

concentration of milk when there is no in-line homogenizer was in the NIR flow 

system. 

 
Modeling 

characteristic 
Fat 

Anhydrous 

lactose 
True protein Total solids 

Total spectra 195 195 195 195 

Spectra used 183 176 177 183 

Mean concentration 3.7199 4.5981 3.2374 12.6967 

Standard deviation  1.4228 0.2406 0.5050 1.6071 

Frequency ranges 

cm-1 

(1000 to 2500 nm) 

9832-8600, 

8000-7328, 

6632-5832, 

4752-4208 

9832-8600, 

8000-7328, 

6632-5832, 

4752-4208 

9832-8600, 

8000-7328, 

6632-5832, 

4752-4208 

9832-8600, 

8000-7328, 

6632-5832, 

4752-4208 

Selected data 

points 
410 

410 410 410 

Preprocessing 

method 
Min-Max NONE 

First 

Derivative 

Constant 

Offset 

Elimination 

Smoothing Points NONE NONE 17 NONE 

Rank 9 10 9 10 

R-square 0.999 0.965 0.997 0.998 

RMSECV1 0.0441 0.0461 0.0272 0.0671 

RPD2 32.2 5.3 18.5 23.0 

Outlier spectra 

removed 
12 19 18 12 

1RMSECV = Root Mean Square Error of Cross Validation 
2RPD = Residual prediction deviation = ratio of standard deviation of the reference 

chemistry divided by the root means square error of cross-validation  

 
60 

 
Figure 3.2. Un-homogenized milk model predicted (X axis) versus reference 

chemistry (Y axis) graph for each of the four major components: fat, protein, lactose, 

and total solids. 

 
NIR modeling performance. The RMSECV values for fat, protein, lactose, 

and total solids were higher and the RPD values were lower for all components for 

unhomogenized milk (Table 2) than homogenized milk (Table 1), indicating we were 

not able to eliminate all the detrimental effects of light scattering on NIR model 

performance metrics. In both homogenized and unhomogenized milks prediction of 

lactose concentration was the most difficult to achieve low RMSECV and high RPD 

values for PLS models produced from the NIR spectra in the current study.  

NIR Model Validation and comparison of performance to MIR models. MIR 

milk analysis (with homogenization) has been the mainstream high-speed secondary 

method for milk analysis for the last 40 years. In the current study we used the 

performance MIR on the same calibration and external validation samples to provide a 


61 

 
point performance reference for a high-speed secondary method in comparison to the 

performance of the NIR models developed in the current study. For the MIR 

performance on external validation milks (Table 3), the MD and SEP are within the 

validation guidelines used by the USDA Federal Milk Markets to evaluate 

performance of payment testing labs using third party calibration samples, i.e., MD < 

(+/-) 0.02 on fat, protein and lactose and < (+/-) 0.05 on total solids and SEP (+/-) < 

0.04 on fat, protein, and lactose and < (+/-) 0.10 on total solids. The NIR model 

performance on external validation for homogenized milks (Table 3) was sometimes 

better and sometimes worse (mostly on lactose) than the MIR, but in general the 

accuracy of the NIR PLS models developed in the current study for homogenized milk 

was good and met the USDA Federal Milk Market validation performance guidelines. 

The NIR model performance on external validation for un-homogenized milks (Table 

3) was not as good as the homogenized milk models, but in general the accuracy of the 

NIR PLS models developed in the current study was good and met the USDA Federal 

Milk Market validation performance guidelines, despite the higher RMSECV values 

and lower RPD values for the development of PLS models for prediction of fat, 

lactose, protein, and solids concentration in unhomogenized milks than for 

homogenized milks. Previous studies by Laporte and Paquin (1999) and Aernouts et 

al. (2011) reported performance statistics for NIR models for analysis of 

unhomogenized milk using NIR transmittance over the 1000 to 2400 nm wavelength 

range. Laporte and Paquin (1999) reported an SEP of 0.05 for fat and 0.12 for true 

protein on unhomogenized milk, both much higher than the values in our study (Table 


62 

 
3, SEP 0.03 and 0.02, respectively). Similarly, Aernouts et al. (2011) reported a 

RMSEP of 0.043 for fat, 0.133% for crude protein, and 0.162 for lactose.  

 
Table 3.3. External validation performance metrics [standard error of prediction (SEP) 

and mean difference (MD)] for NIR PLS models with (NIR H) and without (NIR NH) 

an -inline homogenizer and a comparison to a mid-infrared (MIR) milk analyzer 

calibrated with the same samples and measuring the same components on the set of 

validation milks from 48 individual farms.  

  
External Average SEP values Average MD values 

Validation NIR H NIR NH MIR NIR H NIR NH MIR 

Fat 0.013 0.034 0.016 0.009 -0.006 -0.007 

Anhydrous 

lactose 
0.028 0.027 0.007 0.020 0.020 0.002 

True protein 0.013 0.022 0.016 0.010 0.021 -0.015 

Total solids 0.024 0.038 0.023 0.016 -0.020 -0.020 

 
CONCLUSIONS 

Partial least square models were developed for estimation of fat, homogenized 

true protein and total solids concentration in milk using NIR transmission spectra that 

had analytical accuracy performance on external validation that was equivalent to MIR 

transmittance analysis of the same milks. The mean difference and standard error of 

prediction values for fat, homogenized protein, and total solids were in compliance 

with the expected performance accuracy values indicated in standard methods for 

examination of dairy products. The performance of anhydrous lactose and 

unhomogenized protein concentration predictions by NIR were not as good as MIR 


63 

 
and further work is needed to improve the NIR lactose and protein models. The 

accuracy of prediction of fat, homogenized true protein and total solids on a 

weight/weight basis was better than previously published NIR models and that 

improvement was attributed to the design of the population of milks used for the 

modeling and the quality of the chemical reference method values derived from all lab 

mean reference chemistry using AOAC performance validated reference chemistry 

methods. 

  
ACKNOWLEDGMENTS 

Funding was provided in part by the USDA Federal Milk Markets (Carrollton, 

Texas) and Daisy Brand (Garland, Texas). Technical assistance was provided by 

Bruker (Billerica, Massachusetts) and Perkin-Elmer instruments (Drachten, The 

Netherlands) 


64 

 
REFERENCES 

Aernouts, B., E. Polshin, P. Saeys, and J. De Baerdemaeker. 2011. Visible and near-

infrared spectroscopic analysis of raw milk for cow health monitoring: 

Reflectance or transmittance? J. Dairy Sci. 94:5315–5329. 

https://doi.org/10.3168/jds.2011-4354.  

AOAC. 2023. Official Methods of Analysis 22nd ed. Assoc. Off. Anal. Chem., 

Arlington, VA. 

Bach, K. D., D. M. Barbano, and J. A. A. McArt. 2019. Association of mid-infrared-

predicted milk and blood constituents with early-lactation disease, removal, 

and production outcomes in Holstein cows. J. Dairy Sci. 102:10129–10139. 

https://doi.org/10.3168/jds.2019-16926 

Bach, K. D., D. M. Barbano, and J. A. A. McArt. 2020. The relationship of excessive 

energy deficit with milk somatic cell score and clinical mastitis. J. Dairy Sci. 

104:715–727. https://doi.org/10.3168/jds.2020-18432 

Conzen, J. P. (2005). Multivariate Calibration: A practical guide for developing 

methods in quantitative analytical chemistry. pp. 42, 43, 80, and 81. Bruker 

Optik, Ettlingen.  

Kawasaki, M., Kawamura, S., Tsukahara, M., Morita, S., Komiya, M., Natsuga, M., 

2008. Near-infrared spectroscopic sensing system for on-line milk quality 

assessment in a milking robot. Comput. Electron. Agric. 63, 22–27. 

https://doi.org/10.1016/J. COMPAG.2008.01.006. 


65 

 
Kaylegian, K. E., G. E. Houghton, J. M. Lynch, J. R. Fleming, and D. M. Barbano. 

2006. Calibration of infrared milk analyzers: modified milk versus producer 

milk. J. Dairy Sci. 89:2817-2832. 

Kaylegian K.E., J. M. Lynch, J. R. Fleming, and D. M. Barbano. 2009. Influence of 

fatty acid chain length and unsaturation on mid-infrared milk analysis. J Dairy 

Sci. 92:2485-2501. 

Bruker Optics. 2021. MPA II User Manual. Bruker Optics GmbH & Co. KG, 

Ettlingen, Germany. 

Esbensen, K.H., P. Geladib and A, Larsenc. 2014. The RPD myth. NIR News. Vol. 25 

No. 5. pp 24-28. doi: 10.1255/nirn.1462.  

Haaland, D. M., and E. V. Thomas. 1988. Partial least-squares methods for spectral 

analyses. 1. Relation to other quantitative calibration methods and the 

extraction of qualitative information. Anal. Chem. 60:1193–1202. 

https://doi.org/10.1021/ac00162a020. 

Kaniyamattam, K., and A. De Vries. 2014. Agreement between milk fat, protein, and 

lactose observations collected from the Dairy Herd Improvement Association 

(DHIA) and a real-time milk analyzer. J. Dairy Sci. 97:2896–2908. 

https://doi.org/10.3168/jds.2013-7690. 

Laporte, M.-F., and P. Paquin. 1999. Near-infrared analysis of fat, protein, and casein 

in cow’s milk. J. Agric. Food Chem. 47:2600–2605. 

https://doi.org/10.1021/jf980929r 


66 

 
Lynch, J. M. 1998. Use of AOAC INTERNATIONAL method performance statistics 

in the laboratory. J. AOAC Int. 81:679–684. 

https://doi.org/10.1093/jaoac/81.3.679 

Lynch J. M., D. M. Barbano, M. Schweisthal, and J. R. Fleming. 2006. Precalibration 

Evaluation Procedures for Mid-Infrared Milk Analyzers. J. Dairy Sci. 

89:2761–2774. 

Portnoy, M., C. Coon, and D. M. Barbano. 2020. Infrared Milk Analyzers: Milk Urea 

Nitrogen Calibration. J. Dairy Sci. 104: 7426–7437.  

Portnoy, M., C. Coon, and D. M. Barbano. 2021. Performance evaluation of an 

enzymatic spectrophotometric method for milk urea nitrogen. J. Dairy Sci. 

104:11422–11431. 

Seely, C. R., K. D. Bach, D.M. Barbano, and J.A.A. McArt. 2022. Diurnal variation of 

milk fatty acids in early-lactation Holstein cows with and without 

hyperketonemia. Animal 16: 100552. 

https://doi.org/10.1016/j.animal.2022.100552 

Williams PC, Sobering DC. 1993. Comparison of Commercial near Infrared 

Transmittance and Reflectance Instruments for Analysis of Whole Grains and 

Seeds. Journal of Near Infrared Spectroscopy. 1(1):25-32. doi:10.1255/jnirs.3  

Wojciechowski, Karen L. and David M. Barbano. 2016. Prediction of fatty acid chain 

length and unsaturation of milk fat by mid-infrared milk analysis. J. Dairy Sci. 

99:8561–8570. http://dx.doi.org/10.3168/jds.2016-11248 

Wojciechowski, K. L., C. Melilli, and D. M. Barbano. 2016. A proficiency test system 

to improve performance of milk analysis methods and produce reference 


67 

 
values for component calibration samples for infrared milk analysis. J. Dairy 

Sci. 99:6808-6827. 

Woolpert, M.E., H. M. Dann, K. W. Cotanch, C. Melilli, L. E. Chase, R. J. Grant, and 

D. M. Barbano. 2016. Management, nutrition, and lactation performance are 

related to bulk tank milk de novo fatty acid concentration on northeastern US 

dairy farms. J. Dairy Sci. 99:8486–8497. http://dx.doi.org/10.3168/jds.2016-

10998 

 
68 

 
CHAPTER 4 

CONCLUSIONS AND FUTURE WORK 

 
Chapter 2 – Cuvette Path Length Measurement Using a Confocal Displacement 

Sensor 

 A new method using a chromatic confocal displacement sensor to measure 

path length of polystyrene and quarts cuvettes.  It takes about xx seconds to scan 8 

cuvettes for path length determination. Polystyrene cuvettes (100 cuvettes of nominal 

pathlength 10.0 mm) were determined to have a difference among suppliers of 3 to 4% 

relative path length. Disposable cuvettes from the same manufacturer were also found 

to have differing path lengths from one mold to another. Another supplier provided 

boxes of 100 cuvettes that contained cuvettes from different molds and there were 

significant differences within the same box. Quartz cuvettes were also found to differ 

in path length between two suppliers. These path length variations we observed would 

impact (+/- 0.09 for lactose and +/- 0.26 for MUN) the calculated concentration of 

lactose and milk urea nitrogen (MUN) when the cuvettes were used for enzymatic 

assay reference chemistry. These assays are used to calibrate rapid testing analyzers 

for milk, in which the accuracy of the reference chemistry if a limiting factor for 

accuracy of the measurement.  

 This optical method is nondestructive and avoids hazardous chemicals, so it 

would be feasible to certify cuvette path length with this method. To fully utilize this 

method with full efficiency, a facility that is set up for rapid certification would be 

able to provide reference chemistry laboratories with cuvettes that have a certified 


69 

 
path length for use in analytical chemistry. Additional work is needed to automate the 

analysis of the data from scanning the cuvettes. 

Chapter 3 – NIR Modeling 

 Predictive models for unhomogenized and homogenized milk were made using 

partial least square regression for near infrared (NIR) transmission analysis of milk. 

All milks for modeling and external validation of models were analyzed by a group of 

labs for fat, protein, lactose, and total solids using performance validated reference 

chemistry methods. Homogenized and Unhomogenized milks were analyzed by the 

NIR using the PLS models developed for fat, protein, lactose, and total solids for 

external validation. The performance of models to predict fat, homogenized protein, 

and total solids concentration was equal to mid infrared (MIR) transmittance analysis 

and met the testing criteria for mean difference and standard error of prediction. NIR 

lactose and unhomogenized protein predictions did not meet the same analytical 

performance standards, likely because much of the spectral information for lactose 

was removed for coinciding with a major water peak and protein spectra includes 

micelle-based light scattering. The NIR predictive models developed in the current 

study outperformed those in previously published studies. 

 The PLS models developed in the current study achieved better analytical 

performance due to the combination orthogonal sample set design paired with all-

laboratory mean reference chemistry as the reference data for PLS model 

development. This same approach can be used for the development of other predictive 

models in the future. One of the goals for further expansion of this study would be the 

creation of predictive models for the major components of cream. NIR technology 


70 

 
may be better than MIR for cream due to the larger cuvette pathlength preventing 

blockages as well as being able to be integrated in-line into processing systems.