Utility Analysis for Decisions in Human Resource Management Working Paper 88-21 John W. Boudreau ." Associate Professor NYSSILR--Cornell University 393 Ives Hall Ithaca, New York 14851-0952 (607) 255-2273 Draft of a Chapter for the Handbook of Industrial-Organizational Psychology . Edited by Marvin Dunnette to be published by Consulting Psychologist Press Palo Alto, California December, 1988 Comments welcome, please do not quote without permission from the author. This research was carried out with suppon from the U.S. Army Research Institute, contract SFRC #MDA903-87-K-0001. the views, opinions and/or findings contained in this chapter are those of the author and should not be construed as an Official Department of the Army policy, or decision. This paper has not undergone formal review or approval of the faculty of the ILR School. It is intended to make the results of Center research, conferences and projects. available to others interested in human resource management in preliminary form to encourage discussion and suggestions. .JYr:; J - , 'C 5S ll9 Utility Analysis for Human Resource Management Decisions Page 1 1\2 W92 Introduction 1J0,~y -21 The questions studied by Industrial/Organizational (I/O) psychologists are closely linked to the decisions facing managers of people in organizations. Whether they be line managers, human resource management staff, or organizational psychologists, managers of human resources must make decisions about issues affecting the employment relationship--hiring, training, compensation, performance appraisal, and so on--that draw on theories of human work behavior. Analogously, I/O psychologists, as well as other social scientists, find the organizational environment a rich source of information for advancing knowledge and testing employment-related theories. Both scientists and managers benefit from the knowledge gained about the behaviors of individuals in the work place, who can then search for ways to apply that knowledge to achieve individual and organizational outcomes of efficiency and equitable employment. The similarity of interests between I/O psychologists and human resource management (HRM) professionals has produced some close collaborative relationships, e.g., the many psychologists who consult for industry, conduct studies designed to suppon HRM decisions, or, through their work, influence the direction of employment policies. Still, the HRM functions of organizations typically lack the influence and visibility of other management functions such as marketing, finance, and operations. The literature for HRM professionals routinely laments the slow implementation of HRM programs in organizations, even though these programs have gained wide acceptance by scientists (cf. Jain & Murray, 1984), and they admonish and instruct these professionals to "sell" their programs by emphasizing their effects on attainment of organizational goals (Bolda, 1985, Fitz-Ens, 1984; Gow, 1985, Jain & Murray, 1984, Sheppeck & Cohen, 1985). With increased competition and evidence from the United States and abroad that competitive organizations are likely to manage their people differently, HRM personnel are more frequently expected to justify their contributions to the employer and to account for their existence. One must question whether the lack of influence and slow implementation of HRM programs is a rational response by organizations. Could it be that behavioral theories and findings are relevant only to the scientific community and have such little relevance to organizational decisions and outcomes that they can be ignored by'successful organizations? If the theories and [mdings are relevant, then how should they be communicated to decision makers? Do decisions that consider social science evidence produce greater organizational success, and, if so, are the successes great enough to justify the resources necessary to generate and apply the evidence? This chapter will discuss utility analysis (VA), which attempts to answer such questions by focusing on decisions about human resources. Utility analysis refers to the process that describes, predicts and/or explains what determines the usefulness or desirability of decision options, and examines how that information affects decisions. In HRM and I/O psychology, the focus hes on decisions involving employment relationships and employee behaviors. Thus, I/O psychologists use the term utility analysis to refer to a specific set of models that reflect the consequences, usually performance-related, of programs designed to enhance the value of the work force to the employing organization. Utility analysis offers great potential for enhancing the link between the theories and findings of I/O psychological research and the human resource decisions of organizational managers. To achieve this potential, however, UA research and applications must proceed from a framework that recognizes the Property ot MARTIN P. CATHERWOODLIBRARY NEW YORK SlATE SCHOOL DlPUSTRW. UtIJ8llI£tA tUNS - u .. ! 1&.... j Utility Analysis for Human Resource Management Decisions Page 2 broad effects of such decisions on the work force and the organization. Such a framework requires an expansive view of the decision tasks facing managers of people in organizations, a view that recognizes the contributions, limits and implicit assumptions not only of psychological models, but of models from other social sciences as well. The UA framework provides both a rationale and a significant new direction for an integration between the science and practice of I/O psychology and other scientific disciplines relevant to organizational employment decisions. This chapter is intended as a step toward such an integrative framework. Thus, it will not only review 3?d describe UA theories and applications, but will propose new and integrative directions that have received little attention. UA research must certainly acknowledge the considerations of related disciplines such as economics, management and sociology. But as a true theory of organizational decision making, it provides a mechanism to go beyond simple acknowledgement, to achieve a mechanism for truly interdisciplinary approaches to employment decisions. Chapter Outline This chapter comprises ten sections. The first section introduces and establishes some fundamental concepts, including the nature of utility models, decision options, attributes and payoff functions. It shows where UA models fit within the broader domain of decision models. It further establishes some ground rules guiding subsequent sections. The second section outlines the historical development of concepts integral to utility analysis, the roots of which can be traced to the earliest stages of I/O psychological research. Not only does this historical outline provide some basic concepts for those not familiar with UA research, it also identifies certain fundamental concepts and assumptions essential to understanding utility analysis, which are sometimes ignored or forgotten in more recent theoretical developments; The third section summarizes fmdings from previous studies revealing the effect of I/O psychological interventions on work force consequences. The fourth section critically reviews the research topic commanding the greatest attention to date--measuring the dollar value of performance variability. The fIfth section examines UA research from the perspective of information theory, by examining the role of risk and uncertainty in decision making. Such a perspective suggests that UA models can improve decisions even when information is severely lacking. Methods for identifying risk and uncertainty are described, as well as a technique for identifying when additional information is valuable. The role of UA research in defining statistical and substantive significance is also discussed. The sixth section presents enhancements to the traditional selection utility models. These include incorporating financial/economic considerations, "intangible" factors such as equal employment opponunity and affmnative action, and the role of "constituencies" (Tsui, 1984; 1987; Tsui & Gomez-Mejia, 1988) in evaluating the usefulness of HRM programs. This section also shows how UA research can link I/O psychology and labor economics. It suggests that UA offers a mechanism for truly interdisciplinary approaches to employment issues, but that this demands that UA models reflect economic considerations, stocks and flows. The eighth section discusses the role of utility analysis in describing consequences of programs that affect the "stock" of existing employees by altering the characteristics of the work force or work situation. Recent research is reviewed, suggesting implications for extending utility analysis research to imponant new areas. ,; 'I: Utility Analysis for Human Resource Management Decisions Page 3 n The ninth section presents a unified utility model reflecting outcomes of HRM decisions affect the ~s composition of the WOlXforce by changing the "flows" of employees into, through, and out of n organizations--an employee movement utility model. Important links are proposed between recruitment, selection, turnover, and internal staffing. Empirical simulation analyses are described that suggest that the actual consequences of HRM decisions are likely to reach far beyond those reflected in current models addressing only the consequences of selection. It demonstrates the need for a fully integrated frameworlc rlS, for considering the consequences of changing both the stocks and flows of employees, which can lead to greater synergy in planning and implementing employment programs. Finally, the tenth section presents a matrix to guide future UA research, emphasizing the need to ond move beyond selection models and measurement issues, and toward a broader understanding of HRM program decision making. Concepts and Definitions Utility Analysis as a Subclass of Multiattribute Utility Analysis Multiattribute utility (MAU) models are "decision aids" (edwards, 1977; Einhorn & McCoach, 1977; Einhorn, Kleinmuntz & Kleinmuntz, 1977; Fischer, 1976; Huber, 1980; Keeney & Raiffa, 1976) that provide tools for describing, predicting and explaining decisions. MAU models share certain characteristics and requirements. To apply such models, one must: (1) Identify a set of decision options that represent the alternative programs or courses of action under consideration; ::aI (2) Identify a set of attributes that reflect the characteristics of the options that are important because they represent the things that matter to the decision makers and/or the relevant constituents. the (3) Measure the level of each attribute produced by each option using a utility scale for each attribute; (4) Combine the attribute values for each option using a payoff fwzction reflecting the weight given each attribute and combination rules for deriving an overall total utility value for each '. option. ----.---------------------------------- ity Insen Table 1 Here in --------------------------------------- Table 1 illustrates an extremely simple application of MAU analysis. Suppose productivity is below desired levels among sales people. Two decision options might be identified, involving two different s, training programs called Program A and Program B. Three attributes are of interest: (a) Effects on sales levels, (b) Resources required to develop and implement the program; and (c) Effects on sales person job satisfaction. Attributes (a) and (b) use a utility scale of dollars, while Attribute (c) uses a rating scale from 1 to 7. The payoff function consists of multiplying the level of attributes (a) by 1, multiplying the level of attribute (b) by -1, multiplying the level of attribute (c) by 3,000, and adding the results to produce a total utility value. We could construct a Multiattribute utility matrix like that shown in Table Utility Analysis for Human Resource Management Decisions Page 4 I, with the cells of me matrix containing me expected level of each attribute for each option, and the total utility values below each option computed using the payoff function. Although Program B has the higher first-year dollar payoff, the high weight given to attribute (c), Job Satisfaction, combined with Program B's lower Job Satisfaction cause it to attain a lower utility value than Program A, and thus to be less preferred. Obviously, MAU models can encompass a variety of decision options, numerous and diverse sets of attributes reflecting many different constituents, and very complex payoff functions, but they generally share the characteristics shown in the simple ex~ple of Table 1. MAU models can assist decision makers in overcoming "limits on rationality" (March & Simon, 1958) by providing a simplified, structured framework within which to consider a number of decision options. Huber (1980, pp. 61-62) identifies five advantages of MAU models over less systematic and structured decision systems: (1) Because they make explicit a view of the decision situation, they help to identify the inadequacies of the corresponding implicit, mental model; (2) The attributes contained in such models serve as reminders of the information needed for consideration of each alternative; (3) The informational displays and models used in the mathematical model serve to organize external memories; (4) They allow the aggregation of large amounts of information in a prescribed and systematic manner, and (5) They facilitate communication and support to be gained from constituencies. As a subclass of MAU models, UA models also serve as decision aids, and can provide the advantages listed above. Unfortunately, very little theoretical or empirical re:.earch has approached utility analysis from this decision-making perspective. Nonetheless, a keen appreciation of the role of UA models in the decision process suggests some very different research questions and directions. These will be emphasized throughout the chapter. Unlike the generic MAU model describe in Table 1, UA models focus on a particular type of decision option, a restricted set of attributes, and a defmed mathematical formula for attribute weights and combination rules. The next sections examine these MAU components, and how they apply to UA models. The Decision Options: HRM Productivity-Enhancement Programs Any MAU model requires a focus of analysis--the decision options considered. For example, an MAU model for deciding where to build a new hospital might focus on options reflecting different types of facilities, combined with different locations, combined with different service offerings. Each combination would constitute a decision option. Utility analysis has focused on HRM programs designed to enhance work force productivity. Such programs include selection testing, recruitment, training, and compensation--all of which affect the organizational value of the work force, whether they are explicitly chosen using decision models or evolve implicitly over time (Milkovich & Boudreau, 1988). Utility analysis involves describing, predicting and explaining the consequences of such program options, their desirability, and the decision processes leading to choices among them. Thus, while the focus of UA is more specific than generic MAU models, it covers a wide array of options relevant to organizational Utility Analysis for Human Resource Management Decisions Page 5 goals. As we shall see, the majority of UA research has focused only on selection programs, but we he now have the theoretical models to apply to virtually any HRM program. Decisions about individuals versus decisions about programs. Utility analysis models might seem 0 to focus on decisions about individuals, rather than programs. For example Cascio (1980, p. 128) stated, Id "all personnel decisions can be characterized identically. In the fIrst place there is an individual about whom a decision is required. Based on certain information about the individual (for example performance appraisals, assessment center ratings, a disciplinary repon), decision makers may elect to pursue various alternative courses of action." In MAU terms, the decision options are different courses of action for each individual. However, closer examination shows that UA models are intended to apply to decisions about the programs that guide the countless decisions about individuals made by human resource managers. The options under consideration are the procedures, rules or "strategies" (Cronbach & GIeser, 1965, p. 9) meant to be used with many individuals, and evaluated by their "total contribution when applied to a large number of decisions" (p. 23). Decisions about whom to hire depend on what programs of recruiunent and testing have been chosen to generate applicants and information about them. Decisions about how much to pay individuals depend on what compensation programs and rules have been chosen for that work force. Decisions about assigning individuals to new jobs depend on what career development and training programs have been chosen to generate skills and forecast future needs. Thus, UA models focus on the more strategic and tactical decisions about programs, rather than the operational decisions about each individual. Because program decisions affect many individuals throughout their tenure with the organization, the ity impact of even a single program decision on future work force consequences can be quite large. A selection program that affects the hiring decisions for 1,000 people, each of whom stays for 5 years ",ill affects 5,000 person-years of organizational behavior. If a more correct program decision produces even ~ls a modest work force quality increase of $10 per person-year, it's impact can be $50,000. Of course, this also suggests that the consequences of wrong decisions have large potential negative effects. Utility lts, analysis uses information from social and behavioral sciences to attempt to improve such important decisions. Two types of programs addressed by VA models. It is useful to group the variety of HRM programs that can be addressed by UA models according to whether they affect employee "flows" or employee "stocks". First, programs affecting employee flows change the composition or membership of the work force through "employee movement" (Boudreau, 1988; Boudreau & Berger, 1985a, 1985b; es Milkovich & Boudreau, 1988). For example, selection programs allow additions to be made to the work force, retention programs determine which employees are retained when separations take place; and ed internal staffing programs determine which employees move between positions within an organization (Milkovich & Boudreau, 1988, Chapters 10-13). UA models applied to such programs focus on the y process used to determine which individuals are chosen to move or to remain, and the program's consequences reflect the effects of having a different set of employees in the work force. UA models are typically applied to decisions about this type of program, with external selection programs receiving the . s greatest attention. Second, programs affecting the employee stock change the characteristics of the existing set of i L- Utility Analysis for Human Resource Management Decisions Page 6 employees, in their current positions. For example, training programs operate by altering knowledge, skills, attitudes, or other employee characteristics; Compensation and reward programs operate by altering the relationship between behaviors/outcomes and rewards; Performance feedback and goal setting programs operate by altering employee perceptions of the consequences of their behaviors. Such programs work to the extent that they lead to different behaviors by existing employees, which lead to more valuable organizational outcomes. UA models address decisions about such programs by focusing on options representing different kinds of programs affecting tl1e stock of existing employees. The Attributes of Programs In VA Models Once a set of decision options is defmed. MAU models specify the set of attributes reflecting the outcomes of concern to the decision makers and relevant constituents, and the level of each attribute achieved by each decision option. For example, the decision about where to build a hospital might include attributes as diverse as the environmental impact of the facility, speed of treatment in emergencies, and impact on local property values, reflecting the concerns of' constituents as diverse as community planners, potential patients, nearby propeny owners and the future medical staff. UA models focus on decisions about HRM programs, so the attribute set is more focused, but still quite broad. Cronbach & GIeser (1965, p. 22) defined the attribute domain as "all the consequences of a given decision that concern the person making the decision (or the institution he represents)." HRM program attributes may be placed in two categories--efficiency and equity (Milkovich & Boudreau, 1988): Efficiency attributes reflect the organization's ability to "maximize outputs while minimizing inputs", such as labor costs, job perfonnance, sales volumes, revenues, profits, market share and various fmancial/economic indicators of organizational strength. Effectiveness attributes reflect ~e "perceived fairness" of organizational procedures and outcomes, such as employee attitudes, labor relations, minority and female representation, compliance with legal requirements, and community relations. To date, most UA research and applications have focused on a very small set of efficiency-related attributes reflecting the productivity consequences of HRM program decisions. Although UA models can become mathematically complex, all existing UA models reflect just three basic attributes (Boudreau, 1984c, 1986, 1988; Boudreau & Berger, 1985a, 1985b; Milkovich & Boudreau, 1988): (1) Quantity; the number of employees and time periods affected by the consequences of program options; (2) Quality; the average effect of the program options on work force value, on a per-person, per-time-period basis; (3) Cost; the resources required to implement and maintain the program option. The program options addressed by UA models encompass a potentially large set of attributes reflecting both efficiency and equity, but existing model development has focused on a subset of the efficiency-related attributes reflecting program costs and employee productivity. Thus, like all models, UA models simplify reality by omitting or ignoring some factors. Models, by defmition, are deficient because it is impossible to accurately reflect all the potential attributes affected by decisions. As we shall see, examining the nature of the attributes that are and should be included in utility models is one of the .. most critical issues facing UA research. Defining the domain of appropriate attributes offers fruitful Utility Analysis for Human Resource Management Decisions Page 7 opponunities for further debate and development. We will discuss these opponunities in some detail as nng we review existing research. The Utility Scale for Attributes in UA Models 0 ng With the attributes identified, an MAU model must assign a value for each attribute in each decision option. This requires establishing a utility scale for each attribute, as well as detennining the particular level of each attribute associated with each decision option. For example, in deciding where to locate a new hospital, the attributes are quite diverse (e.g., environmental impact, speed of treatment, facility cost, community satisfaction, etc.), and might be measured in units as diverse as dollars, time, number of complaints, ratings or rankings. UA models focus on HRM programs, and therefore face a more limited set of attributes. Yet, even the relatively simple example in Table I had attributes measured in dollars (Costs and Productivity) as well as ratings (Job Satisfaction). UA models can potentially include a variety of efficiency and equity- related attributes, requiring diverse payoff scales. However, most UA models have focused primarily on productivity-related outcomes, striving to measure them in units relevant to managerial decisions. II Attributes reflecting Quantity are usually measured in person-years, and those reflecting Cost are usually )f a measured in dollars. The appropriate scale for the Quality attributes has been subject to some debate, as we shall see, but the majority of research has been devoted to scaling Quality in dollars per person-year. 88): Attaching a level of each attribute to each option often reflects a process using both subjective and llch objective infonnation. When evaluating past programs, it may be possible to detennine the actual levels of each attribute achieved by different options. But UA models are planning tools, used to anticipate future consequences and suppon current decisions, so attaching attribute levels involves predictions and ity forecasts. Indeed, one major motivation for VA models was to better express statistical forecasts in tenns understandable to managers. The predictive nature of attribute measurement means that utility estimates possess uncertainty and risk. While uncertainty and risk take prominence in general MAV research, VA ::an research has largely ignored them. As we shall see, mechanisms exist to promote further research in this important area. The choice of attribute utility scales and derivation of attribute levels is important, and has received too little attention in VA research. Throughout this chapter we will highlight controversies where additional debate and research attention can be fruitful. Combining Attributes Using a Payoff Function for UA Models The fourth component of an MAV model is the payoff function, which specifies how the attribute levels are to be combined into an overall utility value. Deciding where to locate a new hospital might produce very diverse attributes measured on very different scales (e.g., dollars, time, and ratings/rankings). Payoff functions for such decisions must specify both the weights attached to each attribute level, as well hall as the rules for combining the weighted attribute levels to produce an overall utility value. Such rules the might range from a simple numerical weighting and addition of the weighted values, to more complex non-linear weighting schemes and quadratic combination rules. ,------ Utility Analysis for Human Resource Management Decisions Page 8 Because UA models focus on decisions about HRM programs, their attributes and payoff scales are more limited, and the payoff functions are often simpler. Still, any payoff function must reflect both the imponance of each attribute and its underlying scale. The example in Table 1 adopted a relatively simple combination rule that takes the difference between increased productivity and cq~ts, and then adds the Job Satisfaction level multiplied by 3,000. Obviously, the choice of weights and cofubinaLion roles can have large effects of resulting utility values, and should reflect the values of the decision makers and relevant constituencies. UA research has usually focused on productivity-related outcomes, and thus has adopted payoff functions reflecting dollar-valued productivity and program costs. The payoff function may be considered a variant of the cost-volume-profit models used in other managerial decisions to invest resources. The utility of an HRM program option is derived by subtracting Cost from the product of Quantity times Quality, with the program exhibiting the largest positive difference being preferred. It is typical to refer to UA models as cost-benefit analysis models, and to categorize attributes as either Costs or Benefits. Simply put, Costs represent attributes that reduce overall utility values, while Benefits represent attributes that increase overall utility values. Depending 'on the decision, a given attribute (e.g., reduced employee separations) may represent either a cost or a benefit. Rather than attempt a classification, this chapter will proceed from the more general position that costs and benefits are defined by the attributes, their utility scales, and the payoff functions used to combine them. It is appropriate to question whether such a payoff function is adequate or even appropriate to UA research, and we will explore this issue at length. Summary UA research is a subclass of more general MAU research, and tlle structure of MAU models provides a useful framework for organizing and understanding UA models. As we have seen, UA models reflect a set of decision options, attributes, utility scales and payoff functions, just as any MAU model does. UA models have historically focused on a partic;ular set of options (usually selection programs), attributes (Quality, Quantity and Cost), utility scales (Dollars) and payoff functions (Quantity times Quality, minus Cost). Measuring the payoff in UA research has been characterized as the "Achilles' heel" of UA research (Cronbach & GIeser, 1965, p. 121). As we have seen, such measurement reflects three MAU components: The attributes included; the utility scale used to measure them and attaCh a value to each option; and the payoff function specifying the combination roles across attributes. These components reflect implicit and explicit assumptions about the appropriate decision makers, constituents and consequences to be considered. Throughout the chapter, we will use these MAU concepts to organize and analyze existing and needed future UA research. We have also seen that UA models, like all models, strike a balance between simplicity and realism. All UA models are deficient by defmition, and much research debate has centered on whether and how to reduce that deficiency. But we will never develop a UA model that completely reflects all relevant attributes with perfect accuracy. Does this mean that UA research is unlikely to provide any real information about the effects of HRM program decisions on organizations? If the ultimate objective of '>, UA is to measure the impact of program decisions on organizations, then the answer might be "yes", and Utility Analysis for Human Resource Management Decisions Page 9 we could declare a moratoriwn on UA research. However, Weeall MAU models, UA models are u-e decision aids, not just measurement tools. A decision aid's usefulness lies in its ability to describe, the predict, explain and improve decisions. Such value is assessed by asking whether the model allows the imple best decision to be made with the given body of information, whether it helps to determine if gathering ~Job more information would permit better decisions, and whether it helps to determine how much different ave decision procedures contribute to decision quality (Cronbach & Gieser, p. 21). Depending on the cost 'ant and value of the next best alternative decision aid, even a very deficient or inaccurate UA model might prove effective in improving decision processes or outcomes. Thus, this chapter will approach UA .ered research less from a measurement perspective and more from a decision making perspective. e Historical Development of UtiJity Analysis Models! Though utility analysis is applicable to virtually every HRM program decision, present models resulted as from a concern with selection (and later, placement or classification) decisions. Indeed UA models can e be characterized as responses to the inadequacies of traditional measurement and test theory in expressing the usefulness of tests. "The traditional theory views the test as a measuring instrument intended to assign accurate ts numerical values to some quantitative attribute of the individual. It therefore stresses, as the prime value, precision of measurement and estimation. The roots of this theory lie in surveying and astronomy, where quantitative determinations are the chief aim. In pure science it is 1, reasonable to regard the value of a measurement as proportional to its ability to reduce uncertainty about the true value of some quantity. The mean square error is a useful index of measuring power. There is little basis for contending that one error is more serious than another of equal magnitude when locating stars or determining melting points: measurement theory is unobjectionable when applied to such appropriate situations. "In practical testing, however, a quantitative estimate is not the real desideratum. A choice between two or more discrete treatments must be made. The tester is to allocate each person to the proper category, and accuracy of measurement is valuable only L'lsofar as it aids in this qualitative decision Measurement theory appears suitable without modification when the Idels scale is considered in the abstract, without reference to any particular application. As soon as the scale is intended for use in a restricted context, that context influences our evaluation of the scale." Cronbach and GIeser (1965, pp. 135-136). Therefore, the 'history of UA will be discussed from a decision-making perspective, focusing on the contributions and implications of UA developments for describing, predicting explaining and enhancing ts decision processes and outcomes. Because the vast majority of research has emphasized the selection alue utility model, this will be the focus on the discussion. In this model, the option set involves using a test versus random selection (or choosing between two selection tests), and the utility value reflects only the s effects of selection on the fITstjob to which one group of selectees are assigned. Later sections will describe more recent developments that extend utility analysis beyond selection. \ m. v to IThis section emphasizes developments that set the stage for more recent research and future research directions. Much of this material is drawn from Boudreau (1987, in press). f Other historical summaries can be found in Cronbach and GIeser (1965, chapter 4), Hunter md and Schmidt (1982), Cascio (1982, 1987). Utility Analysis for Human Resource Management Decisions Page 10 Defining the Payoff Ba§ed on the Validity Coefficient Description of the Model. The attribute of selection tests that has the longest history is the validity coefficient, or correlation between a predictor measure and some criterion measure of subsequent behavior, usually expressed as r Classical measurement theory suggested this concept as a measure of AJ' the "goodness" of a test in predicting subsequent behavior. In addition to the validity coefficient itself, two translations are most commonly cited (e.g., Cronbach & Gieser, 1965, chapter 4; Hunter & Schmidt, 1982), both of them lead to the conclusion that only relatively larae differences in the validity coefficient produce important differences in the value of a test First, one can \translate the validity coefficient into the index of forecasting efficiency (symbolized as E) using Equation I below. = - (l-r2 1/2E 1 ) (1) x,y This index, emphasized by early statistical texts (e.g., Kelley, 1923; Hull, 1928), indicates the proportionate reduction in the standard error of criterion scores predicted by the test, compared to the standard error of criterion scores predicted using only the groupmeai1. Second, the coefficient of determination, or the squared validity coefficient appeared as early as 1928 in Hull's text, and reflects the proportion of variance shared by the predictor and the criterion. Obviously, very large increases in validity are required to substantially increase these indexes. As Cronbach and GIeser (1965, p. 31) noted, "the index of forecasting efficiency describes a test correlating .50 with the criterion as predicting only 13% better than chance; the coefficient of detennination describes the same test as accounting for 25% of the variance in outcome." Yet, correlations as high as .50 may be rare. In short, using these indexes, it appeared that very great improvements in testing would be necessary to have any substantial effect on organizational outcomes. Evaluation From a Decision-Theory Perspective. As MAU models of a test's usefulness for decisions, such fonnulas are deficient Only one attribute of the selection system is considered--the accuracy of prediction, expressed as the shared variance between two nonnally-distributed variables. From a decision-making perspective, the usefulness of a selection system depends on its ability to provide infonnation that will improve decisions, where decision improvements are measured in tenns of valued decision outcomes. Therefore, this model omits selection system attributes such as the quality of the existing selection system, the effect of the proposed selection system infonnation on actual decisions, and the impact of those effects on valued consequences. The utility scale for attaching attribute values to each option is a statistic that measures squared deviations from a predicted linear function. Thus, both positive and negative prediction deviations from the linear function are equally undesirable. This implies that a decision maker would consider overpredicting a qualified candidate's future perfonnance just as costly as underpredicting it In fact, of course, the important deviations from predictions are the ones that result in selection errors (i.e., selecting a candidate who should not have been hired, and/or failing to select a candidate who should have been hired). These models adopt an implicit payoff function that assigns equal value (or loss) to inaccurate predictions at all points in the predictor-criterion space (Wesman, 1953). Because there is only one attribute, there is no payoff function for combining different attributes. The statistic serves as the sole .. Utility Analysis for Human Resource Management Decisions Page 11 utility value. These models fail to reflect most of the three basic program attributes (i.e., quantity, quality and idity cost). They reflect neither the quantity of time periods affected by the selection decisions, nor the quantity of employees affected in each time period. Though these models reflect one statistical quality of ~ of the predictor, this is only indirect evidence of that predictor's effect on work force quality. Finally, they ~lf, fail to acknowledge the costs to develop and apply tests. Though the deficiencies inherent in these 1idt, formulas are apparent when viewed from a decision-making perspective, the fundamental notion of 'icient expressing the relationship between a predictor and a criterion in terms of the correlation coefficient nto remains a basic building block of UA models. Future models began to explore ways to in1bed the correlation coefficient within a set of decision attributes that made it easier to interpret. Defining Payoff Based on the Success Ratio Description of the VA model. These utility models reflected a new utility concept--the success ratio, or proportion of selected employees who subsequently succeed. Taylor and Russell (1939) proposed a UA model designed to reflect the fact that the usefulness of a test depends on the situation in s the which it is used. Unlike models based solely on the validity coefficient, the Taylor-Russell model reflects three attributes of the decision situation: (1) the validity coefficient; (2) the base rate, scaled as the ,s proportion of applicants who would be successful if selection were made without the proposed predictor; :ing and (3) the selection ratio, scaled as the proportion of applicants falling above the hiring cut-off on the :ribes predictor. ay The payoff function combining these attributes assumes a linear, homoskedastic, and bivariate normal relationship between the predictor (or predictor composite) and the criterion, and uses formulas for the area under a normal curve to derive the success ratio. The Taylor-Russell model assumes fixed-treatment selection (i.e., each applicant will either be hired or rejected) and a dichotomous criterion (i.e., selectee value is classified as either successful or unsuccessful). Total utility under the Taylor-Russell approach is the difference between the success ratio predicted for a specific combination of validity, selection ratio, Ivide and base rate, minus the success ratio that would result without using the proposed predictor (i.e., the d base rate). The combination producing the greatest in1provement is the preferred option. Taylor and Russell derived extensive tables indicating the predicted success ratio for various combinations of base and rates, validity coefficients, and selection ratios (Cascio, 1987 reprints these tables). To apply the model, a decision maker would choose the criterion (e.g., job performance) and determine the level of criterion performance that represents the dividing line between acceptable and 1m unacceptable (or successful and unsuccessful) selectees. Then, s/he would estimate the current base rate in1plied by this criterion level in the population of individuals on which the proposed predictor would be of applied (perhaps by examining the ~ent success rate, if the predictor is to be added to those already in ting use). Finally, s/he would use the Taylor-Russell tables to determine the expected change in the success n ratio under various assumptions about validity and selection ratios. Detailed summaries of the Taylor-Russell model are provided elsewhere (Taylor & Russell, 1939; Cascio, 1980; Cascio, 1987, chapter 7). According to the Taylor-Russell tables, when other parameters are held constant: (1) higher validities produce more improved success ratios (because the more linear the Utility Analysis for Human Resource Management Decisions Page 12 relationship, the smaller the area of the distribution lying in the false-positive or false-negative region); (2) lower selection ratios produce more improved success ratios (because lower selection ratios mean more "choosy" selection decisions, and the predictor scores of selectees lie closer to the upper tail of the predictor distribution); (3) base rates closer to .50 produce more improved success ratios (because as one' approaches a base rate of zero, none of the applicants can succeed, so selection has less value; as one approaches a base rate of 1.0, all applicants can succeed even without selection, so selection has less value). Evaluation from a decision-making .-erspective. The Taylor-Russell model reflects three attributes, rather than only the validity coefficient, but it still provides a limited description of selection program utility. Like its predecessors, this model ignores both the number of employees affected and the number of time periods during which that effect will last. the model's measure of Quality (proportion successful) is also troublesome because it does not reflect the natural units of value such as sales, productivity or reduced errors. Finally, the model excludes attributes reflecting program costs (Cascio, 1980, 1987), but cost differences will occur, especially as the selection ratio is changed by screening more/fewer applicants. Scaling the base rate as a dichotomous criterion (i.e., success/fail) will often lose information because the value of performance is not equal at all points above the satisfactory level, nor at all points below the unsatisfactory level (Cascio, 1982, p. 135; Hunter & Schmidt, 1982, p. 235, Cronbach & Gieser, 1965, pp. 123-124, 138). More typically, performance differences exist within the two groups, so a continuous criterion scale could be more appropriate. Cascio (1982, 1987) suggests it may be more appropriate for ,I truly dichotomous criteria (e.g., turnover occurrences), or where output differences above the acceptable level do not change benefits (e.g., clerical or technician's tasks), or where such differences are .j unmeasurable (e.g., nursing, teaching, credit counseling). Combining the attributes by assuming bivariate normality and linearity implied in the payoff function may also be unrealistic in some selection situations. Some have proposed that the choice of the criterion cutoff is "arbitrary" (Cascio, 1982, p. 133; Hunter & Schmidt, 1982, p. 235; Schmidt, Hunter, McKenzie & Muldrow, 1979) because it is set by management consensus or because objective information on which to base such a decision is rarely available, and that changing this "arbitrary" cutoff will change the base rate, and thus substantially alter the conclusions from the model. If indeed there is no objective method of setting the performance cutoff, then the Taylor-Russell utility model is inappropriate. However, the concept of a criterion cutoff is not arbitrary, nor does the Taylor-Russell model imply that arbitrary changes in that cutoff are to be regarded as legitimate methods of enhancing the success ratio. Rather, the criterion cutoff (and the base rate it implies) should be based on the relationship between the selection situation (Le., the level of minimally- acceptable criterion levels) and the applicant population (i.e., the proportion of the population that would exceed that level if hired). This concept is essential to evaluating the effects of recruitment on staffing utility, and should not be abandoned by labelling it "arbitrary." Variations on the Taylor-Russell model. The models discussed next add program costs to the model and/or redefine the attribute utility scales to include dollar-scaled consequences of different selection mistakes. Cascio (1980, p. 35) noted that Smith (1948) provides a method of adjusting the Taylor-Russell results to reflect pre-existing selection ratios and validities. Technically, if current- employee characteristics are used as inputs to the model, this assumes that current employees are similar Utility Analysis for Human Resource Management Decisions Page 13 to the applicant population to which the new predictor system will be applied. This is appropriate if one ); is adding the new predictor to an existing set of predictors, and if the base rate, selection ratio and validity coefficient reflect this situation. However, if the predictor will replace a previous predictor, then the one should use the table corresponding to the observed success ratio given current selection ratios and ::me' validities. Sands (1973) proposed the CAPER model (Cost of Attaining PErsonnel Requirements). It's payoff objective is a recruiting and selection strategy that minimizes total costs of recruiting, inducting, selecting and training enough new hires to meet a set quota of satisfactory employees. This model adds the notion :tes, of the costs involved in hiring and recruiting, but it suffers from the same weaknesses in the payoff function as the Taylor-Russell model. ber Mahoney and England (1965) noted that success and failure probabilities on a new predictor are ~ful) conditional on the success and failure probabilities existing in the applicant population after previous methods have been employed. They proposed that previous decision rules (Stone & Kendall, 1956) and Jut Meehl and Rosen (1955) implicitly assumed that these probabilities are .50. They defmed the cost of selection mistakes to include not only false positives (hires who do not succeed) as in the Taylor-Russell model, but also false negatives (rejected applicants who would have succeeded), which could be important ause where high-quality rejected applicants are hired by competitors and reduce the organization's competitive I the advantage (Guttman and Raju, 1965). Mahoney and England simulated various values for the selection " ratio on the proposed predictor, the selection ratio on previous predictors, the existing failure probability, JUS the failure probability under the new system, the ratio of recruitment costs to selection mistakes (i.e., 05, or .10, .30, and .50), and the ratio of predictor costs to selection mistakes (i.e., .05, .10, and .30). They I,e concluded that a new predictcl's value exceeds its cost only when the probability of selection mistakes is quite low (i.e., less than .30), and that "the opportunities for developing and installing predictive measures ate that are wOrth the additional cost appear relatively restricted" (p. 375). This conclusion conflicts with ons. more recent evidence based on newer VA models. One explanation is that their ratios of costs to mistakes were really quite large. Because selection mistakes may reduce performance for many years, and predictors can cost less than a few hundred dollars, it is difficult ~o imagine situations where the ratio would exceed .10, and it would probably frequently fall below .0L ~r Hunter and Sctllnidt (1982) also note a number of studies based on the notion of a dichotomous toff, criterion (Alf & Dorfman, 1967; Curtis, 1966; Darlington & Stauffer, 1966; Schmidt, 1974). While ~t obviously deficient, if the dichotomous-criterion model is easier to implement, then a more complex 'ded model (such as those discussed subsequently) must prove its value based on its ability to improve decisions over the simpler model. y- Ild Defining Payoff Based on the Standardized Criterion Level g Description or the Utility Model. The major criticism of the Taylor-Russell model was that it used a dichotomous notion of total utility (i.e., success/fail) that failed to reflect the true range of variation in selectee performance. The next version of the selection utility model attempted to remedy this by scaling total utility on a continuous scale. Brogden (1946a, 1946b) showed that the correlation coefficient is the proportion of maximum predictive value obtained using a predictor (where maximum predictive value is lar Utility Analysis for Human Resource Management Decisions Page 14 what would hypothetically be obtained if the criterion itself were used to select employees). Moreover, he used the principles of linear regression to demonstrate the relationship between the correlation coefficient and increases in a criterion (measured on a continuous scale). Brogden's logic serves as the basic building block for virtually all subsequent UA research. Assuming a linear relationship between criterion scores (y) and predictor scores (x), the best, linear unbiased estimate of the criterion score associated with a predictor score is: E(y) =A + B(x) (2) The intercept (A) and the slope (B) of this line reflect the linear relationship between x and y as well as the units ~ which each of them was originally scaled. However, because predictor and criterion scales vary from study to study, it is difficult to compare these parameters or to use them in a general model. However, if we transform both the y and x variables into standardized (Z-score) units (i.e., Z. and Zy), we can write Equation 2 as follows: Zy = (r..y)(Z.) (3) Therefore, if we knew the average standardized predictor score of a selected group of applicants (i.e., Z.), our best prediction of the average standardized criterion score of the selected group (i.e., Zy) would be the product of the validity coefficient and the standardized predictor score, as shown in Equation 4. Zy = (r.)(Z.) (4) The validity coefficient was well established. One way to estimate the average standardized test score of the selected group would be to actually observe the value after applying a selection device. However, Kelley (1923) suggested that if one assumes that the predictor scores are normally distributed and that one ranks applicants by test score and selects from the top down, then the average standardized predictor score is a function of the proportion of the applicant population falling above the predictor cutoff score (i.e., the selection ratio). However, if one assumes the predictor is normally distributed, then Equation 4 holds only if one also assumes normally distributed criterion scores as well. Brogden (1949, Equation 6) and Cronbach and GIeser (1965, p. 309) make use of this approach to derive their models. If we symbolize the "ordinate of the normal distribution" corresponding to the standardized predictor cutoff score as lambda (i.e.,)), and the selection ratio corresponding to the standardized predictor cutoff as SR (it has also been symbolized by the greek letter 0), then, Equation 3 can be re-written: Zy = (r x,y)( A ISR) (5) The "ordinate of the normal distribution" is an important variable, multiplicatively related to the average standardized predictor score, and a statistically sophisticated concept It is sufficient, however, to Utility Analysis for Human Resource Management Decisions Page 15 understand that the ordinate is simply a mathematical value that is completely detennined by tM selection "cr, ratio, and (when divided by the selection ratio) can be used to compute the expected average standardized predictor score of those selected using that selection ratio. Computing the relationship between the the selection ratio and the average standardized predictor score of the selected group was made even easier by Naylor and Shine (1965) who computed extensive tables showing, for each selection ratio, the :aT corresponding standardized predictor cutoff score, the corresponding ordinate of the normal distribution, and the corresponding average standardized predictor score under the assumptions noted above. Evaluation From a Decision-Making Perspective. The attributes of the Naylor-Shine utility model still include the validity coefficient and the selection ratio, but their contributions appear through a different payoff function. The validity coefficient now has a constant multiplicative effect on expected las standardized criterion levels at all selection ratios. The selection ratio still reflects the "choosiness" of the les sel-ection program, but is now used to derive a new attribute--the standardized predictor score of selecteesleI. LZ"}. The lower the selection ratio, the greater the predictor score required to meet selection standards, ), we and the greater the reswting ~~dardized predictor score of those meeting the selection standard. Unlike the Taylor-Russell model, the base rate no longer appears as an attribute because the standardization used to go from Equation 2 to Equation 3 defines the average value of the applicant pool as zero. The utility model of Equations 3, 4 and 5 addresses one shortcoming of the Taylor-Russell model by using a total utility concept based on a continuous scale. Utility is dermed as the difference in average standardized criterion score between those selected using a test and those selected without it. The 'owd translation from Equation 2 to Equation 3 requires that the utility concept be expressed in standardized 4. units, which are difficult to interpret in units more natural to the decision process (e.g., performance ratings, dollars, units produced, reduced costs, etc.). Also, this utility concept reflects only the difference between the average standardized criterion score of those selected using the predictor and the average standardized criterion score that would be obtained through selection without the predictor. The absolute utility from the program is not computed, only the increment over not using the predictor. Finally, the model assumes that selection occurs as if applicants were ranked based on their predictor scores, and then ed hired from the top down until the desired selection ratio is reached, which mayor may not describe a zed realistic selection approach. Considering the three basic utilitY model concepts (i.e., quantitY, qualitY and cost), the Naylor-Shine then utility model reflects the effects of selection on per-person, per-time-period quality on a continuous criterion. The quantitY of employees and the number of time periods affected are not explicitly reflected, to nor are program costs. However, the next section will demonstrate that they can be easily added. Defining Payoff in Terms of Dollar-Valued Criterion Levels 13 Description of the utility ~odel. The most obvious drawback of the Naylor-Shine UA model is that standardized criterion levels are difficult to interpret in "real" units. Correlation-based statistics are usefw when predictor and criterion scales vary from study to study (as in selection research) because the standardized scale underlying the correlation coefficient allows direct comparison between studies. However, when one wishes evaluate utility in units relevant to a particwar situation, such standardized r, to scales create problems. Utility Analysis for Human Resource Management Decisions Page 16 Actllill selection makers usually face choices among selection strategies. Each strategy carries with it a set of activities required for development and implementation, as well as the possibility of various outcomes resulting from more accurate selection. The development and implementation activities are often expressed as costs (i.e., the value of required resources) usually scaled in dollars. Therefore, the question becomes whether it is worthwhile to spend that dollar amount to produce the selection consequences. With a standardized criterion scale, one must ask questions such as: "Is it worth spending $10,000 to select 50 people per year, in order to obtain a criterion level 0.5 standard deviations greater than what we would obtain without the predictor?" Many HRM managers may not even be familiar with the concept of a standard deviation. They would fmd it difficult to attach a dollar value to a 0.5 standard deviation increase in the criterion, particularly because the decision makers may never actually observe the population of applicants to which the predictor would be applied. These limitations suggest modifying the UA model for selection to be expressed in dollar terms. Both Brogden (1946a, 1946b, 1949) and Cronbach and GIeser (1965, pp. 308-309) eventually derived their utility formulas in terms of "payoff' (often expressed in dollars) rather than standardized criterion scores. Also, they both included the concept of costs. In fact, Brogden's'(l949) treatment explicitly computed utility values in dollar terms, and attempted to derive guidelines for testing costs. Brogden and Taylor's (1950) formula introduced a scaling factor to translate standardized criterion levels into dollar terms. The scaling factor is the dollar value of a one-standard-deviation difference in criterion level (e.g., 0" 0.. and SD,). Tbe cost attribute is usually expressed as the cost to administer the predictor to a single applicant (usually symbolized as C). Finally, the utility value is symbolized as I!.U , to indicate that it represents the difference between the dollar payoff from selection without the predictor and the dollar payoff from selection with the predictor (this is usually called the "incremental" utility of the predictor). The resulting utility Equation may be written as Equation 6. jj U = (SD,)(r.)(ZJ - CISR (6) The per-applicant cost (C) is divided by the selection ratio (SR) to reflect total cost of obtaining each applicant (e.g., if the selection ratio is .50, then one must test 2 applicants to fmd each selectee, and the testing cost per selectee is 2 times the cost per applicant). Sometimes, the entire formula is simply written in terms of per-selectee outcomes, and the symbol C is used to denote the cost per selectee. Equation 6 depicts the incremental dollar value ( U ) produced by using a predictor (x) in a population of applicants where the validity coefficient is r..,; a one-standard-deviation difference in ~llar valued criterion levels equals SD,; the average standardized predictor score of those selected is Z.; and the per- selectee cost of using the predictor equals (CISR). To express the total gain from using the predictor to select N. selectees, we simply multiply by the number Selected, change the symbol for increment.al. utility from .4 fj to. U, and multiply the per- applicant cost by the number of applicants (Napp)as shown in Equation 7. 4 U = (N.)(SD,)(r.)(Z -J (C)(N.",) (7) Utility Analysis for Human Resource Management Decisions Page 17 ith it This formula is stated in terms of the per-selectee incremental criterion level multiplied by the number selected (Brogden, 1949), but Cronbach & GIeser (1957, 1965) derived their formulas in terms of the per- he applicant incremental criterion level, which can be derived by dividing the total utility by the number of applicants, expressed as Equation 8: nding ter with AU/applicant = (N)Napp)(SDy()r.)()./SR) - C (8) By In Equation 8, the term ( /SR) has been substituted for the average standardized test score. If we note that the term (N)Napp)equals the selection ratio (SR), we can cancel terms and produce the Cronbach & GIeser equation for per-applicant incremental dollar-valued utility, shown in Equation 9. >n 1 and fj U/applicant = (SD)(r.)~) - C (9) if (e.g., Cronbach and Gieser (1965, p. 39) also developed a utility formula for comparing the usefulness of two tests (one producing lower validity and lower costs, the other producing higher validity with higher costs). te They recommended computing the difference in utility between the two tests, which simply involves ~<> substituting the difference in validities for r.,y and the difference in costs for C in Equations 6 through 9. Recent embellishments of the B-C-G model have explicitly incorpoiated the duration of the effects of better-selecting one group by multiplying the value component (i.e., the component containing r.) by the expected average tenure of the hired group (i.e., 1). These equations have come to be known as the Brogden-Cronbach-Gleser (B-C-G) selection utility model. Evaluation From a Decision Making Perspective. The B-C-G selection utility model reflects the same attributes as Naylor-Shine, but adds the attribute of dollar-valued criterion standard deviation (i.e., the SDy). It also adds attributes reflecting the duration of selection effects (i.e., 1) and the program costs (i.e., C). In terms of the overall utility concept, scaling the per-person, per-time-period incremental criterion level in dollars seems more in keeping with organizational objectives evaluated in dollars. The ion model continues to focus on the incremental utility added by using the predictor versus not using it. Thus, all utility values are scaled as differences from an unknown utility level that would be attained per- without the predictor. Table 2 summarizes the results of the Schmidt, Hunter, McKenzie & Muldrow (1979) application of he the B-C-G model for entry-level computer programmers in the U.S. Government. The application reflects the consequences of hiring one group of 618 computer programmers, assumed to stay for 9.69 years and then leave. The utility computation is organized according to the Quantity, Quality and Cost components developed earlier. Unlike earlier models, the B-C-G model incorporates all three concepts. Although modifications to this basic model have recently been proposed, the B-C-G model has been the dominant, framework for studying HRM program utility. ------------------------ Utility Analysis for Human Resource Management Decisions Page 18 Insen Table 2 Here --...----...-...----...-------------- Assumptions of the B-C-G Selection Utility Model. The payoff function translating the attributes into utility values reflects cenain assumptions (Cronbach and GIeser, 1965, p. 307): (I) Decisions focus on an indefinitely large population of "all applicants after screening by any procedure which is presently in use and will continue to be used." Thus, the appropriate population for deriving the validity coefficient, SD" and the selection ratio depends on the decision situation. If one is contemplating adding a new procedure to a group of previously-used procedures, then it is the "incremental" validity coefficient and the pre-screened population SD, and selection ratio that count. If, however, one contemplates replacing an old procedure with a new one, then the parameters should reflect the unscreened population. (2) Regarding any person, one can decide only to accept or reject them. Thus, no adaptive decisions can be made to reflect different predictor scores (e.g., ttaining those who achieve a moderately high score, in order to bring them to minimally qualified levels). (3) Predictor (or "test") scores are standardized to zero mean and unit standard deviation. (4) The "payoff' resulting from accepting a person has a linear regression on predictor j score, and the predictor is scored so that validity is positive. (5) The payoff resulting from rejecting a person is unrelated to predictor score, and is set to zero. Thus, it is assumed that the organization is indifferent to the consequences of rejection, regardless of the qualification level of those rejected. . (6) The average cost of administering the predictor ("testing") a person is C, and C is greater than zero. In practice, it is often easier to separate this cost into its fixed components (i.e., one-time development costs) and its variable components (i.e., ongoing per-applicant administration costs). Also, if the decision options include the possibility of testing more or fewer applicants, then the differences in recruiting costs necessary to provide different quantities of applicants should be included (Boudreau & Rynes, 1985; Hunter & Schmidt, 1982. p. 241). (7) The strategy for selection is to set a predictor cutoff score so that the desired proportion (selection ratio) of the applicant group falls above it. All applicants scoring above that level are accepted, those below it are rejected. This is equivalent to ranking applicants from the top down on predictor score, and then hiring by rank order until the established quota of new hires is met (assuming there are no rejected offers). When such hiring does not take place, the effective selection ratio is different. Validity (or the Dollar-Valued Criterion versus Proxy Criteria. Adopting the SD, scaling factor carries with it some assumptions about observed and implied correlations. There is no clear consensus regarding the meaning of y (we will discuss this after reviewing empirical attempts to estimate SD,), but it undoubtedly reflects a wide variety of employee behaviors and attributes that affect dollar-valued organizational outcomes. If it were possible to measure such a criterion, the best utility model would simply reflect the regression equation of y on the predictor score (similar to Equation 2). In reality, however, predictors are not validated on such a dollar-valued criterion because it cannot be directly measured. Thus, UA models substitute a validity coefficient (r..,) that reflects the regression of one or more proxy criteria (e.g., performance ratings, tenure, sales, etc.) on the predictor, with all variables standardized to Z-scores. This substitution not only assumes that dollar-valued criterion levels are linearly related to predictor scores, but that the proxy criterion and unobserved dollar-valued productivity are also linearly related. Hunter & Schmidt (1982) and Schmidt, Hunter, McKenzie and Muldrow (1979) proposed that many mistakenly believe that utility equations are of no value unless the data exactly fit the linear homoskedastic model, and all marginal distributions are normal. They state that the B-C-G model only introduces the normality assumption for "derivational convenience" (Hunter & Schmidt, p. 243) because it provides an exact relationship between the selection ratio and the average standard test score of selectees. They further state that the only critical assumption is a linear homoskedastic relationship between Utility Analysis for Human Resource Management Decisions Page 19 predictor and criterion, and they present evidence in suppon of this relationship using observable proxy criteria. They argue (Schmidt,et al., 1979, p. 613) that the relationship between the proxy and employee dollar value will be linear or that ceiling effects on proxy measures will make the correlation between the tes proxy and the predictor underestimate the correlation between the dollar value and the predictor. Raju, by BurKe & Normand (1987) note that equality between these correlations implies a correlation close to unity~ between the proxy and dollar value. Evidence of low correlations between typical and maximum ion lITes, performance (Sackett, Zedeck & Fogli, 1988) suggests that validity might differ depending on whether ratio dollar value reflects typical or maximum performance. Evidence that test validity may be higher at higher predictor score ranges (Lee & Foley, 1986) suggests that the level of test scores in the applicant ptive population may also moderate incremental utility values. We have no direct evidence regarding the correlation between predictors and dollar-valued utility, but small estimation errors may not seriously m. reduce the utility model's ability to improve decisions (compared to less sophisticated decision models). Hunter & Schmidt (1982) and Schmidt, et al. (1979) also state it is a mistake to believe that test set to validities are situationally specific, making application of utility analysis possible only when a criterion- related validity study has been performed in the particular situation. "Validity generalization" research (Hunter, Schmidt, & Jackson, 1982), which allows data from many studies to be analyzed together, .e., on strongly suggests that much of the variability in validity coefficients observed across studies is due to then artifacts of the studies (e.g., different sample sizes, different criterion reliabilities, different range e restrictions, etc.), rather than real differences in the predictor-criterion relationship. Moreover, the :}nion variability that does remain after correcting for these artifacts may be so small that it does not seriously n on reduce the utility model's ability to enhance decisions. Indeed, it has been suggested that selection validities might usefully be estimated by experts or even less experienced judges (Schmidt, Hunter, Croll ~tion & McKenzie, 1983; Hirsh, Schmidt, & Hunter, 1986). The role of testing costs. Both Brogden (1949) and Cronbach and GIeser (1965) portrayed testing tor :us costs as a fundamental characteristic of their VA models. The cost attribute recognizes that improvements in validity and/or reductions in selection ratios are not infinitely desirable. At some point, but additional costs will offset gains from improved employee quality. At the extreme, it seems unlikely that pursuit of selection systems with validities close to unity would be cost effective. Cronbach and GIeser d (1%5) discussed the importance of the cost of testing in deciding between competing predictors (p. 39), in determining optimum test length (p. 323-324) and in determining the optimum predictor cutoff score (p. 308). Brogden (1949) noted that considering the cost of testing can show that higher selection ratios or (i.e., testing fewer applicants and being less choosy) can be preferable to low ones if the testing cost is high. He concluded that "the ratio of cost of testing to the product of the validity coefficient and SD, (in learly dollar units) should not exceed .10. It would be desirable to hold it below .05" (p. 177). Below .05, also lower selection ratios contribute to higher utility. Brogden presented an example for hosiery loopers, and used a one-year payoff duration, His analysis indicated that testing costs above $5.00 per person lany decreased utility at low selection ratios. As we shall see, in actual applications SD, (per person, per year) is usually fairly large compared to testing costs. Moreover, testing costs occur once, but benefits usually nly accrue over the selected group's tenure. Thus, the value of SD, when considered over the group's tenure lse it is larger, and testing costs become less likely to detract from utility except at very low selection ratios ;tees. (Hunter & Schmidt, 1982, p. 240). Utility Analysis for Human Resource Management Decisions Page 20 However, omitting such costs from the UA model, or assuming they equal zero removes much of the justification for dollar-valued utility estimates. Faced with a costless selection procedure, any non- negative validity coefficient must produce positive utility because N.. SDy, and z: must always be positive (see Equation 7), so a utility model based solely on the sign of the validity coefficient would suffice. In reality, implementing employee selection programs may require time, energy and other resources that could be used to implement other managerial programs. If so, the lost value of the foregone programs represents a legitimate cost of the selection program, so actual costs (i.e., the true investment necessary to implement the selection program) may be much higher than testing costs alone. The Appropriate Applicant Population. Cronbach and GIeser (1965, p. 34-35) stated "we use 'validity' subsequently to refer to a correlation computed on men who have been screened on whatever a priori information is in use and will continue to be available," and that the appropriate utility calculation depends on the situation in which the selection program will be used. They noted three possible situations: (1) all prior information will continue to be used and the new system added to it; (2) the new system will be substituted for some of the prior infonnation; or (3) a composite of previous and new information will be used. Each has different implications for the UA model. The incremental program contribution is key. Moreover, any new program should be compared to the efficient use of information already available. Some have concluded that the B-C-G model presumes "concurrent validity" (e.g., Cascio, 1980, p. 39), but the precise assumption is that selection devices be evaluated in light of the conditions under which they will actually be applied. In fact. such conditions may indicate a population less restricted than current applicants (e.g., if the predictor is to be substituted for an existing predictor and applied to unscreened applicants) or it may imply a more restricted population (e.g., if the new predictor is not only going to be added to an existing screening system, but the existing system will be improved before adding it). Mueser and Maloney (1987) argue that validity coefficients used in utility analysis may be severely overstated if test validation data arise from situations where composite predictors are already in use, and validity estimates fail to correct for multivariate restriction in range on those composites versus test scores. Applicant population characteristics also affect the selection ratio and SDy (Boudreau & Rynes, 1985). Determining the appropriate population requires assumptions that have important implications for integrating additional staffmg processes (e.g., recruitment. turnover) into the selection utility model, as discussed subsequently. Several enhancements to the B-C-G model have been proposed and applied, but the vast majority of empirical UA research has focused on selection systems, using the B-C-G model. Therefore, we will now review empirical research based on the B-C-G model, and discuss the enhancements and their empirical findings subsequently. Existing UA applications have produced two kinds of fmdings: Evidence of the utility values from selection programs, and evidence of differences in SDy- Utility Values for Selection Programs --------------------- Insert Table 3 Here ------------------- Utility Analysis for Human Resource Management Decisions Page 21 )f the Table 3 summarizes the utility values reported in existing literature. Twenty-one empirical stUdies .were located, with utility values for 48 interventions. Two of these studies reported results for non- d selection activities (Florin-Thuma & Boudreau, 1987; Mathieu & Leonard, 1987), but the utility model used by these studies is sufficiently similar to include their results here (the utility model for non- selection programs will be discussed subsequently). Several studies used enhanced utility models incorporating additional attributes (Burke & Frederick, 1986; Cronshaw, et al., 1986; Florin-Thuma & ne. Boudreau, 1987; Mathieu & Leonard, 1987; Rich & Boudreau, 1987). The symbols at the top of the table stand for the parameters of the utility model. Ns is the number -selected or treated; T is the tenure,rera of the selectees, or F is the analysis period; SR is the selection ratio; Zs is the estimated average ltion standardized predictor score of selectees; r..y is the validity coefficient; SDy is the dollar-valued standard deviation of performance among the applicant population (or the untreated group for non-selection new programs); Cost is the total program cost; and4U is the total utility of the program over all treated employees and all time periods. The last two columns contain an equation expressing total utility as a am function of SDy, as well as the "Break-Even" (B-E) SDy value necessary for the program's total returns to Ltion equal its costs (Boudreau, 1984, and as discussed subsequently). The overwhelming conclusion from Table 3, is that selection programs payoff handsomely. Virtually every study has produced dollar-valued payoffs that clearly exceeded costs (Van Naersson, 1963 did lion report that improved selection to reduce accidents did not payoff because accident frequency and or damages were already quite low). Even the earliest studies that reported utility per person (or per person, per hour in the case of Roche, 1961) found that the payoff exceeded costs. In studies dealing with more be employees, multiple-year tenure, and occurring more recently (which include the effects of inflation) the utility estimates are always positive, and have ranged into the millions (e.g., Schmidt, et al., 1979; Cascio ely & Ramos, 1986; Cronshaw, Alexander, Weisner & Barrick, 1986; Schmidt, Hunter, Outerbridge & and Tranner, 1986; Rich & Boudreau, 1987). The clear positive payoff from selection programs remains evident in studies with both small and large SDy values, and with selection ratios as high as 81% (Van ~S, Naersson, 1963). The largest utility values occur where large numbers of individuals are affected by the for program, and Ns is ,large. s Many of the studies were designed to examine whether substituting a more-valid selection method for a less-valid one (usually an interview) produced greater dollar-valued payoff (Cascio & Silbey, 1979; y of Schmidt, et al., 1979; Ledvinka, Simonet, Neiner & Kruse, 1983; Schmidt, et al., 1984; Cascio & Ramos, I 1986; Burke & Frederick, 1986; Rich & Boudreau, 1987). In these cases, Table 3 reports a utility value for each selection method separately and for the difference between them. As shown, in every case the more valid (and usually more costly) selection procedure produced the greater estimated utility. However, even the interview produced positive utility despite its cost and low validity. This is not an argument in favor of less-valid selection, but. it does illustrate that even modestly valid selection programs may produce substantial utility values. The utility values measured by the B-C-G model appear to be quite high. Moreover, the estimated costs of improved selection are often minuscule compared to the benefits. A $10 per-applicant testing cost might produce over five times greater validity if the PAT is substituted for the interview (Schmidt, et aI., 1979). As noted earlier, testing costs are unlikely to reflect the full range of resources required to Utility Analysis for Human Resource Management Decisions Page 22 implement top~down selection based on more valid predictors, but even inflating costs by a factor of 10 or 100 often would not change the positive utility values. According to these fmdings, the economic impact of improved selection might well surpass many more traditional investment opportunities, such as plant, equipment, marketing, financial, etc. Such a conclusion seems at odds with the observations reported earlier (and verified by many HR managers) that human resource management's contribution is often ignored, that HR issues are not considered in organizational planning, and that debate continues over whether HR activities are really an appropriate use for ~rganizational resources. This suggests several important research issues regarding the decision processes of managers, and how payoff information about HRM programs is interpreted and evaluated. However, only one research issue has . received substantial attention in the I/O psychology literature--the accuracy, psychometric quality, and proper measurement method for SD,. Research Measuring SD, The standard deviation of dollar-valued job performance in the applicant population (SD,) was characterized as the "Achilles' Heel" of utility analysis by Cronbach and GIeser (1965, p. 121). The amount of recent research aimed at estimating this elusive concept suggests that many of today's UA researchers agree. Moreover, researchers often regard accurate SD, measurement as fundamental for useful UA research (Burke & Frederick, 1984, 1985; Weekley, et al., 1985; DeSimone, Alexander & Cronshaw, 1986; Greer & Cascio, 1988). This section reviews this research from a decision-theory perspective, focusing its contribution toward better describing, predicting, explaining and enhancing HRM program decisions. The review will focus on four decisions that must be made in measuring SD,: (1) the definition of utility (i.e., y); (2) the focus population; (3) the setting of study; and (4) the operational measurement method used. From a decision-making perspective, these decisions should be guided by how well the analysis will describe, explain, predict and/or enhance HRM program decisions. SD, measurement is fundamentally linked to the decision context in which the measure is applied. However, existing research seldom explores whether utility analysis and SD, measures affect decisions or reflect decision maker objectives and values. Instead, research tends to pit one measure against others, often advocating a particular measure, with quality usually defmed psychometrically (e.g., consistency with other measures, reliability across estimators, consistency with distributional assumptions). Such research provides interesting tests of measurement principles. However, its value in describing, predicting, explaining and enhancing decision processes is difficult to determine, because most SD, studies don't reflect actual decisions. UA models were spawned by the limitations of measurement theory and correlational statistics to fully capture the decision processes and consequences of selection programs (see Cronbach & GIeser, 1965, p. 135-137). It is ironic that the resurgence in VA research should focus once again on measurement issues. VA research (including SD, measurement research) should focus clearly on the ultimate purpose of VA models to describe, predict, explain and enhance decision processes. This focus is frequently absent in the rush to develop and test each new SD, measure. ---------------------------- Insert Table 4 Here r Utility Analysis for Human Resource Management Decisions Page 23 -10 --------------------------- Table 4 summarizes existing SDy measurement research. The studies are arranged chronologically, C has with each study described in terms of its setting and sample, utility scale, estimation method and research fmdings. The research fmdings are described in terms of the mean SD, estimate derived (i.e., MEAN), 1 is the standard deviation of the SD, estimate in the sample (i.e., SD), the standard error of the mean SD, s estimate (SE), the percent of average salary represented by average SD" and the percent of the mean payoff estimate represented by average SD,. In studies estimating dollar-valued payoff (y) directly (e.g., Desimone, et al., 1986; Day & Edwards, 1987; Greer & Cascio, 1987; Edwards, et aI., 1988) Table 4 reports the actual average payoff estimate (Mean y) as well as the estimate of the standard deviation IS (SD,). Thirty-four studies were located, producing over 100 individual SD, estimates (the results shown in Table 4 sometimes represent averages of groups of estimates derived by the authors). The trend in research activity is clearly evident, with only five studies between 1953 and 1978 but with 29 studies between 1979 and 1988. The Utility Scale Viewing UA models as special cases of MAU models suggests that utility will be largely in the eye of the beholder. Generic MAV models often rely on subjectively-scaled payoff functions, measured by having decision makers indicate their preferences for different levels of certain attributes on a scale of zero to 100 (see Huber, 1980 for a number of examples). The nature of the decision situation and the [RM decision makers determine the payoff ftmcLion, and the MAV model makes the vaIues, assumptions and ) the priorities explicit Because VA models serve (in part) to translate HRM program consequences into units that managers y understand (usually dollars), VA research has used more focused utility scales. Equations 6 through 9 clearly indicate that the utility scale reflects the expected average increase in employee dollar value due ~ver, to the selection program, on a per-person, per-year basis. Little consensus exists regarding the meaning of "dollar value." The variety of criteria available for evaluating HRM decisions (see Smith, 1976, n Milkovich & Boudreau, 1988 or other introductory textbooks) virtually guarantees mat different researchers will adopt diverse definitions of the payoff scale, as we shall see. Still, a broad concept of other the utility scale must be maintained, to avoid basing decisions on a dangerously narrow perspective. Defining me meaning and scale of me criterion is important to advancing VA research and applications (Day & Edwards, 1987; Desimone, et aI., 1986; Steffy & Maurer, 1988). While a single definition will not apply in all situations, mis section will attempt to develop a framework for categorizing existing defmitions and developing new ones. At me very least, such a framework will allow researchers to clearly identify the objectives and assumptions underlying various studies. Eventually, it may aid understanding of the appropriate utility scales for different situations. The Utility Concept. A general definition of payoff for utility analysis is "all consequences of a given decision that concern the person making the decision (or the institution he represents)" (Cronbach & .eus GIeser, 1965, p. 22; Boudreau, 1987; in press). Some of these consequences may be positively valued (often referred to as "benefits") and some may be negatively valued (often referred to as "costs"). This. definition suggests several implications: !1 Utility Analysis for Human Resource Management Decisions Page 24 II (1) Utility may reflect different outcomes (e.g.. productivity increases. labor cost i reductions. affmnative action goal attainment, improved organization image. consistency with I fundamental organizational beliefs. high levels of fInancial return. etc.) consistent with the desires -'I i and objectives of decision makers and the constituents they serve (see Cronbach & Gieser. 1965.. I p.23). i (2) Utility measures should reflect the decision context. Work force quality . I improvements will have different value depending on how they are used by the organization. ! For example. improved work force quality may be used to increase the number of units produced. to increase their average quality. or to reduce costs. As we will discuss subsequently. . I the dollar implications of these strategies are quite different. . (3) Increased measurement precision will not always improve decision quality. For example. if a simple (and inexpensive) payoff measure implies positive program utility. but a I more accurate (but also more expensive) measure leads to the same decision. then the more accurate measure does not improve decision quality. A Framework for the Payoff Scale Defining SD,. The payoff scales in UA research usually focus on the economic consequences of programs that increase labor force quality. Yet. there are many ways an organization might employ a higher-quality work force (Cronbach & GIeser. 1965. p. 23). and the payoff from HRM programs depends on how the organization uses the quality enhancements they produce. The Quantity. Quality and Cost concepts introduced earlier provide a useful framework. Among other objectives. organizations aim to increase economic value. They can do this through some combination of: (1) producing high quality per unit of product sold (in order to generate high prices/revenue from selling each product unit); (2) producing and selling a large quantity of units; and (3) producing units at low cost (i.e.. the value of resources in their next best alternative use. Levin. 1983). This framework applies even to non-profit organizations, whose objective is to provide the maximum quantity of service at the minimum cost, with a target profit of zero. This implies three general uses for improved k..bor force quality: (1) Increasing the quantity of production; (2) Increasing the quality of production; and (3) Reducing production costs. Managers may choose to use labor force quality increases in any combinations of these ways. A payoff scale defined in terms of economic profit can reflect any or all of these uses. A payoff scale defined in terms of quantity will be suffIcient to reflect uses affecting product quantity, but it will fail to reflect the other two. and so on. Payoff scales reflecting revenue enhancements (through higher quality or quantity) and cost reductions dominate the UA literature. though profit-based scales are emerging. Payoff as cost reduction. Most of the earliest UA applications focused on cost reduction from improved selection. Doppelt and Bennett (1953) focused on reductions in training costs. Van Naersson (1963) focused on reductions in driving accident and training costs. Lee and Booth (1974) and Schmidt & Hoffman (1973) focused on reduced costs of replacement (e.g., recruitment, selection and hiring costs) when turnover is reduced. More recently. Eaton. Wing & Mitchell (1985) and Mitchell. Eaton and Wing (1985) measured payoff in terms of the avoided costs of additional tanks to achieve a given military objective. Boudreau (1983a. p. 555) noted that utility models including variable costs applied to situations where cost reduction is an important selection outcome. Schmidt and Hunter (1983. p. 413) noted that increases in work force productivity might be used to reduce "payroll costs" by producing the same amount of output with a smaller number of employees. Arnold, Rauschenberger. Soubel & Guion (1982). DeSimone. et al. (1986) and Schmidt. et al. (1986) emphasized cost reduction from hiring fewer employees to do the same amount of work. These payoff functions are also consistent with the "behavioral costing" approach to HRM program analysis described by Cascio (1982. 1987) in which HRM r Utility Analysis for Human Resource Management Decisions Page 25 program effects are evaluated according to their ability to reduce costs associated with undesirable employee behaviors. A few authors (Mahoney & England, 1965; Sands, 1973) have incorporated not ~s only the costs of replacing employees, but also the costs of false negatives (i.e., costs of mistakenly rejecting applicants who would have been successful if hired). Cost-based payoff functions reflect an important element of economic payoff, but they can be misleading in those situations where programs that reduce costs also reduce revenue. For example, improved selection may identify employees who stay longer and reduce separation expenses, but if they stay because they are mediocre performers and have few employment opportunities, the reduction in replacement costs may be offset by a reduction in productivity. Although this danger is less apparent with a payoff scale reflecting reduced training time costs (because training success is likely to positively DeUS relate to subsequent job performance), training cost reductions may understate selection utility. Where 'ays cost reduction is the dominant consideration, cost reduction alone may represent a useful payoff scale. ~ However, its deficiencies have led researchers to explore further options. Payoff as the "value of output as sold". Schmidt, et al. (1979) proposed an SD~ measure that asked estimators to consider the "yearly value of products and services", and the "cost of having an )me outside firm provide these products." This payoff scale reflects the product of price and quantity sold, or the "sales value" (Boudreau, 1983a) of productivity. Hunter and Schmidt (1982, pp. 268-269) interpreted ld (3) the payoff function as the value of "output as sold," or what the employer "charges the customer." As 3). Table 4 indicates, much research has focused on similar payoff scales (Cascio & Silbey ,1979; Bobko, et al., 1983; Ledvinka, et al., 1983; Burke & Frederick, 1984; Schmidt, et al., 1984; Wroten, 1984; Bolda, ~sfor 1985; Burke, 1985; Eaton, Wing & Lau, 1985; Eaton, Wing & MitChell, 1985; Eulberg, O'Connor & Peters, 1985; MitChell, Eat0n & Wing, 1985; Reilly & Smither, 1985; Weekley, et al., 1985; Burke & Teases Frederick, 1986; Cascio & Ramos, 1986; Cronshaw, et al., 1986; DeSimone, et al., 1986; Schmidt, et al., any 1986; Day & Edwards, 1987; Greer & Cascio, 1987; Mathieu & Leonard, 1987; Rich & Boudreau, 1987; Edwards, 1988). 19 The "sales value" payoff scale implies that the appropriate benefit from improved HRM programs is rature, the increased revenue generated by higher-quality employees. Its widespread adoption reflects, in pan, the strong endorsement it originally received. For example, Hunter & Schmidt (1982) characterized Roche's payoff definition (contribution to company profits) as "deficient on a logical basis", because it sson subtracted costs of production from the value of output as sold. This view is difficult to reconcile with midt the general payoff defmition originally proposed by Brogden & Taylor (1950) and Cronbach & GIeser :osts) (1965), both of whom included the notion of revenue minus costs (or simply cost reduction) as pan of Wing the payoff function. Boudreau (1983) and Reilly & Smither (1986) proposed that the practice of asking estimators to consider both the "value of products and services" and the "cost of having an outside firm provide these 3) products" may be confusing in an economic sense because a fmn will pay an outsider a maximum of the iruernal costs of providing a service, not their value. Day & Edwards (1987) found that when the latter ~ the uion instruction was dropped, SD~ values were slightly higher for Account Executives, and much higher for ~wer Mechanical Foremen, though the inter-rater variability of the estimates was also higher. As Boudreau (1983a, p. 553; 1983b; 1987; in press) noted, the value of output as sold produced by HRM employees can be a deficient payoff definition for organizations using traditional financial invesunent Utility Analysis for Human Resource Management Decisions Page 26 decision models. 'When other organizational investments are being evaluated based on profit contribution, evaluating HRM investments based on revenue contribution (without considering associated costs of production) can cause HRM program value to be relatively inflated. Hunter, Schmidt & Coggin (1988, p. 526) adopted a similar position, stating that "increase in the dollar value of output as sold is the most relevant index when the concern is with sales figures, total fInn income, market share and so forth," noting that this is a different payoff defmition from profits. Payoff as Increased Profits. The initial attention to the ,payoff function for utility analysis proceeded from the notion that the payoff scale should be applicable to business decisions, and generalizable across business organizations. Brogden and Taylor (1950) proposed the "dollar criterion," providing a number of computations for dollar-valued criterion measures. All of them share notion that each unit produced (e.g., square feet of flooring laid) represents some value to the organization. That value reflects the sales revenue generated when the unit is sold, less any costs involved in producing that unit. Brogden & Taylor list a number of elements to be considered in such a criterion, including: (1) Average value of production or service units, (2) Quality of objects produced or service accomplished; , (3) Overhead--including rent, light, heat, cost depreciation, rental of machines and equipment, etc.; (4) Errors, accidents, spoilage, wastage, damage to machines or equipment due to unusual wear and tear, etc.; (5) Such factors as appearance friendliness poise, and general social effectiveness where public relations are involved; (6) The cost of time of other personnel consumed. Roche (1961) explicitly followed Brogden and Taylor (1950) in developing a dollar criterion that would convert "production units, errors, time or other personnel consumed, etc. into dollar units" (p. 255). Cronbach and GIeser (1965) provided a very general payoff concept, including all consequences important to decision makers. Thus, their payoff concept is consistent with a "profit" definition, though it can encompass even broader defmitions. Cronbach's comments on Roche's dissertation (Cronbach & GIeser, 1965, p. 266) seem to suggest that the concept of profit (revenue less costs) fIts their definition. Indeed, Cronbach suggested a formula for hourly profit that reflected revenue less variable and fixed costs. More recently, Cascio and Ramos (1986, p. 20) discussed the concept of "the difference between benefits and costs" as their payoff function, and Greer & Cascio (1987) used "contribution margin" to reflect a similar concept Hunter, et al. (1988, p. 526) also endorse the profit concept, noting that "when the focus of concern is with pretax profits, that would be the most relevant index." Reilly & Smither (1985) compared several different payoff definitions for SDy, including profits. Their results suggest that the graduate students in their simulation differed most in their SDy estimates when they were asked to consider "net revenue" rather then "new sales," or "overall worth". The results of Bobko, et al. (1983) may reflect a similar phenomenon, in that their sales counselor supervisors exhibited much greater variability in their SDy estimates when attempting to estimate "yearly value to the company" rather than "total yearly dollar sales." Greer & Cascio (1987) estimated SDy based on contribution margin, the revenue generated by better-quality workers less the costs associated with them, and found that the average value of y and SDy were both higher than SDy estimates derived by scaling average salary levels (CREPID), but were only slightly higher than those based on revenue (Schmidt, et al., 1979). l' Utility Analysis for Human Resource Management Decisions Page 27 Summary. Although costs, sales and profits have enjoyed some attention as payoff functions, any ution, payoff function's usefulness should be judged in tenns of its ability to better describe, predict and explain and enhance decisions. Because UA models focus on the consequences of improving the quality of an 188,p. organization's labor force, a fundamental consideration is how the organization uses quality improvements. 1St In some organizations, improved work force quality may reduce costs (e.g., through reduced staffmg levels), but maintain the same quantity, quality and price of output. In other applications, the quantity of output may be increased, while maintaining the quality and price. A revenue-based utility scale (e.g., "output as sold") will reflect the objectives in the latter situation but not in the former (because revenue doesn't change in the former), and vice versa for a cost-based utility scale. A profit-based utility concept -n,tt encompasses the objectives of both, because quantity, quality and cost are included (though they may not that vary in every decision). One can use a revenue-based payoff function in estimating SD" and add other at parameters to the UA model so that overall utility values reflect a profit focus. In Table 3, several ~ that studies have adopted a total utility function reflecting profit contribution (Burke & Frederick, 1986; Mathieu & Leonard, 1987; Rich & Boudreau, 1987), though their SDy measures reflected revenue increases. This financial/economic approach (Boudreau, 1983a) will be discussed subsequently. C.; Comparing the psychometric characteristics (e.g., inter-rater consistency, distribution shape, mean) of and SDy estimates resulting from different payoff functions can provide interesting insights about measurement, but research addressing decision processes and outcomes effons should place the estimates within a decision context, and apply them to actual decisions. All existing payoff scales reflect a concern with mId productivity-based outcomes, virtually ignoring other factors that might be affected by selection decisions (e.g., community relations, work force attitudes, adherence to a code of ethics). Thus, every payoff function is deficient in some way. While deficiency is d characteristic of all models (because models ugh it simplify reality), research should address the effects of incorporating these broader outcomes into actual ~ decisions. ion. Effects of Jobs Studied n A wide variety of occupations has been examined, with the occupation usually determined by the ~o research setting presented to the researchers (SD, studies often occur within a validation study). Different when occupations should exhibit different SDy values. Occupations in which workers exercise more discretion regarding production and/or where variation in production has large implications for organizational goals should exhibit higher SDy values than jobs without these characteristics. However, this effect may be ~s reduced if variability in skills and motivations among applicants is negatively related to discretion and :sults variation in productivity. Even jobs with high discretion and variability may emerge with lower SD, values when their employees/applicants have low ranges of skill and motivation. Most SD, studies ) examine only one job, making across-job comparisons difficult (because jobs, measurement methods,the settings and time periods are confounded). Five studies employed more than one job (Wroten, 1984; Eaton, Wing & Lau, 1985; Mitchell, Eaton & Wing, 1985; Day & Edwards, 1987; Mathieu & Leonard, em, 1987). Wroten (1984) did not statistically test the effect of jobs on SD, estimates, but his results indicate rIg different rankings of jobs based on SD, level for each estimation method. Similarly, although Eaton, , et Wing and Lau (1985) found a significant effect of MOS (job type) for their GLOBAL estimarion ~ ~ Utility Analysis for Human Resource Management Decisions Page 28 technique, they found no such significant effect for the EQV technique. Nor did they fmd that military rank or the interaction between rank and MOS were significantly related to SD,. Mitchell, Eaton and Wing (1985) found very similar results for Crewman and Transport Operators. Day & Edwards (1987) found higher SDy values for Account Executives than for Mechanical Foremen, with the differences more pronounced for more subjective and global estimation methods. Mathieu & Leonard (1987) found similar SDy levels for bank Head Tellers and Operations Managers, but substantially higher values for Branch Managers. Thus, in studies estimating SDy in different jobs, .there is mixed evidence of across-job variation in SDY' Hunter, Schmidt & ludiesch (1988) studied the effects of occupational complexity, defined using Hunter's (1980) system, on performance variability. Instead of examining SDy estimates, however, they examined the ratio of the standard deviation of output to mean output (SD;J. Across many studies, they used the actual reported ratio of the standard deviation to the mean. For low or medium complexity jobs that reported only the ratio of highest-performer output to lowest-performer output, they used a formula that assumed normality Schmidt & Hunter (1983, p. 408). For some high complexity jobs (attorneys, physicians and dentists) they used the mean and standard deviation of incOme. They corrected observed distributions to reflect a constant time period. They conclude that for incumbents in routine clerical or blue-collar work, SD, is about 15%, in medium complexity jobs it is 25%, and in high-complexity jobs it is 48%. For life insurance sales it was 97%, and for other sales it was 42%. After correcting for selective hiring (assuming applicants were hired using general mental ability), they estimate that the progression from low to medium to high-complexity jobs is 20% to 30% to 50%. For life insurance sales it is 123.8% and for other sales it is 54.2%. To the extent that dollar-valued productivity is linearly related to these productivity measures, one could expect SDy to rise with job complexity as well. Pitfalls of job descriptions as group identifiers. Every study used job titles to distinguish the group for analysis. By using job titles to identify employees holding similar job duties and tasks, existing research may be inadvertently including across-job differences in the SDy measure. For example, although computer programmers may all hold the same job title, certain programming jobs may involve primarily transcribing flowcharts into computer code, while other programming jobs may involve designing the logic of the program (Rich & Boudreau, 1987). Clearly the latter job has more potential for both valuable positive contributions and/or costly mistakes. Yet, existing SDy measurement methods would include both groups in the SDy estimate. If the selection test will primarily be used to select programmers assigned as coders, this will overstate SDy (and vice versa). Still, the Hunter, et al. (1988) study suggests that even job titles may be sufficient to detect consistent differences in SDy according to job complexity. Tbe Focus Population The focus population is the population of individuals over which variability occurs. Virtually all SDy measurement methods focus on job incumbents. The incumbent population is most familiar to job supervisors who provide the SDy estimates, and it is the only population on which actual output information exists. However, the incumbent population is not strictly the appropriate population of interest for most utility models. For selection utility models. the appropriate population is the applicant population to which the r Utility Analysis for Human Resource Management Decisions Page 29 sel~tion procedures will be applied. This population may differ from the incumbent population fOf a iary number of reasons. First, certain procedures may operate to make the incumbent population a restricted d sample of applicant job performance (for example, promoting the best performers and dismissing the &7) worst performers), as discussed by Hunter, et al. (1988) and Schmidt, et al. (1979). Such a situation more would make SD, estimated on job incumbents a downward-biased estimate of the applicant population. milar Second, the applicant population may change over time due to different recruitment procedures or labor h market influences (Becker, 1988; Boudreau & Rynes, 1985). Such influences may operate either to increase or decrease performance variability among applicants, and produce applicant SD, levels either higher or lower than the variability among job incumbents. Third, SD, estimates based on job incumbents encourage estimators to consider all of the incumbents in their experience. This group includes ley incumbents with very different tenure levels. If performance varies with tenure, then job incumbent they variability will reflect this. However, each cohan of hired applicants will have equal tenure throughout , jobs their employment, removing this source of variability within cohorts of selectees. Thus, where job tenure ula and performance are related, SD, estimates based on job incumbents will include variability not present in ;, applicants and will tend to overestimate. However, Greer & Cascio (1987) examined this possibility in a ~ed sample of beverage salesmen, and found that though wide variations in tenure existed, tenure was not or significantly correlated (r=O.l18) with dollar-valued output estimates. Founh, as noted earlier, VA )bs it studies have grouped employees with similar job titles to form the focus population. If task assignments or work environments differ within the same job, the variability of performance may differ as well. This is not a problem if selected individuals are assigned to tasks and environments in the same proportion as the incumbent group. If, however, entering employees tend to be assigned to specific tasks or nearly environments (perhaps with less chance for error), then SD, estimates based on incumbent populations may be inaccurate reflections of the actual SD, in the selection system (Bobko, et al., 1983; Boudreau & group Rich, 1987). Most authors argue that incumbent-based SD, estimates are conservative due to restricted range. However, there is no evidence regarding the possible biasing effects of different recruiting hough approaches, or different labor market conditions. rily logic Measurement Techniques both Without doubt, the research question most addressed by existing VA research is whether using ed as different SD, measures produces differem SD, values. Table 4 attests to this fact, indicating that the vast ven majority of studies compare one SD, estimation method to another. Authors customarily argue that because SD, was characterized as the Achilles' Heel of utility analysis by Cronbach and GIeser (1965) and because differences in SDy can cause such large differences in total utility estimates (because SD, is multiplied by many other factors in the selection utility formula), it is important to develop better SDy measures. lSD, SD, measurement methods fall into four categories: (1) Cost Accounting, which refers to methods in which accounting principles are used to attach a value to units of performance or output for each individual, with the standard deviation of these individual performance values representing SD, (e.g., Roche, 1961; Van Naersson, 1963; Schmidt & Hoffman, 1973; Lee & Booth, 1973; Greer & Cascio, 1987); Utility Analysis for Human Resource Management Decisions Page 30 (2) Global Estimation, where experts are asked to provide estimates of the total yearly dollar valued perfonnance at two, three, or four percentiles of anhypothetical perfonnance distribution, and average differences between these percentile estimates represent SD, (e.g., Cascio & Silbey, 1979; Schmidt, et al., 1979; Hunter & Schmidt, 1982; Bobko, et al., 1982; Burke & Frederick, 1984; Schmidt, Mack & Hunter, 1984; Wroten, 1984; Bolda, 1985; Eaton, Wing, & Lau, 1985; Eaton, Wing & Mitchell, 1985; Mitchell, Eaton & Wing, 1985; Weekley, et al., 1985; Burke, 1985; Burke & Frederick, 1985; Mathieu & Leonard, 1987; Rich & Boudreau, 1987); (3) Individualized Estimation, which refers to methods in which some measurable characteristic of each individual in the sample (e.g., pay, sales activity, performance ratings) is trnnslated into dollars using some scaling factor such as average salary or average sales, with the standard deviation of these values representing SD, (e.g., Janz & Dunnette, 1974; Cascio, 1980; Arnold, et al., 1982; Dunnette, et al., 1982; Bobko, et al., 1983; Ledvinka, et al., 1983; Burke & Frederick, 1984; Reilly & Smither, 1985; Cascio & Ramos, 1986; Eulberg, O'Connor & Peters, 1985; Greer & Cascio, 1987). (4) Proportional Rules, which involve multiplying the value of some available productivity- related variable (e.g., average wage, average sales, average productivity value) by a proportion to arrive at an SD, estimate (e.g., Hunter & Schmidt, 1982; Schmidt & Hunter, 1983; Eaton, Wing & Lau, 1985; Weekley, et al., 1985; Cascio & Ramos, 1986; Eulberg, et al., 1985; Mathieu & Leonard, 1987; Schmidt, et al., 1986). Cost accounting. As noted above, the initial concept of a payoff function measured in dollars (Brogden & Taylor, 1950) proposed using cost accounting to attach a dollar value to production units based on their contribution to organizational profit. Then, the number of units produced by each individual in a sample (over a constant period of time) is recorded, and each unit produced is multiplied by its profit contribution, producing a dollar-valued productivity level for each individual. The standard deviation of these values is used as SD,. One early study attempted this technique (Roche, 1961, in Cronbach & Gieser, 1965). In summarizing the method, Roche notes that "many estimates and arbitrary allocations entered into the cost accounting" (Cronbach & Gieser, 1965, p. 263), and Cronbach's comments note that it is possible the accountants did not fully understand the utility estimation problem (p. 266-267). Cascio and Ramos (1986, p. 20) also discuss the difficulties they encountered in applying a cost-accounting approach to SD, estimation for telephone company managers. Greer & Cascio (1987) applied cost accounting to estimate productivity of route salesmen in a midwestem U.S. soft drink bottling company. Their method involved estimating the "contribution margin" (revenue less variable costs) associated with selling cases of different sizes and types, multiplying that by the number of cases sold by each salesman, and then multiplying that by the percentage of sales attributable to route salesman effort on each route. This produced an estimate of the contribution margin for each route salesman, and the standard deviation of these values represented SD,. The difficulty and arbitrariness of the cost-accounting methodology has frequently been cited as arguing in favor of simpler methods (e.g., Cascio, 1980; Cascio & Ramos, 1986; Hunter & Schmidt, 1982; Schmidt, et al., 1979), because although cost accounting methods are complex, costly and time consuming, they are still prone to arbitrary estimation and subjectivity, especially in jobs for which there is no identifiable production unit, such as managerial jobs. Global Estimation. This SD, measurement method, first proposed by Schmidt, et al. (1979), involves having experts estimate the dollar value of several points on a hypothetical distribution of performance (usually the 15th percentile, the 50th percentile, and the 85th percentile). If the average difference between the 15th and 50th percentile is not significantly different from the difference between the 85th and 50th percentiles, the presumption of normally-distributed payoff levels is accepted, and the F Utility Analysis for Human Resource Management Decisions Page 31 average of the two differences is used as the SD, estimate. This procedure has the advantage of being & relatively simple and straightforward to administer (Schmidt, et al., 1979; Cascio, 1980). Schmidt, et al. (1979) proposed that obtaining estimates from a sample of experts would cancel any individual biases. u, :e, Studies using such global SD, estimates have generally produced large SD, values and resulting large utility values, so the global estimation procedure has been the subject of substantial study by researchers interested in testing its reliability and construct validity, producing several controversies. First, subjects frequently find the task of estimating the dollar value of performance distribution percentiles somewhat difficult. Some respondents gave inconsistent percentile estimates, found the task 985; difficult, refused to do the task, or provided percentile estimates extremely different from the others (e.g., Bobko, et al., 1983; Mitchell, Eaton & Wing, 1985; Reilly & Smither, 1985; Mat.hieu & Leonard, 1987; ity- Rich & Boudreau, 1987). In these cases, and even where aut.hors do not repon subject difficulty, t.he SL inter-rater variability in SD, estimates is usually as large or larger t.han the mean SD, estimate. The SD, ard, Values column of Table indicates the within-sample standard deviation of SD, estimates (abbreviated SD), where reported. Such high inter-rater variability is disturbing, of course, because it suggests that the measures may be capturing bias or error. Desimone, et al. (1986) explicitly examined the inter-rater and temporal stability of their SD, estimates, and found both of them to be low. Weekley and Gier (1986) also noted inconsistencies in Global estimates across a t.hree-month period. This has led some researchers ied to suggest new measurement methods (discussed below), others have suggested or investigated variations ITd on the global estimation method designed to improve consensus. The most frequently used tactic is to provide an anchor for the 50th percentile (e.g., Bobko, et al, 1983; Burke & Frederick, 1984; Wroten, 1984; Eaton, Wing & Mitchell, 1985; Burke, 1985; Burke & Frederick, 1986) which is supponed by :::ost evidence of a high correlation between 50th percentiles and SD, (e.g., Bobko, et al., 1983; Schmidt, Mack & Hunter, 1984; Wroten, 1984; Edwards, et aI., 1988). Research comparing t.he anchored method to the unanchored method generally suggests that providing anchors reduces inter-rater variability (e.g., Burke & SD, Frederick, 1984; Wroten, 1984), but that the value for the anchor is positively related to the SD, values late that result. Another frequently-used tactic is to have groups of raters provide consensus judgments of lved different percentiles (e.g., Burke & Frederick, 1984; Wroten, 1984). Some researchers (e.g., Burke & ~rent Frederick, 1984; Mathieu & Leonard, 1987) simply drop inconsistent or outlier values on the assumption that that they represent error, though there is no theory or empirical data to suggest how inconsistent or nate :nted oousual an estimate must be to qualify for deletion as an outlier. A second controversy involves the underlying assumption of normality inherent in the global estimation approach. Averaging the differences between the 15th and 50th percentiles with the differences between the 85th and 50th percentiles presumes a normally-distributed dollar-valued performance distribution. This assumption is often justified by failing to reject the hypothesis that the tere means of the two differences are significantly different, but this amounts to accepting a null hypothesis. In view of the large inter-rater variability associated with these measures, it seems possible that failure to reject this hypothesis may be due to measure unreliability rather than to an underlying normal distribution. I Some studies have suggested non-normal performance distributions or significantly different percentile I estimates (e.g., Bobko, et al., 1983; Burke & Frederick, 1984; Schmidt, Mack & Humer, 1984; Burke,een 1985; Rich & Boudreau, 1986) However, other studies found no significant differences, and there is .he evidence that actual performance distributions follow a normal distribution (Hunter & Schmidt, 1982). I L -- Utility Analysis for Human Resource Management Decisions Page 32 Some researchers examined this issue by including an additional percentile estimate (the 97th percentile), which would be expected to be equally different from the 85th percentile as the 85th and 15th percentiles are different from the 50th. Bobko, et al. (1983) and Burke & Frederick (1984) found the difference between the 97th and 85th percentiles significantly smaller than the other two, suggesting either a non- normal underlying distribution or that estimating the 97th percentile taps a different estimation process than estimating the other three percentiles. A third controversy arises because the initial research on the global estimation method provided no information to indicate what processes are used in arriving at the SD, estimates (e.g., what anchors respondents use, what perfonnance attributes they consider, and whether similar anchors and attributes are considered by different experts). This has prompted a few researchers to investigate the judgment processes underlying SD,. Bobko, et al., 1983 noted that sales managers reponed using pay as an anchor for their estimates of "overall worth." Burke & Frederick (1984) gathered anecdotal data following their main study, and found that supervisors of sales managers reponed using five dimensions: (1) management of recruiting, training and motivating personnel; (2) amount of dollar sales achieved; (3) management of sales coverage; (4) administration of performance appraisal; and (5) forecasti~g and analyzing sales trends. Burke (1985) found that supervisors of clerical workers followed job evaluation dimensions in their judgments, with salary-related factors most frequently used. How accurate is the global estimation technique? Only limited evidence exists, usually based on arguably deficient objective performance measures (sales performance). Bobko, et al. (1983) found that the actual distribution of sales revenue (number of policies sold times average policy value) for sales counselors was normally distributed, and that the SD, estimate based on the average of the 85th minus 50th and the 50th minus 15th percentiles was not significantly different from SD, based on the actual sales distribution, although the percentile estimates were quite different DeSimone, et al., 1986 found the opposite results. However, when respondents in the Bobko, et al. (1983) study were asked to consider the "overall wonh of products and services" and "what you would pay an outside organization to provide them," the values were only about one-tenth the actual sales standard deviation, and apparently anchored on pay levels rather then sales. Burke and Frederick (1984) also found SD, estimates of overall worth were lower (about one-percent of the actual sales standard deviations), and anchored on various activities including sales. Reilly and Smither (1986) found that graduate students panicipating in a business simulation, who had been provided with data to estimate actual standard deviations, produced global SD, estimates slightly higher than the simulation information for repeat sales and new sales, and much higher than the simulation for net revenue. The SD, estimate of overall worth was 49% of actual repeat sales, 3.45 times actual new sales, and 1.92 times actual net revenue. DeSimone, et al. (1986) found that the global SD, estimate for medical claims approvers was 19% of the compensation-weighted standard deviation of actual claims approved. Greer & Cascio (1987) found no significant differences between the global SD, measure and their cost accounting estimate. Thus, the research comparing global SD, estimates to objective performance is sparse, and the results are mixed. Individualized estimation. This is similar to the cost-accounting method in that it attempts to attach a dollar value to the output of each individual in a sample, the standard deviation of those dollar values becoming the SD, estimate. However, more recent versions of this approach have foregone the complex and costly cost-accounting approach in favor of approaches derived from industrial psychology and HR Utility Analysis for Human Resource Management Decisions Page 33 Ie), managtmtnt practices. Cascio (1982; 1987) and Cascio & Ramos (1986) have developed the ltiles CREPID (Cascio-Ramos Estimate of Perfonnance In Dollars) method This method breaks a job into important "principle activities." Then, each activity is rated on two dimensions--time/frequency and }- importance (originally, difficulty and consequence of error were also included), and the ratings multiplied to give an overall weight to the activity. The proportion of total weights becomes the final importance weight assigned to each activity. To assign a dollar value to each activity, average salary for the job is 10 divided among the activities according to the proportional importance weights. After this "job analysis" phase, supervisors are asked to rate a sample of employees in terms of their performance on each ; are principle activity, using a 0 to 200 scale, "with a value of 100 points indicating average performance ('This employee is better than 50% of those I've seen do this activity'). A value of 200 indicated that char the employee was better than 99% of those the supervisor had seen do the activity, and a value of 50 heir indicated that the employee was better than 25% of those he or she had seen do the activity. A value of ment 0 indicated that the employee was the worst the supervisor had seen do the activity." (Cascio & Ramos, of 1986, p. 22). Then, to translate these ratings into dollars, the ratings are divided by 100 (to produce a 0 ends. to 2.0 scale) and these are multiplied by the dollar value assigned to that activity. Finally after each employee has been assigned a dollar value for each activity, these values are summed over activities to provide the total dollar value of yearly performance for that employee. Thus, a person performing bener than 50% of the incumbents the supervisor has experienced on all dimensions will receive a dollar value that equal to the average yearly salary for that job. A person performing bener than 99% of all incumbents will receive a dollar value equal to twice the average salary, and the worst performer each supervisor has JS experienced will receive a dollar value of zero. Edwards, et al. (1988) modified the basic CREPID procedure applied to Di3trict Sales managers by substituting archival data for either performance ratings, the job analysis ratings, or both. They found that SD, levels were similar for the original procedure, and ler when substituting either performance or job analysis archival information, but much smaller when using )Vide archival data for both performance and job analysis (see Table 4). )red Janz & Dunnette (1977) also proposed identifying critical job activities. However, rather than 1h allocating salary to each activity based on its time/frequency and importance, the Janz and Dunnette 'ities procedure requires job experts to estimate the "relative dollar costs associated with different levels of effectiveness on each of the various job performance dimensions" (p. 120). This requires tracing the SD, consequences of the various levels of effectiveness to determine their impact on activities to which costs Igher and/or value can be attached. For example, different levels of equipment maintenance effectiveness might les, be traced to breakdowns, which in turn can be traced to repairs, which in turn can be traced to dollar the losses due to repair costs and/or lost productivity during repair. Different levels of effectiveness would produce different levels of breakdowns, repair costs and lost productivity. This method was applied to :n the power plant operators by Dunnette, et al. (1982), producing results that supponed the high SD, values derived using the Schmidt, et al.>(1979) global estimation method for the same jobs (see Table 4). Another individualized estimation approach involves having experts directly assign dollar values to attach individual employees. Bobko, et ai. (1983) used this method to derive an SD, estimate based on sales uues (sales volume times average policy value) levels, with each person's yearly sales representing the Iplex individual value estimate. Burke and Frederick (1984) also used individual sales levels. Wroten (1984) HR adopted a similar approach, but did not have sales data available. He simply asked his supervisors to Utility Analysis for Human Resource Management Decisions Page 34 provide a direct estimate of the yearly dollar value of each employee's perfonnance. Ledvinka, et al. (1983) and Desimone, et al. (1986) used total payroll plus benefits divided by the nwnber of insurance claims as the value per claim, and then multiplied this value by the actual standard deviation of claims processed. Greer & Cascio (1987), as noted, multiplied the quantity of cases sold by an estimate of the contribution margin per case. Day & Edwards (1987) proposed a "return on investment" approach that calculated "average annualized investments" for a job as total compensation plus benefits plus 40% overhead. Supervisors estimated the percentage return on this investment represented by each of the seven points on their existing perfonnance appraisal form, with the product of this percentage and the average annualized invesunent representing the value of that perfonnance rating. Each person's value was estimated according to "ROI" value of their performance rating. .-. Individualized estimation has the advantage of assigning a specific value to each employee that can be explicitly examined and analyzed for its appropriateness. Such analysis might be useful in determining J which individual attributes contribute to differences in payoff values. It may provide a more J understandable or credible estimate to be communicated to those familiar with the job. The very limited evidence on this issue is mixed. Greer & Cascio (1988, p. 594) stated that four top managers and an j accountant preferred CREPID. Day & Edwards (1987) found no sigriificant differences in managerial confidence ratings for different estimation methods. Edwards, et al. (1988) found supervisors perceived their CREPID job analysis ratings as more accurate than their global utility value estimates, but found the CREPID less "doable"/feasible than Procedure B. These tests do not directly examine the effects of SD, estimation methods on confidence or accuracy of decisions. Each method makes certain basic assumptions regarding the nature of payoff. CREPID is based on the assumption that the average wage <::qualsaverage productivity, a position frequently questioned in economic theory (Becker, 1964; Bishop, 1987; Frank, 1984; Rynes & Milkovich, 1986) and clearly violated in organizations with tenure-based pay systems, pay systems based on rank, hourly-based pay systems, and where training may have different value to different organizations. Sales-based measures are based on the asswnption that sales captures sufficient perfonnance differences to be useful (an assumption that may omit important job tasks, such as training, that reduce an individual's sales but increase the group's sales); and the Janz-Dunnette measure assumes that job behaviors' effects on costs and revenues can be accurately traced by managers. Such estimation methods are usually more complex, costly and time consuming than the direct estimation methods, which may provide perfectly adequate SD, values for many decisions (as discussed below). Proportional rules. The final SD, measurement method emerged from observations concerning the relationship between SD, estimates and average salary levels, and from the desire to provide a straightforward SD, measurement method. The method involves multiplying average salary in a job by some proportion (e.g., between 40% and 70%) to derive the SD, estimate for the incumbent employee group. Hunter and Schmidt (1982, pp. 257-258) reviewed empirical stUdies for which an SD, estimate was reported or could be derived. They compared the SD, estimates to reported average salary levels (or made assumptions about average salary levels), and discovered that on average SD, was about 16% of average salary in previous studies. They observed that these values "refer to only partial measures of value to the organization" (p. 257) because they generally relied on partial job performance measures f Utility Analysis for Human Resource Management Decisions Page 35 (e.g., tenure or reduced ttaining costs). The authors also reviewed two of their own studies employing ce the global estimation procedure, where SDy was 60% of annual salary in one study of budget analysts and ns 55% of annual salary in another study of computer prograni1mers. They estimated that "the true average the for SDy falls somewhere in the range of 40 to 70% [of average salary]" (p. 258). In a follow-up lat investigation, Schmidt and Hunter (1983) proposed the following logic: In the United States economy (based on National Income Accounting methods), wages and salaries make up approximately 57% of the total value of goods and services produced. Therefore, if we knew the ratio of SDy to mean salary, we e could multiply that ratio by .57 to obtain a predicted ratio of SDy to mean output value. Thus, if the ~ was ratio of SDy to salary ranges between 42% and 60%, the ratio of SDy to output value should fall between 23% and 34%. To test this logic, the authors reviewed studies reporting empirical data on productivity :an levels measured in units of output. Their review indicated that for studies eXani1iningnon-piece-rate lining situations the average ratio was .185 (standard deviation of .052), for studies eXani1iningpiece-rate situations, the average ratio was .150 (standard deviation of .044), and for studies with uncertain lited compensation systems, the average ratio was .215 (standard deviation of .067). Though all three average n ratios fell below the lower bound predicted (the values for both the non-incentive and incentive conditions J were statistically significantly lower), the authors made five observations: (1) that their method was ~ "intended to apply to jobs without incentive based compensation systems"; (2) that even the lowest mean ld the value is "still 77% as large as the predicted lower bound value" (p. 409); (3) that the studies reviewed SDy "reflect primarily quantity of output; quality of output is probably reflected only crudely in these figures"; (4) that quality andquantity have been found positively correlated in some studies (p. 411), and (5) that on the reviewed studies were conducted on "blue collar skilled and semiskilled jobs and lower level white collar jobs", while their studies were conducted on higher level jobs where errors may be more expensive (p.412). These observations led them to conclude that "researchers examining the utility of personnel y programs such as selection and ttaining can estimate the standard deviation of employee output at 20% of ~s are mean output without fear of overstatement", and that "the findings of this study provide suppon for the lption practice that we have recommended of estimating SDy as 40% of mean salary" (p. 412). Schmidt., et a1. (1986, p. 5) state "the standard deviation of employee output can safely (if conservatively) be estimated lUes as 20% of mean ~utput,or alternatively, 40% of mean salary." i1d Hunter, et al. (1988) extended this research by analyzing the ratio of output standard deviation to s for mean output (SD,) in a larger sample of jobs, including high-complexity and sales jobs as well as those analyzed earlier. They employed new corrections for unreliability that reduced observed SD" corrected the for restricted range assuming selection on general mental ability, and used variability in salary levels as' a proxy for output variability in professional jobs (see the earlier section on the Focus Population). Their by fmdings suggested that as one moves from routine to medium complexity to professional work, the SD, ~ values progress from 20% to 30% to 50%. The proportional rules proposed by Schmidt and Hunter are intriguing because they suggest that SD, estimation may be quite feasible in many applications where job complexity can be estimated, removing a "as I major stumbling block 10 widespread utility measurement. However, knowing SD, allows one to estimate of I only the percentage increase in productivity likely from HRM programs. Determining whether such )f I increases offset dollar costs, or whether to invest program resources in different jobs requires assumption~or estimates of the dollar value of this percentage. The assumption that average salary is equal to about I I I L Utility Analysis for Human Resource Management Decisions Page 36 half the average value of products "as sold" may be violated in tenure-based pay systems, negotiated pay systems, or due to labor market conditions such as unemployment, and intemallabor markets (e.g., Becker, 1975). Indeed, National Income accounting used to generate the national GNP and labor cost figures used by Schmidt and Hunter (1983) assigns the same value to both output and wages for jobs where output is not readily measurable (e.g., Government services), producing a ratio of output to wage of 1.0, not .57. Thus, the .57 figure represents an average around which specific jobs may vary. Existing research provides limited support for the proportional rules applied to output, and less for proportional rules applied to salary. Table 4 shows that of the 44 SD, values from studies reporting mean productivity values, only two SD, values fell below 20% of mean productivity, with 13 falling in the predicted 20%-35% range, and 29 falling above 35%. However, of 66 values in studies reporting salaries, 24 fell below 40%, 18 within the predicted 40%-70% range, and 22 above 70%. The values falling above the ranges may reflect high-complexity jobs. In fact, Eaton, Wing & Lau (1985) to conclude that 125% of base pay would be a conservative estimate of SD, for military personnel. Using 40% of salary may overestimate the SD, value that would result from other methods, but using 20% of mean output seems to be conservative compared to other measurement me~ods. Still, overly conservative SD, estimates may produce severely understated utility estimates, and possible rejection of potentially useful HRM programs. Clearly, the impact depends on the decision situation. Evidence Directly Comparing Measurement Techniques. Wroten (1984) compared the Schmidt, et al. method, individual subjective payoff estimates, and group consensus percentile estimates for six jobs, with either no anchor, high, low or "accurate" anchors. The means for six unanchored methods, and each of the three anchors are shown in Table 4 for each job. He found :.hat unanchored SD, estimates had higher variance, that the mean unanchored SD, estimate was not significantly different from the actual anchored condition, but that it did differ significantly from both the high and low anchored conditions. He also found that individualized estimation usually produced less SD, variation than the global method. EatOn, Wing & Lau (1985) compared the Schmidt et al. ("GLOBAL") technique to a variant of the proportional technique called "superior equivalents" (EQV) in which experts estimated the number of 85th and 15th percentile performers it takes to equal the work of 17 average performers (the value of an average performer anchored by either average compensation or the subjective estimate of the 50th percentile). They also used a new "system effectiveness technique" (EFF) in which the standard deviation of payoff is expressed as a proportion of mean payoff (in units of the cost of a tank). The underlying payoff scale of these laner techniques is cost savings (either in terms of payroll or tank costs). Results indicated that as a percent of the GLOBAL value, the EQV salary anchor technique was 66.7%, the EQV global anchor technique was 72%, and the EFF technique was 150%. Eaton, Wing and Mitchell (1985) compared the GLOBAL technique (using only the 85th and 50th percentiles), the EQV technique and the 40%-70% of salary rule, producing SD, estimates for 5 military occupations (MOS). Over all 5 MOS, the average SD, of the GLOBAL technique was $9,387, and for the EQV technique was $14,990. As shown in Table 4, the EQV values were higher for every MOS. The GLOBAL estimates always fell within the 40%-70% range of salary (though they always fell above 35% of the mean y estimate), while the EQV estimates were always higher than 70% of salary. The --- f Utility Analysis for Human Resource Management Decisions Page 37 pay EQV technique produced a larger rnnge but a lower between-subject dispersion in SD, values compared to GLOBAL. The EQV produced no significant differences by MOS, rank, or their interaction. The t GLOBAL technique also produced no significant differences by rank or by the interaction of rank and MaS, but it did produce significant MaS differences (with Annor Crewmen having a lower SD, than 1ge Vehicle Mechanics, Medical Specialists, and Radio Operators). Mitchell, Eaton & Wing (1985) explored whether job incumbents could provide usable SD, estimates, )r and studied the jobs of Motor Transpon Operator and Cannon Crewman in the U.S. Anny. They used the GLOBAL technique, the EQV technique and then the GLOBAL technique after feedback of dollar in values for soldiers in other specialties. For both jobs, EQV produced highest SD, values, GLOBAL next y ~ and feedback lowest, as Table 4 shows. The authors also reported that they had respondents delineate s job tasks before making estimates and this "seemed to reduce extreme values." Eulberg, O'Connor and Peters (1985) explicitly compared the SD, estimates provided by supervisors and job incumbents of the ing medical technician job in the U.S. Air Force. They used the CREPID method, applying the same of performance ratings on each job dimension and the same average salary value for both group's estin1ates. vative Each group provided its own set of imponance ratings for the job dimensions. As Table 4 shows, the SD, values were quite similar (approximately $3,300 per year) for both methods and for the 40% of salary rule. The authors fmd this convergence "remarkable", but it reflects only similar rankings of job task imponance between both groups (because the same pay levels and performance ratings were used), jt, et and the mathematical properties of CREPID suggest it will produce values approximately 40% of pay )bs, (Raju, Burke & Normand, 1987; Reilly & Smither, 1986). I each Reilly and Smither (1986) provided graduate students taking part in a management simulation with Id sales data on 10 employees, based on 3 job components (selling established produces, selling new 11 products and cost control). They used CREPID methods to obtain importance ratings on the 3 job lS. dimensions and then compared these to the actual simulated data provided. They also obtained SD, 1od. estimates for each job component using Schmidt, et al. (1979) techniques. Both methods caused some confusion among subjects, there were no order effects. The Schmidt, et al. SD, values for established the sales, new product sales, and cost control were significantly correlated (r >.68), but none of these were . 85th correlated with the .SD, for "overall worth." The Schmidt,et al. (1979) estin1ates were slightly higher than actual for repeat sales, 13% higher than actual for new product sales, and 51% higher for revenue less costs. The CREPID SD, estimate was below all the Schmidt, et al. (1979) estimates, slightly higher ,iation than the actual SD, of new sales, and far lower than the actual SD, for repeat sales and net revenue. ing These inconsistencies are interesting because subjects had the information necessary to make exact ;ults calculations of the dollar-valued performance for each employee, but apparently failed to use it in their .EQV estimates. even under such "ideal" conditions. Weekley, et al. (1985) compared CREPID to the Schmidt, et al. technique to the 40% of salary rule Dth for convenience store managers. ' They discovered very high variability in using the Schmidt, et al. itary method, and this method produced a value almost twice as high as CREPID. The CREPID value was for 36% of average salary and the Schmidt, et aI. value was 66% of average salary. Cascio & Ramos )S. (1986) also applied the CREPID technique (to telephone company managers) and found that it produced bove an SD, value roughly 35% of salary. Ie Desimone, et al. (1986) found that Global SD, estimates were much lower than compensation- L Utility Analysis for Human Resource Management Decisions Page 38 weighted deviations in the number of processed claims. Similarly, Greer & Cascio (1987) found "Cost- accounting" SD, estimates to be slightly higher than Global estimates for soft drink route salesmen. . Greer & Cascio (1987) also found the CREPID method produced the lowest SD, estimate. Day & Edwards (1987) found that SD, values were highest for the Global and modified Global method, followed by the % RO!, and lowest for the 40%-salary and CREPID methods for AccoW1t Executives and Mechanical Foremen. Finally, Edwards (1988) found that the Global method with feedback (Burke & Frederick's, 1984 Procedure B) produced the highest SD, values, followed by various forms of the CREPID method. CREPID estimates frequently fall near 40% of salary, and below more Global estimates, prompting the argument that because the CREPID scale is based on salary, it "considers only the contribution of labor not the combined contribution of labor, equipment, capital, overhead and profit, as does a standard based on the value of output as sold" (Greer & Cascio, 1987, p. 593). Edwards, et al. (1988, p. 533) also argue that average salary cost should be increased by the amount of benefits and "overhead" before scaling, to better reflect the "value of the total cost of services." The ROI method (Day & Edwards, 1987) proposes a similar scaling approach. It is undoubtedly true that larger sCaling factors would increase CREPID estimates, but it is not clear that such adjustments are justified. As noted above, salary (or salary plus benefits) will not necessarily reflect the average value of employees, and may overstate it Selecting higher-quality employees often has little effect on expenditures for equipment, capital, and overhead, so including these factors as potential cost reductions seems inappropriate. Moreover, while all of these factors moderate the contribution of higher-quality labor to organizational goals, the SD, concept always reflects such contributions because it is estimated across employees within a particular mix of capital, equipment and overhead. Salary-scaled estimates mayor may not reflect this quality, but the concept is no different whether scaled using salary or some other method. As noted in the discussion of payoff scales, the key question is how the high-quality labor will be used to enhance organizational goals, and this is likely to be situationally specific. SD, measures and post hoc adjustments should make their assumptions explicit. Simple proportional rules or compensation-based scaling factors may not generalize to every situation. Yet, where do such questions fit the decision-theory perspective on utility analysis? Summary and Conclusion: The Need to Look Beyond SD, Differences between SD, estimates using different methods are often less than 50% (and may be less than $5,000 in many cases). Still, these differences may be multiplied by factors of hundreds or thousands, depending on the number of employees selected, the validity of the device and the selection ratio (as shown in Table 3), in deriving the final total utility value. Even a small SD, differences multiplied by such large values imply vast tOtal utility differences. The tempting conclusion is that we need substantially more research on SD, measurement to whittle down such differences and provide more precise total utility estimates. This conclusion is not encouraging. The unfortunate fact is that I/O psychology and Human Resource Management has produced no well-accepted measure of job performance differences (on any scale, let alone dollars). The task of estimating dollar-valued performance variability has proven confusing and difficult for some subjects and virtually always produces substantial disagreement among raters. When SD, estimates can be verified against an objective criterion (e.g., sales, units produced), the criterion is arguably deficient leading to the conclusion that any observed differences in SD, estimation Utility Analysis for Human Resource Management Decisions Page 39 methods cannot serve as justification for using one measure over another (Day & Edwards, 1987; Burke & Frederick, 1984; Weekley, et aI., 1985). Thus, measured against the accuracy of SD, estimates, the contribution of this research must await development of an acceptable dollar-valued performance measure. Greer & Cascio (1987, p. 594) state "researchers within the accounting profession must develop an objective, veriflable, and reliable method for estimating the standard deviation of job performance in dollars", but accounting systems are neither designed nor intended to reflect this variable. Of course, if a measure of y existed, we could derive SD, directly, making estimation unnecessary. Indeed, even the SD, concept would have little value because one could use the slope of the regression line in Equation 2 to predict selection utility. Thus, while SD, measurement research produces infonnation on the variability across raters, methods or jobs, it is unlikely to provide information on measurement accuracy, nor is it likely to allow us to substantially reduce the uncertainty associated with total utility values. SD, estimation research can advance measurement theory. Here, the value of the research rests not on its ability to better describe, predict, explain or enhance decisions, but rather to illuminate new aspects of measurement. This may be a quite useful and legitimate application of the SD, concept However, it is very different from utility analysis, and this difference should be made clear by those researchers pursuing measurement theory. -y If SD, research is unlikely to produce the most accurate measure, and unlikely to alleviate uncertainty L in utility estimates, then what is the role of SD, measurement research in advancing UA knowledge? The B-C-G utility model emerged from traditional psychological measurement theory, which focused only on Jl standardized error terms but provided no context within which to evaluate them. UA models were .t formulated to better account for the decision context facing selection program managers. It seems ironic that after over 30 years, the major research efforts remain focused on measurement, taking little notice of the decision context in which such measures will be used. We have evolved from focusing only on the )f correlation coefflcient, to focusing only on the SD, value. We must return to describing, predicting, LIs, explaining and improving decisions, taking into account the context within which those decisions must be made. This suggests several research issues which have been all but ignored in the rush to develop new ~e SD, measures. First, the effects of SD, measures on the perceived quality of the utility analysis should be examined. Though virtually every new measure is justifled by proponents because it may produce more credible, 5S understandable, or easily communicated utility values, not one study has directly addressed these issues. If decision makers find the utility values resulting from a relatively simple proportional rule just as credible as complex job-analysis-based methods (e.g., CREPID, Janz-Dunnette) or Global estimation (Schmidt, et al., 1979), they may have little motivation to pursue the latter to increase decision credibility. Of course, even decision-maker preferences are not the real issue. Research (e.g., Kahneman & Tversky, ,re 1972, 1973) shows that decision makers frequently prefer and use heuristics that are detrimental to decision quality. VA research must focus on the quality of decisions as well as decision maker preferences. Second, and related to the first issue, we have little information on the relative effon and cost required to implement the different SD, measurement procedures. On their face, the proportional rules seem least complex, followed by the global estimation methods, fonowed by individual estimation the methods, fonowed by the job-analysis based methods. In a sense, the burden of proof rests with those ... r, Utility Analysis for Human Resource Management Decisions Page 40 who would advocate more complex and costly measures to demonstrate that the improvement in decision quality or our ability to understand the decision process justifies the additional resources necessary to gather the information. The costs of the different SD, estimation methods has not been computed, though Cascio & Ramos (1986) noted that CREPID ratings took 15 minutes per employee and Edwards, et aI. (1988, p. 532) noted that their managers felt the CREPID procedure took "too long". Third, and most important, comparative SD, studies seldom estimate overall utility values for actual decisions, producing results that are completely devoid of any decision context. It is often impossible to tell whether the measurement differences detected would have made any difference to actual decisions. Yet, as Table 3 demonstrates, virtually every study that has accounted for the decision context (by computing a utility value) has produced extremely high utility values regardless of the SD, level. Weekley, et al. (1985) proposed that while break-even SD, values are low when comparing implementing an HRM program to doing nothing (with zero cost and zero benefits), comparing HRM programs to other organizational investments might produce decision situations where differences in SD, estimates indeed affect the ultimate decision. Research incorporating such contextual variables could prove quite fruitful. Still, in the absence of any criterion against which to verify SD, values, one would still be left with little basis for choosing one over another. Further SD, measurement research seems unlikely to explain how apparently high HRM program payoffs can exist while the HRM function achieves low status and importance in organizational decision making. Answering that question, and developing decision models to alleviate the situation, requires that UA research explicitly recognize organizational decision contexts. The next sections discuss how utility models can better reflect such contextual factors, and links UA research to other fruitful research streams. The Role of Uncertainty and Risk in Utility Analysis How is it that UA research can simultaneously produce such clear evidence of HRM program payoff and such a raging debate on the proper measurement method for one utility parameter (SD,)? Although the expected utility values are quite high, if substantial uncertainty is associated with these utility estimates, and if that uncertainty results from uncertain SD, values, then reducing SD, measure uncertainty will improve decisions. However, uncertainty affects all VA parameters, not just SD,. Just as all models are deficient, all predictions contain uncertainty. UA research cannot ignore this fact, but must instead embrace its implications for advancing understanding of decisions and decision processes. Frameworks incorporating uncertainty (Alexander & Barrick, 1987; Boudreau, 1984a, 1987; 1988, in press; Cronshaw, Alexander Wiesner & Barrick, 1987; Milkovich & Boudreau, 1988; Rich, 1986; Rich & Boudreau, 1987) change the focus of utility analysis from estimating the expected utility value to estimating both the expected value and the distribution of values. Measurement issues become relevant as they affect uncertainty in the decision situation. This framework emphasizes the role of utility value variability in changing decisions, rather than simply measuring the sources of that variability (e.g., SD, measurement error) in the absence of a decision contexl It is surprising that the issue of uncertainty and risk in utility analysis received little attention for so long, because decision theory has traditionally been concerned with decision making under uncertainty, and has recognized that the riskiness of alternatives plays a role in decision making. this emphasis has been especially evident in the literature on financial Utility Analysis for Human Resource Management Decisions Page 41 investment decision making (e.g., Bierman & Smidt, 1975; Hertz, 1980; Hillier, 1963; Hull, 1980; Wagle, 1967). If two alternative resource investments offer the same expected value, but offer substantially different risks of large losses (below he expected value) or large gains (above the expected value), rational decision makers should take such risks into account. Four Alternath'e Approaches for Estimating Uncertainty Rich and Boudreau (1987b) provided an initial conceptual framework for uncertainty in UA and empirically compared four alternative methods addressing uncertainty: (1) sensitivity analysis; (2) break- even analysis, (3) algebraic derivation of utility value distributions; and (4) Monte Carlo simulation analysis. Sensitivity analysis. Though existing utility models contain no parameters reflecting utility value variability, the notion that utility values represent estimates made under uncertainty has not been completely overlooked. Several previous utility analysis applications and demonstrations (e.g., Boudreau, 1983a, 1983b; Boudreau & Berger, 1985a; Cascio & Silbey, 1979; Cronshaw, et al., 1987; Florin & Boudreau, 1986; Schmidt, et al., 1979; Schmidt, et al., 1984) have addressed possible variability in utility parameters through sensitivity analysis. In such an analysis, each of the utility parameters is varied from its low value to its high value while holding other parameter values constant The utility estimates resulting from each combination of parameter values are examined to determine which parameters' variability has the greatest effect on the total utility estimate. These sensitivity analyses virtually always indica-te that utility parameters reflecting changes in employee quality caused by improved selection (i.e., rx,y, Z" SDy) and the quantity of employees affected (i.e., N,) have substantial effects on resulting utility values. A variant of sensitivity analysis involves attempting to be as "conservative" as possible in making utility estimates. This approach has led researchers to produce clearly understated SDy values f (Arnold, et al., 1982), or to estimate the 95% confidence interval surrounding the mean SDy value and use the value at the bottom of this interval in the utility computations (e.g., Cronshaw, et aI., 1987; Hunter & Schmidt, 1982; Schmidt, et al., 1979, 1984). If estimated utility values remain positive despite such ty conservatism, it is presumed they will be positive in the actual application. Is Though valuable in assessing the effects of individual parameter changes, sensitivity analyses provide no information about the effects of simultaneous changes in more than one utility parameter (though Boudreau & Berger, 1985a and Boudreau, 1986 expressed utility as a function of changes in several parameter levels, to present the effects of simultaneous changes in utility parameters more concisely). & They also provide no information regarding the utility value distribution nor the probabilities associated with particular parameter value combinations (Hillier, 1963, p. 444). Moreover, when all parameters are as estimated at their most conservative levels (a statistically unlikely event), one runs the danger of incorrectly concluding that some programs will not payoff. Break-even analysis. Boudreau (1984a) proposed that a relatively simple and straightforward nd uncertainty analysis could be carried out by calculating the lowest value of any individual utility (or parameter combination) that would still yield a positive total utility value. These parameter values were termed "break-even" values because they represent the values at which the HRM program's benefits are equal ("even with") the program's costs. Any parameter values exceeding the break-even Utility Analysis for Human Resource Management Decisions Page 42 value would produce positive total utility values, and vice versa. Such logic is welI-known in micro- economic theory and fmanciaI management (i.e., Bierman, Bonini & Hausman, 1981). Boudreau showed how to apply break-even analysis not only when considering one program option (i.e., where the alternative is to do nothing, incur no costs, but receive no benefits), but also when multiple alternatives are involved (with more expensive alternatives offering greater potential payoffs). With multiple alternatives, one computes a series of decision rules specifying the range of parameter values that would justify choosing that alternative over the others (e.g., Boudreau, 1988; Milkovich & Boudreau, 1988). The break-even approach is simple and focuses on the decision context Boudreau proposed that break- even analysis allows decision makers to maximize the value of existing information, determine the critical values for the unknown parameters that could change the decision, and determine whether further measurement effon is warranted. Because controversy surrounded the accuracy and validity of SD, estimates, Boudreau (1984a) concentrated his analysis on that utility parameter, demonstrating that the break-even SD, values for the studies by Cascio and Silbey (1979), and Schmidt, et al. (1979) were substantially lower than the expected SD, value they derived. Table 3 updates Boudreau's analysis to incorporate additional and moreJecent utility analyses. The column labeled "Payoff Function," presents incremental utility as a function of SD,. These payoff functions were derived for each study and each selection device. The coefficient multiplied by SD, in each equation was derived by dividing the total program payoff (before subtracting program costs) by the SD, value. The number subtracted from this product is the reponed total cost All equations express payoff in terms of total utility, but for studies that reponed only per-person utility values, the payoff function reflects the utility of selecting one individual. The last column of Table 3 computes the break- even SD, value based on the payoff equation. This is simply the cost figure divided by the coefficient on SD,. For studies reponing no incremental cost for more valid selection, the implied break-even SD, value is also zero because any positive return justifies a costless program, so SD, becomes irrelevant. The equations and break-even values not only verify the earlier conclusion that HRM program utility is uniformly high, but also shed some light on the SD, controversy. Compare the reponed SD, values (in the column labelled SD,) to the break-even values for each study. Without exception, the break-even SD, values fall at or below 60% of the estimated SD, value. In many cases, the value necessary to break even is less than 1% of the estimated value. The break-even SD, value exceeds 20% of the estimated value in only 6 of the 42 analyses. In three of these six cases (Burke & Frederick, 1986; Rich & Boudreau, 1987; Schmidt, Mack & Hunter, 1984), this reflects an interview with low validity. The break-even value determining whether to replace the interview with a more-valid predictor was much smaller in the latter two studies. In shon, the vast majority of utility analysis applications conclude that the more-valid selection device is wonh its extra costs. This conclusion would probably have been apparent without ever actually measuring SD, (or by measuring it in the simplest manner possible) because the break-even SD, values are so low that they often fall several standard deviations below the expected value. Rich & Boudreau (1987b) found that the break-even SD, value fell below the lowest value estimated by any of the subjects. Boudreau's (1984a; 1987; 1988; in press) findings produced a similar conclusion, leading him to propose that future utility analysis research should use break-even analysis to put parameter measurement controversies into perspective. He speculated that many VA applications do not require costly and Utility Analysis for Human Resource Management Decisions Page 43 complex SD, measurement, but could simply present decision makers with the break-even SD, values and 'ed ensure that there is consensus that it would be exceeded. Moreover, he proposed that such an approach may prove much less confusing and difficult for decision makers than attempting to estimate an exact :s point estimate. In other words, the break-even approach suggests a mechanism for concisely summarizing the potential impact of uncertainty in one or more utility parameters. It shifts emphasis away from lId esrimaring a utility value, to making a decision using imperfect information. It pinpoints areas where controversy is important to decision making (i.e., where there is some doubt whether the break-even value k- is exceeded) versus areas where controversy has little impact (i.e., where disagreements about SD, do not tical indicate a serious risk of values below break-even). Thus, break-even analysis provides a simple expedient allowing utility analysis models to assist in decision making even when some utility Paranl1cters are unknown or uncertain. Measurement research (on SD, or other utility parameters) is not always unnecessary, but such research must consider the decision context, and repon not only the magnitude of the uncertainty but also its likely effect Recent research incorporating Boudreau's break-even analysis approach has reached similar 'he conclusions (e.g., Burke & Frederick, 1986; Cascio, 1987; Cascio & Ramos, 1986; Aorin-Thuma & Boudreau, 1987; Karren, NKomo, & Ramirez, 1985; Mathieu & Leonard, 1987). Eaton, Wing and Lau 11 (1985) also concluded that HRM program decisions in the military seldom hinge on differences of 10% the or 20% of testing costs, so a rough estimate of SD, may often be sufficient for decision making. Although relatively simple, break-even analysis is not without limitations. It is more difficult (but quite possible) to conduct break-even analysis when more than two or three utility paranl1eters may vary. 'eak- Moreover, the distribution of utility values is not estimated, so two programs could have similar break- nt on even values and similar expected utility values, but one might be preferable because its distribution may value be more positively skewed. Neither traditional utility analysis, sensitivity analysis, break-even analysis nor algebraic derivation (discussed next) adequately reflect such situations. tility Algebraic derivation of utility value variability. Goodman's (1960) equations for the variance of ~s(in the product of three or more random variables under conditions of independence were adapted by lSD, Alexander and Barrick (1987) to produce a formula for the standard error of utility values associated with k a one-cohon selection utility model. They demonstrated this derivation using data from the Schmidt, et ~d al. (1979) study, as well as variance estimates for employee tenure, SD" validity, and the number selected. Their standard deviations (estimated for various selection rarios) were about 50% of the expected utility values. By assuming a normal distribution of utility values, determining the utility value at the lower end of a 90% confidence interval, and using break-even analysis, the authors concluded that the selection program had a very high probability of producing benefits exceeding costs. device Algebraic derivation reflects simultaneous variability in several utility parameters, and can be useful in estimating the risk associated with utility values. However, it is more complicated than break-even les analysis and has limitations. First, the formula can incorporate dependencies between variables, but doing eau SOproduces very complex estimation equations and requires information on covariances that is seldom bjects. available. Variance estimates become especially difficult when programs can be expanded or abandoned during the project's life, or when variables are related in a non-linear fashion (as the selection ratio and ment the average standardized predictor score are related in utility formulas). Alexander and Barrick (1987) surmounted this difficulty by holding the selection ratio and average predictor score constant for each Utility Analysis for Human Resource Management Decisions Page 44 variance estimate. Second, algebraic derivation provides a variance estimate, but it requires assumptions about the distribution shape (e.g., normality) to make strong probabilistic inferences. Existing literature provides no empirical information supporting or refuting the assumption of normality, but Hull (1980) noted that non-normal distributions are likely when: (a) programs can be abandoned or expanded during their life; (b) non-normal components heavily influence the distribution; and (c) there is only a small number of variables. Each of these conditions may characterize utility analysis, as discussed below. Monte Carlo simulation of utility value variability. Monte Carlo simulation attempts to address limitations of the three previous methods. Simulation describes each utility model parameter in terms of its expected value and distribution shape. In each simulated trial, a value for each utility parameter is "chosen" from the distribution for that parameter, and the combination of chosen parameter values is used to calculate total utility for that trial. Repeated application of this choosing and calculating procedure (using a computer) produces a sample of trials from which describes the distribution properties of utility values. Thus, unlike the other three methods, simulations can vary many parameters at once, can reflect dependencies among the parameters, can acknowledge possible program expansion or abandonment, and can reflect non-normal distribution assumptions. Rich and Boudreau (1987'.) applied Monte Carlo simulation and the other three uncertainty estimation methods to a decision to use the Programmer Aptitude Test (PAT) to select computer programmers in a mid-size computer manufacturer. They used a utility model enhanced to reflect financial/economic factors and employee flows through the work force (these enhancements will be discussed subsequently). They discovered that all of the utility parameters were subject to some degree of uncertainty or variability over time. They also discovered that SDy variability heavily influenced the utility value distribution and that the distribution of SDy values was positively skewed as in other studies (e.g., Bobko, et al., 1983; Burke & Frederick, 1984; Schmidt, Mack & Humer, 1984; Burke, 1985; Mathieu & Leonard, 1987). The simulation suggested greater risk (variability) in utility values than the algebraic derivation because the simulation better reflected dependencies among utility parameters and parameter relationships over time. However, break-even analysis, algebraic derivation and Monte Carlo simulation all led to the same conclusion-- The selection program had a very small probability of negative payoff. Cronshaw, et al. (1987) also simulated utility values, but their analysis held validity, costs and SD, constant, and used subjective estimates of optimistic, likely and pessimistic parameter levels, rather than observed distributions. Their analysis also focused only on effects for the first cohort of selectees hired, while Rich & Boudreau (1987b) incorporated effects of subsequent program application and employee turnover (discussed subsequently). Still, Cronshaw, et al. (1987) reached a similar conclusion--The selection program had a very small probability of negative payoff. Thus, Monte Carlo simulation better reflects factors affecting utility value variability, and indeed suggests that substantial variability existed due to both measurement error and uncertainty regarding future conditions (Rich & Boudreau, 1987). This methodology may prove very useful in describing the behavior of utility value variability in future research. However, existing research also suggests that the simpler break-even analysis procedure may describe the decision situation adequately enough to reveal the con-ect decision. We should also note that all selection utility models and all of the variability estimation procedures except Monte Carlo analysis presume a linear and-constant relationship between utility and the parameters reflecting employee quantity and quality (Le., Ns, Z., SD" and rx,y). Economic theory Utility Analysis for Human Resource Management Decisions Page 45 IS suggests this assumption may be questionable in certain situations (as will be discussed subsequently). Therefore, Monte Carlo simulation may have an advantage over the other three methods when such non- linearities are important enough to alter decisions and when they can be quantified sufficiently to be g incorporated into a simulation algorithm. Statistical Hypothesis Testing and Uncertainty in Utility Analysis of The inferential statistics approach. Researchers are fan1iliar with the classic statistical tools of confidence intervals, hypothesis tests, and probability statements. Such tools usually emphasize the lsed probability of Type I error (accepting an alternative hypothesis that is false) by specifying the significance level of the statistical test A statistical approach uses sample information to estimate the variability of ty the sampling disuibution in a statistic (e.g., I, F, etc.). Then, assuming that the null hypothesis (usually a ~~t hypothesis of zero effect) is the mean of the sampling disuibution, a decision rule for the statistic is rIt d t.I established such that the null hypothesis is rejected in favor of the alternative hypothesis only if the I I observed effect in the sample is large enough to fall near the tail of the assumed disuibution (i.e., in the,~ tlion t highest 5% or 1% of the disuibution). Of course, this ignores tl1e probability of Type II error a (incorrectly failing to reject tl1e null hypotl1esis when it is false). Although some methods for reducing ;tors 1 Type I error (e.g., increasing San1plesize and measurement reliability) reduce botl1errors by producinga ey smaller variance in tl1e san1pling distribution, other mechanisms for reducing Type I error (e. g., requiring )ver larger effects before rejecting tl1e null hypothesis) actually increase the probability of Type II error. at Utility models are intimately connected to botl1 statistical inference and to decision making. UA rke I models make use of statistics (e.g., tl1e correlation coefficient) that summarize sample inforn1ation. However, tl1ey also can illuminate tl1e limitations of statistical analysis in a decision making context, and e suggest a more complete approach to using statistical evidence in decision making. Several authors have Ie. argued for increased emphasis upon substantive significance as opposed to statistical significance (e.g., Campbell, 1982; Rosenthal & Rubin 1985). With its emphasis on decision making, VA research can conuibute to formalizing and quantifying this more substantive emphasis. It is beyond tl1e scope of this :>, Chapter to fully debate tl1e philosophical and practical issues surrounding the question of substantive and an statistical inference. However, it is important to delineate some important roles for utility analysis in red, such a debate. The role of utility analysis in defining substantive significance. Statistical inference emphasizes extreme conservatism in the interest of maximizing confidence in reponed findings. Specifically, it sets very stringent standards for new research results to replace previously accepted findings. Consider validation studies, where the correlation coefficient is tested for statistical significance using the inferential future model specified above. Assuming the true distribution of correlation coefficients has a mean of zero (and 1avior a variance determined by the sample size, the reliability of tl1e measures and other factors), the observed ler correlation must be large enough that the probability of its occurrence in such a distribution is below 5% Irrect before rejecting the null hypotl1esis of zero correlation. Such an approach an10unts to an extremely conservative decision rule, especially because practical sample sizes and measure reliabilities often require Id the quite large San1plecorrelation coefficients to reach statistical significance (Schmidt, Hunter & Urry, 1976). Meta-analysis techniques can help to place tl1e results of many small-san1ple studies in perspective and Utility Analysis for Human Resource Management Decisions Page 46 provide a truer picture of the correlation coefficient mean and variability. The inferential statistical model is usually applied outside of a decision making context. No costs or benefits are attached to the two types of error, and the implied value judgments inherent in statistical testing are accepted implicitly. But suppose the study described above were being conducted in an actual organization, where managers must decide whether to adopt a particular selection device. Costs associated with Type I error--Le., adopting the selection device when it should have been rejected include test development, test administration and scoring, and possible productivity reductions from using the test instead of (or in addition to) the existing selection system. AdOpting a decision rule that rejects the selection device unless the-observed correlation is large enough to reach statistical significance "protects" the organization from needlessly incurring these costs. However, Type II error--failing to adopt the selection device when it should have been adopted also brings costs such as the lost productivity enhancements or cost reductions from improved person-job matching. The B-C-G utility model suggests that productivity enhancements and cost reductions are often quite sizable even with very modest correlation coefficients and performance variability. Thus, improved selection systems may often be "worth the risk," because the costs of Type I errors are fairly small, the costs of Type II errors are relatively large, and only a modest validity level is required to produce benefits from the improved selection system. Classical statistics attempts to minimize Type I error even at substantial risk of Type II error, reflecting values that are vinually opposite from these characteristics. Several authors have made a similar argument, though the link to utility analysis has not been as clear. Rosenthal and Rubin (1985) take issue with the notion that statistical inference is designed to establish facts, proposing that the purpose is to summarize information efficiently. They make three important points: (I) that "when tht dependent variable is of some importance and where obtaining additional data is difficult, expensive or unlikely" even non-significant results can contribute to scientific understanding; (2) that by taking the ratio of the probability of Type I error to Type II error, we obtain an index of the "perceived relative seriousness" of the two errors which indicates that in most studies Type I errors were implicitly "from 5 to 95 times more serious than Type II errors" (p. 529); and (3) that the notion of value-free scientific inference is usually inaccurate because investigators use their own values in choosing what statistical tests and contrasts to investigate. Cascio and Zedeck (1983) also suggested computing the ratio of Type I to Type II errors as a measure of the relative importance of decision consequences. Using utility analysis formulas, they demonstrated that less stringent decision rules increase power (the ability to detect non-zero effect sizes), but also increase Type I error. They suggested that researchers adjust alpha levels (Le., acceptable Type I error levels) downward to increase statistical power. Fowler (1985) noted Campbell's (1982) lament that statistical significance is often incorrectly taken as substantive significance, and his admonition that researchers argue for the substantive as well as the statistical significance of their fmdings. Using a defmition of substantive significance derived from Cohen's (1977) signal to noise ratio, and Lykken's (1968) observations concerning common variance in psychological variables, Fowler reviewed Journal of Applied Psychology articles from 1975 and 1980, fmding that 75% of the 1975 effect sizes and 69% of the 1980 effect sizes were "below Cohen's large effect" (p. 217), though they reached statistical significance. Abelson (I985) described the paradox whereby baseball batting skill explains less than 1% of the variance in single-at-bat performance but is Utility Analysis for Human Resource Management Decisions Page 47 regarded with extreme imponance by decision makers in selecting baseball players and assigning them or positions (batting opponunities in games). Abelson pointed out that decision makers must consider the season-long performance of a team, not the single at-bat performance. Because any player may have tual 1,000 at bats in a season, and because scoring rallies are more likely when groups of skilled batters build on each other's skills, even the "modest" explanatory power of batting skill has important implications for ude team performance (compared to other alternative selection and assignment schemes). est These observations suggest an important role for UA research in explicating the debate on substantive versus statistical significance. Rosenthal and Rubin's first observation supports the earlier conclusion that ts" the potential effect of HRM programs on productivity is important enough that even imperfect information about utility parameters may be quite valuable, because further data gathering may be difficult or expensive. Their second observation, as well as the Cascio and Zedeck observations, suggests that HRM sts program decision rules should be adjusted so that the ratio of Type I to Type n error probabilities is consistent with the costs and benefits of both types of errors. Rosenthal and Rubin's third observation suggests that both the organizational implications and scientific value judgments should be considered when interpreting statistical tests, and UA models can provide valuable information describing organizational implications. Fowler's observation that a majority of research studies may produce :>en statistically significant findings that have low substantive importance is not unlike our earlier observation that although many of the discrepancies between SDy measurement methods are quite large in absolute terms, break-even analysis reveals that the discrepancies appear to have little bearing on the quality of HRM program decisions. Fowler's fmdings also reinforce our conclusion that such research should report the decision context in which SDy information will be used. Finally, Abelson's (1985) explanation for the variance explanation paradox parallels the break-even analysis of Table 3, suggesting that most ific organizations (like baseball teams) are more concerned with productivity outcomes reflecting multiple ain employees and time periods, than with the behaviors of one employee. ) that Incorporating the Value of Information The issues of necessary precision, statistical versus substantive significance and uncertainty regarding UA for HRM programs are analogous to similar issues for other organizational investment decisions. Decision makers must (implicitly or explicitly) assess the value of additional information (and the cost of :y acting with uncertainty) in light of the particular decision context they face, and several models for .ase quantifying these issues are available (cf. Bierman, et al., 1981). Yet, these well-known models are not usually applied to HRM decisions at least partly because of the widespread failure even to attempt to Icen quantify the effects of HRM interventions. UA allows quantification, and thus offers one link to decision he models that more explicitly incorporate the value of information. Although it is not possible to fully develop the mathematical and logical arguments inherent in such an information model, we can briefly : in summarize how UA models can be used for this purpose, and the implications of viewing UA as a ), component of the larger task of making decisions under uncertainty. rge The basic information value model incorporating utility analysis. Information has value when it reduces uncertainty in a way that produces better decisions. Gathering information (such as utility model . is parameter measurement) is a decision in itself, subject to both desirable and undesirable consequences. In Utility Analysis for Human Resource Management Decisions Page 48 simplest terms, the value of additional information depends upon: (1) the probability that the information's results can change decisions; (2) the consequences of the changed decisions; and (3) the cost of gathering the additional information. The value of additional information may be considered as the product of (1) and (2), less (3). Additional information has greatest value when the probability that the additional information will change the decision is very great, the consequences of changed decisions are very large, and the information can be gathered at low cost. Additional information has less value under the opposite conditions. Evaluating information requires: (1) an explicit decision (i.e., the alternatives, their attributes, and the value of the differences in their consequences); (2) a decision rule for using the additional information to alter decisions; (3) assumptions or data regarding the likely results to be revealed by the information; and (4) the cost of the additional information. Two models for evaluating information value are commonly discussed (cf. Bierman, et al., 1981)--"perfect information" and "imperfect information." The two models differ primarily in the way they treat the third factor listed above (i.e., the probable results of the additional information). Suppose an organization is considering two selection devices. One device is more valid, but also more costly to develop, administer and evaluate. The decision maker realizes that selection utility consequences will be quite different if future conditions produce very large applicant pools (allowing the organization to be very choosy, and achieve a high average selectee test score) as opposed to very small applicant pools (providing less choice and thus less payoff to improved validity). Suppose it has been determined that two selection ratios are possible (i.e., .30 or .70). UA reveals that if the selection ratio is .30, then the more-expensive device offers a utility of $500,000, and the less-expensive device offers a utility of $300,000. If the selection ratio is .70, then the more expensive device offers a utility of $50,000 and the less-expensive device offers a utility of $200,000. Should the decision maker gather additional information (e.g., labor market forecasts, strategic forecasts of labor demand, etc.) to attempt to predict the selection ratio more precisely? Without further information, the decision maker attaches a 20% probability to the .30 selection ratio, and an 80% probability to the .70 selection ratio. Thus, with no additional information the expected values are $140,000 and $220,000 for the more- and less-expensive alternatives, respectively, and the less-valid and less-expensive alternative is preferred. The expected value of this decision is $220,000. In the "perfect information" model, one assumes that a perfect predictor would foretell the actual selection ratio in advance, and calculates the additional decision value that could be derived from that information. In the example, if the decision maker had perfect information, then there is a 20% chance that the information would foretell a low selection ratio. With this information, the decision maker would switch to the more-expensive alternative and would enjoy the $500,000 utility instead of the $300,000 utility of the less-expensive selection device. However, there is an 80% chance the perfect infonnation will reveal a high selection ratio, in which case the original decision was correct anyway. Thus, the value of perfect information is equal to 20% times the utility difference under the high-selection-ratio condition (i.e., .20 times $200,(00), or $40,000. l1nder these assumptions, this is the upper limit of the value of any information that improves the ability to predict the actual selection ratio. The information value is realized only under the conditions where it changes the decision, and depends on the consequences as well as the probability of that change. Even in the absence of selection ratio ~.- Utility Analysis for Human Resource Management Decisions Page 49 infonnation, it is possible to compute the value for the two probabilities that would change the decision. n's The more-expensive option is preferred if the probability of the low selection ratio exceeds 43 percent, ng and the probability of the high selection ratio does not exceed 57 percent. ) In the "imperfect information" model, one uses Bayesian probability relationships to determine how imperfect information changes the a priori probability estimates, the decisions implied by these changes, e, and the expected consequences of the decisions under all future conditions and information outcomes. Frequently decision trees can represent the decision situation. However, the main objective is similar--to determine the economic value of information designed to reduce uncertainty. Moreover, the same three the factors determine the economic value of additional information. to Variations on these models can be developed that reflect continuous as well as discrete distributions md of future conditions, information outcomes and probabilities. Indeed if the distribution of information outcomes is assumed to be normal, it is possible to evaluate the consequences of various statistical :leIs decision rules (e.g., setting Type I error at 5%) in light of alternative future conditions (e.g., decision consequences, true values for utility parameters), and determine the economically optimum decision rule and/or the economically optimum sample size for a future study. Moreover, such methods can be applied not just to uncertainty regarding future selection ratios, but to uncertainty regarding any of the utility parameters. Such a framework makes it possible to explicitly consider not only expected utility values, he but uncertainty and risk inherent in those values as well as the implications of decision rules derived all from inferential statistics or other methods. Linking the Information Value Model and Emerging VA Research. Recall the three determinants io is of information value: (1) the probability that the additional information will change decisions; (2) the :i consequences of the changed decisions; and (3) the cost of gathering the additional information. Despite research on SDy variability, actual selection device decisions are unlikely to be altered by different SDy measures because: (1) different SDy measures have low probabilities of producing SDy values below break- It to even, (2) even crude SDy estimates will often lead to a decision favoring improved selection, and (3) refined SDy measures may be complex and costly. Boudreau (1984a) developed this point in detail using ith the "perfect information" model. nsive When the costs of implementing improved selection are modest compared to the potential benefits, relying on decision rules based on statistical significance may be overly conservative. Failure to adopt improvedselectiondevices because validitiesdo not reach statistical significance may imply a belief that the consequences of erroneous implementation are tens (or hundreds) of times as great as the consequences of erroneous failure to implement. Existing evidence suggests that implementation costs are mee low and potential productivity benefits are very large, so HR managers often cannot afford the risk of not lIould trying improved selection devices. Still, the B-C-G model reflects only a portion of relevant decision ) factors,and we have no research to suggest what other factors decision makers may consider. on Conclusion the In view of the high variability associated with utility parameter estimates (especially SDy)' it seems ion plausible that perceived uncertainty and risk associated with utility estimates may explain why HRM programs do not enjoy widespread acceptance and why the utility values may appear larger than many UUhty Analysis for Human Resource Management Decisions Page 50 researchers would have expected (Schmidt, et al., 1979; Hunter & Schmidt, 1982). However, this view may also reflect ignorance of the capacity for improved HR management to affect organizational goals. As the illustration in Table 2 and the break-even analysis of Table 3 vividly illustrate, the leverage or quantity of person-years affected by HRM programs can be quite large. Thus, the coefficient on SDy is often quite large, and even modest levels of performance variability offer substantial opponunities for highly valuable HRM program effects. Schmidt et al. (1986; 1984) have expressed utility values as a percentage of output and wages, and suggested that if decision makers or researchers fmd utility values "implausible," it may reflect the fact that they do not appreciate the magnitude of their human resource investment. Uncertainty about SDy would not have been an important factor in any published utility analysis applications. Further research on the cognitive processes affecting SDy estimation and further efforts to develop new and more reliable SDy estimation methods may provide more information on the nature and magnitude of this uncertainty (Bobka, Karren & Kerkar, 1987). Such research should reflect the decision context so that the implications of these fmdings can be meaningfully interpreted. The information value model suggests that valuable future UA research will address issues likely to alter decisions, in contexts where such alterations carry large consequences. Recognizing the decision context reveals that UA models reflect an organizational process, not merely the single application of a particular program. The next sections review enhancements to the B-C-G utility model designed to better reflect such organizational processes. Such enhancements can have at least three purposes: (1) To provide more accurate and realistic utility values; (2) To improve the usefulness of UA models in enhancing decisions; and (3) To allow UA research to encompass a broader theoretical domain that advances scientific understanding of decisions about HRM programs. The first objective must be measured against actual or presumed objective values which may not be available. The second objective can be measured against the infornation value principles noted above. The third objective may be the most important for research, and can be measured against the ability of enhancements to incorporate and integrate fruitful new directions for scientific inquiry. Expanding the Domain of Attributes In Selection Utility Analysis VA models are special cases of MAD models, representing some, but certainly not all factors affected by HRM decisions. All MAD models, including UA models are undoubtedly deficient. This deficiency offers another possible explanation for utility values that may be higher than expected, or for the lack or widespread application of the models. The DA model may be missing important variables that are relevant to decision makers. Such deficiencies would be especially troubling if the omitted variables tend to argue against interventions, because VA models could produce positive utility values (suggesting program implementation), while a more complete VA model might reveal reasons against implementation. Moreover, because VA models focus on decisions to invest organizational resources in HRM programs, they implicitly draw on assumptions regarding both financial decision processes and labor market phenomena that interact with such decisions. We now explore how attributes from each of these related domains affect the B-C-G selection utility model. Utility Analysis for Human Resource Management Decisions Page 51 Financial/Economic Considerations s The dollar-valued payoff function in UA models has led to speculation that UA models can provide a link between Personnel/HRM research and more traditional management functions (e.g., marketing, [mance, accounting, operations). For example, Landy, FaIT & Jacobs (1982, p. 38) suggested that UA models may be capable of "providing the science of personnel research with a more traditional 'bottom line' interpretation", Cascio and Silbey (1979) called for a "closer liaison" between personnel researchers and cost accountants, and Greer & Cascio (1987) proposed that cost accounting should contribute to defining the criterion in utility analysis. Even the original treatments of Brogden & Taylor (1950) and Cronbach & GIeser (1965) reflected a concern with the profit contribution of enhanced work force od quality. Recently, researchers have suggested enhancements to the B-C-G selection utility model designed on to incorporate financial/economic considerations. Iue Variable costs, taxes and discounting. Boudreau (l983a) recognized that UA models addressed ts economic and financial consequences of HRM decisions, but failed to incorporate certain [mancial/economic considerations. He suggested that measuring utility with a payoff function reflecting :rely sales revenue or "the value of output as sold", would probably overstate HRM program effects on discounted after-tax profit (the payoff scale used for financial investments). He showed how the utility formulas could easily be altered to account for three basic financial/economic concepts: variable costs, taxes and discounting. :ler First, Boudreau noted the difference between "sales (or service) value" (i.e., the value of the increase in sales revenue or output as sold), "service cost" (i.e., the change in organizational costs associated with The the increased revenue), and "net benefits" (i.e., the difference between service value and service costs) produced by an HRM intervention. He suggested that productivity enhancements through improved HRM programs may require additional support costs (e.g., increased inventories to support higher sales, increased raw materials usage to support higher output volumes, increased saIaries!benefits as incentives for improved perlormance). Moreover, many interventions operate not by increasing sales revenue or output levels, but br reducing costs (e.g., Florin-Thuma & Boudreau, 1987; Schmidt, et al. 1986). He suggested including the effects of HRM programs on service costs in the model in either of two ways: (1) by reflecting the change in costs through a correlation coefficient (between the predictor and service costs) and the dollar-valued standard deviation of service costs among applicants; or (2) by assuming .s service costs are proportional to service value increases and simply multiplying the incremental service for value increase by a proportion (I-V) reflecting the change in net benefits. Greer & Cascio (1987) derived :s variable costs more precisely using accounting conventions for soft-drink route salesmen. Boudreau (l983a) showed how incorporating such considerations could increase utility values (if costs fall when s productivity increases) or decrease utility values (if costs rise when productivity increases). Second, Boudreau noted that most organizations do not keep the full value of increases in net :in benefits. Rather, they must pay taxes on increased income to Federal, State and Local governments. labor Thus, adjusting utility values from the B-C-G model to reflect increases in net benefits may still overstate hese the organizational payoff by failing to account for increased taxes. Boudreau proposed multiplying both the net benefits and the implementation costs (C) by one minus the applicable tax rate (i.e., I-TAX) to ' , ,; Utility Analysis for Human Resource Management Decisions Page 52 r';"i!:~' :1 reflect after-tax effects. He speculated that TAX levels might be as low as zero (for organizations . .,'..." reporting losses) and as high as .55 (for organizations subject to multiple income tax obligations). Third. Boudreau observed that UA models typically focused upon benefits from interventions lasting into future time periods (the second column of Table 3 indicates the number of future time periods analyzed in empirical applications). UA models had treated such year-to-year effects as equivalent to each other. Returns derived in future years were simply added to the returns from initial years. He noted that such treatment ignored a fundamental fact of organizational management--money can be invested to earn interest. When interest can be earned, accelerated program returns and postponed program costs can be invested to earn interest for a longer period of time. Therefore, fmancial analysis "discounts" future earnings (and costs) to reflect these potential investment returns. Boudreau demonstrated how the interest rate earned on program returns (i) could be incorporated into the UA model, producing a "discount factor" (i.e., DF, the summed effects of discounting over a number of future periods). He demonstrated that discounting reduced utility value levels, with the most substantial reductions occurring when program returns occur farther into the future, and when the discount rate is h~h. . Boudreau (1983a) incorporated these factors into the selection utility formula and derived the combined effects of hypothetical levels of the parameters on reported utility values. His derivations suggested when HRM programs face zero tax and interest rates, and variable costs are reduced with productivity increases, B-C-G utility values might understate actual discounted. after-tax net benefits by as much as 33%. However, when HRM programs face-positive tax and interest rates and costs rise with .(1 productivity, reported values might be overstated by as much as 84%. SOldies applying fmancial/economic considerations suggest that unadjusted utility values commonly exceed adjusted values. As shown in Table 3, Mathieu and Leonard (1987) found TAX=.46, i=.15, and V=-.07; Burke and Frederick (1986) found TAX=.49; ;=.18 and V=-.05; Rich and Boudreau (1987b) found TAX=.39; ;=.15; and V::O. Table 5 extends the example begun in Table 2, incorporating the fmancial/economic considerations noted by Boudreau (1983b), assuming the variable cost proportion equals 5%, the tax rate equals 45%, and the interest rate equals 10%. Program costs are assumed to occur at the beginning of the analysis, so they are not discounted but are adjusted only for their effect on taxes. Assuming a 10-year analysis period, unadjusted quantity equals 6,180 person-years. Unadjusted quality per person-year is $5,331 per person-year. Thus, the unadjusted product of quantity and quality is $39.125 million. This is adjusted to reflect the 5% variable costs and 45% taxes. Finally, to reflect discounting at a 10% rate, the lO-year quality effect is multiplied by .614,producing an after-cost, after-tax discounted net program benefit of $12.5519 million as shown in Table 5. Subtracting the after-tax testing cost of $6,698 produces the after- cost, after-tax, discounted net utility of $12.55 million as shown. Though substantially smaller than the $37.9 million reponed by Schmidt, et al. (1979), derived in'Table 2, this return remains substantial. ------------..----------------- Insen Table 5 Here ------------------------------ Boudreau (1983a) stated that utility values incorporating these financial/economic considerations would better reflect the decision context of organizations that compute such investment values for Utility Analysis for Human Resource Management Decisions Page 53 programs in other management functions, and might be more credible to managers accustomed to working with financial analysis. He also noted a number of theoretical implications. First, employee wages and salaries are a different concept from their productive value, with wages and salaries reflecting resource costs, while productive value reflects the output of applying human resources to a production process. Equating compensation with productive value will usually understate value, but in some cases will overstate it because wages may exceed production value for some jobs or individuals. As we shall see, the relationship between improved labor quality and compensation is central to labor economic theory, and this model provides a framework in which to integrate them. Second, utility analysis reflects temporal effects that may not remain constant over time, as the B-C-G model assumes, and might lead to biased utility value estimates. Although Schmidt, Hunter, Outerbridge & Goff (1988) found that validities and performance differences remained constant over time, temporal instability has been incorporated into utility models for training (Mathieu & Leonard, 1987; Schmidt, et aI., 1982). Third, the enhanced fmancial/economic utility model might partially explain the unreliability observed in managerial SDy estimates when managers are asked to use two conceptually different anchors--the value of output and the cost of contracting for that output--to derive one value (Day & Edwards, 1987; Reilly & Smither, 1985). Applying Capital Budgeting Indices to Utility Analysis. Cronshaw and Alexander (1985) suggested that "a major reason for the differential success of human resource and fmancial managers in implementing their respective evaluation models is the greater rapprochement of capital budgeting with the everyday language of line managers and with the financial planning needs of the organization" (p. 102). .s They speculated that by integrating VA results into the financial decision making context, personnel managers would better communicate the impact of their programs on the "value of the firm" as opposed to "increased productivity" or "operating costs." Cronshaw and Alexander (1985) separated the cost component of the selection utility model into two components, C., the original one-time costs of developing and validating a selection instrument, and C;, the implementation costs incurred each time the instrument is used. The "return" (i.e, R) of the program was the one-year, one-cohon productivity increase from a selection device (i.e., the product of N.. SDy, r.;" and Z.). They explained the analogies between the selection utility model and five standard capital budgeting indices often discussed in fmancial investment textbooks. First, the pay-back period (PP) or "number of years a finnrequires to recover its original investment from net returns" was formulated as the sum of C; and C. divided by R (a more consistent formulation would be C. divided by the difference between R and CJ to The authors note that this index is deficient because it ignores interest earned on returns over time, and it ignores returns that occur subsequent to the payback period. Second, they defined return on investment (RO!) as the ratio of "annual cash returns to original cost" er- and formulated it as the ratio of R to the sum of C; plus C. (a more consistent formula might be the difference between R and C; divided by Cj. They noted that this index ignores interest returns, but it also ignores any multiple-year returns because it reflects only the one-year return from selection divided by the entire implementation cost Third, they defined "net present value" (NPV) as the difference between the discounted sum of returns over time (where the discount rate is the expected rate of return earned by the firm on contributed capital) and the original and implementation costs. This formulation is vinually identical to Boudreau's Utility Analysis for Human Resource Management Decisions Page 54 (1983a) formula, but Cranshaw and Alexander do not account for variable production costs, multi-year implementation costs, and taxes. Fourth, they defmed the "profitability index" (PI) as the "ratio of the present value of net cash inflows to cash outflows", and formulate it as the discounted sum of returns (R) divided by the sum of implementation and original costs (a more consistent formulation might be the discounted sum of R minus C; divided by the original costs). The authors note that a PI greater than 1.0 suggests a payoff exceeding costs as well as meeting the discount rate. They note that such an index fails to take into account the relative size of investments. Fifth, they defined the "internal rate of return" (IRR) as "the rate which equates the NPV of cash inflows with cash outflows" and formulate it as the value of the discount rate that equates the discounted sum of the returns with the sum of the one-year implementation costs and the original costs (a more correct formulation might equate original costs with the discounted sum of the difference between the returns and implementation costs). The authors noted that the derived rate of return is then compared to the organization's required rate of return to determine project acceptability. An additional impOrtant limitation to this index is it's assumption that each project's returns would eam interest at that project's IRR, thus incorrectly implying different interest rates on the returns from different projects. Cronshaw and Alexander briefly discuss the issues of taxation, application to non-selection programs, multi-year benefits and the flow of employees through the work force over time, which would make their fmancial models more compatible with Boudreau's (1983a; 1983b) derivations, and could address some of the inconsistencies noted above. They also provide a useful distinction between viewing HRM program expenditures as "operating costs" written off in the current period (presumably implying that program returns occur only in that period), and as "capital investments" (presumably implying multi-year future returns). They speculate that the reason for the low credibility and the presumed expendability of HRM programs may be that HRM managers fail to adequately communicate the multi-year benefits accruing from such programs. This point is analogous to an earlier observation made by Boudreau (1984a) which showed that break-even values suggested high HRM payoffs, as well as suggesting the even large cost outlays were justified by HRM program returns. However, two of Cronshaw and Alexander's financial indexes (payback period, one-year return on investment) will also understate multi-year benefits. In fact, only two indices (net present value and profitability index) accurately reflect the relative discounted multi-year payoffs from competing fmancial investments. The profitability index is especially intriguing because it suggests considering payoff in terms of the benefits per dollar expended, rather than the benefits minus dollars expended, as is the traditional utility focus. It is straightforward to re-formulate utility equations to reflect this alternative perspective. It would be interesting to learn whether such reformulations would affect decision processes. "PitfaUs" in Using Financial/Economic Considerations. Hunter, Schmidt & Coggin (1988, p. 522) propOsed that financial accounting methods are "frequently inapplicable to human resource programs and, in addition, may sometimes have negative consequences even when they are applicable on a purely logical basis." First, they noted that except for discounted present value, the financial indexes discussed by Cronshaw and Alexander (1985) require that a pOrtion of the costs be designated as the "investment" (e.g., Co), and they speculate that many improved selection systems may actually involve no original costs and/or may actually reduce ongoing testing costs. They correctly note that under such conditions, Utility Analysis for Human Resource Management Decisions Page 55 one can compule discounted net present value (as described by Boudreau, 1983a), but not the other indexes. However, assuming costless HRM programs reduces the justification for any dollar-value:o Van Naersson Examined selection utility (] 4,392 1 yr. .81 .334 .66 $77.00 $1,627963) $73,049 (970 SOy)-$1,627of a driving experience A U = $1.68 ~3 questionnaire for reducing (1) .;.:!training time for . drivers in the Dutch Anny. Schmidt & Examined weighted 308 2 yrs. .302 NR .47 $1,652 $ 628 $ ]61,243 Hoffman application blank selection NA NAper (1973) utility for reducing separ. separations among nurse's aides. Lee & Booth Examined the selection 245 25 mo. .17 1.47 .56 $1,238 $0 $249,900 ~(1974)" (202 SOy )-$0utility of a weighted ~U= $0 1.0 application blank for tV predicting turnover among clerical employees. ~ ~ Oi Table 3. Results of Studies Deriving Actual Program Utility Values (continued) Cg ~... Reference Settin!! TorF SR ~.. ~~Cost AV Vtilitv Fonnula BoEt!.. -<' Cascio & Examined assessment center 50 5 yrs. .50 .80 .35 $ 9,500 $73,928 $504,211 A V = (61.04 SDy)-$40,328 $660.70 ~Silbey (1979) selection utility for food ~and ' programmers. V10 Schmidt, Examined PAT selection 618 10 yrs. .50 .80 .76 $10,413 $370,800' $38,755,422 A V = (3,757 SDy)-$370,8oo $98.70 (t> ~et aI. (1979) utility for V.S. Gov!. computer programmers. ~§ Schmidt, Examined PAT selection 618 10 yrs. .50 .80 .76 $10,413 $ 12,360 $31,906,400 A V = (3,065 SDy)-$12,360 $ 4.03 ~et (t>aI. (1979) utility minus interview 3 selection utility for V.S. (t> GoV!. computer programmers. =' '"" Arnold, et Examined selection utility 1,853 I yr. .06 1.97 .84 $3,000 $0 $9,199,033 LJ,.V = (3,066 SDy)-$O $0 aI. (1982) of a strength test to select steelworkers. Dunnette, et Examined selection utility I yr. .50 .80 .28 $15,600' $100 $ 3,295 A V = (.216 SOy) . $100 $46.30 aI. (1982) of a test battery to select hydroelectric power plant operators. ~Dunnette, et Examined selection utility I yr. .50 .80 .44 $21,400' $100 $ 7,335 A V = (.347 SDy) - $100 $288.18 ~aI. \C(1982) of a test battery to w select fossil power plant operators. Table 3. Results of Studies Deriving Actual Program Utility Values (continued) c::g; Reference &nin~ ~TorF SR ~.b !:.oJ' ¥2, Cost AJ! Utility Formula B-E¥2, Dunnette, et Examined selection utility 1 yr. .50 .80 .44 $72,400' $100 $25,285 AU = (.351 SOy) - $100 $284.90~al.(1982) of a test battery to ~'< select fossil power plant VJ control room operators (CRG). In' 0...'. Dunnette, et Examined selection utility I yr. .50 .80 .30 $23,500. $100 $ 5,440 Au = (.236 SOy) - $100 $424 al. (1982) of a test battery to t::10 select nuclear power plant (') operators. en'g' VJ Dunnette, et Examined selection utility I yr. .50 .80 .30 $134,800' $100 $32,150 AU = (.239 SOy) - $100 $418 al. (1982) of a test battery to S' select nuclear power plant :r: control room operators. c::S § Ledvinka, Examined selection utility 10 I yr. .07 1.918 .36 $5,542 $1,104 $37,162 AU = (6.90 SOy)-$I,I04 $160 et aJ. (1983) of the JEPS test for life ~0 insurance claim approvers. VJ0 Ledvinka, Examined selection utility 10 I yr. .07 1.918 .14 $5,542 $0 $14,881 AU = (2.68 SDy)-$O $0 ~et 0al. (1983) of the interview for life insurance claim approvers. ~§ Ledvinka, Examined selection utility 10 I yr. .07 1.918 .22 $5,542 $1,104 $22,281 AU (4.22 SOy)-$I,I04 $262 ~~et = aJ. (1983) of the JEPS test minus the :3 selection utility of the 0 interview for life .:.:.I. insurance claim approvers. Schmidt, Mack Examined selection utility 80 10 yrs. .10 1.758 .14 $4,451 $232,000" $ 644,384 A U = (197 SOy)-$232,OOO $1,178 & Hunter of using the interview to (1984) select U.S. Park Rangers. Schmidt, Mack Examined selection utility 80 10 yrs. .10 1.758 .51 $4,451 $232,000' $2,960,542 lIU = (717 SOy)-$232,OOO $323.57 & Hunter of using the PACE test to (1984) select U.S. Park Rangers. & S~hmidt, Mack Examined selection utility 80 10 yrs. .10 1.758 .37 $4,451 $0 $2,316,482 /J U = (520 SOy)-$O $0 0 & Hunter of the PACE test minus the \0 ~(1984) selection utility of the interview to select U.S. Park Rangers. .... ~ Table 3. Results of Studies Deriving Actual Program Utility Values (continued) gC ........ Reference Settin!! ~TorF SR b !:x.. §Q, Cost 1I.JL Utility Formula B-E §Q, '< ~Wroten (1984) Calculated selection utility IO yrs. .15 1.55 .30 $29,472" $ 1,000 $ 136,045 AU = (4.65 SDy)-$I,OOO $216.00 FE. for using various selection ' Utility calculated based on training 10 Operations Mgrs. (SJ) who remain a maximum of 20 .:.:.I. yrs. TAX = .46, i=.15, V=-.0129. Mathieu & Examined training program 36 19 yrs. Calculated!!. of .3146 $10,064 $14,8W $ 156,400 /J. U = (17.01 SOy)-$14,814Leonard utility for Branch Managers $811 (1987)" in the Bank of Virginia. Utility calculated based on training 36 Branch Managers who remain a maximum of 19 ~yrs. TAX = .46, i=.15, V =-.0281. \-C.) Table 3. Results of Studies Deriving Actual Program Utility Values (continued) e § Reference Setting N., Tor F SR b !:.'" ffi. Cost .AJ! Utilitv Fonnula B-E ffi. ~ Rich & Examined the selection flows 11 yrs. .398 .73 .73 $15,888 $229,101k $3,198,258 AU = (216 SDy)-$229,101 $1,D62 ~ Boudreau utility of the PAT for ~ (19871 computer programmers. '<" .<" Table 4. Results of Studies Estimating ffi Values c Reference Setting §Utili tv Scale Estimation Method SO, VaJues Ooppelt & Bennett Estimated savings in ~Grocery (1953)" training costs. Clerks: 5" Mean = $ 308/yr SE ~'< V) ~. SO % salary = 15% .0....' % mean y =NR Adding Machine Opers.: W,.... V) Mean = $214/year o' SE V::)3 SO % salary = 10% S' % mean y =NR p:: c:: Produce workers: S§ Mean = $]79 SE :;0 SO GV) % salary 010% ~% =mean y =NR G Roche (1961) Estimated the value of "The dollar profit which Reported in Used cost accounting to attachproduction units of Mean $],217accrues to the company = § Cronbach "standard" costs (materials,radial drill operators ~($.585/hour)as a result of an indi- & Gieser (1%5) labor and facility usage) and (J IQ>:> (N=291) in a manufac- SE NAvidual's work." prices to units. Price less = G turing organization, SO NAcost was value per unit, and = S by attaching value to % salary G SO, was based on individual = 25%' g each unit. % mean y NRoutput quantities. = Van Naersson Used training time Reduction in training time (1963) Used training time data to esti-data to estimate the Mean $77.00costs. mate SO training hours (12.9), = Rcported in SO of training time SE NAand then multiplied by average = Cronbach & costs across military SO = NAtraining cost/hour ($6.00) to Gieser (1965) driver trainees in % salary 4%produce SO of cost equal to = the Outch Army. %,mean y =NR$77.00. & G \0 \0 Table 4. Results of Studies Estimating ~ Values (Continued) c::: .§... Reference Setting Utility Scale Estimation Method ffi Values '< Schmidt & Hoffman Estimated savings from Reduction in recruitment, No SD, calculated, but Mean =$624/year ~ (I 973) reducing turnover costs hiring, and training an SD, estimate can be SE =NA e:. (hiring, training, costs due to longer tenure derived by working back- SO = NA ' Dunnette, et aI. Power industry experts participated in a Dollar value contribution of Schmidt, et. al. method Hvdro Operator ::! (1982) workshop in which they discussed critical operator performance. Mean = $20,790 !::..'<: incidents of effective and ineffective SD = $14,110 ~. power plant operator performance, were SE = $ 2,530 and then made dollar contribution SD = $19,120 D.:> Nuclear Operator ::! Mean = $74,900 :;a(!> SD = $107,150 Nuclear CRO Mean = $213,730~SD = $226,600 § !:>:> SE = $ 38,860 (Jq(!> 8 Hunter & Schmidt Supervisors (N = 62) Value of products and Global estimation of 15th, 50th, Mean = $11,327 (!> (I 982) of budget analysts services and/or cost and 95th percentiles. Average SE = $ 1,120 estimated similar of having an outside difference between two endpoints ~SD = $ 8,818 values to Schmidt, contractor provide them. equals SD, estimate. % salary = 60% et al. (I 979). % mean y = NR z(!> ..... 0..... Table 4. Results of Studies Estimating ffi Values (Continued) e Reference Setting §Utility Scale Estimation Method ffi Values -< Bobko, Karren & Sales counselor supervisors (N=17) "Total Yearly Dollar Sales" Global estimation of the 15th Parkington (1983) estimated SD Sales. 4 dist. poims 6" 1 for counselor sales versus "Yearly value to the 50th, 85th and 97th percentiles e;. and performance levels. Also, . Mean = $47,967company of the overall of both sales and value. Also SE $ 9,969 'performance data was obtained for 92 .....products and services pro- gathered empirical sales data, =SO CI> actual insurance counselors. $34,533duced (considering the computed by taking the number =% salary 352% cost of having an outside = Q'of policies sold and multiply- % mean y 50% contractor provide them). ing by the average policy = tJ value in his/her area. Value. 4 dist. points B. Mean = $ 4,967 CI> SE = $ 2,089 g' SO CI>= $ 7,533 % salary = 37% 5:r % mean y = 31% ::r:s:: Sales. 3 dist points 3 Mean $56,950 :I:>;!'= SE = $15,365 :;0 (I> SO = $55,400 CI> % salary 0= 419% % mean y = 59% ~(I> Value. 3 dist. points 2::: Mean = $ 5,550 § SE = $ 2,413 I>' SO = $ 8,700 ~ % salary 41% 3= % mean y 35% g (I> = Actual Sales Data Mean = $124,882 SO = $ 52,308 % salary = 384% % mean y = 42% ~ ~ 0-N ''II ~, --~~~, ~ "",,--~-~~,-"o>,' '..=-"'-- - Table 4. Results of Studies Estimating ffi Values (Continued) c Reference Selling §Utilitv Scale Estimation Method §Q., Values .... '< Ledvinka, Simonet, Claims processed per Dollar value to the company Average dol1ar value of a pro- Neiner & Kruse ?;day for 15 insurance Mean $5,542of claims processed per (1983 ) cessed claim was estimated by = ~ claims approvers were SD NAyear. dividing total payrol1 plus = '< recorded for 2 months. SE NA ,....benefits per year by the average =% salary 42.6% '" number of processed claims per =% mean y 31.4% '" year. Assumption was that the = .0..'. wages and benefits paid to the t::1 average employee equals his/her ~8. value to the organization. Then, the standard deviation of claims g'"' processed per year (1679.29, as corrected for range restriction) s'"' times the average value/claim ::r: ($3.30) became the SD, estimate. I: 3 :~::3 Burke & Frederick Regional manufacturing Used actual yearly sales (1984) Yearly sales volume for the 69 : e:.. Hunter (1984) (N=114) provided data follow the S, et al. method surveying respondents. Mean = $3,801 '< for S, et al. SD, of "value of goods and SE = $ 239 ~. estimates, by con sider- services." Authors note SD = $2,546 '" ing the park rangers subjects were asked to % salary = 36% ~ they supervised. Only "consider what the cost % mean y = 28% two supervisors could would be of having an out- ~ not estimate the 15th side contracting firm pro- 50th-15th .... percentile. vide the products or ser- Mean = $5,101 '" vices to them." (p. 492). SE = $ 357 ~r SD = $3,813 '" % salary = 49% S' % mean y = 38% ::r:~a Wroten (1984) Groups of supervisors (N=3 to 4) of Not reported, but probably Twelve estimation methods were used, Direct. Head Operator s:» ;:3 petroleum workers in 7 different similar to Schmidt, et al. for each of six jobs. The six jobs Mean = $31,423 organizations and 16 different locations (1979) because all measures were: (1) Head Operator, (2) Outside SD = $26,663 ~(t> provided SD, estimates for six refinery were described as variants Operator, (3) Pump Operator, SE = $ 6,383 0 jobs using six different methods. of this method. (4) Instrument Technician, 'e"; (5) Outside Mechanic, (6) Welder. Direct. Outside Oper. 0(t> The twelve estimation methods were Mean = $20,468 of four types: (I) Direct Estimates SD = $16,041 §~ (including the Schmidt, et al. SE = $ 3,638 method, obtaining y estimates for ~ individuals and calculating SD, from Direct. Pump Operator a them, and obtaining percentile Mean = $13,950 (t> estimates by group consensus); (2) SD = $ 9,532 .=..'. Actual Anchored Estimates (which SE = $ 2,124 replicated the first three methods, but provided accurate 50th Direct. Ins!. Tech. percentile estimates fi rst); Mean = $35,037 (3) High Anchored Estimates (which SD = $25,004 replicated the first three methods, SE = $ 6,266 but provided a high 50th percentile estimate first); and (4) Low Direct. Outside Mech. Anchored Estimates (which replicated Mean = $25,297 the first three methods, but SD = $19,310 z (t> provided a low 50th percentile SE = $ 4,776 estimate first). Accurate anchor was derived from Direct. Welder ~ "cost accounting" and unanchored Mean = $19,708 group's 50th percentile estimate. SD = $13,430 High anchor was twice actUal, and SE = $ 3,235 low anchor was half of actual. b 1 Table 4. Results of Studies Estimating ffi Values (Continued) c Reference Setting Utilitv Scale §Estimation Method .....§!!.. Values '< Wroten (1984) Actual. Head Operator ~ Mean = $27,521 ~ SO = $17,795 ' SE = $]5,248 ~ Hil!h. Outside Mech. Mean = $79,789 (t11) SO = $36,766 Q. SE $24,344 r/:>= o::'I r/:> High. Welder Mean = $60,356 S' SO = $29,735 ::r: r:: SE = $11,432 9 ~ ::I low. Head Operator Mean = $27,294 :;0 (1) SO = $27,163 r/:> 0 SE = $ 9,617 E(")i (1) low. Outside Oper. Mean = $20,571 ~§ SO = $18,353 SE $ 6,263 (J ~Q = (1) (91) low. Pump Operator Mean = $10,358 . :.:.I. SO = $ 8,868 SE = $ 2,863 low. Ills!. Tech. Mean = $17,501 SO = $10,307 SE = $ 3,574 low. Outside Mech. .." Mean = $12,752 ~ SO = $ 7,287 ~ .... SE = $ 2,514 ~ low. Welder Mean = $10,718 SO = $ 5,761 SE = $ 792 Table 4. Results of Studies Estimating; ffi Values (Continued) gc ..... Reference Setting Utility Scale Estimation Method ffi Values -<' Bolda (1985) Estimated the job performance value for The dollars-per-hour value Gathered estimates of the Mean = $6.00lhr. ~ employees in maintenance and toolroom of the employee's performance 15th, 50th and 84th percentiles, SE =NR e:.. jobs in a manufacturing operation. and used the average SD =NR ~. difference as SDy. % salary V>= 46% % mean y = 46% 0..'.. Burke (1985) Supervisors of clerical Used the Schmidt, et al. (1979) After making the glohal estimate, Mean = $ 5,529 t/ workers made global y function of the "tota] the 50th percentile was fed back SE =$ 400 B. V> ratings for one of the yearly value of services" to two groups of supervisors. The SD = $ 3,800 g' three job classes they . considering how perfonnance first group (N=50) estimated the % salary =NR V> supervised. 132 gave contributes to the "sales 15th before the 85th percentile, % mean y = 25% estimates of the 50th value of products sold." while the second group (N=41) did S' percentile (mean = the opposite. They also provided ::r: $22,045). This mean t::self-reported dimensions used in :3 was fed back to two the estimates. Sixteen of the ~ ::::s groups. original 118 surveys produced in- consistent estimates. Dimensions (~1) used tended to follow job eval. V>0 t:: Eaton, Wing Supervisors of soldiers Used Questionnaires similar Used the GLOBAL technique of liB. EOV (r1)i Lau (1985) in 5 military occupa- to Eaton, et a!. (I 985a,b). S, et a!. (1979), the Superior Mean = $12,881 tions (MaS) provided The payoff scale was the Equivalents Technique (EQV), and SD =NR §~ data estimating the "worth to the Army" of the examined the "40-70% Rule." SE =NR value of first-tcnn soldiers, considering "such Only the 85th and the 50th per- % salary = 8]% ~(1) soldiers operating at factors as salary, output, centi]es were estimated. % mean y = 81% :3 different performance responsibility and equip- (1)g levels. The 5 MaS's ment" (p. 4). lIB. GLOBAL were identified as: Mean = $ 9,774 Infantryman (lIB), SD = NR Armor Crewman (l9E), SE = NR Vehicle Mech. (63B), % salary =61% Medical spec. (9 \B), % mean y =5]% Radio oper. (05C). Total number of super- 19E EOV visors was 270. Compu- Mean =$]3,630 ted equivalent civilian SD = NR salary levels to be SE =NR ~Ocij approximately $16,OOO/yr. % salary = 84% % mean y = 84% ...... 0 -.J Table 4. Results of Studies Estimating ffi Values (Continued) cs Reference Setting ~.Utility Scale Estimation Method ffi Values Eaton. Wing ~ Lau (1985) 19E GLOBAL ::1 e?. Continued Mean = $6,254SD '<~,= NR en SE % salary = NR= 39% .8..'. % meany = 45% (tI:)J 0 91B EOV Vi' Mean = $16,720 0' % salary ::1== 105% [/) % mean y = 105% 5' 91B GLOBAL ::r:: i= Mean = $ 9,132 p3O % salary ==57% ::1 %meany =51% It! (I) [/) 63B EOV 0 Mean = $15,068 i..=.. % salary 0= 94% (I) % mean y = 94% ~gj 63B GLOBAL pO Mean = $10,625 (Jq (I) % salary = 66% (3I) % mean y = 68% g 05C EOV Mean = $16,653 % salary = 104% % mean y = 104% 05C GLOBAL Mean = $11,150 % salary = 70% % mean y = 61% p"'Oti(Jq (I) ..... 0 00 Table 4. Results of Studies Estimating ~ Values (Continued) c::: § ...... Reference SetlinJ!. Utility Scale Estimation Method ffi Values '-< Eaton, Wing & Trainers/Supervimrs of The Superior Equivalents The difference between the median Global (SD$) method 5" Mitchell (1985) U.S. ATmY Tank Technique (EQV) derives supervisor 85th and 50th percen- Mean = $40,000 ~ commanders (N=40 and 48) the number of 85th and 15th tile dollar-valued perfoTmance SD =$235,000 '-; EOV. Crewman en Mean = $ 9,600 0 c::: SD = NR § SE = NR % salary = NR ~ % mean y = NR § (~)q GLOBAL. TransPOrt (t> Mean = $ 7,000 (:3t> SD = $7,O(X) . :.:.s (3rd-lst quartile) SE =NR % salary = NR % mean y =47% FEEDBACK. TransPOrt Mean =$ 6,000 SD = $12,000 (3rd-Ist quartile) SE = NR % salary = NR ~ % mean y = 40% ............ 0 EOV. Transport Mean =$ 10,000 Table 4. Results of Studies Estimating ~ Values (Continued) c Reference Sellin I! Utility Scale ~Estimation Method §Q, Values "<: Reilly & Smither Graduate students lob performance was measured Used CREPIO to obtain ratings Sim. Repeat Sales 5' (1985) (N=16) with prior man- through 3 job components: of performance dimensions that Mean = $1,093,641 e:. agement experience (I) selling estab. products, "<:could be compared to actual per- SO $ 170,119 en played a computeri7£d (2) selling new products. = 1;;'formance. Used the Schmidt, et al. management simulation. (3) expense control. The method to obtain SD, estimates Simulated New Sales 0....' They were provided first and third were repor- of estab. prod. sales, new prod. Mean = $156,225 sales data on 10 rep- ted in dollars, and the sales, net sales less expenses, SO = $ 24,302 t:!(t> resentatives, based second could be computed in and value of "overaJl products 0v,' on 3 job components. dollars using a formula. and services produced," Simulated Net Revenue , o'In addition, information on Mean = $ 175,600 en variable cost levels was SO $ 43,639 ='= provided. 5' Schmidt.et al. repeat sales ::r: Mean $178,725 c::= SE = $ 13,651 3 ~ SO = $ 54,604 =' % salary = 357% 10(t> en 0 Schmidt. et aJ. new sales c..:.:. Mean = $ 29.477 0 SE = (t>$ 3,374 SO = $ 13.496 ~ % salary = 60% § (J~q (t> Schmidt. et aI. net revenue 8 Mean = $ 119,605 (t> SE = $ 57,773 ~ SO = $ 231,092 % salary = 242% Schmidt. et aI.overall worth Mean = $ 83,994 SE =$ 25,247 SO =$ 100,988 % salary = 170% CREPID $ performance (~Jq Mean = $ 26,485 (t> SE =$ 1,381 I-' SO = $ 5,524 I-' I-' % salary = 54% % mean y = 49% Table 4. Results of Studies Estimating ffi Values (Continued) c:: .§... Reference Setting Utilitv Scale Estimation Method ffi Values '<: ?; Weekley, et a!. Supervisors of store Global estimation was based Schmidt, et al. (1979) estimation CREPID e:.. (1985) managers (N=IIO) pro- on subjects estimates of was used for the global method. Mean =$ 7,701 '<: vided global SO, the "yearly value of the Standard CREPID method was also % salary = 36% en ~. estimates as well as output produced" to the used, and both were compared to ratings for CREPID. company. No reference to 40% of salary. Schmidt. et a!. 0....' Subjects worked for subcontracting was made. Mean = $13,968 % salary ( t1J) a convenience store = 66% n chain. CREPID ratings % mean y = 51% tn. were obtained for 805 o' store managers. 40% of salary e:3n Mean =$ 8,850 S' Burke & Frederick Same as Burke & Same as Burke & Frederick Same as Burke & Frederick (1984) Standard (3 pt est) ::r:c:: (1986) Frederick (1984) (1984) except that estimates were made Mean = $35,192 a that omiued the 97th percentile. II>::J Proced. A (3 pI. est) :;0 Mean = $27,500 (1) en 0 Proced. B (3 pI. est) c..:.:. Mean = $28,151 n(1) Cascio & Ramos Second-level managers Not reported, but assumed Standard CREPID method, but CREPID (1986) provided CREPID ratings to bc the standard CREPID the originally-derived estimate ~~Mean = $10,081 II> for 602 first-lcvel notion of payoff 10 the of $10,081 was adjusted for SE = NR I DeSimone, et al. Surveyed supervisors (N=27) of Similar to Schmidt, et al. Similar to Schmidt, 'et a!. (1979) Mean = $3,871 O(1Q) (1986) medical claim approvers in a large SD = $1,765 .... financial service company. SE $ 334 ....= t-) % Salary = 25% Actual medical claims Payroll cost reductions Used 12 monthly averages of claims Mean y = $19,939 approver performance in a that could be achieved by processed per day for 176 approvers. SO, = $ 4,896 large financial services company. having fewer approvers process Mean was 47, SD was 11.54, extrapolated % Salary = 31% a similar number of claims. to yearly mean and SD of 11,139 and 2,735. Salary and benefits per claim were $1.79. ..., ,~ ?"'t'-~-'"""-'~~~---~"";-''fr~'''::';--~'-:':-~'0~t~) ._~-""---~.'~,, ~~"c ~,~~,,,~,~ "" Table 4. Results of Studies Estimating ffi Values (Continued) c..... :? Reference Setting Utilitv Scale Estimation Method ffi Values Q Schmidt, et al. U.S. Government employees Value of output as sold. 40% of lowest 1984 salary level in each Mean =$5,429 > (1986) across levels GS-I to GS-18. grade, averaged by weighting according to so $2,251 e=?' .= the number hired at each GS level. '< Ii/!o!. Day & Edwards Estimated utility for 43 Account Executives Similar to Schmidt, et al. (1979) Same as Schmidt, et al. (1979) Mean = $161,471 .0.'. (1987) using ratings from 34 supervisors, in a (N=17 supervisors) SO = $252,639 Midwestern U.S. transportation company. SE 1:1= $61,274 (()t> % salary = 471% (;;' % mean y = 80% o' I/o Similar to Schmidt, et al. (1979) Modified Schmidt et al. (1979) Mean = $180,382 =' by omitting the instruction, "In SO = $248,153 S' placing an overall dollar value on this SE = $60,186 :r: output, it may help to consider what % salary = 526% c the cost would be of having an outside % mean y 3= 80% firm provide these products and ~=' services." (N=17 supervisors) :;0(t> I/o 0 "Worth in dollars of an employee's % ROI method: (I) calculated Mean y = $180,920 c(..). overall job performance". the "average annual investments" (sum SO, = $ 34,103 co of salary plus incenti ve pay plus % salary = 99% benefits, plus 40% "overhead", which ~%mean y = 19% equalled $65,280 per position. (2) had g; Had N=34 supervisors estimate the ~QQ (t> percent return on this investment (ROI) 3 correspond ing to each of the seven co performance appraisal scale points. (3) Applied these figures to each Account Executive based on their actual appraisal. Similar to Cascio (1982) CREPID method as described in Cascio Mean y = $ 45,230 (1982) SO, =$ 13,392 % salary =39% % mean y =30% Similar to Schmidt, et al. (1979) 40% of average salary SO, =$ 13,723 zco ..... ..... w Table 4. Results of Studies Estimating ffi Values (Continued) c:: §... Reference Setting Utilitv Scale Estimation Method ffi Values '< Day & Edwards Estimated utility for 107 Mechanical Similar to Schmidt, et al. (1979) Same as Schmidt, et al. (1979) Mean = $41,423 ~e:. (1987) Continued Foremen using ratings from 28 supervisors (N=13 supervisors) SD = $38,698 '< in a Midwestern U.S. transportation company. SE = $10,733 ~.V> % salary = 129% % mean y = 54% 0..,' (t:JSimilar to Schmidt, et al. (1979) Modified Schmidt et aI. (1979) Mean = $134,335 1) (N=15 supervisors) SD = $258,618 D.V> SE = $ 66,775 g' % salary = 417% V> % mean y = 65% S' "Worth in dollars of an employee's %ROI Mean y = $ 95,744 ::r:c:: overall job performance" SDy = $ 14,440 3 % salary = 45% !:::>J:> % mean y = 15% (110) Same as Cascio (1982) CREPID Mean y = $43,237 V>0 SDy = $11,988 (E); % salary = 37% (1) % mean y = 28% ~Same §as Schmidt, et al. (1979) 40% of salary SDy = $12,881 (J!:q>:> (1) Greer & Cascio Estimated the performance value Value of output as sold, similar Global Estimation Model using the Mean y = $31,979 3 (1987) of route salesmen (N=62) for a to Schmidt, et al. (1979) questionnaire-based procedure of SDy = $14,636 (1) .::JMidwestern U.S. soft drink company. Schmidt, et al. (1979), completed % salary 55% ..= by supervisors (N=29). % mean y = 46% "Contribution of labor". CREPID method (Cascio Mean y = $38,435 & Ramos, 1986). SDy = $8,988 % salary = 34% % mean y = 23% "Contribution Margin of "Cost-accounting" method that calculated Mean y = $44,985 salesmen" defined as the revenue less cost per unit sold, and SDy = $15,864 revenue less variable costs. multiplied by the quantity of units sold % salary = 60% &(1) by each salesman. % mean y = 35% ..... ..... .f>o. Table 4. Results of Studies Estimating ffi Values (Continued) CE:~: . Reference Setting Utility Scale Estimation Method ffi Values '< Mathieu & Supervisors of bank Similar to Schmidt, et aJ. The original distribution of SD, 5-Head Tellers. Trimmed ~Leonard (1987) employees (Head estimates was examined for non- Mean == $2,369 Tellers, Operations normality, which was assumed to ''1 to a questionnaire differed from normal for Branch % mean y == NR similar to that used Managers and Operations Managers tJ ('> by Schmidt, et al. and marginaHy for Head Tellers. ()per. Mgr.. Trimmed 0 Vi' The 85-50 distribution (SDy2) Mean == $3,123 g' differed from normality for SO NR == Operations Managers. So, the SE == NR '" authors trimmed a number of % salary 5'17% == "outliers" from each distribu- % mean y NR ::r: == ~tion,which normali7.ed them. :3 The average SD, was used. Branch Mgr.. Trimmed P;:J! Mean ==$10,064 SD NR i'ti== ('> SE ==NR 0 % salary '"==44% c,.:.., % mean y ==NR ("'> Head Teller, Untrim ~Median ~% ==$2,150 salary ==17% &; ('> :3 Oper. Ml!r.. Untrim g('> Median ==$3,250 % salary 18% == Branch Mgr. Untrim Median ==$10,000 % salary ==44% Rich & Boudreau Supervisors of computer programmers Similar to Schmidt, et aJ. Gathered estimates of the Mean ==$15,888 (1987) (N==29) in a computer manufacturing 15th, 50th and 85th percentiles, SO ::: $14,617 organization estimated the value of and used the average difference SE $ 2,761 '-rj== ~performance for computer programmers. as SDy. % salary ::: 60% (JQ(t> % mean y ::: 47% ...... ...... VI Table 4. Results of Studies Estimating §Q, Values (Continued) c:: §.... Reference Settin I! Utility Scale Estimation Method §!!.Values '< Ed wards, et aJ. Directors of Sales, Regional Managers, and See description for Burke & Burke & Frederick (1984), Mean ~= $63,326 ~ (1988) Field Personnel Managers (N=33) in a Frederick (1984) Procedure B. Procedure B. SD = $16,177 '< National manufactUring company estimated SE = $ 2,816 V~.J performance value for the job of District Sales % salary = 174% Managers. % mean y = 72% 0...', t/ CREPID-O, followed CREPID Mean y = $42,002 (1) (') using job components from Burke & SD, = $12,170 1;;' Frederick (1984), applied to 33 District % salary = 33% o' Sales Managers. % mean y = 29% :::s VJ S' CREPID-AP, followed CREPID Mean y = $33,475 using job components from Burke & SD, = $11,342 :I:r:: Frederick (1984), applied to 33 District % salary = 31% pao Sales Managers, but used archival % mean y = 34% :::s data on performance. (:1:):0 VJ CREPID-AJ, followed CREPID Mean y = $42,318 0 using job components from Burke & SD, = $11,160 . r.:.:, (') Frederick (1984), applied to 33 District % salary = 31% (1) Sales Managers, but used archival % mean y = 26% job analyisis data on activity frequency ~§ and importance. po ~ CREPID-AA, followed CREPID Mean y = $38,293 (a1) using job components from Burke & SD, = $ 7,890 :..:.:.s Frederick (1984), applied to 33 District % salary = 22% Sales Managers, but used archival % mean y = 21% data on performance and job analysis. . As reported by Hunter & Schmidt (1982). p'i:oI ~ .... .... 0\ Utility Analysis for Decisions in Human Resource Management Page 117 Table 5. Financial One-Cohort Entry-Level Selection Utility Model Cost-Benefit Information Entry-Level Computer Programmers Current Employment 4,404 Number Separating 618 Number Selected (N.) 618 Average Tenure (1) 10 years Test Information Number of Applicants (Napp) 1,236 Testing Cost - $lO/applicantTotal Test Cost (C) $12,360 Average Test Score ( ZJ .80 SD Test Validity (r.,,) .76 SDy (per person-year) $10,413 Financial Information Variable Costs (V) 5% Tax Rate (TAX) 45% Interest Rate (i) 10% Utility Computation Unadjusted Quantity =Average Tenure X Applicants Selected = 10 Years X 618 applicants = 6,180 person-years Unadjusted Quality = Average Test Score X Test Validity X SDy = .80 X .76 X $10,413 = $6,331 per person-year Adjusted Costs = Test Costs - Tax Savings (After Taxes) = $12,360 - (.45 X $12,360), or .55 X $12,360 = $6,798 Variable Tax Discount Unadjusted Unadjusted Cost Cost Rate Adjusted Utility = Quantity X Quality X Adjustment X Adjustment X Adjustment Costs = [6,180 X $6,331 X .95 X .55 X .614] $6,798 = $12.55 million Adapted with permission from: Boudreau (1988, Table 5). Utility Analysis for Decisions in Human Resource Management Page 118 Table 6. Entry-Level Selection Utility Decision With Financial/Economic Considerations and Employee Flows Cost-Benefit Entry-Level Computer Information Programmers Current Employment 4,404 Number Separating (Ns) 618 Number Acquired (Na) 618 Average Tenure (1) 10 years Test Information Number of Applicants (Napp) 1,236 Testing Cost $10/applicant Total Test Cost (C) $ 12,360/year Average Test Score ( Z,j .80 SD Test Validity (rz,,) .76 SD, (per person-year) $ 10,413/yr. Financial Information Variable Costs (V) 5% Tax Rate (TAX) 45% Interest Rate (i) 10% Flow Information Analysis Period 10 years Test Application Period 7 years Person- Years Affected 31,282 After-Cost, After Tax, Benefit - Cost Discounted Utility Increase over Random Selection $54.32 - $.04 (Millions) = $54.28 Adapted with permission from Boudreau (1988), Table 6. Utility Analysis for Decisions in Human Resource Management Page 119 Table 7. Entry-Level Recruitment/Selection Utility Decision With Financial/Economic Considerations and Employee Flows Cost-Benefit Entry-Level Computer Information Programmers Current Employment 4,404 Number Separating (Ns) 618 Number Acquired (N.,)' 618 Average Tenure (1) 10 years Test Information Number of Applicants iN app) 1,236 Average Test Score ( ZJ .80 SD Financial Information Variable Costs (V) 5% Tax Rate (TAX) 45% Interest Rate (i) 10% Flow Information Analysis Period 10 years Test Application Period 7 years Person-Years Affected 31 ,282 Workforce Utility Results Staffing Variable Recruitment Advertising Recruitment Agency Test Validity (rx) .76 .60 Testing Cost (Cs) $10/applicant $10/applicant Recruitment Cost (C,) $ 1,250/Se1ectee $ 2,225/se1ectee Average Applicant Service Value $52,065 $60,000 Average Applicant Service Cost $36,445 $40,000 SD of Applicant Value (SDy) $10,413 $ 8,500 Value of Random Selection (Millions) $141.04 $180.50 Cost of Random Selection (Millions) -$ 4.55 -$ 8.10 Value Added by Testing (Millions) $ 54.32 $ 35.00 Cost Added by Testing (Millions) -$ 0.04 -$ 0.04 Total After-Tax, After-Cost Discounted Workforce Value $190.76 $207.45 (Millions) Adapted with permission from: Boudreau (1988), Table 7 Utility Analysis for Decisions in Human Resource Management Page 120 Table 8. Entry-Level Recruitment/Selection/Retention Utility Model With Financial/Economic Considerations Cost-Benefit Entry-Level Computer Information Programmers Current Employment 4,404 Beginning Average .Service Value $52,065 Beginning Average Service Cost $36,445 SD of Incumbent Service Value (SD,) $1O,413/person-year Number Separating (N.) 618 Number Selected (Na) t 618 t Acquisition Cost $7,000IHire Separation Cost $7,000/Separation Number of Applicants (Napf') 1,236 Average Applicant Service Value $52,065/year Average Applicant Service Cost $36,445/year Average Test Score ( ZJ .80 SD SD of Applicant Service Value (SD,) $1O,413/person-year Testing Cost $ la/applicant Variable Costs (V) 5% Tax Rate (TAX) 45% Interest Rate (i) 10% Analysis Period 10 years Workforce Utility Results Staffing Variable Option 1 Option 2 Option 3 Option 4 Test Validity (r.) 0.00 0.76 0.76 0.76 Separation Effect $0 $0 $2,707 -$2,707 After-Tax, After-Cost Discounted Work Force Value (Millions) $200.31 $242.10 $351.69 $132.50 Adapted with permission from: Boudreau (1988), Table 8 Utility Analysis for Decisions in Human Resource Management Page 121 Table 9. Interna/External Recruitment/Selection/Retention Utility Decision With Financial/Economic Considerations Cost-Benefit Entry-Level Computer Upper-Level Data System Information Programmers Manager Current Employment 4,404 1,000 Beginning Average Service Value $52,065 $57,272 Beginning Average Service Cost $36,445 $40, ()()() Number Separating 618 100 Number Selected 718 0 Number Promoted 100 100 Acquisition Cost $7,000 NA Separation Cost $7,000 $8,000 Promotion Cost NA $8,000 Number of Applicants 1,436 3,786 Average Applicant Service Value $52,065/yr. 1.10 times average Programmer value Average Applicant Service Cost $36,445/yr. 1.10 times average Programmer value Average Test Score .80 SD 2.32 SD SD of Applicant Value (SDy) SlO,413/yr. Sll,454/yr. Testing Cost $10/applicant NA Assessment Center Cost NA $1.44 million/yr. Variable Costs 5% 5% Corporate Tax Rate 45% 45% Corporate Interest Rate 10% 10% Analysis Period 10 years 10 years Utility Analysis for Decisions in Human Resource Management Page 122 Table 9. [merna/External Recruitment/Selection/Retention Utility Decision With Financial/Economic Considerations (Continued) Total Workforce Utility Results Options HRM Activity 1 2 3 4 5 Programmer Selection Validity 0.00 0.76 0.76 0.76 0.76 Programmer Promotion Effect $0 $0 $0 -$625 -$625 Manager Promotion Validity 0.00 0.00 0.35 0.35 0.35 After-Cost, After-Tax, Discounted Total Workforce Value (Millions) $249.86 $296.90 $302.51 $278.68 $198.38 Adapted with permission from: Boudreau (1988, Table 9). Utility Analysis for Human Resource Management Decisions B Utility of 8dditions in period I = I (QuantilY of Acquisilions X QualilY of AcquIsitions) -Tral1SDclion COSISof Acquisitions C A Utility or workforce in period II I Utility of = JIe&inninI Wortd'OIU (Quantity of Acquisilions X Quality of AClft6.siliolu) (t 0) = ~ (Quantity of f?llrntions X Quality of Rrlmlions) 2wntily of Job Incumbmts Quality of Job Incumbrnts) - Transaclion Costs of AcqUIsitions - Transaclion Costs of S,parations D Scpanations in period I = I F Utility or additions in period I = :2 QuantilY of S,poralions I I (Quantity of Acquisitions X Quality of Acquisitions)E Utilityof - Transaction Costs of AcqUIsitions Wortd'orce in Period II= I (Quantily of Job lncumbrnts G X Quality of Job Incumbrnts) Utilily or WorldOlU in period II = :2 - Transaclion CoStS of Acquisitions (Quantity of Acquisitions X Quality of Acquisitions) - TransaClion Costs of S,paralions ~ (Quantity of f?ltrntions X Quality of f?ltr11lions) - Transaction Costs of Acquisitions H Transaction CoStS of S,parations Seperations in - period I = I QuantilY of S,porations I I ~ continues it fut\n time periods 1I=3...F. Utility Analysis for Human Resource Management Decisions Page 123 Figure 1. Diagram of External Movement Utility Model Concepts c:: rt 1 r .. -..".. r1t-" External Acquisitions '-<: Into Job A ~ t !~!~ t ~ , Quantity of '-(f<): ' ' !~~!~~~!_~~~!~!~!~~~f (1f-)" A !I G Job A's Job A's t-1't 1 0 BeginningWorkforceUtility WorkforceUtility H (t=O) .1 ::r:(t=l) ~ ; ; ; I (Quantityof Job A Incumbents t, (Quantityof ExternalRetentions I ~ ~' ::I 1_~_2!!~!!~L~!_~~~-~_!!;~~~!;~~~ I c I X Quality of ExternalRetentions) I. T , I ::u(1) " ExternalSeparations + (Quantityof InternalRetentions (f) I \ 1 I 0 . From Job A I X Quality of Internal Retentions) 1___-I I c:: I I (t-I) I I , I: ( H) t = t + (Quantity of External Acquisitions (1) I Quantity of I ,\ \ II' I I ~I I I X Quality of External Acquisitions) I I ,External Separations, , I I I I it I ::I I - Costsof ExternalSeparations I : II>OQ I : - Costs of InternalSeparations (1)I I E! _:_-~~~~~-~!_~~~~~~!_~~~!=!~!~!;= ~! ::I! I rt Internal Movement I : t:;t From Job A to Job B l_~ Process Continues (1), I () I I (t=l) In FUture Time Periods I (1f-)"1 I Quantity of 1 --. t=2...F I ~ 11 I 0 -" ' !~!~:!;~!_~~~~~~!;~=I :(:fI) B ,! B, I ' Job B's Job B's I I Beginning Workforce Utility Workforce Utility I I (t=O) , .. (t=I). I I !-(~tltY-~f~~b-B-i~~;~t;-i ~1---(~~tltY-~f-E;t;~~l-R;t;~tl~~; 1 I I I X Qualityof Job B Incumbents)1 D II X Quality of External~etentions) I, II External Separations I + (Quantity of Internal Acquisitions :--' ! \ From Job B X Quality of Internal Acquisitions)1 1 I (t=l) , I t t I - I.. I Quantity of , I Costs of External Separations III r 'External Separations ' I-:--~~~~=-~!_!~~~:~~!_~~~!~!~!~~~ I1 1 IIIto...- Utility Analysis for Human Resource Management Decisions Page 124 Figure 2. Diagram of Imernal-External Movement Utility Model Concepts utility Analysis for Human Resource Management Decisions Page 125 Figure 3. Matrix of Research Issues in Utility Analysis Type of Research study v w X y Z Research Conceptual Empirical Data-Based Theory Content Extension Simulation Demonstration Inference Testing Outcome Evalua- * * * tion A Human Resource Planning B Recruit- ing * * C Selection D * * * * Training E * * * * Compen- sation F Internal Movement <* * G Turnover and * * * * Layoffs H Perf or- mance * * * Assess- ment I HRM Decision * * Processes J J