USF Libraries
USF Digital Collections

The impacts of the handoffs on software development

MISSING IMAGE

Material Information

Title:
The impacts of the handoffs on software development a cost estimation model
Physical Description:
Book
Language:
English
Creator:
Douglas, Michael Jay
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
COCOMO II
PSEstimate
Experiment
Design science
Dissertations, Academic -- Business Administration -- Doctoral -- USF
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Abstract:
ABSTRACT: Effective software cost estimation is one of the most challenging and important activities in software development. The software industry does not estimate projects well. Poor estimation leads to poor project planning with resulting schedule overruns, inadequate staffing, low system quality, and many aborted projects. Research on software estimation is needed to build more accurate models of the key aspects of software development. The goals of research in this dissertation are to investigate and improve the modeling of team size and project structures in current software estimation methods.Mathematical models for estimating the impacts of project team size and three variations of project structure are developed. These models accept the outputs of the COCOMO II software estimation tool, allow variation in both team size and project structure, and produce more detailed project estimates. This new extended model of COCOMO II is implemented in a decision support tool f or software estimators called PSEstimate.Following the design science research paradigm, the artifact is evaluated with an experiment with experienced software project managers. Three treatment groups: a manual (no tool) group, a COCOMO II group, and a PSEstimate group, completed two multipart software cost estimation tasks. The accuracy and consistency of the cost and schedule estimates, the participants' confidence in their estimates, and their satisfaction with and perceived usefulness of the cost estimation tool are measured.The experimental results support most of the hypotheses of the dissertation. For most tasks, individuals aided by computer-based decision support tools produce more accurate project effort estimates and are more confident in their estimates than manual estimators. There are no significant differences between the three groups on schedule estimation. A possible explanation is that experienced estimators in the manual group compensate for the inaccuracy of th eir effort estimates by adding time to their schedule estimates.The research contributions are new mathematical models for software estimation based on project team size and structure; a decision support tool (PSEstimate) that incorporates these models; and the experimental results that demonstrate improvements in software estimation by experienced project managers when the new models and tool are applied in practice.
Thesis:
Dissertation (Ph.D.)--University of South Florida, 2006.
Bibliography:
Includes bibliographical references.
System Details:
System requirements: World Wide Web browser and PDF reader.
System Details:
Mode of access: World Wide Web.
Statement of Responsibility:
by Michael Jay Douglas.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains 185 pages.
General Note:
Includes vita.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001910277
oclc - 173307733
usfldc doi - E14-SFE0001692
usfldc handle - e14.1692
System ID:
SFS0026010:00001


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001910277
003 fts
005 20070926114742.0
006 m||||e|||d||||||||
007 cr mnu|||uuuuu
008 070926s2006 flu sbm 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0001692
040
FHM
c FHM
035
(OCoLC)173307733
049
FHMM
090
HF71 (ONLINE)
1 100
Douglas, Michael Jay.
4 245
The impacts of the handoffs on software development :
b a cost estimation model
h [electronic resource] /
by Michael Jay Douglas.
260
[Tampa, Fla] :
University of South Florida,
2006.
3 520
ABSTRACT: Effective software cost estimation is one of the most challenging and important activities in software development. The software industry does not estimate projects well. Poor estimation leads to poor project planning with resulting schedule overruns, inadequate staffing, low system quality, and many aborted projects. Research on software estimation is needed to build more accurate models of the key aspects of software development. The goals of research in this dissertation are to investigate and improve the modeling of team size and project structures in current software estimation methods.Mathematical models for estimating the impacts of project team size and three variations of project structure are developed. These models accept the outputs of the COCOMO II software estimation tool, allow variation in both team size and project structure, and produce more detailed project estimates. This new extended model of COCOMO II is implemented in a decision support tool f or software estimators called PSEstimate.Following the design science research paradigm, the artifact is evaluated with an experiment with experienced software project managers. Three treatment groups: a manual (no tool) group, a COCOMO II group, and a PSEstimate group, completed two multipart software cost estimation tasks. The accuracy and consistency of the cost and schedule estimates, the participants' confidence in their estimates, and their satisfaction with and perceived usefulness of the cost estimation tool are measured.The experimental results support most of the hypotheses of the dissertation. For most tasks, individuals aided by computer-based decision support tools produce more accurate project effort estimates and are more confident in their estimates than manual estimators. There are no significant differences between the three groups on schedule estimation. A possible explanation is that experienced estimators in the manual group compensate for the inaccuracy of th eir effort estimates by adding time to their schedule estimates.The research contributions are new mathematical models for software estimation based on project team size and structure; a decision support tool (PSEstimate) that incorporates these models; and the experimental results that demonstrate improvements in software estimation by experienced project managers when the new models and tool are applied in practice.
502
Dissertation (Ph.D.)--University of South Florida, 2006.
504
Includes bibliographical references.
516
Text (Electronic dissertation) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
500
Title from PDF of title page.
Document formatted into pages; contains 185 pages.
Includes vita.
590
Adviser: Alan R. Hevner, Ph.D.
653
COCOMO II.
PSEstimate.
Experiment.
Design science.
690
Dissertations, Academic
z USF
x Business Administration
Doctoral.
773
t USF Electronic Theses and Dissertations.
0 856
u http://digital.lib.usf.edu/?e14.1692



PAGE 1

The Impacts of the Handoffs on Software Development: A Cost Estimation Model by Michael Jay Douglas A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Information Sy stems and Decision Sciences College of Business Administration University of South Florida Co-Major Professor: Alan R. Hevner, Ph.D. Co-Major Professor: Rosann Webb Collins, Ph.D. Anol Bhattacherjee, Ph.D. Kaushal Chari, Ph.D. Date of Approval: May 8, 2006 Keywords: cocomo ii, psestimate, experiment, design science Copyright 2006 Michael Jay Douglas

PAGE 2

i TABLE OF CONTENTS LIST OF EQUATIONS.............................................................................................................. ..................VI LIST OF TABLES................................................................................................................. .....................VII LIST OF FIGURES................................................................................................................ ......................IX ABSTRACT....................................................................................................................... ............................X CHAPTER 1 INTRODUCTION......................................................................................................... ............1 1.1 INTRODUCTION............................................................................................................................... ............1 1.2 SOFTWARE COST ESTIMATION DIFFICULTIES.............................................................................................3 1.3 SOFTWARE COST ESTIMATION MODELS HELP...........................................................................................5 1.4 ATTRIBUTES OF A GOOD MODEL................................................................................................................6 1.5 THE SOFTWARE HANDOFF..........................................................................................................................9 1.6 THE SOFTWARE HANDOFF AND TEAM SIZE..............................................................................................11 1.7 SOFTWARE HANDOFF AND PROCESS STRUCTURE.....................................................................................12 1.8 INTER-GROUP COORDINATION..................................................................................................................15 1.9 RESEARCH QUESTIONS.............................................................................................................................16 1.10 NEW SOFTWARE COST ESTIMATION MODEL..........................................................................................17 1.11 RESEARCH PARADIGM............................................................................................................................18 1.12 CONTRIBUTIONS............................................................................................................................... ......21 1.13 DISSERTATION FORMAT.........................................................................................................................21 CHAPTER 2 LITERATURE REVIEW.................................................................................................... ....23 2.1 INTRODUCTION............................................................................................................................... ..........23 2.2 COST ESTIMATION NEEDS........................................................................................................................26 2.3 COST ESTIMATION SOLUTIONS.................................................................................................................28 2.4 EMPIRICAL MODEL BUILDING..................................................................................................................39 2.5 COCOMO......................................................................................................................... ......................40

PAGE 3

ii 2.6 COCOMO II............................................................................................................................. ...............42 2.7 OTHER MODERN SOFTWARE COST ESTIMATION TOOLS...........................................................................43 2.8 NEW FINDINGS NOT ASSIMILATED INTO SOFTWARE COST ESTIMATION MODELS...................................43 2.9 EMPIRICAL DATASETS..............................................................................................................................4 7 2.10 OTHER VALIDATION APPROACHES.........................................................................................................47 2.11 CONCLUSIONS............................................................................................................................... .........48 CHAPTER 3 COST ESTIMATION IN COCOMO II..................................................................................49 3.1 INTRODUCTION............................................................................................................................... ..........49 3.2 PROJECT CHARACTERISTICS.....................................................................................................................49 3.3 COCOMO II OUTPUTS............................................................................................................................57 3.4 MODEL TYPES............................................................................................................................... ...........61 3.5 EFFORT ESTIMATION............................................................................................................................... .62 3.6 SCHEDULE............................................................................................................................... .................63 3.7 STAFFING............................................................................................................................... ..................64 3.8 COCOMO II OVERVIEW..........................................................................................................................64 CHAPTER 4 COMMUNICATION OVERHEAD.......................................................................................65 4.1 INTRODUCTION............................................................................................................................... ..........65 4.2 COMMUNICATION OVERHEAD DEFINITION...............................................................................................65 4.3 QUANTIFYING COMMUNICATION OVERHEAD...........................................................................................67 4.4 COOPERATING PROGRAM MODEL COPMO...........................................................................................69 4.5 COMMUNICATION OVERHEAD CONTRIBUTIONS.......................................................................................70 CHAPTER 5 EXTENDED ESTIMATION MODEL...................................................................................71 5.1 INTRODUCTION............................................................................................................................... ..........71 5.2 MODEL OVERVIEW............................................................................................................................... ....72 5.3 EXTENDED EXAMPLE INFORMATION........................................................................................................74

PAGE 4

iii 5.4 USING THE COCOMO II OUTPUTS..........................................................................................................75 5.5 MODELING THE WORK BREAKDOWN STRUCTURE IN PROCESS STRUCTURES...........................................79 5.6 MAPPING OF THE THREE-TIER PROCESS STRUCTURE...............................................................................82 5.7 MAPPING OF THE TWO-TIER PROCESS STRUCTURE..................................................................................85 5.8 MAPPING OF THE ONE-TIER PROCESS STRUCTURE...................................................................................86 5.9 POPULATING STAFFING INTO THE PROCESS STRUCTURES........................................................................87 5.10 EFFORT CALCULATION...........................................................................................................................91 5.11 THREE-TIER STRUCTURE........................................................................................................................91 5.12 TWO-TIER STRUCTURE...........................................................................................................................99 5.13 ONE-TIER STRUCTURE.........................................................................................................................104 5.14 STAFF LOADING............................................................................................................................... ....105 5.15 OPTIMIZATION............................................................................................................................... .......105 5.16 CONCLUSION............................................................................................................................... .........107 CHAPTER 6 DECISION SUPPORT TOOL..............................................................................................10 8 6.1 EXAMPLE TEST RUN............................................................................................................................... 108 6.2 TOOL DISCUSSION............................................................................................................................... ...110 6.3 TOOL CONSTRUCTION............................................................................................................................113 CHAPTER 7 EXPERIMENTAL VALIDATION.......................................................................................115 7.1 INTRODUCTION............................................................................................................................... ........115 7.2 STUDY RATIONALE............................................................................................................................... .116 7.3 INSTITUTIONAL REVIEW.........................................................................................................................118 7.4 RESEARCH QUESTION.............................................................................................................................11 8 7.5 HYPOTHESES............................................................................................................................... ...........119 7.6 PRETEST............................................................................................................................... ..................120 7.7 PILOT TEST............................................................................................................................... ..............121 7.8 MAIN STUDY............................................................................................................................... ...........121

PAGE 5

iv 7.9 TRAINING............................................................................................................................... ................122 7.10 EXPERIMENTAL TASKS.........................................................................................................................122 7.11 EXPERIMENTAL TASK 1........................................................................................................................123 7.12 EXPERIMENTAL TASK 2........................................................................................................................124 7.13 POST EXPERIMENT QUESTIONNAIRE....................................................................................................126 CHAPTER 8 RESULTS AND DISCUSSION...........................................................................................128 8.1 INTRODUCTION............................................................................................................................... ........128 8.2 TREATMENT BREAKDOWN.....................................................................................................................129 8.3 DATA ANALYSIS OVERVIEW..................................................................................................................129 8.4 EXPERT VALIDATION.............................................................................................................................13 0 8.5 ACCURACY............................................................................................................................... ..............130 8.6 CONSISTENCY............................................................................................................................... ..........136 8.7 CONFIDENCE............................................................................................................................... ...........137 8.8 SATISFACTION AND PERCEIVED USEFULNESS........................................................................................139 CHAPTER 9 CONCLUSIONS AN D CONTRIBUTIONS........................................................................144 9.1 INTRODUCTION............................................................................................................................... ........144 9.2 CONTRIBUTIONS TO RESEARCH..............................................................................................................144 9.3 CONTRIBUTIONS TO PRACTICE...............................................................................................................146 9.4 LIMITATIONS AND KEY ASSUMPTIONS...................................................................................................148 9.5 FUTURE WORK............................................................................................................................... ........149 APPENDICES..................................................................................................................... ........................158 APPENDIX A: KEMERER DATASET.................................................................................................... .159 APPENDIX B: MERM AID-2 DATASET.................................................................................................. 160 APPENDIX C: LINGO SCRI PT FOR TIER-THREE................................................................................161 APPENDIX D: INSTITUTIONAL REVIEW BOARD APPROVAL........................................................163

PAGE 6

v APPENDIX E: EXPERIME NTAL MATERIALS.....................................................................................164 ABOUT THE AUTHOR............................................................................................................... END PAGE

PAGE 7

vi LIST OF EQUATIONS EQUATION 2.1 EFFORT EQUATION.................................................................................................................32 EQUATION 2.2 SIZE EQUATION......................................................................................................................32 EQUATION 2.3 BASIC EFFORT EQUATION.......................................................................................................39 EQUATION 2.4 BASIC COCOMO EQUATION.................................................................................................40 EQUATION 2.5 COCOMO EFFORT EQUATION...............................................................................................41 EQUATION 2.6 INTERMEDIATE COCOMO EFFORT EQUATION......................................................................42 EQUATION 3.1 ECONOMY OF SCALE EQUATION.............................................................................................54 EQUATION 3.2 NOMINAL EFFORT...................................................................................................................63 EQUATION 3.3 SCHEDULE ESTIMATION..........................................................................................................63 EQUATION 3.4 STAFFING EQUATION..............................................................................................................64 EQUATION 4.1 COMMUNICATION PATHS FOR N PEOPLE.................................................................................66 EQUATION 4.2 PREDICTION EQUATION FOR COMMUNICATION OVERHEAD...................................................68 EQUATION 4.3 COPMO EQUATION...............................................................................................................69 EQUATION 5.1 COMMUNICATION PATHS FOR N PEOPLE.................................................................................92 EQUATION 5.2 PREDICTION EQUATION FOR COMMUNICATION OVERHEAD...................................................92 EQUATION 5.3 EFFORT MULTIPLIERS DUE TO INTRA-GROUP COMMUNICATION...........................................94 EQUATION 5.4 TIER-THREE EFFORT MAPPING EQUATIONS...........................................................................95

PAGE 8

vii LIST OF TABLES TABLE 2-1 BASIC COCOMO CONSTANTS.....................................................................................................41 TABLE 3-1 UNADJUSTED FP TO SLOC CONVERSION RATIOS........................................................................52 TABLE 3-2 SCALE FACTORS...........................................................................................................................54 TABLE 3-3 POST-ARCHITECTURE EFFORT MULTIPLIERS................................................................................56 TABLE 3-4 EARLY DESIGN EFFORT MULTIPLIERS..........................................................................................57 TABLE 3-5 PLANS AND REQUIREMENTS ACTIVITY DISTRIBUTION.................................................................59 TABLE 3-6 PRODUCT DESIGN ACTIVITY DISTRIBUTION.................................................................................59 TABLE 3-7 PROGRAMMING ACTIVITY DISTRIBUTION.....................................................................................60 TABLE 3-8 INTEGRATION AND TEST ACTIVITY DISTRIBUTION.......................................................................60 TABLE 3-9 WORK BREAKDOWN STRUCTURE FOR A MEDIUM SIZE PROJECT..................................................60 TABLE 3-10 SOFTWARE COST ESTIMATION MODEL TYPES............................................................................62 TABLE 4-1 COMMUNICATION OVERHEAD PERCENTAGE AS A GIVEN TEAM SIZE...........................................68 TABLE 4-2 COMMUNICATION PATHS ADDED TO COMMUNICATION OVERHEAD............................................68 TABLE 4-3 COPMO AND COMMUNICATION OVERHEAD...............................................................................69 TABLE 5-1 PLANS AND REQUIREMENTS ACTIVITY DISTRIBUTION.................................................................76 TABLE 5-2 PRODUCT DESIGN ACTIVITY DISTRIBUTION.................................................................................76 TABLE 5-3 PROGRAMMING ACTIVITY DISTRIBUTION.....................................................................................77 TABLE 5-4 INTEGRATION AND TEST ACTIVITY DISTRIBUTION.......................................................................77 TABLE 5-5 PLANS AND REQUIREMENTS PHASE FOR A 40 KSLOC PROJECT...................................................78 TABLE 5-6 COMPLETE WORK BREAKDOWN STRUCTURE FOR EXTENDED EXAMPLE.....................................78 TABLE 5-7 WORK BREAKDOWN STRUCTURE MAPPING...................................................................................79 TABLE 5-8 ADJUSTED WORK BREAKDOWN STRUCTURE...............................................................................81 TABLE 5-9 EXAMPLE TEAM SIZES..................................................................................................................91 TABLE 7-1 RANDOMIZING TO TREATMENTS................................................................................................117 TABLE 7-2 RESEARCH MODEL.....................................................................................................................119 TABLE 7-3 EXPERIMENTAL TASKS OVERVIEW.............................................................................................126

PAGE 9

viii TABLE 8-1 TREATMENT BREAKDOWN.........................................................................................................129 TABLE 8-2 EXPERTS RATINGS OF EFFORT AND SCHEDULE FOR TASKS........................................................130 TABLE 8-3 RESULTS FOR TASK 1 FOR EFFORT.............................................................................................131 TABLE 8-4 RESULTS FOR TASK 2 FOR EFFORT.............................................................................................131 TABLE 8-5 BOOTSTRAP P-VALS FOR EFFORT................................................................................................132 TABLE 8-6 RESULTS FOR TASK 1 FOR SCHEDULE.........................................................................................132 TABLE 8-7 RESULTS FOR TASK 2 FOR SCHEDULE.........................................................................................132 TABLE 8-8 BOOTSTRAP P-VALS FOR SCHEDULE...........................................................................................133 TABLE 8-9 WELCH'S ANOVA FOR EFFORT AND SCHEDULE........................................................................133 TABLE 8-10 BOOTSTRAP P-VALS FOR EFFORT..............................................................................................134 TABLE 8-11 BOOTSTRAP P-VALS FOR SCHEDULE.........................................................................................134 TABLE 8-12 ACCURACY RESULTS VS. EXPERT.............................................................................................135 TABLE 8-13 LEVENE TEST FOR EFFORT AND SCHEDULE..............................................................................136 TABLE 8-14 PIVOT TABLE OF CONFIDENCE TYPE RESULTS.........................................................................139 TABLE 8-15 RESULTS OF TOOL VS. NO TOOL FOR CONFIDENCE..................................................................139 TABLE 8-16 ITEM-TOTAL FOR SATISFACTION..............................................................................................141 TABLE 8-17 ITEM-TOTAL AND CRONBACH’S ALPHA FOR PERCEIVED USEFULNESS....................................141 TABLE 8-18 SATISFACTION AND TREATMENT MEANS.................................................................................142 TABLE 8-19 BOOTSTRAP P-VALS FOR SATISFACTION AND PERCEIVED USEFULNESS....................................142 TABLE 9-1 COCOMO II SCHEDULE REDUCTION MULTIPLIER....................................................................147

PAGE 10

ix LIST OF FIGURES FIGURE 1-1 TYPICAL PROJECT RESOLUTIONS..................................................................................................2 FIGURE 1-2 THREE-TIER MODEL....................................................................................................................13 FIGURE 1-3 TWO-TIER MODEL.......................................................................................................................14 FIGURE 1-4 ONE-TIER MODEL.......................................................................................................................14 FIGURE 1-5 RESEARCH MODEL......................................................................................................................17 FIGURE 1-6 NEW SOFTWARE COST ESTIMATION MODEL...............................................................................18 FIGURE 1-7 DESIGN SCIENCE RESEARCH MODEL...........................................................................................20 FIGURE 5-1 MODEL OVERVIEW......................................................................................................................74 FIGURE 5-2 EFFORT BREAKDOWN FOR THREE-TIER......................................................................................83 FIGURE 5-3 TWO-TIER EFFORT BREAKDOWN................................................................................................86 FIGURE 5-4 ONE-TIER PROCESS STRUCTURE.................................................................................................87 FIGURE 5-5 THREE-TIER MODEL....................................................................................................................88 FIGURE 5-6 TWO-TIER MODEL.......................................................................................................................89 FIGURE 5-7 ONE-TIER MODEL.......................................................................................................................90 FIGURE 5-8 EFFORT BREAKDOWN FOR THREE-TIER......................................................................................93 FIGURE 5-9 TWO-TIER EFFORT BREAKDOWN..............................................................................................100 FIGURE 6-1 SCREENSHOT OF ESTIMATING SOFTWARE SIZE.........................................................................111 FIGURE 6-2: SCREENSHOT OF DEVELOPED TOOL SIMULATION RESULTS...................................................112 FIGURE 6-3: SCREENSHOT OF DEVELOPED TOOL AFTER OPTIMIZATION....................................................113 FIGURE 7-1 DESIGN SCIENCE RESEARCH MODEL.........................................................................................115 FIGURE 8-1 EMPIRICAL RESEARCH MODEL..................................................................................................128

PAGE 11

x The Impacts of the Handoffs on Software Development: A Cost Estimation Model Michael Jay Douglas ABSTRACT Effective software cost estimation is one of the most challe nging and important activities in software development. The so ftware industry does not estimate projects well. Poor estimation leads to poor projec t planning with resulting schedule overruns, inadequate staffing, low system quality, a nd many aborted projects. Research on software estimation is needed to build more accurate models of the key aspects of software development. The goals of research in this dissertation are to investigate and improve the modeling of team size and project structures in current software estimation methods. Mathematical models for estimating the im pacts of project team size and three variations of project struct ure are developed. These m odels accept the outputs of the COCOMO II software estimation tool, allow variation in both team size and project structure, and produce more de tailed project estimates. This new extended model of COCOMO II is implemented in a decision suppo rt tool for software estimators called PSEstimate.

PAGE 12

xi Following the design science research paradi gm, the artifact is evaluated with an experiment with experienced software pr oject managers. Three treatment groups: a manual (no tool) group, a COCOMO II group, and a PSEstimate group, completed two multipart software cost estimation tasks. The accuracy and consistency of the cost and schedule estimates, the participants’ confiden ce in their estimates, and their satisfaction with and perceived usefulness of the cost estimation tool are measured. The experimental results support most of the hypotheses of the dissertation. For most tasks, individuals aided by compute r-based decision support tools produce more accurate project effort estimates and are more confident in their estimates than manual estimators. There are no significant differences betwee n the three groups on schedule estimation. A possible explanation is that experienced estimator s in the manual group compensate for the inaccuracy of their effort estimates by adding time to their schedule estimates. The research contributions are new mathem atical models for software estimation based on project team size a nd structure; a decision suppor t tool (PSEstimate) that incorporates these models; and the experime ntal results that demonstrate improvements in software estimation by experienced projec t managers when the new models and tool are applied in practice.

PAGE 13

1 CHAPTER 1 INTRODUCTION When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers your knowledge is of a meager and unsatisfactory kind; it may be th e beginning of knowledge, but you have scarcely in your thoughts advan ced to the stage of science. William Thomson (Lord Kelvin), 1891 1.1 Introduction Software cost estimation remains an im portant unsolved practical problem in software engineering (Lewis 2001). Software co st estimation has failed, in most cases, to accurately predict the ac tual costs or the time needed to develop the system (Vijayakumar 1997). Project managers have the responsibility to make accurate estimations of cost and effort, but without good software cost estimation tools, the effectiveness of software project management is reduced (Agarwal, Kumar et al. 2001). A good software cost estimation model can significantly help soft ware project managers make informed decisions on how to manage resources, cont rol and plan a project, and deliver a project on time, on schedule, and on budget (Chen, Me nzies et al. 2005). The problems in estimation are exacerbated by continued ch anges in software technologies. Thus, software cost estimation models require c onstant modification to stay current (Jones 2002). Further research in software cost estimation is clearly needed. In the United States, more than 250 bill ion dollars is spen t each year on IT application development (The Standi sh Group 2003), but in 1994, only 16.2% of

PAGE 14

2 software development projects were comp leted both on-time and on-budget (Standish Group Inc. 1994). Almost ten years later, on ly 32% of projects are successful (The Standish Group 2003). Some companies can exp ect that a typical software development project will be delivered a year late at double the budge t (Paulk 1995). Figure 1-1 illustrates typical project resolutions, and highl ights how late many projects are delivered. 22% of the projects in this data set took mo re than twice as long to complete than was originally expected. Cancelled 29% On-Time 26% 21%-50% Late 8% 51%-100% Late 9% 101%-200 Late 16% More than 200% Late 6% Less than 20% Late 6% Figure 1-1 Typical Project Resolutions (McConnell 2000) Poor project planning and management re sults in companies taking a collective loss of $80 billion annually on new software projects that eventual ly get cancelled (King 1997). Cancelled projects are especially prob lematic given that projects are commonly cancelled in the later stages of development after signi ficant resources have been

PAGE 15

3 expended on behalf on the project. The averag e cancelled project in the United States is about a year behind schedule and has consum ed 200 percent of its expected budgeted by the time it has been cancelled (Jones 1993). A goal of project managers is to guide to completion a software development project. A successful software development pr oject will deliver to the users a system desired by the customers. If the people that financially support the software development and the users of the software are satisfied, th e system can be considered successful. Part of the desired system could be parameters such as: total system development cost, scheduled delivery date, functionality, a nd quality. The project manager needs to supervise the software development project so that the desired system is delivered. Reasonable estimates of cost, schedule, and sta ff are critical guides that help the project manager successfully control the software development activities 1.2 Software Cost Estimation Difficulties Estimating software development costs with accuracy is very difficult. The most common approach for improving software cost estimates is to use empirical models (Mukhopadhyay, Vicinanza et al. 1992). The pr edictive accuracy of software cost estimation models is not sati sfactory, since model-based esti mates are generally within 25% of the actual cost or schedule, one-ha lf of the time (Ferens and Christensen 2000). This means that more than one-half of estimates are off by more than 25 percent, when comparing the actual versus estimated me tric. When poor results are found using software cost estimation models, many research ers suggest calibrating the parameters of a model to a specific environment (Kemerer 1987; van Genuchten and Koolen 1991;

PAGE 16

4 Andolfi 1996). However, results from calibratin g software cost estimation models show that the predictive accuracy does not alwa ys improve (Ferens and Christensen 1998). The additional goal of being able to predict the costs and schedule at the beginning of the project can prove to be more challeng ing. “Early prediction of completion time is absolutely essential fo r proper advance planning and aversion of the possible ruin of a project” (Pillai and Na ir 1997 p.485). Nevertheless, using the entire suite of available software cost estimati on models, researchers find that there is no evidence that software models are effective at estimating projects at an early stage of system development (Vijayakumar 2002). Yet, estimation does not stop because it is inaccurate. Instead of using a model, softwa re cost estimation continues to be most commonly conducted by experts, sometimes us ing a Bayesian approach to manage the uncertainty (Stamelos, Angelis et al. 2003). McConnell suggests there is a lack of understanding as to what developing software means. The difficulty in creati ng good software cost estimation models is directly related to the lack of unders tanding about software development. The problems in developing toda y’s and tomorrow’s systems are overwhelming; they require many di fferent types of problems to be solved. No other scientific or engi neering discipline relies on a single technique for addressing problems, so why are we, so-ca lled professional engineers (and computer scientists), st upid enough to think that our field is fundamentally different in this respect? So, what do we need to do? First, industrial management has to understand that software e ngineering is not an engineering discipline like so many others (yet ) and that standards, methods, and tools are all likely to be wrong (once we really understand what developing software means) (McConnell 2000 p.17).

PAGE 17

5 The current software cost estimation t ools have not yet reached a level of accuracy required for proper advanced pla nning. Research is needed to improve the understanding of software development and th en use that knowledge gained to create better software cost estimation models. 1.3 Software Cost Estimation Models Help A software cost estimation model prov ides a formal method for estimating software costs and schedule. Because of the lack of predictive validity, some project managers believe that using formal methods to estimate software cost s is a wasted effort and instead use intuitive judgment (Paulk 1995) or external sources such as senior project managers desires for cost estima tes (Agarwal, Kumar et al. 2001). Senior management needs are not usually based on the capabilities of the development staff. These needs are therefore subject to schedule and budget overruns. Even with the predictive pr oblems of software cost estimation models, the models prove to be better than a ny alternative method of estimation. For example, simple statistical models have been shown to be s uperior to using human judgment even though the statistical models were created by the humans (Paulk 1995). Cons istent answers when given the same input are one reason there is an advantage of using models over humans. An incomplete model is better than no model at all; therefore, research is conducted to improve models rather than using other methods of estimation, such as expert opinion.

PAGE 18

6 1.4 Attributes of a Good Model There are three requirements for a software cost estimation model that will make accurate predictions of software effort and schedule. The first requirement is that the estimation model is built on a so lid foundation of prior resear ch and empirically tested. Software cost estimation models have two problems. First, “The domain of software effort estimation lacks a strong causal model based on deep principles and is situated within an often-changing, highly contex t-dependent task environment” (Mukhopadhyay, Vicinanza et al. 1992 p.156). Second, attempts at validating software cost estimation models have been largely unsuccessful (Mukhopadhyay, Vicinanza et al. 1992). Since there is a lack of theoretical sup port describing the complicated process of how software development impacts software deve lopment costs, using historical data as a basis for software cost estimation is very in sightful. Having an organization collect data during software development is the first st ep in trying to improve estimates. Boeing Information Systems used historical data and drastically increased the quality of software estimates. Without historical data, the varian ce in effort ranged from -145% to +20%, whereas with historical data the varian ce was reduced to -20% to +20% (Vu 1997). Boeing Information Systems still encountered cost overruns, but moving from 145% to only 20% was a big improvement. By m easuring and documenting the software development process, future estimates are based on empirical data rather than pure speculation. The second requirement of a good estima tion model is that the development process follows a repeatable process. A soft ware development organization that follows a

PAGE 19

7 repeatable process is more mature because a higher amount of discipline is instilled into software development activities. The maturity of an organizations software process influences its ability to meet costs, quali ty, and schedule target (Curtis 1992). In 1994, 75% of all software organizations did not use a disciplined approach to development software. “The immature software organizati on is reactionary and managers are usually focused on fighting the fires that a more matu re process might have prevented” (Curtis 1992 p.2). Having project managers react to c ontingencies, rather than planning and controlling the project, only makes project planning more difficult. Research has found that the inability to estimate software devel opment accurately is the fault of an immature organization (Curtis 1992). The best predictor of cost in an immature organization is the capability of the staff (Paulk 1995). Heroic e fforts by an individual are needed in order for an immature organization to deliver proj ects within planned targets. Software cost estimation models have limited use in immatu re organizations. However, the value of software cost estimation models increases as the organization becomes more mature. For that reason, it is not su rprising that most high maturity companies use cost models for their software cost estimation (P aulk, Goldenson et al. 2000). The most common method available to project managers for increasing the quality of the organization’s software devel opment processes is to use the Capability Maturity Model. The Carnegie Mellon Soft ware Engineering Institute’s Capability Maturity Model (1995) (CMM) is a framework for improving the software development process based on the concepts of Tota l Quality Management and continuous improvement. Research has shown that the pred ictability, control, and the effectiveness

PAGE 20

8 of the processes are signif icantly improved by adopting th e CMM (Humphrey, Snyder et al. 1991; Lipke and Butler 1992; Dion 1993; Paulk, Weber et al. 1993). By adopting key process areas, software development proce sses mature, allowing for an improvement in software development. Another model of software process quality improvement is ISO 9001. ISO 9001 was created at the same time the CMM wa s created in 1987. The US Department of Defense sponsored the CMM where as the Inte rnational Organization for Standardization in Geneva, Switzerland created the ISO 9001 model. ISO 9001 and more specifically ISO-9000-3, which governs the software deve lopment process, are commonly needed by businesses that want to develop and sell soft ware in the European Union. Both the CMM and ISO 9001 embody the philosophy, “To estimate the time and cost of next time, you must know and be able to repeat what you did last time” (Putnam and Myers 1997 p.105). On August 11, 2000, a new process model, the CMMI-SE/SW Version 1.0, officially replaced the CMM. The Capabil ity Maturity Model In tegration (CMMI) was created to support process and product improvement, and to reduce redundancy and eliminate inconsistency experienced by thos e using multiple standalone models. The CMMI combines all relevant proces s models into one product suite. The ISO 9001:2000 standard makes obsolete the preceding ISO 9001 standards. Organizations that are ISO 9001 compliant have to update their quality system and be recertified at the new ISO 9001:2000 standard to conduct business in the European Union. The continual improvement of proce ss models highlights the importance of

PAGE 21

9 having a repeatable process. With the contin ual improvement of process models, software development estimation can advance. A third method of process improvement that can be applied to software is Six Sigma. A process that has achieved Six Sigm a will produce no more than 3.4 defects per million opportunities (Harry and Lawson 1992). The third requirement for a good estimati on model is that the model includes relevant factors that vary with project metrics. This dissert ation argues that two relevant factors, process structure a nd inter-group coordination are mi ssing from current software cost estimation models. 1.5 The Software Handoff To advance software cost estimation, mode ls must include one major activity of software development, the software handoff. A software handoff can explain differences in inter-group coordination between different process models. The software handoff is introduced and this di ssertation will explain how the so ftware handoff affects software development. A software handoff occurs when one pe rson or group’s software-developmentlifecycle-work-product output is given to a nother person or group as input to another work-product. Examples of a software ha ndoff include the analysts’ requirements document being given to the designers, the designers’ system design being given to the programmers, and the programmers’ code bei ng given to the tester. Unless one person comes up with an idea for a system, creat es the requirements, designs the system, implements the design, tests the code, and us es the final system, a software handoff will

PAGE 22

10 occur. The term handoff invokes an analogy to bo th football and air tr affic control. When an airplane moves from one controller to another, it is “handed off” to the next responsible controller. The term handoff is also used in wireless networking terminology when one call moves from one cell tower to another cell tower because of movement in the wireless device. With a software handoff, an artifact moves from one person or group to another. The software handoff creates a potential communication problem in software development. A software handoff can be thought of as an information flow. “It is clear that information flow impacts produc tivity (because developers spend time communicating) as well as quality (because developers need information from one another in order to carry out their tasks well)” (Seaman and Basili 1997 p.550). “Communication problems occurred in the transition between phases when groups transferred intermediate work products to succeeding groups” (Curtis, Krasner et al. 1998 p.1281). The software handoff is one of the culprits of communication problems during software development. The software handoff is a process loss a nd leads to inefficiency, but software handoffs can be anticipated during developm ent and can be managed. Properly managing the handoff will increase efficiency. The e ffects of the software handoff are most commonly seen in integration testing when re work is needed to fix misunderstandings caused by communication problems during deve lopment. Since the handoff is required for all large systems, proper management is required. Software handoffs have different magnitudes. Handing off 100,000 lines of code is a large handoff compared to handing

PAGE 23

11 off only 1000 lines of code. Some software deve lopment processes, such as a project that has many different specialized groups all wo rking together, have more handoffs than other processes. The number of handoffs in a project can be controlled by the way the project team is structured. If an analyst does both requirement s definition and design, this eliminates the handoff of the requirements document to the design group. Software handoffs are unavoidable during software development. Any software development process that requir es coordination between groups is going to have software handoffs. More interfaces mean more softwa re handoffs. Bigger so ftware development projects are going to have bigger software ha ndoffs. The amount of information that must be communicated in the handoff is anothe r aspect of the software handoff. Different software development projects are going to need different process structures based on the size of the pr oject, the number of people working on development, and the amount of schedule time to complete development. Creating an order entry website will probably not need the same process structure that a large military project needs for system development. 1.6 The Software Handoff and Team Size Up to a point, a larger team allows more work to be done in a given amount of time. However, as teams get larger, the comp lexity of the software handoff grows. At some point, creating a bigger team will no longer be efficient. There exists an equilibrium point which maximizes the efficiency of the work to be done. The team size of a project group will aff ect the software handoff. Twenty people handing over an artifact to twen ty other people is different from one person handing an

PAGE 24

12 artifact to one other person, even if the ar tifacts are the same. Splitting up development tasks between more teams re quires more handoffs, but the handoffs are smaller. For every software development project, the process of development will dictate a process structure, and the process structure will dictate the number of handoffs. 1.7 Software Handoff and Process Structure “A software group should have betw een five and eight members. The overall design should be portioned into successively smaller chunks, until the development group has a chunk of so ftware to develop that minimizes intra-group and inter-group communications. The chunks should be designed to hide difficult de sign decisions” (Simmons 1991 p.461). Since Simmons suggests separating diffi cult design decisions, the V-Model of software development is used to partition the activities of software development into different groups. This disserta tion details three different stru ctures based on the V-Model with each structure having different amounts of partitioning. This dissertation will study the impact of the software handoff on three different types of software development process struct ures. The first structure is a Three-Tiered model as shown in Figure 1-2. The boxes in the figure represent different development groups. In the three-tiered group, there ar e requirements, design, implementation/unit test, integration test, and customer acceptanc e groups. Each group will have a variable number of team participants with the minimum number being one.

PAGE 25

13 Requirements Design Customer Acceptance Integration Testing Implementation/Unit Testing Desired System Delivered System Figure 1-2 Three-Tier Model Figure 1-3 shows a Two-Tied model. In this model, the requirements and design teams are combined to form the analysis and design team. The customer acceptance and integration test teams are also combined to form the integration/customer acceptance team. Reducing from five to three groups a llow for a reduction of software handoffs.

PAGE 26

14 Requirements/ Design Desired System Delivered System Integration Testing/ Customer Acceptance Implementation/ Unit Testing Figure 1-3 Two-Tier Model Figure 1-4 shows a One-Tier model. In this model, all system development activities take place in one group. There is no formal software handoff, but very little process to organize complexity. Also, for large groups, communication costs are higher in the One-Tier model with the same number of staff as compared to the other groups. All Systems Development Desired System Delivered System Figure 1-4 One-Tier Model

PAGE 27

15 1.8 Inter-group Coordination Inter-group coordination is a CMM Level 3 key process area. According to the CMM (Paulk, Weber et al. 1993), inter-group co ordination is used to establish a means for the software engineering group to particip ate actively with othe r engineering groups so the project is better able to satisfy the customer’s needs effectively and efficiently. Examples of engineering groups that need to be coordinated with customers and endusers are: software engineering, software estimating, system test, software quality assurance, software configuration ma nagement, contract management, and documentation support. Communication proble ms during software development should be addressed by inter-group coordination. Inte r-group coordination includes the technical working interfaces and interactions between groups. The software handoff is a way to understand inter-group coordination. Inter-group coordination is planned and managed to ensure that quality and integrity exists throughout the entire software development process. To have satisfied the requirements of the inter-group coordination key process, measurement must be made to the system under development to ensure proper intergroup coordination. Examples of measurement activities include: measuring the actual effort and the resources expanded by the so ftware engineering group for support to other engineering groups, and measuring the actual effort and other resources expanded by the other engineering groups in support of the software engineering groups. One example of an inter-group coordination activity includes when representatives of the pr oject engineering groups c onduct periodic reviews and

PAGE 28

16 interchanges. These interchanges are softwa re handoffs. By studying software handoffs, more knowledge about software development can be understood. 1.9 Research Questions This dissertation is based on three res earch questions. The research questions guide this dissertation th rough the nine chapters. Research Question 1: Can a software cost estimation model be built that reflects the effect of both inter-group coordi nation and intra-group communication? Research Question 2: Can a software cost estimation tool be built for project managers that implements inter-group coordination, intra-gr oup communication and process structure? Research Question 3: Does an experiment demonstrate the effectiveness of the new software cost estimation model? Figure 1-5 displays the research model us ed in this dissertation. The research model is derived from the previous research questions. Three different types of relevant factors will be studied. A ba seline where no support is given is the first type of model support. The second type is allowing for proj ect size support. The third is a model that provides support for inter-group coordination and software handoffs The experimental effectiveness of the estimation model is meas ured by five variables. These variables are accuracy, consistency, confidence, satis faction and perceived usefulness.

PAGE 29

17 Accuracy of Software Estimate Consistency of Software Estimates Method of EstimationNo Model State-of-the-practice model (COCOMO II) State-of-the-practice model that includes the effects of inter-group coordination and intra-groupcommunication Confidence of Software Estimates Satisfaction of Estimation Technique Perceived Usefulness of Estimation Technique Figure 1-5 Research Model 1.10 New Software Cost Estimation Model Figure 1-6 shows the new software co st estimation model that has been developed. To estimate a project some basic in formation about a project is needed. These project characteristics focus on de scribing the size of the project The first characteristic is system size. In addition, any factor that can make the project easier or harder to conduct also needs to be quantified. The e ffort multipliers and scale factors are the methods in this model to quantify the different difficulties.

PAGE 30

18 The project characteristics are then entered into the COCOMO II algorithm. COCOMO II returns an effort estimate in man-months, a schedule estimate in months, and a detailed work breakdown structure that will quantif y how much effort each particular software development activity will need. Next, different pr ocess structures with configurable team sizes are used to come up with a modified effort and schedule estimate. A new measure, staff loading, was also created. This measure represents the percentage of time that the groups in the tw o-tier and three-tier processes are assigned to tasks. Project Characteristics System Size Scale Factors EffortMultipliers COCOMO II Outputs Effort Schedule Work Breakdown Structure Software Development Process Structure One-Tier Structure Two-Tier Structure Three-Tier Structure Team Size Requirements Team Size Design Team Size Implementation Team Size Integration Testing Team Size Acceptance Testing Team Size New Estimation Model Effort Schedule Staff Loading Inter-group Coordination Intra-group Communication Figure 1-6 New Software Cost Estimation Model 1.11 Research Paradigm Information Systems research can be broken down into two complementary paradigms. The first paradigm is behavioral science. A goal of th e behavioral science

PAGE 31

19 paradigm is to develop and verify theories of individual and organi zational behavior. The behavioral science paradigm follows a natu ral science orientation where researchers measure the naturally-occurring or evoked beha vior of individuals, groups of people, and organizations. Individuals or gr oups of individuals working to form an organizational unit together are typically the unit of analysis in the behavioral scien ce paradigm. Managerial and organizational issues are studied in this paradigm. Managerial and organizational issues are important, however the technological aspects of IS are equally as important. The behavioral science paradigm does not work well when applied to technologica l aspect of IS. For example, an efficient way to store and retrieve data does not occur in nature. A researcher can not just study individuals to extract methods to efficiently retrieve data. A different approach is needed. The second paradigm in Information Syst ems is the design science paradigm. The design science paradigm stresses “design” as an approach to create knowledge. The late Herbert Simon’s The Sciences of the Artificial (1969) explained the importance of Design. Studying artificia l objects (man-made) rather th an natural objects or phenomenon can solve many problems that a behavioral approach cannot. For example, instead of studying individuals to find a way to effici ently store and retrieve data, designing a system will produce much more knowledge. In design science research, the artifact is important. In Information Systems, mode ling, building, designing, and implementing an artifact can create knowledge. Figure 1-7 shows the design science paradigm model that this di ssertation is based upon.

PAGE 32

20 Figure 1-7 Design Science Research Model (Hevner, March et al. 2004) This research is conducted under the desi gn science paradigm. The design science paradigm has two different fundamental goals The first goal is the construction phase, where artifacts are produced to solve a specif ic problem. The second goal is the evaluation phase, where the produced artifact s are evaluated. A project management tool is developed in the construction phase. The tool instantiates the research model depicted in Figure 1-6. During the construction phase, ri gor is applied by usi ng prior research and tools. Relevance is applied in the constr uction phase by using current problems that organizations have with current so ftware cost estimation models. The evaluation phase is conducted by test ing the developed project management tool. An experimental design is used to show that the model developed improves estimation in a laboratory setting.

PAGE 33

21 1.12 Contributions Improving the process of developing so ftware not only makes the organization more mature, but also can lead to cost sa vings (Fenton 1993). By introducing the concept of a software handoff and its effect in soft ware development, better processes can be devised. Utilizing better proce sses will lead to more mature organizations. By building a cost estimation model that includes process st ructure and team size, better estimates can be used by software cost estimators. For software development managers, the new software cost estimation model provides a better model than any currently avai lable. The project management tool that implements the new software cost estimati on model can be used to support improved estimation. By helping a project manager e fficiently manage the software handoff, project management is improved. The contributions also imp rove not only organizations but also the knowledge base of software development. By modeling the software handoff, the impact of intergroup coordination on software development can be described and studied in greater detail than previously possible. 1.13 Dissertation Format The format of the remainder of the dissert ation is as follows. Chapter 2 provides a detailed literature review on the field and progress of soft ware cost estimation. The goal of this chapter is to show the progress and problems encountered in software cost estimation. Chapter 3 details the COCOMO II cost estimation model. COCOMO II represents the state-of-the-pra ctice in software cost estimation. From this chapter, an

PAGE 34

22 understanding of how to estimate software proj ects is presented. Chapter 4 addresses the conceptual development of communication overhead. Communicati on is an important aspect in software cost estimation that is missing from current software estimation models. This chapter will present the theore tical and mathematical development of intergroup coordination and intr a-group communication. Chapte r 5 details an extended software cost estimation model. This new extended model builds on COCOMO II as presented in Chapter 3 and the communication overhead discussion presented in Chapter 4. At the end of this chapter, the new exte nded model is developed and introduced in a tool for project managers called PSEstimate. Chapter 6 shows the tool PSEstimate in use and the type of problems it can solve. Chapte r 7 presents the experimental validation of the new extended software cost estimation mode l, and in this chapter the experimental design is outlined. Hypotheses are presented based the resear ch questions introduced in Chapter 7. Chapter 8 presents the empirical re sults from the experimental validation and a discussion of these results. Chapter 9 c oncludes the dissertatio n and presents the contributions of this work.

PAGE 35

23 CHAPTER 2 LITERATURE REVIEW Only in software do people cling to the illusion that it’s OK to come up with estimates of the future, even though you’ve never measured anything in the past. Tom DeMarco (Brady and DeMarco 1994) 2.1 Introduction With the invention of the electronic com puter circa 1945 and the first high-level programming language, FORTRAN, circa 1955, people wanted to know the cost of developing a software project. The problem of software cost estimation became relevant around 1975 when software development me thodologies emerged (Nemecek 2001). This chapter reviews the relevant literature on software co st estimation. Based on the literature, three ideas concerning the state of the art of software cost estimation will be expressed; there is a lack of a theoretical framework for estimation, very limited progress has been made in estimation, finally, drastic changes in modeling are needed to improve estimation. The first point is that software cost estim ation is plagued by a lack of a theoretical framework. Without a theoretical framework, the causes of cost in a software project are difficult to verify. A theoretical framework enum erates important metrics that need to be collected for software cost estimation m odels. “Even today, industry surveys indicate only about 25 percent of application devel opment organizations have a formal metrics program” (Yourdon 1994). Because a theoreti cal framework is lacking, construct

PAGE 36

24 development is not conducted with softwa re cost estimation measures. Instead of properly developing constructs from a theo retical framework, si gnificant correlations from statistical methods are used in softwa re cost estimation models. By measuring many variables and using a “shotgun” approach, where correlations are run between all variables to see if any correla tions are significant, eventua lly some variables will be found to have significant correlations even though the relationship might only be a spurious correlation. In addition, it is unclear if the significant correlations found are the artifact of violating the assumptions of a particular statistical method. The second point is very little progress ha s been made on the problem of trying to devise high-quality software metrics that m odel cost. Software development is a very difficult task to understand; estimating software costs is even more difficult. Software cost estimation models are not much better today than they were over 20 years ago. We are rarely able to predict accurately the cost of any software development project (Nemecek 2001). Some researchers claim th at “no prediction technique has proved consistently accurate, even when we relax the accuracy criterion to merely require that a technique generates useful predictions” (K adoda, Cartwright et al. 2000). Software engineering has seen a shorta ge of competent software developers with an increasing amount of work to be done ; this phenomenon is commonly called the “software crisis” (Amoroso and Zawacki 1992). Much of the work on software cost estimation follows the work done to solve the “software crisis” pr oblem. Many attempts have been made to increase software developers’ productivity, but theoretical frameworks to explain productivity are rare. With a lack of a th eoretical framework, empirical evidence is

PAGE 37

25 sometimes ignored. Solid empirical findings, such as an increase in productivity can be realized by giving software developers an o ffice with at least 90 square feet (DeMarco and Lister 1999) are rarely used in practi ce (Jones 1988). Software cost estimation is dependent on the subfield of software engi neering, software measurement and the metrics developed. Unfortunately, software measur ement has a very poor empirical knowledge base because of inappropriate or inade quate use of measurement. Many empirical findings are suspect mainly “because of thei r poor experimental design and lack of adherence to proper measurement princi ples” (Fenton 1993 p.141). Even though much work has been done to improve software deve lopment, very little pr ogress in the way of practical or theoretical cont ributions actually enhances so ftware cost estimates. The third point is that the field of software cost estimation will never mature unless drastic changes are applied. The fiel d of software engineering or software development is different from all other fiel ds. With over 25 years of work on software cost estimation, most estimates are at best, guesses. The popular advice of taking the best software cost estimate and double it shows the difficultly in using an atheoretical approach to software cost estimation. Ne mecek (2001) tells the story of a project manager, who after winning a large software development contract was asked how the estimate for effort was derived. The project manager summed the worst-case estimates for the project and then multiplied effort by 400%. When the project was completed, it still ran over budget (Nemecek 2001). A drastic change in software cost estimation will force a change to using a theo retical approach to software cost estimation. Researchers

PAGE 38

26 claim that a simple theoretical framework wa s shown to be better than the most popular software cost estimation mode l (Smith, Hale et al. 2001). It takes a diverse skill set to provi de solutions to the software cost estimation problem. Competency in three main fields, Software Engineering, Management, and Statistic s are needed. According to Jones, universities do not properly prepar e their graduates for immediate assimilation into commercial software development. About 1 year of remedial training and $15,000 to $25,000 in training must be spent before an entry-level graduate software engineer can be entrusted with commercial-grade software projects in a major company. At the same time, the curriculum of software mana gers is lagged by 5 year behind the state of the art (Jones 1998). With neither new graduates nor software managers having up-to-date knowledge on software cost estimation, champions’ support for a strong estimation program is difficult to achieve. Since there is such as long learning curve for both entry-level graduates and software managers, tools th at provide a decision support system are needed. Managers will need state-of-the-art tools to help them manage their jobs. Furthermore, research has shown that tool s that explain “why” or provide cognitive support to an answer are more preferable th an the tools that just provide the outcome solution (Sengupta and Abdel-Hamid 1993). Soft ware managers prefer the cognitive support that theoretical models can provide. 2.2 Cost Estimation Needs Software cost estimation tools are needed to help manage all but trivial system development projects. An accurate estimate can be used by management to support estimating the cost of proposed new system, perform design-to-cost analysis, schedule the personnel and resources needed thr oughout the development, and monitor the

PAGE 39

27 progress of the project (Adrangi 1987; Cover 1988). Since capital to invest in software development projects is scarce, companies pr ioritize development projects on some sort of cost/benefit analysis. A valid cost estimat e will allow a company to develop the best software development given a limited am ount of capital. Sche duling personnel and resources is an important activity for softwa re cost estimation tools. Knowing how many people will be needed and the amount of tim e required to develop the project will allow management to provide the resources require d to develop the soft ware. Monitoring the progress of the project is important to know if the project is on track or if it is falling behind schedule. Valid software cost estimations also allo w other parts of a bu siness to be more productive. Sales and marketing need estimates when the project will be completed in order to be effective. Many times software is marketed but no product ever ships. Manufacturing delays causes an ineffici ent use of time for many people in an organization. Having valid software cost estimations is important to an organization. Software that is late, over budget, or is of poor quality creates a major distraction to an organization that develops software. Even very successful companies have problems delivering large software proj ects on time and on budget. The onl y real solution is to have organization use valid software cost estimations. Even though cost estimation tools are neede d, the solution is not so simple. It is difficult to extract the variables that influence effort. While it may seem simple to measure the productivity of the team and assume the team will have that same

PAGE 40

28 productivity on other projects, it was shown that factors beyond the control of a software development team have a significant im pact on the productivity of a software development team (Leavitt 1977). As the same team moves from project to project, two different productivities will be seen. In 1979, Larry Putnam considered softwa re cost estimation an “intelligent guessing game” and warned against software pitfalls such as cost overrun, schedule slippages, and interdepartmental communica tion breakdowns. He said the poor project estimation is one of the major problems in software development and attributed these failures to the fact that software management and development is a science still in its infancy (Scannell 1979). In 1988 and 1989, softwa re was still being delivered late, over budget, with poor quality and missing featur es, therefore an em pirical study was conducted to see why, and the major problem was underestimation of effort (van Gunuchten 1991). Tasks were found to be mo re complex than initially estimated; therefore, frequent budget and schedule overruns were common. Another study found that software managers fail to learn from their mistakes by co ntinuing to undersize software size (Abdel-Hamid and Madnick 1989) Today, cost overruns and late software are still common. To a point, the “intel ligent guessing game” continues. 2.3 Cost Estimation Solutions Software cost estimation lacks a strong theoretical foundation. Practioners rather than researchers are leading the work conducted in software cost estimation. When the task is to create an estimate for a partic ular company, theory is not considered. By reviewing the history of so ftware cost estimation, many potential problems that are

PAGE 41

29 difficult to be solved by people in the fi eld can be addressed using a theoretical foundation. Software project cost estimation starte d by understanding that the bigger a project was to be developed, the more effort and the longer it would take to develop. Managers assumed that productivity rates of programme rs were constant. Software development was thought to be linear, to do twice the work, you need twice the time. Therefore, the size of the system needed to be estima ted, and using the productivity rates of the programmers, the schedule, and the number of people to develop the system could be calculated. If the schedule needed to be shorte ned, more programmers were added to the system. Brooks showed though that effort and schedule could not be directly interchanged (Brooks 1975). Putnam wrote that the phenomenology of the software development process is not known, but data suggests a clear time-varyi ng pattern (1978). Norden (1970) applied Lord Rayleigh’s distribution, to describe the projected labor needed during the stages of hardware development. Putnam applied Norden’s concepts to software development (1978). Putnam’s Software Equation was the result of the Raylei gh curve applied to software development and is summarized as fo llows “It has been disc overed that there is a fundamental relationship in software development between the number of source statements in the system and the effort, de velopment time, and the state of technology being applied to the project” (Putnam and Fitzsimmons 1979). The Programmed Review of Information for Costing and Evaluation – Software (PRICE S), developed in 1977 by Martin Marie tta Price Systems, was the first complex

PAGE 42

30 commercially available software cost estima tion tool. PRICE S is a proprietary cost estimation model developed by Lockheed Marti n. To use this cost estimation model, a company would have to hire a Lockheed Martin consultant to conduct the cost estimation. Government agencies such as NASA, IRS, U.S. Air Force, U.S. Army, U.S. Navy, etc, as well as private companies have used PRICE S (NASA 2002 p. 35). TRW Defense and Space Systems Group want ed a software estimation model that was developed with a well-defined set of crit eria. In addition, the cost estimation model was required to be related to actual softwa re project dynamics and the majority of the cost model, not based on poorly calibrated subjective factors (Boehm and Wolverton 1980). From this development work, COCOMO (Boehm 1981) was designed. By using a database of metrics built from software deve lopment projects, a regression was conducted to relate project size with project effort. Cost estimation of software development and control of cost during development is cited as being difficult because of a lack of useful cost history figures, therefore a software-c ost database was developed to support cost estimation (Dekker and van den Bosch 1983). A so ftware metric based on the sum of the number of files, flows, and processes in the system was found to be valid and reliable for a database of 20 different systems (van de r Poel and Schach 1983). In this study, the researchers attempted to show that the co st of developing a system is directly proportional to its size, the size and cost of software can be accurately estimated early in the software development process, and the size of the software and the cost can be used to determine the efficiency of the development process.

PAGE 43

31 Boehm’s book on Software Engineering Economics detailed five different software cost estimation techniques incl uding algorithmic cost estimation, expert judgment, cost estimation based on previous experience, price-to-win cost estimation, and top-down/bottom-up costing (Boehm 1984) The argument made was that it is important to use an economic-based perspectiv e to software engineer ing. By applying an economics-based perspective to cost estima tion, the technical aspects of a software project can be analyzed in relation to the resource cons traints that characterize the software engineering environment. Therefore, by way of the duality principle (Musgrave and Rasche 1977), a better estimate will be foun d than by just looking at technical aspects or resource constraints alone. Cost estimation models not built with the duality principle in mind, have a weakness in having spurious correlations if usi ng regression. Research such as (van der Poel and Schach 1983) a nd (Dekker and van den Bosch 1983) that are not based on the duality principle provide little empirical evidence because their promising results are probably based on spur ious correlations. Very shortly after Boehm argued for an economics-based perspective to software engineering, a study that looked at both the resources and the workload of a system was published (Italiani 1984). Italiani analyzed the performance of a software st aff based “conventional experience,” “relative capacity,” and a new construct, “working envi ronment quality coefficient.” By creating a workload matrix involving development activi ties, Italiani created a theory, productive capacity of a software development syst em, to support software cost estimation. Unfortunately the impact of this work is limited.

PAGE 44

32 Even with the new economics-based pers pective to cost estimation, the backlog of software development projects was steadily increasing with cost overruns and schedule slippages costing companies real money. Ther e were no standardized or reliable methods for cost estimation and project control ther efore a better understanding of process was thought to be the answer (Raja 1985). Raja e xplains how the Rayleigh model for software development can be effectively used for cost control and project management. By combining concepts from statistics, perfor mance evaluation, critical path method, and software engineering, a project size can be estimated as a function of total project effort and development time. Before this effort wa s always the dependent variable with size being the independent variable as shown in Equation 2.1. Raja made software size the dependent variable with effort and devel opment time being independent variables as shown in Equation 2.2. Raja through his m odeling asked the question, with a given amount of effort, what sized projects could be built. 10EffortSize Equation 2.1 Effort Equation 23SizeEffort Equation 2.2 Size Equation A study (Kitchenham and Taylor 1985) wa s done to determine the effectiveness of the Putnam’s Rayleigh curve model and COCOMO with 33 software development projects. Kitchenham found that neither the Putnam nor the COCOMO model adequately

PAGE 45

33 fitted the data when looking at software size, effort expended, and the time required for development. By 1985, several cost estimation models we re proposed, but very little external empirical validation was succe ssfully completed on any of the proposed models. Modern systems were becoming more software intens ive with software development definitely being on the critical path for system deliver y. Other areas of software development were becoming mature, but cost estimation made lit tle progress. Software is not a repetitive task like creating an automobile; instead, software is developed rather than built, therefore traditional experimental methods which are common with agriculture and assembly line production, are very difficult to conduct with software development. Without experimental methods, it is hard to verify cause and effect during software development. Instead of developing and testing theories, best practices were used instead. By sharing best practices in so ftware cost estimation allowed th e field to slowly progress. Many of the failures of software cost estimation have been because of the difficulty in measuring a software developm ent system (Verner and Tate 1987). With size being the major variable to describe a softwa re development system and the difficulty to measure a system size accurately, failure in software cost estimation can be understood. The usual way of measuring a system size is using lines of code (LOC). A popular quip summarizes the inadequacy of using lines of code as a measure of software size. “To estimate software development costs on the basis of LOC is analogous to estimating home construction costs based on the number of nails or bricks to be used” (Callisen and Colborne 1984). However, using lines of c ode is a poor measure because programmers

PAGE 46

34 can easily manipulate the metric. Function Point Analysis (FPA) is one attempt to solve the sizing problem in software development. Some cost estimation models use different methods of sizing to mitigate the weaknesses of using lines of code as a size metric. Function Points are an improvement over line s of code, but fundamental flaws in the construction of function points prevent them from being valid measures (Kitchenham, Pfleeger et al. 1995; Kitchenham 1997). Another method to solve the software-sizi ng problem is to calibrate the software estimation model. By using historical data, which may or may not be reliable measures representing software size, effort, and devel opment time, better estimations were thought to be possible. The PRICE S model had a formal established me thodology of calibrating productivity indexes with hi storical data. With the me thodology, the organization was calibrated rather than the model (Park 1988). The advantages of calibrating a software cost estimation model were shown (Cuelenaer e, van Genuchten et al. 1987). Software cost estimation is important because software continues to be large part of the cost of modern systems, therefore based on the state of estimating, there was a request for more efficient software cost models (Ferens 1988). Human, technical, environmental, and political reasons all can affect the effort a nd time required to develop a system so there was a claim that software cost estimation wi ll never be an exact sc ience (Navlakha 1990). Through an experiment, Navlakha showed the importance of customizing a software cost estimation model to an organizational environment. The software cost estimation field wa s revitalized with object-oriented development. Using object-oriented developmen t, the method of software sizing became

PAGE 47

35 more accurate because the strong link be tween specifications and implementation (Laranjeira 1990). The number of classes a nd methods in an object-oriented system provides more insight into the project size than just lines of code. With object-oriented development, software metrics became a popular avenue of research. There was interest to develop new metrics around the new paradi gm in programming. Many of the metrics developed were highly correlated with softwa re size, and this pr ovided no support to the software-sizing problem. A major work done by Abdel-Hamid (1991) provided many insights into software development by using a novel approach for res earching software engineering. By using a dynamic simulation model, various inputs were allowed to change over time. The field of Calculus was now being applied to software development instead of multiple regressions from statistics. By building a model of the software development environment, variables that affected software development were very explicitly described. An integrated theoretical framework to software development was built. Many interesting finding came out of this research (Abdel-Hamid 1988; Abdel-Hamid 1988; Abdel-Hamid 1989; AbdelHamid 1992; Abdel-Hamid 1993; Abdel-Ha mid, Sengupta et al. 1994; Abdel-Hamid, Sengupta et al. 1999), but results of the work have yet to be integrated into modern software cost estimation models. Relevant findings such as communication overhead in software development, schedule pressure, le arning curves in software development, productivity lost to training new employees task underestimation, and the effects of turnover in system development are not m odeled in most software cost estimation models; even though they are shown in the si mulation to drastically affect effort and

PAGE 48

36 schedule. The knowledge created from softwa re dynamics has not yet been used to develop better software cost estimation m odels. Abdel-Hamid even describes how the interdependency of projects results in a funda mental deficiency in the formulation of current generation cost estimation tools ( 1993). Abdel-Hamid believes that the reason software cost estimation model have low porta bility is because of the lacking of the models to quantify the effect of managerial decisions on cost (1987). Two identical projects can be conducted by two different orga nizations but most cost estimation models will not provide different estimates to the effort and schedule, even though the first project might have three times the amount of employees as the second project. Current cost estimation models have poor linkages to the real world of so ftware development. The lack of cost estimation models built on theoretical frameworks is the reason. The Minimum Software Cost Model (MSCM) (Hu, Plant et al. 1998) is software cost estimation model built from economic pr oduction theory and systems optimization theory. In particular, the MSCM was de rived from the Cobb-Douglas production function. Using Kemerer’s data set of 14 proj ects (1987), the MSCM was declared to be superior to all other software cost estima tion models. Unfortunately, all but two of the projects in the database were COBOL systems so this does not help in estimating modern object-oriented systems. A study used four new construc ts as inputs to software cost estimation. Team size, concurrency, intensity, and fragmentation where shown to have goodness of fit and quality of estimation superior to that of the COCOMO model, while being more parsimonious (Smith, Hale et al. 2001).

PAGE 49

37 Team Size Team size is an important construct because Brooks (1975) showed that managers often employ additional people to late projects in order to rescue the project. However, the additional communication and training need ed cause the project to become late. Brooks’ Law was later empirically validated (S engupta, Abdel-Hamid et al. 1999). It was shown that big teams cause negative eff ects during development (Fried 1991), and Putnam (1985) stresses using a small team a pproach for production of reliable systems. According to Smith and all, “Although no prio r research has been found that directly explores the relationship betw een team size and development effort, these related finding support an expected negative relationship betw een the two” (Smith, Hale et al. 2001). No software cost estimation model specifically model the size of a team into the calculation. Hence, public dataset do not provide the am ount of people that worked in a team. Intensity Smith et al (2001) also devised a constr uct called intensity, which measures the degree of schedule compression. It is thought that high developer produ ctivity requires single-minded work time, and for each interru ption, immersion time is required to restore the high productivity. This is the main reason that a private office increases a developer’s productivity. Having a developer working on a single task shoul d result in higher productivity. However, Putnam warns that schedule compression increases communication noises, which introduces ambigu ities into the development process, and results in lower productivity as people interru pt each other to re solve the ambiguities (Putnam 1985). If given too much time to comp lete work, the work will scale to fill the

PAGE 50

38 allotted time. Intensity is not included as a fact or in software cost estimation models even though research has shown that it is an im portant driver of pr oductivity, which affects costs. Concurrency Concurrency is the degree to which team members work together or independently on a portion of the software pr oject. The degree to which people work together is shown to be cri tical to team performance (Gui nan, Coprider et al. 1998).Yet software cost estimation models do not include a measure of concurrency in the model. COCOMO II does include a qualitative measure of team interactions. A software development team is rated on a scale from having very difficult interactions to seamless interactions. Higher the scale, the larger the effort is needed. Concurrency instead explains if people are working together or independently. Fragmentation The last construct advanced by Smith is fragmentation. Fragmentation examines the degree to which a team’s time is br oken up over multiple modules. While it is understandable that fragmentation leads to decreased efficiency, managers argue that cross-pollination of ideas ensure consistent approaches on multiple modules (Reinertsen 2000). A person that works 80 hours per week on a module with no fragmentation cannot easily increase the amount of hours worked on that module whereas someone that only works 20 hours per week. Forcing developers to work to a rate of full-time utilization

PAGE 51

39 only guarantees queues and delays. Nevertheless software cost estimation models do not include a measure of fragmentation. 2.4 Empirical Model Building The majority of software cost estimation models that have been developed are empirically based (Cover 1988). Most models are a variation of a basic effort equation as shown in Equation 2.3. bEca Equation 2.3 Basic Effort Equation In the basic effort equation, “E” stands fo r effort, and “a” is normally the size of the project, usually in lines of code. Both “c” and “b” are constants established through an analytical techniqu e, usually regression. Historical data usually consisted of effort project size, and project duration. From this historical data, software cost estimation models first tried to relate project size and effort. It was generally agreed that bigger pr ojects should take more effort to develop. What was not known was whether a project twic e as big as another would take twice the effort to develop. This economies and disec onomies of scale of software development was the first empirical task software cost estimation models trie d to answer. It was important to explore whether the relationshi p between project size and effort was linear. By collecting project data that included how much effort wa s required and the total size of the software project, a multiple regressi on was conducted with the dependent variable being effort and the independent variable being software size.

PAGE 52

40 Later it was found that software size and e ffort did not have a linear relationship, except for very small software development pr ojects. Boehm first showed that there were diseconomies of scales in software developm ent (Boehm 1981). Instead of using a linear relationship to model software size and effort a nonlinear relationship was used to fit the data better. Using a nonlinear re lationship to model size with effort resulted in the first modern cost estimation tool, Barry Boeh m’s Constructive Cost Model (COCOMO) (Boehm 1981) was created. 2.5 COCOMO COCOMO includes three different types of cost models; these types are basic, intermediate, and detaile d. All three models used thousands of lines of delivered source code or KSLOC as a measure of software si ze. The differences between the three models were accuracy. To be more accurate, the model required more information. The simplest cost model was Basic COCO MO, but it provided the most unreliable results of the three, but only simple informa tion was needed as input. The model is shown in Equation 2.4. ()bEffortaKLOC Equation 2.4 Basic COCOMO Equation There were only three parameters and th e software size as the input to the equation. The output, Effort, was given in man-months. One man-month equals one person working for a month.

PAGE 53

41 The equation for schedule for Basic, Interm ediate and Detailed are all the same. Schedule explains the number of calendar months it will ta ke the software project to complete. 2.5cScheduleEffort Equation 2.5 COCOMO Effort Equation Three constants are needed to come up with a numerical answer for effort and schedule. The project’s development type fi rst has to be known, and then the constants can be found by referring to Table 2-1. Three different project t ypes are defined by COCOMO, Organic, semidetached and embedded. In organic mode, the software development team is small. Usually only small (less than 50 KSLOC) projects are de veloped by an organic team. Most people developing the software have experience and thorough understanding of the system will contribute to the organi zations objectives. Semidetached mode is a compromise between organic and embedded. Typically projects that are in the semidetached mode are no bigger than 300 KSLOC. In embedded mode, the project needs to f it within tight constr aints and these are the most difficult software development proj ects developed. A missile system would be a type of embedded software development project. Development Type Constant a Constant b Constant c Organic 2.4 1.05 0.38 Semidetached 3.0 1.12 0.35 Embedded 3.6 1.20 0.32 Table 2-1 Basic COCOMO Constants

PAGE 54

42 Intermediate COCOMO was more accurate by adding 15 more parameters. These 15 cost drivers characterize product attributes computer attributes, personnel experience, and software tools and practices. A project that has a higher complexity will have a higher cost driver; therefore, the project takes more effort and time to complete. Intermediate COCOMO has the following equation for effort: ()bEffortEAFaKLOC Equation 2.6 Intermediate COCOMO Effort Equation A new variable, EAF, represents the product of the 15 cost drivers. In order not to overestimate effort when using Intermediate COCOMO with effort multipliers, the value for the “a” constant changes from the Basic COCOMO model. The constants “b” and “c” however are the same. For organic the new c onstant is a=3.2; semidetached a=3.0; and embedded a=2.8. Detailed COCOMO is very similar to Intermediate COCOMO but instead uses the 15 different cost drivers through each phase of the software development lifecycle. This way a cost driver can focus on a specifi c phase rather than having to apply to the whole project. By individually estimati ng each phase, for a project with new programmers and very experienced designers the effort for the implementation phase will increase whereas the effort for the design phase will be reduced. 2.6 COCOMO II Over 15 years after releasing CO COMO, COCOMO II (Boehm 2000) was developed. This major work updated an outda ted software cost estimation model that COCOMO had become. Both CO COMO and COCOMO II used a database of project in

PAGE 55

43 which multiple regressions were run in order to create scale and effort multipliers. Just before COCOMO II come out, the projects in the COCOMO database were almost 20 years old. The software cost estimation m odel was not reflecting many improvements in productivity. Today, COCOMO II has over 100 co mmercial implementations, and is the most widely used software co st estimation tool. COCOMO II is discussed in more detail in chapter three. 2.7 Other Modern Software Cost Estimation Tools While COCOMO II is the most used mode rn cost estimation tool, several tools also exist. Using the Basic COCOMO form ula, but with different values for the constants, there are three different cost estimation models, Walston-Felix (1977), BaileyBasili (1983), and Doty (Herd, Postak et al 1977). Using a simple regression between function points and effort resulted in Al brecht-Gaffney (1983), Kemerer (1987), and Matson, Barret and Meltichamp (1994) cost estimation models. Putnam’s SLIM model (1978; Putnam and Myers 1992) is different fr om all other cost estimation models in terms of equation form, but outputs valu es are very close to COCOMO II. 2.8 New Findings Not Assimilated Into Software Cost Estimation Models Angelis et al (2001) used recent data collected by the International Software Benchmarking Standards Group to create a softwa re cost estimation mo del. This data set consisted of historical data from many different types of organizations. They conducted a regression with the basic effort equation as the model. The re sults showed that 44% of the variance of was explained when predicting effo rt with size. A categorical regression was

PAGE 56

44 conducted with many variables, but the variable, maximum team size, was found to be significant. With the maximum team size placed into the model, the explained variance doubled to around 88%. Using dimensional analysis is common in fi elds like Physics, Chemistry, or Math where units matter. “Dimensional analysis is a method of comparing the dimensions of the physical quantities occurring in a problem to find relationships between the quantities without having to solve the problem co mpletely” (Random House 1998). Equation checking is part of dimensional analysis. In this step, the formula’s theoretical derivation is checked based on algebra. If the units on both sides of the e quation are equal, the equation is said to be commensurable. If the un its are not equal, for example, if apples are on one side of the equation, and oranges are on the other side, the equa tion is said to be incommensurable. After studying all the soft ware cost estimation models, “Conventional software models can not be correct because each is incommensurate” (Nemecek 2001). Predicting effort with size using regressi on is not a valid theoretical derivation. Another study was done to look at th e sensitivity of COCOMO II (Musilek, Pedrycz et al. 2002). After conducting three types of sensitivity analysis including mathematical analysis of the effort equation, Monte Carlo simulation, and error propagation, the size variable in COCOMO II was found to be very sensitive followed by the effort multipliers. The exponential factor has little impact of error. The authors suggest using fuzzy set of inputs to softwa re size whereby giving the project manager a spectrum of effort estimations rather than a single point estimate.

PAGE 57

45 Neural networks also have been used to predict effort. In this particular study (Idri, Khoshgoftaar et al. 2002), si ze plus four effort multipliers were placed in the neural network. This study used the COCOMO datase t and the researchers claimed that the “results are acceptable”. Although, understandin g and interpreting the resulting neural network was found to be very difficult. Estimating by analogies or case-based-reas oning is another technique used to predict effort. The use of analogies as a technique was suggested over 20 years ago (Boehm 1981). The effectiveness of case-base d-reasoning greatly relies on the underlying dataset used for analogies. Case-based-reasoning is a type of cluster analysis and inherits the weakness of any clus ter analysis methodology. “Cluster analysis is the name for a group of multivariat e techniques whose primary purpose is to group objects based on the characteristics they posses. Cluster analysis classifies objects (e.g., responde nts, products, or other entities) so that each object is very similar to others in the cluster with respect to some predetermine d selection criter ion. The resulting clusters of objects shoul d then exhibit high in ternal (within-cluster) homogeneity and high external (betw een-cluster) heterogeneity. Thus, if the classification is successful, the objects within clusters will be close together when plotted ge ometrically, and different clusters will be far apart” (Hair, Anderson et al. 1998 p.473). Case-based-reasoning is often used in task domains that have no strong theoretical models and where the domain rules are in complete, ill-defined, and inconsistent (Mukhopadhyay, Vicinanza et al. 1992). The num ber of possible project factors is a problem for many software cost estimation mode ls. Over 74 different project factors have been identified (Wrigley and Dexter 1987). Pr edetermining some set of project factors then running a cluster-type analysis on a published data se t usually yields favorable

PAGE 58

46 results. Consider the case-based-reasoning m odel called Estor. “Estor did not perform quite as well as the human expert, but it di d outperform existing al gorithmic model on the data set” (Mukhopadhyay, Vicinanza et al. 1992 p.167). Estor estimates averaged 52% within actual estimates when COCOMO av eraged 618% within actual estimates. The goal of software cost estimation is not to predic t the cost of historical data, but rather to predict the cost of new proj ects. The authors write, “To be fair, Estor would almost certainly fail to accurately estimate project from different environment (e.g. embedded military systems) with additional domain knowledge” (Mukhopadhyay, Vicinanza et al. 1992 p.167). “Estimates of the accuracy of pred iction obtained from a training set are always optimistic. To get a more realistic es timate of the accuracy of prediction you either have to use a new, i ndependent data set or adopt a jack-knife approach” (Samson, Ellison et al. 1997 p.59). An important study was conducted to show the causes of estimating error. Only one managerial practice, which was the use of the estimate in performance evaluation of software managers and professionals, was shown to increase accuracy of estimates. Software cost estimation models were shown to be no help. The authors write “… It is unexpected that the applica tion of the algorithmic basis failed to predict estimating accuracy. Apparently, the use of complex statistics, software, and standards do not facilita te more accurate estimates. Such a finding does not imply that software managers and professionals should shun algorithm-based estimating techni ques. However it intimates instead that they recognize their shortcomi ngs: Specifically, the employment of algorithm-based estimating methods did not improve the accuracy of cost estimates for subjects in this resear ch. When using such methods, software managers and professionals probably n eed to be very careful to avoid the impression in other managers and us ers that they can guarantee meeting algorithm-based estimated” (Lederer and Prasad 1998).

PAGE 59

47 By holding estimators responsible for thei r estimates is probably the only way software cost estimation is going to impr ove. Once people are responsible for their estimates, substandard models will not be tolerated. 2.9 Empirical Datasets Empirical validation of software cost estimation models using regression depend on the quality of the datasets available. COCO MO II has the best data set of projects with “161 carefully-collected” projects (Boehm a nd Sullivan 2002). However, the dataset is proprietary and not publishe d. COCOMO only needed 63 projects to have the same predictive accuracy as COCOMO II, which is being within 30% of the actual metric, 75% of the time (Boehm 1981). The larger requ ired dataset need by COCOMO II shows the difficulty of using regression to develop cost estimation models. Empirically validating a cost estimation model using a regression appro ach with the following datasets is not very convincing. Two example empirical dataset are presented in Appendix A and Appendix B. 2.10 Other Validation Approaches When experts estimate software costs w ithout any formal algorithmic technique, they outperform software cost estimation mode ls. The mean error rates of the experts’ predictions still ranged from 32 to 1107 percent (Lederer and Prasad 1998). Experts have a better idea at estimating software paramete rs than software cost estimation models, so the knowledge experts have has yet to be transf erred into a software cost estimation tool. In the absence of empirical data, professi onal judgment should be used. The Delphi

PAGE 60

48 method is a method to capture and properly document the knowledge being shared from an engineer’s expert opinion (NASA 2002 p.39) Using experts to validate a software cost estimation tool with a technique such as the Delphi method solves the problem of having large empirical datasets, but finding capable experts a problem. Unfortunately, according to Andy Prince, “Everyone is an ex pert on cost. Get used to it” (NASA 2002 p. 170). 2.11 Conclusions Software cost estimation remains a difficu lt problem. With current estimates that still result in millions of dollars being spen t in projects running over budget, the need to have better estimates will continue. There are some new ideas that can be used to make a better software cost estimation model, mainly the work on team size and task assignment. Even though many hundreds of variables have be en proposed as inputs into software cost estimation, none of the variable s has shown external empiri cal validity. Yet there is a need to build better models, and future softwa re cost estimation models are going to have to be manager oriented. Since “software cost estimation is the process of predicting the amount of effort required to build a software system and is a fundamental managerial planning activity” (Nemecek 2001). Software cost estimation is more than just about the size of the project. Having 10 people or 100 people working on a project makes a big difference because of the mythical man-month (Brooks 1975). This dissertation will provide a drastic change to the field of software cost estimation by placing the most logical driver of cost missing from current generation models, configuration of workforce and team size, into the formula.

PAGE 61

49 CHAPTER 3 COST ESTIMATION IN COCOMO II “We shall not fail or falter; we shall not weaken or tire...Give us the tools and we will finish the job.” Sir Winston Churchill, BBC radio broadcast, Feb 9, 1941 3.1 Introduction This chapter describes the first two constr ucts used in this dissertation, project characteristics and COCOMO II outputs. Th is chapter will review how COCOMO II models differences in project characteristics to estimate effort, schedule, and staffing needed to conduct the particular software de velopment project. The equations used by COCOMO II to produce the out puts are described. 3.2 Project Characteristics It is safe to assume that no two soft ware projects are alike. Given any two software development projects, differences can be found between the projects. Therefore, it is important to identify and quantify the significant differences among software development projects. Project characteristics are the independent variable in software cost estimation models, meaning differences in project characteristics create changes in the dependent variables, effort, and schedule.

PAGE 62

50 Software Size The first project characteristic that was modeled was software size. Common sense leads researchers to theorize that larger software development projects will take more effort and more time to complete than small projects. COCOMO II uses a measure of software size in the algorithm to calculate effort and schedule. COCOMO II uses thousands of delivered source lines of code (KSLOC) as a measure of software size. Measuring KSLO C is not universal. With the same source code there are different methods for counti ng KSLOC. For example, the following simple code can be counted in many ways: int x, y, z; x=3;y=4;z=2;int xyz = x+y+z; The line above code can be considered one line of code, or five, depending on the counter. The same functionality can be rewri tten to be seven lines of code as shown below: int x; (1) int y; (2) iny z; (3) int x=3; (4) int y=4; (5) int z=2; (6) int xyz = x+y+z; (7) An alternative is to code as follows with one line of code: int xyz=3+4+2; (1)

PAGE 63

51 Since programmers have control over how they implement the code, large variations are susceptible to the KSLOC m easurement. Another solution was devised to get around the problem that exists using KSLO C. By measuring functionality rather than lines of code, the same logic can be used to argue that bigger programs require more effort still applies. A project with more functionality will require more effort and schedule time to complete versus a project with less functionality. Information systems are commonly si zed by functionalit y, like number of graphical user interface screen or reports. F unction points are used instead of lines of code to measure software size. COCOMO II’s internal algorithms only use KSLOC in estimating effort and schedule. A process called backfiring is us ed to convert function points into SLOC. COCOMO II can convert from unadjusted f unction points to lines of code based on programming language used to implement the function points. Table 3-1 shows the conversion factors for different programming languages.

PAGE 64

52 Programming Language SLOC per Unadjusted Function Point Access 38 Ada 83 71 Ada 95 49 AI Shell 49 APL 32 Assembly – Basic 320 Assembly – Macro 312 Basic – ANSI 64 Basic – Complied 91 Basic – Visual 32 C 128 C++ 53 Cobol (ANSI 85) 91 Database – Default 40 Fifth Generation Language 4 First Generation Language 320 Forth 64 Fortran 77 107 Fortran 95 71 Fourth Generation Language 20 High Level Language 64 HTML 3.0 15 Java 53 Jovial 107 Lisp 64 Machine Code 640 Modula 2 80 Pascal 91 PERL 21 PowerBuilder 16 Prolog 64 Query – Default 13 Report Generator 80 Second Generation Language 107 Simulation – Default 46 Spreadsheet 6 Third Generation Language 80 Unix Shell Scripts 107 Visual Basic 5.0 29 Visual C++ 34 Table 3-1 Unadjusted FP to SLOC Conversion Ratios (Boehm 2000, p 20)

PAGE 65

53 To convert from unadjusted function points to SLOC simply multiply the unadjusted function point estimate by th e appropriate conversion ratio for the programming language in which development will occur. There is a lower bound on software size as to what COCOMO II can estimate. COCOMO II has been calibrated for projects bigger than two KSLOC; therefore, the model built in this dissertation will also not be able to calculate projects smaller than two KSLOC. Projects smaller than two KSLOC are typically completed by only one person, and the developer’s skill highly determines the effort and schedule required to develop the project. Scale Factors Researchers use more than just software size to quantify a software development project. Differences in projects with the sa me software size lead researchers to add another component to the descri ption of a project. By using the concept of a scale factor, the software size can adjust to circumstances that cause more or less effort needed for the same software size. For example, this allo ws two projects, both 40 KSLOC, to have different effort estimates based on scale factors. COCOMO II has five scale factors that account for the economies and diseconomies of scale in software developm ent projects. When there are economies of scale, doubling the software size will result in effort being less that double the original. Whereas when diseconomies of scale are present for a software project, doubling the project size will results in more than double of the original pr oject effort being needed to complete the project.

PAGE 66

54 Driver N ame Very Low Low N ominal High Very High Extra High PREC Precedentedness 6.20 4.96 3.72 2.48 1.24 0.00 FLEX Development Flexibility 5.07 4.05 3.04 2.03 1.01 0.00 RESL Architecture / Risk Resolution 7.07 5.65 4.24 2.83 1.41 0.00 TEAM Team Cohesion 5.48 4.38 3.29 2.19 1.10 0.00 PMAT Process Maturity 7.80 6.24 4.68 3.12 1.56 0.00 Table 3-2 Scale Factors COCOMO II uses Equation 3.1 to calcul ate if a project has economies or diseconomies of scale. 5 10.01j jEBSF Equation 3.1 Economy of Scale Equation In Equation 3.1, B is a constant and for COCOMO II.2000 the valu e is 0.91. If the value of E is equal 1.0 then the economies of scale and diseconomies of scale are in balance. If the value of E is less than 1.0 th en the project has economies of scale. If the value of E is greater than 1.0 then the project has diseconomies of scale. If the highest and lowest s cale factors are applied to E quation 3.1, the result is that the economy of scale equation ranges fr om 0.91 to 1.2262. COCOMO II’s accuracy depends on correctly identifying the pr oper scale factors for a project. Effort Multipliers In addition to scale factors, there are other set of variab les that are thought to help increase the quantifying of project characterist ics. Effort multipliers are used as the third type of project characteristic s along with software size a nd scale factors. COCOMO II

PAGE 67

55 has two different sets of effort multipliers that should be used at different times. The first set is the Post-Architecture effort multipliers The seventeen effort multipliers are to be used after the software architecture ha s been designed. The Early Design effort multipliers are an alternative to the Post-Architecture effort multipliers. The Early Design effort multipliers are best used when a high-le vel model is needed to explore architectural alternatives or incremental development stra tegies, whereas the Post-Architecture effort multipliers are best used when more detail ed information about the architecture is available and a more accurate estimation is needed (Boehm 2000 p. 12). COCOMO II provides for either type of multiplier to be used. The following table lists the quantitative values for each effort driver. The scale is divided by very low, low, nomina l, high, very high, and extra high.

PAGE 68

56 Drivers Description VL L N H VH XH RELY Required Software Reliability 0.82 0.92 1.00 1.10 1.26 n/a DATA Database Size n/ a 0.90 1.00 1.14 1.28 n/a CPLX Product Complexity 0.73 0.87 1.00 1.17 1.34 1.74 RUSE Developed for Reusability n/a 0.95 1.00 1.07 1.15 1.24 DOCU Documentatio n Match to Life-Cycle N eeds 0.81 0.91 1.00 1.11 1.23 n/a TIME Execution Time Constraint n/a n/a 1.00 1.11 1.29 1.63 STOR Main Storage Constraint n/a n/a 1.00 1.05 1.17 1.46 PVOL Platform Volatility n/a 0.87 1.00 1.15 1.30 n/a ACAP Analyst Capability 1.42 1.19 1.00 0.85 0.71 n/a PCAP Programmer Capability 1.34 1.15 1.00 0.88 0.76 n/a PCON Personnel Continuity 1.29 1.12 1.00 0.90 0.81 n/a APEX Applications Experience 1.22 1.10 1.00 0.88 0.81 n/a PLEX Platform Experience 1.19 1.09 1.00 0.91 0.85 n/a LTEX Language and Tool Experience 1.20 1.09 1.00 0.91 0.84 n/a TOOL Use of Software Tools 1.17 1.09 1.00 0.90 0.78 n/a SITE Multisite Development 1.22 1.09 1.00 0.93 0.86 0.80 SCED Required Development Schedule 1.43 1.14 1.00 1.00 1.00 n/a Table 3-3 Post-Architecture Effort Multipliers

PAGE 69

57 Drivers Description XL VL L N H VH XH RCPX Product Reliability and Complexity 0.49 0.60 0.83 1.00 1.33 1.91 2.72 RUSE Developed for Reusability n/a n/a 0.95 1.00 1.07 1.15 1.24 PDIF Platform Difficulty n/ a n/a 0.87 1.00 1.29 1.81 2.61 PERS Personnel Capability 2. 12 1.62 1.26 1.00 0.83 0.63 0.50 PREX Personnel Experience 1. 59 1.33 1.22 1.00 0.87 0.74 0.62 FCIL Facilities 1.43 1. 30 1.10 1.00 0.87 0.73 0.62 SCED Required Development Schedule n/a 1.43 1.14 1.00 1.00 1.00 n/a Table 3-4 Early Design Effort Multipliers The effort multipliers were designed to be independent factors, but the literature has shown the factors are often interrelated (Briand, El Emam et al. 1998). Briand also shows that even though some CO COMO factors appear to be useful and significant, they only play a minor role in explaining project effort because the impact on different models goodness of fit is weak. The conclusion of Briand ’s research is that the effort multipliers described in this section might not be the co rrect variables. Nevertheless, from all the possible set of variables to us e, COCOMO II uses the variables described in the section. 3.3 COCOMO II Outputs This section describes the outputs fr om COCOMO II. The outputs are the dependent variables. Effort and schedule are the most common dependent variables. However, a lesser-known variable, the work br eakdown structure, plays an important role too. Development Effort Estimating development effort is the main goal of software cost estimation. The common unit of measure of effort is manmonths or the politically-correct person-

PAGE 70

58 months. One person-month represents one pe rson working for a month. The more personmonths required the more effort is require d to complete the project. An estimate in person-months can be easily converted into person-years, person-da ys, or person-hours by the appropriate multiplication factor. COCOMO II will provide all estimates of effort in man-months. Project Duration Project duration is a very important de pendent variable in software cost estimation. Along with knowing the cost, know ing how long a project will take to conduct is a practical concern of project ma nagers. COCOMO II will provide an estimate of the project duration in months. This estimate can be converted into different time units by the appropriate multiplication factor. Work Breakdown Structure COCOMO II provides a unique work breakdown structure based on the project size, effort estimate, and schedule estimate By breaking down the whole project into three main activities, which are product de sign, programming, and in tegration and test, the amount of time needed to conduct re quirements and analysis, product design, programming, test planning, veri fication and validation, project office, quality assurance, and manuals for each phase can be estimated. As the project size and scale factors ch ange, the work breakdown structure will also change. A sample work break down st ructure for a medium project (32K SLOC)

PAGE 71

59 with a size exponent (diseconom y of scale) of 1.12 is show n in Table 3-9. COCOMO II derives the work breakdown structure from a table based on the relevant factors. Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S, I,M, L S I M L VL S I M L VL Overall Phase Percentage 6 7 7 7 7 7 8 8 8 8 8 Requirements Analysis 46 48 47 46 45 44 50 48 46 44 42 Product Design 20 16 16.5 17 17.5 18 12 13 14 15 16 Programming 3 2.5 3.5 4. 5 5.5 6.5 2 4 6 8 10 Test Planning 3 2.5 3 3.5 4 4.5 2 3 4 5 6 V & V 6 6 6.5 7 7.5 8 6 7 8 9 10 Project Office 15 15.5 14. 5 13.5 12.5 11.5 16 14 12 10 8 CM / QA 2 3.5 3 3 3 2.5 5 4 4 4 3 Manuals 5 6 6 5.5 5 5 7 7 6 5 5 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 3-5 Plans and Requirements Activity Distribution Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S, I,M, L S I M L VL S I M L VL Overall Phase Percentage 16 17 17 17 17 17 18 18 18 18 18 Requirements Analysis 15 12.5 12.5 12.5 12.5 12.5 10 10 10 10 10 Product Design 40 41 41 41 41 41 42 42 42 42 42 Programming 14 12 12.5 13 13.5 14 10 11 12 13 14 Test Planning 5 4.5 5 5.5 6 6.5 4 5 6 7 8 V & V 6 6 6.5 7 7.5 8 6 7 8 9 10 Project Office 11 13 12 11 10 9 15 13 11 9 7 CM / QA 2 3 2.5 2.5 2.5 2 4 3 3 3 2 Manuals 7 8 8 7.5 7 7 9 9 8 7 7 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 3-6 Product Design Activity Distribution

PAGE 72

60 Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S I M L S I M L VL S I M L VL Overall Phase Percentage 68 65 62 59 64 61 58 55 52 60 57 54 51 48 Requirements Analysis 5 5 5 5 4 4 4 4 4 3 3 3 3 3 Product Design 10 10 10 10 8 8 8 8 8 6 6 6 6 6 Programming 58 58 58 58 56.5 56. 5 56.5 56.5 56.5 55 55 55 55 55 Test Planning 4 4 4 4 4 4.5 5 5.5 6 4 5 6 7 8 V & V 6 6 6 6 7 7.5 8 8.5 9 8 9 10 11 12 Project Office 6 6 6 6 7.5 7 6.5 6 5.5 9 8 7 6 5 CM / QA 6 6 6 6 7 6.5 6.5 6.5 6 8 7 7 7 6 Manuals 5 5 5 5 6 6 5.5 5 5 7 7 6 5 5 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 3-7 Programming Activity Distribution Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S I M L S I M L VL S I M L VL Overall Phase Percentage 16 19 22 25 19 22 25 28 31 22 25 28 31 34 Requirements Analysis 3 3 3 3 2.5 2.5 2.5 2.5 2.5 2 2 2 2 2 Product Design 6 6 6 6 5 5 5 5 5 4 4 4 4 4 Programming 34 34 34 34 33 35 37 39 41 32 36 40 44 48 Test Planning 2 2 2 2 2.5 2.5 3 3 3.5 3 3 4 4 5 V & V 34 34 34 34 32 31 29.5 28.5 27 30 28 25 23 20 Project Office 7 7 7 7 8.5 8 7.5 7 6.5 10 9 8 7 6 CM / QA 7 7 7 7 8.5 8 8 8 7.5 10 9 9 9 8 Manuals 7 7 7 7 8 8 7.5 7 7 9 9 8 7 7 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 3-8 Integration and Test Activity Distribution Product Design Phase Programming Activity PhaseIntegration and Test Phase Requirements & Analys is 12.50% 4% 2.5% Product Design 41% 8% 5% Programming 13% 56.5% 37% Testing Planning 5.5% 5% 3% V & V 7% 8% 29.5% Project Office 11% 6.5% 8% QA 2.5% 6.5% 8% Manuals 7.5% 5.5% 7.5% Phase Percentage of Total Effort 17% 58% 25% Table 3-9 Work Breakdown Structure for a Medium Size Project

PAGE 73

61 3.4 Model Types With the independent variables and dependent variables described, the next step is to describe the relationship among all the vari ables. A model is needed to describe how the independent variables affect the dependent variables. In the literature, there are four common models used to relate software size to effort. In addition, research is conducted to identify the causes of economies or diseconomies of scale. On one hand, fixed overhead costs such as project management ma y not directly increase with system size; therefore, larger projects can realize economies of scale. On the other hand, some overhead activities, such as documentation, in crease in excessive proportion to project size. As projects increase, the amount of work required for documentation increases more rapidly leading to dise conomies of scale. From the software cost estimation litera ture, it is unclear if economies or diseconomies of scale exist. Most likely, mixe d economies of scale exist, but it is difficult to know at which project size economies of scale can no longer be realized. Kitchenham found that the relationship between effort and size is rather linear since the tendency of the constant b in th e log-linear model is to be 1.0 (Kitchenham 1992). By ignoring economies of scale and dise conomies of scale, the linear model was argued as being the best method to describe size and effort. Furthe r research has shown that economies and diseconomies of scale exis t in software development (Banker, Chang et al. 1994). Banker concludes th at the log-linear re lationship is too limited to model size on effort. Hu tested the linear, quadratic, log-linear, and translog model and found that

PAGE 74

62 the quadratic model provided the most plausi ble relationship between effort and size (Hu 1997). Model Specification Model Name () EffortabSize Linear Model 2()() EffortabSizecSize Quadratic Model abEfforteSize Log-linear Model ln abcSizeEfforteSizeSize Translog Model Table 3-10 Software Cost Estimation Model Types (Briand, El Emam et al. 1998) Briand et al (1998) states th at the most plausible mode l to explain the costs of space and military projects is to use log-li near model involving KLOC, team size, and three COCOMO factors: reliability require ments (RELY), storage constraints (STOR) and execution time constraints (TIME). COCOMO II uses the log-linear model to rela te software size to effort. The model allows projects to have economies and disec onomies of scale through scale multipliers. It is unclear though if this is the best plausible model to describe the size/effort relationship. 3.5 Effort Estimation With the software cost estimation model type picked for COCOMO II along with the independent and dependent variables, th e next step in estimating effort is to instantiate the model. The formula to estimate effort in person-month is given in Equation 3.2

PAGE 75

63 17 1 5 10.01E i i i jPMASizeEM whereEBSF Equation 3.2 Nominal Effort In Equation 3.2, PM stands for the total e ffort in person-months. A is a constant, which for COCOMO II, the value is 2.94. Size represents the estimated project size in thousands of source lines of code (KLOC). Th e effort multipliers as shown by EM are all multiplied together. In addition, B is a constant, for COCOMO II the value is 0.91. Finally, the five scale factors ( SF) are summed together. The re sult is the effort in personmonths. 3.6 Schedule The amount of time to develop the softwa re product is the schedule or project duration. The equation to estimate the proj ect duration is shown in Equation 3.3. In Equation 3.3, C and D are constants, which for COCOMO II is 3.67 and 0.28 respectively. PM is the effort in person-m onths calculated from the previous section. SF is the summed scale factors. TDEV is the project duration in months. 5 1() 0.20.01F i jTDEVCPM whereFDSF Equation 3.3 Schedule Estimation

PAGE 76

64 3.7 Staffing COCOMO II calculates staffing by taking the effort estimate divided by the schedule estimate. PM Staffing TDEV Equation 3.4 Staffing Equation One underlying assumption of COCOMO II is that higher team size results in lower productivity, but the dire ct effects of team size ar e not specifically modeled by COCOMO II. In addition, there is no support in COCOMO II for increasing or decreasing the staffing estimate. Team size is th ought to be indirectly captured by factors already modeled, such as proj ect size, (Conte, Dunsmore et al. 1986) but not explicitly modeling team size leaves no support for cha nging the staffing estim ate. Briand et al (1998) states that after product size, team size is the strongest factor influencing project cost, but COCOMO II treats it as a dependent variable rather th an an independent variable. This dissertation will address this large weakness. 3.8 COCOMO II Overview COCOMO II provides a rich structure to characterize software projects though scale factors and effort multipliers. Also us ing lines of code or function points as a measure of size, a software project can be parameterized in detail. COCOMO II provides a detailed estimation of product activity though the work breakdown structure. The effort and schedule estimate along with the work break down structure will be used as inputs to improve the issues raised about staffing to build a new cost estimation model.

PAGE 77

65 CHAPTER 4 COMMUNICATION OVERHEAD I will pay more for the ability to d eal with people than any other ability under the sun. John D. Rockefeller 4.1 Introduction “Professional programmers spend cons iderable time communicating with others in their organization, both indi vidually and as part of a group. Thus the analysis of communication pr oblems–for example, groups not realizing they are even supposed to communicate, misunderstandings about a shared issue, conflicting view s from different groups, or changes in project personnel–is a key elem ent in understanding how to better support the software development process” (Rosson 1996 p.194). Just as there are losses in productivity due to lack of motivation, there are also losses because of communication. This loss is commonly called communication overhead. This chapter details the derivati on of communication overh ead used in this dissertation. 4.2 Communication Overhead Definition Communication overhead is the “average team member’s drop in productivity below his nominal productivity as a result of team communication, where communication includes verbal communication, documentation, and any additional work, such as that due to interfaces” (Abdel-Hamid and Madnick 1991 p.93). Such communication overhead is not needed when software is developed by a single person, but as additional people are added to a team, the communication overhead rises.

PAGE 78

66 “… it is necessary that each i ndividual spend part of his time communication with each of the othe r team members. For example, the designer must confer with the coder to resolve any questions the code may have about the design; both of these mu st talk to the indi vidual testing the code to give him the benefit of thei r experience with the program; each of these must talk to the documentor to assure that the documentation is proper and complete; and so on” (Tausworthe 1977). As more people are added to a softwa re development project, the number of possible communication paths grows not linearly, but polynomially. Since communication paths are a function of communication overhead, communication overhead also grows exponentially. Brooks detail ed this relationship, saying as the team size (n) increases, communication over head increases in proportion to 2n (Brooks 1975; Abdel-Hamid and Madnick 1991). Brooks argue d for the drop in productivity as team size increases stating the following: 1. As the team size increase, there is greater need to coordinate the activities of the group, thus increasi ng overhead at the expense of code production. 2. As members are added to a team, the new members must acquaint themselves with the overall project design and with previously completed work before they can begin to contri bute to the project. (Conte, Dunsmore et al. 1986 p.258) The number of communication paths that ex ist in a team with n people is shown in Equation 4.1. ()(1) 2 nn CommunicationPaths Equation 4.1 Communication Paths for n People If a group of 30 people were in a team, there is a possibility of 30(301) 2 or 435 communication paths between all people. Abdel-Hamid found that for a team of 30

PAGE 79

67 people, the communication overhead is more th an 50%. Out of an 8-hour day, more than 4 hours of the day will be spent communi cating. Typical communication activities include meetings, phone calls, documen tation, and artifact reviews. During software development, if need ed communication is not done, problems will arise from misunderstandings and will eventu ally have to be corrected. On the other hand, communication that is not needed can al so occur, leading to no foreseeable benefit to the software development project. Since communication overhead can take up such as large percentage of time during software development, so me people suggest small, agile teams, that consist of no more than 10 people (Paulk 2001). With small teams, communication overhead is reduced, leading to more efficient software development. However, some software projects cannot be completed in a reasonabl e period with 10 people or less. In these projects, communication and co mmunication overhead play an important role in the project success. Instead of limiting the number of team members on a software development project, another method is to implement a process structure that limits communication paths between individuals. By breaking the project team into smaller groups and restricting the numbe r of communication paths between team members, communication overhead can be reduced. 4.3 Quantifying Communication Overhead Abdel-Hamid quantified the relationshi p between team size and communication overhead. Table 4-1 shows the communication ove rhead percentage for given team sizes.

PAGE 80

68 Team Size Communication Overhead 0 0 % 5 1.5 % 10 6 % 15 13.5 % 20 24 % 25 37.5 % 30 54 % Table 4-1 Communication Overhead Pe rcentage as a Given Team Size (Abdel-Hamid and Madnick 1991 p.94) To find a team size not listed, interpolati on is used between th e two closest points. To provide a better way of finding a team si ze not listed, mapping team size to number of communication paths using Equation 4.1 provi des more detail. Table 4-2 shows the addition of adding communicat ion paths to Table 4-1. Team Size Communication Pa ths Communication Overhead 0 0 0 % 5 10 1.5 % 10 45 6 % 15 105 13.5 % 20 190 24 % 25 300 37.5 % 30 435 54 % Table 4-2 Communication Paths A dded To Communication Overhead Conducting a regression with communi cation overhead being the dependent variable and communication path s being the independent variab le leads to Equation 4.2. 0.001248269 CommunicationOverheadCommunicationPaths Equation 4.2 Prediction Equation for Communication Overhead The regression equation has very high explanatory power with 2 R being greater than 0.99. Having a 2 R at 1.00 is the maximum possible. Therefore the eq uation is very good at modeling Communication Overh ead based on Communication Paths. Once more than 30 people are on a team, the empirical evidence on communication overhead is sparse. In order not to estimate with Equation 4.2 beyond the

PAGE 81

69 data that the equation was modeled, any t eams bigger than 30 people or 435 paths will assume a communication overhead of 54%. 4.4 Cooperating Program Model COPMO Team size was a major factor whose signi ficance was not fully analyzed therefore Thebaut (Thebaut and Shen 1984) proposed a software cost estimation model assuming additional effort is needed for when there is large number of peopl e in teams on a project. The equation developed assumed that staff pr ovides diseconomies of scale rather than software size. dEffortabScP Equation 4.3 COPMO Equation In the previous equation, a, b, c, and d are constants that need to be determined from empirical data, S is the program size in thousands of lines of code, and P is the average personnel level (staff) over the life of the project. Communication overhead is modeled with the last term in the equation. By replacing c with 1.5, and d with 2.0, the communication overhead follows Brooks’ suggestion. Calculating the la st term of the equation w ith team size produces the following table. Team Size Communication Overhead Increase in % of Overhead COPMO Increase in % of COPMO 0 0.00 0 5 0.02 37.5 10 0.06 3.00 150 3.00 15 0.14 1.25 337.5 1.25 20 0.24 0.78 600 0.78 25 0.38 0.56 937.5 0.56 30 0.54 0.44 1350 0.44 Table 4-3 COPMO and Co mmunication Overhead

PAGE 82

70 Table 4-3 shows that the in crease in COPMO for a given team size has the same increase in communication overhead. Th is further provides evidence of the 2n relationship between team si ze and communication overhead. 4.5 Communication Overhe ad Contributions By including communication overhead to software cost estimation models, the work of Thebaut and Abdel-Hamid can be cont inued. Thebaut was interested in looking at the average staffing level throughout the project where A bdel-Hamid was interested in the instantaneous staffing level during the pr oject. Neither of the researchers looked at how the structure of the proj ect team interacted with co mmunication. The regression equation developed here along with the equa tion for the number of communication paths given a certain number of people will be used to create a new software cost estimation model.

PAGE 83

71 CHAPTER 5 EXTENDED ESTIMATION MODEL “640K ought to be enough for anybody.” Bill Gates, 1981 “A review of the literature for the last ten years shows that very little in terms of new methods has been propos ed in this area [software cost estimation]. In our opinion, the met hods available today are more than adequate for a company to establish an estimation approach. All that is needed is management’s willingness to employ the planning and control philosophy used in other functional areas in the information systems department.” (Benbasat and Vessey 1980 p. 42) 5.1 Introduction This chapter details the creation of a ne w software cost estimation model based on COCOMO II, software development process structure, and team size. The outputs of COCOMO II which include effort, schedul e and project duration, and the work breakdown structure are summarized and then are further expl ained with various process structures: one-tier, two-tier, or three-tier, along with team size to improve the estimates for effort and schedule. A new metric is creat ed called staff loading that quantifies what percentage of time staff is actively working through development. Different completed software development projects are run through the new software cost estimation tool to illustrate the impact of the software handoff on software development.

PAGE 84

72 5.2 Model Overview The new software cost estimation model pe rforms five steps in order to create new estimates. The five steps are initiall y summarized than are further explained throughout the chapter. The first step in the new software cost estimation is to calculate the outputs from COCOMO II. COCOMO II includ es many project differences in its cost estimation model. The differences in projects allow COCO MO II to yield a scale factor and an effort multiplier for each particular project. Along with the project size, scale factor, and effort multiplier, COCOMO II can produce an estimate for effort, duration, staff size, and a work breakdown structure. Chapter 3 descri bes COCOMO II and the calculations formed in detail. The second step is preparing the work br eakdown structure and effort estimate to be input into the new cost esti mation model. The effort estimat e is adjusted to include the effects of the planning and requirements phase In addition, the wo rk breakdown structure is mapped into the different process struct ures. The work breakdown structure provides information about how long different softwa re development activities will take. The process structure explains which group conducts the partic ular software development activities. Combining the pr ocess structure with the wo rk breakdown structure will inform the model to which group does how much work. The third step is to include staffing as an independent variable. With staffing moving from a dependent variab le in COCOMO II to an independent variable in the new

PAGE 85

73 software cost estimation model, the staffing for each process structure must be included. Populating the process structure with sta ffing information is thus the third step. The fourth step is to ca lculate coordination and comm unication costs based on the staffing and the combined work breakdown stru cture and process structure. A new effort estimate will be created. The fifth step is to calculate a new sc hedule estimate based on the new effort estimate along with the staffing and proce ss structure. Many other software cost estimation models have difficultly in estimating project duration, but with this new model, the estimate is rather straightforward.

PAGE 86

74 Calculate Outputs fromCOCOMO II Work breakdown Structure Effort Estimate Step 1 Adjust the work breakdown structure and effort estimate to include the planning and requirements phase Step2a Convert the work breakdown structure into a % of effort of total rather than % of phases Step2b Map the work breakdown structure into the three different process structures Step2c Populate the three different process structures with staffing information Step 3 Calculate new effort estimate based on coordination and communication costs Step 4 Calculate new schedule for the three different process structures with the staffing information Step 5 Figure 5-1 Model Overview 5.3 Extended Example Information An extended example is used throughout this chapter to show the workings of the model. For the extended example, a medium sized project consisting of 40 KSLOC is used. The default scale multipliers (1.12) and effort multipliers (1.00) are also used. Working with this example will show how th e five different steps of the model create a new estimate for both effort and schedule.

PAGE 87

75 5.4 Using the COCOMO II Outputs The first step that is conducted in the ne w cost estimation model is to calculate the needed outputs from COCOMO II. Chapte r 3 provides details on how COCOMO II estimates effort, schedule, staffing, and a wo rk breakdown structure. This cost estimation model specifically needs the effort estimat e and the work breakdown structure from COCOMO II. COCOMO II provides the effort estimate in man-months and derives the work breakdown structure from tables Table 5-1, Table 5-2, Table 5-3, and Table 5-4. Table 5-1 is the planning and requirements phase of software development. This phase is where the software specification is created. Table 5-2 is the product design phase. This phase is where the requirements specification is turned into a valid software design. Table 5-3 is the programming phase. This phase is where the software design is implemented into code. Finally, Table 5-4 is the integrati on and test phase. This phase is where the developed software is tested All the numbers in the work breakdown structure represent percentages.

PAGE 88

76 Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S,I,M,L S I M L VL S I M L VL Overall Phase Percentage 6 7 7 7 7 7 8 8 8 8 8 Requirements Analysis 46 48 47 46 45 44 50 48 46 44 42 Product Design 20 16 16.5 17 17.5 18 12 13 14 15 16 Programming 3 2.5 3.5 4. 5 5.5 6.5 2 4 6 8 10 Test Planning 3 2.5 3 3.5 4 4.5 2 3 4 5 6 V & V 6 6 6.5 7 7.5 8 6 7 8 9 10 Project Office 15 15.5 14. 5 13.5 12.5 11.5 16 14 12 10 8 CM / QA 2 3.5 3 3 3 2.5 5 4 4 4 3 Manuals 5 6 6 5.5 5 5 7 7 6 5 5 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 5-1 Plans and Requirements Activity Distribution Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S,I,M,L S I M L VL S I M L VL Overall Phase Percentage 16 17 17 17 17 17 18 18 18 18 18 Requirements Analysis 15 12.5 12.5 12.5 12.5 12.5 10 10 10 10 10 Product Design 40 41 41 41 41 41 42 42 42 42 42 Programming 14 12 12.5 13 13.5 14 10 11 12 13 14 Test Planning 5 4.5 5 5.5 6 6.5 4 5 6 7 8 V & V 6 6 6.5 7 7.5 8 6 7 8 9 10 Project Office 11 13 12 11 10 9 15 13 11 9 7 CM / QA 2 3 2.5 2.5 2.5 2 4 3 3 3 2 Manuals 7 8 8 7.5 7 7 9 9 8 7 7 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 5-2 Product Design Activity Distribution

PAGE 89

77 Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S I M L S I M L VL S I M L VL Overall Phase Percentage 68 65 62 59 64 61 58 55 52 60 57 54 51 48 Requirements Analysis 5 5 5 5 4 4 4 4 4 3 3 3 3 3 Product Design 10 10 10 10 8 8 8 8 8 6 6 6 6 6 Programming 58 58 58 58 56.5 56. 5 56.5 56.5 56.5 55 55 55 55 55 Test Planning 4 4 4 4 4 4.5 5 5.5 6 4 5 6 7 8 V & V 6 6 6 6 7 7.5 8 8.5 9 8 9 10 11 12 Project Office 6 6 6 6 7.5 7 6.5 6 5.5 9 8 7 6 5 CM / QA 6 6 6 6 7 6.5 6.5 6.5 6 8 7 7 7 6 Manuals 5 5 5 5 6 6 5.5 5 5 7 7 6 5 5 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 5-3 Programming Activity Distribution Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S I M L S I M L VL S I M L VL Overall Phase Percentage 16 19 22 25 19 22 25 28 31 22 25 28 31 34 Requirements Analysis 3 3 3 3 2.5 2.5 2.5 2.5 2.5 2 2 2 2 2 Product Design 6 6 6 6 5 5 5 5 5 4 4 4 4 4 Programming 34 34 34 34 33 35 37 39 41 32 36 40 44 48 Test Planning 2 2 2 2 2.5 2.5 3 3 3.5 3 3 4 4 5 V & V 34 34 34 34 32 31 29.5 28.5 27 30 28 25 23 20 Project Office 7 7 7 7 8.5 8 7.5 7 6.5 10 9 8 7 6 CM / QA 7 7 7 7 8.5 8 8 8 7.5 10 9 9 9 8 Manuals 7 7 7 7 8 8 7.5 7 7 9 9 8 7 7 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 5-4 Integration and Test Activity Distribution Using the information about the extende d example, COCOMO II calculates the effort to be 169.9 man-months for the gi ven 40 KSLOC project. W ith the size exponent being E = 1.12 in the extended example, and the since 40 KLSOC is closer to 32 KSLOC rather than 128 KSLOC, the M column under the E = 1.12 section is used. The correct numbers for the plans and requirements phase that are used for the extended example are highlighted in Table 5-5.

PAGE 90

78 Size Exponent E = 1.05 E = 1.12 E = 1.20 Size S,I,M,L S I M L VL S I M L VL Overall Phase Percentage 6 7 7 7 7 7 8 8 8 8 8 Requirements Analysis 46 48 47 46 45 44 50 48 46 44 42 Product Design 20 16 16.5 17 17.5 18 12 13 14 15 16 Programming 3 2.5 3.5 4.5 5.5 6.5 2 4 6 8 10 Test Planning 3 2.5 3 3.5 4 4.5 2 3 4 5 6 V & V 6 6 6.5 7 7.5 8 6 7 8 9 10 Project Office 15 15.5 14.5 13.5 12.5 11.5 16 14 12 10 8 CM / QA 2 3.5 3 3 3 2.5 5 4 4 4 3 Manuals 5 6 6 5.5 5 5 7 7 6 5 5 S: 2 KSLOC; I: 8 KSLOC; M: 32 KSLOC; L: 128 KSLOC; VL: 512 KSLOC Table 5-5 Plans and Requirements Phase for a 40 KSLOC project The other three tables are also selected to create a complete work breakdown structure for the extended example. The comp lete work breakdown structure is shown in Table 5-6. Phases Plans and Requirement Product Design Programming Activ ity Integration and Test Activity Requirement & Analysis 46 12.5 4 2.5 Product Design 17 41 8 5 Programming 4.5 13 56.5 37 Test Planning 3.5 5.5 5 3 V & V 7 7 8 29.5 Project Office 13.5 11 6.5 7.5 Quality Assurance 3 2.5 6.5 8 Manuals 5.5 7.5 5.5 7 Phase Totals 7 17 58 25 Table 5-6 Complete Work Breakdown Structure for Extended Example A notation is needed to represent the cells in the previous table. Each phase will be denoted by an abbreviation for the pha se. Plans and Requirements is PR, Product Design is PD, Programming Activity is PA, and In tegration and Test is IT. A subscript is used to denote the activity rows. Requirement s & Analysis is row 1, with each Manuals being row 8. The activity is a dded as a subscript to the phase to get a variable in the form:ActivityPhase. The phase total for each phase is not ated by the given phase with total

PAGE 91

79 as the subscript. Table 57 shows the complete enumeration of the work breakdown structure using the described notation. Phases Plans and RequirementProduct Design Progr amming Activity Integration and Test Activities Equ Ex Equ Ex Equ Ex Equ Ex Requirement & Analysis 1PR 46 1PD 12.5 1PA 4 1 I T 2.5 Product Design 2PR 17 2PD 41 2PA 8 2 I T 5 Programming 3PR 4.5 3PD 13 3PA 56.5 3 I T 37 Test Planning 4PR 3.5 4PD 5.5 4PA 5 4 I T 3 V & V 5PR 7 5PD 7 5PA 8 5 I T 29.5 Project Office 6PR 13.5 6PD 11 6PA 6.5 6 I T 7.5 Quality Assurance 7PR 3 7PD 2.5 7PA 6.5 7 I T 8 Manuals 8PR 5.5 8PD 7.5 8PA 5.5 8 I T 7 Phase Total TOTALPR 7 TOTALPD17 TOTALPA58 TOTALIT 25 Table 5-7 Work breakdo wn structure mapping 5.5 Modeling the Work Breakdown Stru cture in Process Structures The first step in mapping the work breakdown structure (shown as Step 2a in Figure 5-1) to different process structures is to adjust the effort estimate to include the plans and requirements phase. From Table 5-7 TOTALPD + TOTALPA+ TOTALIT= 100. COCOMO II’s effort output only includes the product design, programming activity, and integration and test phases. To incl ude the plans and requirements phase, TOTALPR must be added to the COCOMO II effort estimate. To include the plans and requirement phase the following equation is used: (1) 100TOTAL TotalPR E ffortCOCOMOIIEffortEstimate

PAGE 92

80 Along with adjusting the effort to incl ude the plans and requirements phase, the work breakdown structure must be changed so TOTALPR + TOTALPD + TOTALPA + TOTALIT = 100. The algorithm to convert the four phase totals to equal 100 is shown below: TOTALTOTALTOTALTOTALXPRPDPAIT TOTAL TOTALPR PR X TOTALPD = TOTALPD X TOTALPA = TOTALPA X 100TOTALTOTALTOTALTOTALITPRPDPA The work breakdown structure four phase totals now sum to 100, but adding1PR though 8PR equals 100 instead ofTOTALPR The activities in each phase are adjusted by the phase total to indicate the percentage of work that activity will take place for the whole project rather than just the phase. By multiplying the activities in each phase with the phase total, the conversion is made.

PAGE 93

81 Let I be an Index Set of Activities where | I | = 8. 100TOTAL iiPR PRPRiI 100TOTAL iiPD PDPDiI 100TOTAL iiPA PAPAiI 100TOTAL iiIT I TITiI Phases Plans and RequirementProduct Design Progr amming Activity Integration and Test Activities Equ Ex Equ Ex Equ Ex Equ Ex Requirement & Analysis 1PR 2.99 1PD 1.9875 1PA 2.168 1 I T 0.585 Product Design 2PR 1.105 2PD 6.519 2PA 4.336 2 I T 1.17 Programming 3PR 0.2925 3PD 2.067 3PA 30.623 3 I T 8.658 Test Planning 4PR 0.2275 4PD 0.8745 4PA 2.71 4 I T 0.702 V & V 5PR 0.455 5PD 1.113 5PA 4.336 5 I T 6.903 Project Office 6PR 0.8775 6PD 1.749 6PA 3.523 6 I T 1.755 Quality Assurance 7PR 0.195 7PD 0.3975 7PA 3.523 7 I T 1.872 Manuals 8PR 0.3575 8PD 1.1925 8PA 2.981 8 I T 1.638 Phase Total TOTALPR 6.5 TOTALPD15.9 TOTALPA54.2 TOTALIT 23.4 Table 5-8 Adjusted Work Breakdown Structure The adjusted work breakdown structure now reflects a project that will have four phases of development. The next step is to map the adjusted work breakdown structure into the three different process structures. W ith three different process structures, there will be three different mappings.

PAGE 94

82 There are three types of places that the thirty-two different cells can be mapped into. A cell can be mapped into a main group box. This mapping represents the fact that only one group will do the work without work ing with other groups. Examples include the implementation/unit testing group writi ng the software, and the design group, designing the software. Anot her place to map is between two groups. This mapping represents a handoff. The implementation/unit te sting team giving the code to the testing group is an example of a handoff. The third mapping is general ove rhead. Cells that do not map into the first two mappings belong in the third. Project management is a good example of a mapping that be longs in the third group. 5.6 Mapping of the Three-Tier Process Structure The first structure to be mapped is the Three-Tier Structure. The three-tier structure provides five different main boxes and includes requirements, design, implementation/unit testing, integration te sting, and customer acceptance. Figure 5-2 shows the mapping of the work breakdown structur e into the three-tier process structure. Each individual mapping is di scussed in this section.

PAGE 95

83 Figure 5-2 Effort Breakdown for Three-Tier 1PR is the requirements and analysis activ ity of the plans and requirements phase. This phase is where the initial requirement s of the systems are developed from the customers. 1PR is conducted by the requirements team therefore is mapped to the requirements box in the three-tier process struct ure. The requirements team also start to plan for quality assurance at the beginning of the project, 7PR is also mapped to the requirements team process structure. While the requirements are being collected the initial customer acceptance test plan can be created (8PR ). Part of this test plan is the manual for the system.

PAGE 96

84 After the initial requirements are created the requirements must be handed over to the design group. Product Design and Programming done in the plans and requirements phase is very high level usually co nsisting of initial prototypes that will be eventually discarded. The requirements team transfers to the design team the requirements document along with the initial product design (2PR ) and initial programming (3PR ). With the requirements document from the requirements group, the design group can start on the designing the system, 2PD Any questions for the requirements group or updates to the requirements will occur through the requirements and design group handoff; 1PD represents this activity. The design gr oup will also conduct the initial test planning (4PR ) and verification and validation activities (5PR ). Once the design is created, two major activi ties occur. First, the Integration test plan (4PD ,5PD ) is handed off from the design gr oup to the integration testing team. Second, the detailed design (3PD ) created by the design group is handed off to the implementation/unit testing group. If there are any questions about the re quirements when creating the customer acceptance test plan, 7PD maps the extra quality assurance activity. The quality assurance activity could cause changes in the requirements though. But at this point the plans and requirement and product design pha se is complete. The programming activity phase is ready to start.

PAGE 97

85 The implementation/unit testing gr oup starts devel oping the code (3PA ). Changes to the requirements propagate through the requirements and design groups (1PA ) and through the design and implem entation/unit testing group (2PA ). With the detailed design already complete, the design group continue s working on the integration test plan (4PA ,5PA ,7PA ). At this point the programming phase is co mplete and the final phase, integration and test start. Any final changes to the requi rements are propagated through to the design group (1 I T ) and the implementation/unit test group (2 I T ). The handoff of code from the integration/unit testing group to the integra tion testing (3 I T ) is a large task. In this activity all rework is done. Testing can commence once th e code is given to the integration team. The final integration test plan (4 I T ) is conducted by the integration testing team (5 I T ). Once the code is tested, the integrated syst em is delivered for the customer acceptance team for testing and delivery (7 I T ). In the three-tier structure, there was no specific place to map the project office activities (6PR ,6PD ,6PA ,6 I T ) and manuals (8PD ,8PA ,8 I T ). These activities are mapped as general overhead that will add to the comple tion of all software development activities. 5.7 Mapping of the Two-Tier Process Structure The mapping for a Two-Tier Structure is next. Three main boxes are used. By using the mapping for the three-tier proce ss structure and combining the requirements and design team to create the requirements/de sign and combining the integration testing

PAGE 98

86 and customer acceptance group to create the integration/customer acceptance group the two-tier process structure is formed. Figure 5-3 shows the two-tier process structure. Figure 5-3 Two-Tier Effort Breakdown 5.8 Mapping of the One-Tier Process Structure The final process structure is the one-tier process structure. Si nce there is only one place for the work to be done, the one and onl y box contains all the mappings. This can also be seen by combining the requirements/d esign, integration/customer acceptance, and implementation/unit testing into one box.

PAGE 99

87 All Systems Development Desired System Delivered System Figure 5-4 One-Tier Process Structure 5.9 Populating Staffing into the Process Structures Unlike COCOMO II, the new cost estimation model will be able to include the effects of changing the staff size. Staffing is now modeled as an independent variable, rather than a dependent variable that is the result of effort divide d by schedule. The team size can be adjusted in the model from a minimum of one person to however many is wanted. The project manager is no longer li mited in knowing the staffing must exactly match what COCOMO II suggests or the es timate will not be va lid. If COCOMO II requires ten people, but only se ven are available, simply putting in the seven people will adjust the scheduled project duration. Changing staffing for a team changes both intra-group communication and intergroup coordination. Intra-group co mmunication is calculated dire ctly from the size of the team. Bigger teams are going to need more intra-group communicati on. A staff meeting with five people will take more effort than a meeting with just three people. Since the amount of communication overhead for a given team size is known, intra-group communication is well understood. Inter-gr oup coordination occurs when two different

PAGE 100

88 teams need to coordinate information. Having bigger teams results in more inter-group coordination in addition to intra-group communication. Three different process struct ures are presented in this dissertation. The one-tier process structure has only one team. The two-ti er process structure has three teams; the requirements/design team, the implem entation/unit testing team, and the integration/customer acceptance team. The thre e-tier process structure has five teams: requirements, design, implementation/unit tes ting, integration testing, and customer acceptance teams. A method is needed to refer to the differe nt teams in the thre e process structures. Step three of Figure 5-1 is to populate the thr ee different process stru ctures with staffing information. Requirements Design Desired System Delivered System Acceptance Test Integration Test Implementation Unit Tes t Specification Integration Plan Detailed DesignSoftware Code Requirements Document Developed System Figure 5-5 Three-Tier Model

PAGE 101

89 The three-tier process structure is first e numerated. With the three-tier model, five different variables that represent the number of staff in each team are needed. From the three-tier model, the following variables are created:RequirementsTeamSize D esignTeamSize ImplementationTeamSize IntegrationTestingTeamSize and AcceptanceTestTeamSize Requirements/ Design Desired System Delivered System Integration Testing/ Customer Acceptance Implementation/ Unit Testing Figure 5-6 Two-Tier Model Next, the two-tier process structure is enumerated. The variables to represent the different teams are RequirementsDesignTeamSize,ImplementationTeamSize,IntegrationCustomerAcceptanceTeamSize.

PAGE 102

90 All Systems Development Desired System Delivered System Figure 5-7 One-Tier Model Lastly, the one-tier structure is enumerat ed. The variable to represent the single team isOneTeamTeamSize. At this point, each team in each of the th ree process structures is given a variable name, and these variables names are used in the next step to calculate effort. Based on the extended example described earlier in this chap ter, the process struct ures are going to be populated. Table 5-9 shows a possible method of populating the pro cess structure teams with staff.

PAGE 103

91 Process Structure Team Name Team Size Three-Tier RequirementsTeamSize 2 Three-Tier D esignTeamSize 2 Three-Tier ImplementationTeamSize 2 Three-Tier IntegrationTestingTeamSize 2 Three-Tier AcceptanceTestTeamSize 2 Two-Tier RequirementsDesignTeamSize 3 Two-Tier ImplementationTeamSize 3 Two-Tier IntegrationCustomerAcceptanceTeamSize 4 One-Tier OneTeamTeamSize 10 Table 5-9 Example Team Sizes 5.10 Effort Calculation Step 4 of Figure 5-1 is to calculate the effort needed for each process structure based on the staffing information. 5.11 Three-Tier Structure Describing how the estimation for the thre e-tier structure is implemented is discussed in this section. The three-tier stru cture has five different places to put staff members. Staff members can be placed in re quirements, design, implementation and unit test, integration test, or acceptance test. The algorithm assumes at least one staff member is assigned to each functional group, but any number of staff members can be present. The communication overhead for each team size is calculated based on Equation 4.1 and Equation 4.2

PAGE 104

92 ()(1) 2nn CommunicationPaths Equation 5.1 Communication Paths for n People 0.001248269 CommunicationOverheadCommunicationPaths Equation 5.2 Prediction Equati on for Communication Overhead For example, for a team of 10 people, the communication overhead would be 0.056. For a group of 10 people there will be almost 6% more effort required to complete the task than if a single person did it alone The next step is multiply the original COCOMO II effort estimate by each work br eakdown cell to get a numerical estimation of effort in each cell. The phases of the work breakdown structur e must be mapped into the group that does the work. As shown in Figure 5-8, the mapping from the work breakdown into function groups is shown. Arcs between gr oups are shared tasks and will affect the combination of the teams. The box labeled as general overhead are ac tivities that are not particularly done by any group. As more total staff are added to the project, this overhead grows.

PAGE 105

93 Figure 5-8 Effort Breakdown for Three-Tier Effort Calculation To calculate effort, two steps are required First, the intra-gr oup communication is calculated. Then, the inter-g roup coordination is required. To calculate intra-group communication, the communication overhead is calculated for each group based on the staff size of the group. The total amount of work that is conducted in the group is multiplied by the communication overhead. The following equation takes a group sta ff size, and calculates the communication overhead that will result with the given staff size.

PAGE 106

94 ()(1) ()1(0.001248269) 2nn CEMn Using the Communication Effort Multiplier eq uation, the effort increase due to intra-group communication is calculated. () () () (RequirementsRequirements DesignDesign ImplementationImplementation IntegrationTestingEffortMultiplierCEMTeamSize EffortMultiplierCEMTeamSize EffortMultiplierCEMTeamSize EffortMultiplierCEMTeamSiz ) ()IntegrationTesting AcceptanceTestAcceptanceTeste EffortMultiplierCEMTeamSize Equation 5.3 Effort Multipliers Due To Intra-Group Communication Inter-group Coordination: ()Requirements&Design RequirementsDesignEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize ()Design&Implementation D esignImplementationEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize &()ImplementationIntegrationTesting ImplementationIntegrationTestingEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize ()IntegrationTesting&AcceptanceTesting IntegrationTestingAcceptanceTestEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize ()Requirements&AcceptanceTesting RequirementsAcceptanceTestingEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize ()Design&IntegrationTesting D esignIntegrationTestingEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize

PAGE 107

95 Re( )All quirementsDesign ImplementationIntegrationTestingAcceptanceTestEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize TeamSizeTeamSizeTeamSize 17 452 3 5 23111 30Requirements Design Implementation IntegrationTesting AcceptanceTest Requirements&Design Design&ImplementationEffortPRPR EffortPRPRPD EffortPA EffortIT Effort EffortPRPRPDPAIT EffortPD 32 &3 7 87 444557ImplementationIntegrationTesting IntegrationTesting&AcceptanceTesting Requirements&AcceptanceTesting Design&IntegrationTestingPAIT EffortIT EffortIT EffortPRPD EffortPDPAITPDPAPA Effo 6666888AllrtPRPDPAITPDPAIT Equation 5.4 Tier-Three E ffort Mapping Equations

PAGE 108

96 RequirementsRequirements DesignDesign ImplementationImplementation IntegrationTestingTierThreeEffortMultipler EffortMultiplierEffort EffortMultiplierEffort EffortMultiplierEffort EffortMultiplierEffo IntegrationTesting AcceptanceTestAcceptanceTest Requirements&DesignRequirements&Design Design&ImplementationDesign&Implementationrt EffortMultiplierEffort EffortMultiplierEffort EffortMultiplierEffort &&ImplementationIntegrationTestingImplementationIntegrationTesting IntegrationTesting&AcceptanceTestingIntegrationTesting&AcceptanceTestingEffortMultiplierEffort EffortMultiplierEffort EffortMultipl Requirements&AcceptanceTestingRequirements&AcceptanceTesting Design&IntegrationTestingDesign&IntegrationTesting AllAllierEffort EffortMultiplierEffort EffortMultiplierEffort Finally, TierThreeEffortEstimate = TierThreeEffortMultipler COCOMOII Effort Estimate Schedule Calculation To calculate the project duration the form ula of effort divided by people is used. The TierThreeEffortEstimate from the previous section is used to represent the effort, and the number of people in a particular group is us ed for the people. Development effort that is not directly related to a pa rticular team group is added as overhead. The equations that setup the schedule calcul ation are shown below:

PAGE 109

97 Re6(1)quirementsOverheadTierThreeEffortEstimatePR 68(1)DesignOverheadTierThreeEffortEstimatePDPD 68(1)ProgrammingOverheadTierThreeEffortEstimatePAPA 68(1)TestingOverheadTierThreeEffortEstimateITIT Time for Plans and Requirement Phase: 1 1 2 2 3 3 4PR PR PR PR PR PR PRTime Time Time Time Requirements Requirements Requirements RequirementsDesign Requirements RequirementsDesignOverhead TeamSize Overhead TeamSizeTeamSize Overhead TeamSizeTeamSize 4 5 5 7 7 8 8PR PR PR PR PR PR PRTime Time Time Requirements Design Requirements Design Requirements Requirements Requirements RequirementsOverhead TeamSize Overhead TeamSize Overhead TeamSize Overhead TeamSizeTea A cceptanceTestmSize Time for Product Design Phase: 1 1 2 2 3 3 4 4PD PD PD PD PD PD PD PDTime Time Time Time Design RequirementsDesign Design Design Design DesignImplementation DesignOverhead TeamSizeTeamSize Overhead TeamSize Overhead TeamSizeTeamSize Overhead Te 5 5 7 7PD PD PD PDTime Time DesignIntegrationTesting Design DesignIntegrationTesting Design R equirementsAcceptanceTestamSizeTeamSize Overhead TeamSizeTeamSize Overhead TeamSizeTeamSize

PAGE 110

98 Time for Programming Activity Phase: 1 1 2 2 3 3 4PA PA PA PA PA PA PATime Time Time TimeProgramming RequirementsDesign Programming DesignImplementation Programming ImplementationOverhead TeamSizeTeamSize Overhead TeamSizeTeamSize Overhead TeamSize 4 5 5 7 7PA PA PA PA PA Time Time Programming D esignIntegrationTesting Programming D esignIntegrationTesting Programming DesignIntOverhead TeamSizeTeamSize Overhead TeamSizeTeamSize Overhead TeamSizeTeamSize egrationTesting Time for Integration and Test Phase: 1 1 2 2 3 3 Time Time Time Testing RequirementsDesign Testing DesignImplementation Testing ImplementationIntegratioITOverhead IT TeamSizeTeamSize ITOverhead IT TeamSizeTeamSize ITOverhead IT TeamSizeTeamSize 4 4 5 5 7 7 Time Time Time nTesting Testing DesignIntegrationTesting Testing IntegrationTesting Testing IntegrationTestingAccITOverhead IT TeamSizeTeamSize ITOverhead IT TeamSize ITOverhead IT TeamSizeTeamSize eptanceTest To calculate the schedule, the tasks that ar e on the critical path are added together. Adding all the times will assume no parallel ism, whereas only taking the longest task assumes complete parallelism. Normally, software development project are somewhere between the two poles. By taking the tasks that are on the critical path leads to a schedule estimate. 1212 33357PRPRPDPD PDPATimeTimeTimeTime Time+TimeTimeTimeTimeITITITTierThreeSchedule

PAGE 111

99 5.12 Two-Tier Structure Describing how the estimation for the tw o-tier structure is implemented is discussed in this section. The two-tier stru cture has three different places to put staff members. Staff members can be placed in Analysis and Design group, the Implementation group or the System Testing group. The algorithm assumes at least one staff member is assigned to each functional gr oup, but any number of staff members can be present. The two-tier process structur e is a simplified case of the three-tier team structure. The top two tiers are compressed into one tier, but the implementation tier stays the same. With fewer teams in which to put people, bi gger team sizes are expected using the same amount of people as in the th ree-tier structure. The comm unication overhead is bigger, but the work might get quicker depending on the team size. The effort breakdown is again used for the remainder of the algorithm.

PAGE 112

100 Figure 5-9 Two-Tier Effort Breakdown Effort Calculation () ()RequirementsDesign RequirementsDesign Implementation Implementation IntegratioEffortMultiplier CommunicationEffortMultiplerTeamSize EffortMultiplier CommunicationEffortMultiplerTeamSize EffortMultiplier ()nCustomerAcceptance IntegrationCustomerAcceptanceCommunicationEffortMultiplerTeamSize

PAGE 113

101 ()RequirementsDesign&Implemetation RequirementsDesignImplementation RequirementsDesign&IntegrationCustomerAcceptanceEffortMultiplier CommunicationEffortMultiplerTeamSizeTeamSize EffortMultiplier Communi () (RequirementsDesignIntegrationCustomerAcceptance Implementation&IntegrationCustomerAcceptance ImplementationcationEffortMultiplerTeamSizeTeamSize EffortMultiplier CommunicationEffortMultiplerTeamSize )IntegrationCustomerAcceptanceTeamSize 1234571211 3 57 332 RequirementsDesign Implementation IntegrationCustomerAcceptance RequirementsDesign&Implemetation RequirementsDEffortPRPRPRPRPRPRPDPDPAIT EffortPA EffortITIT EffortPDPAIT Effort 45454778 3 6666888 esign&IntegrationCustomerAcceptance Implementation&IntegrationCustomerAcceptance AllPDPDPAPAITPDPAPR EffortIT EffortPRPDPAITPDPAIT RequirementsDesignRequirementsDesign ImplementationImplementation IntegrationCustomerAcceptanceIntegrationCuTierTwoEffortMultipler EffortMultiplierEffort EffortMultiplierEffort EffortMultiplierEffort stomerAcceptance RequirementsDesign&ImplemetationRequirementsDesign&Implemetation RequirementsDesign&IntegrationCustomerAcceptanceRequirementsDesign&IntegEffortMultiplierEffort EffortMultiplierEffort rationCustomerAcceptance Implementation&IntegrationCustomerAcceptanceImplementation&IntegrationCustomerAcceptance AllAllEffortMultiplierEffort EffortMultiplierEffort

PAGE 114

102 Finally, TierTwoEffortEstimate = TierTwoEffortMultipler COCOMOII Effort Estimate Schedule Calculation To calculate project duration, the formula of effort divi ded by people is used. The TierTwoEffortEstimate from the previous sectio n is used to represent the effort, and the number of people in a particular group is used for the people. Development effort that is not directly related to a part icular team group is added as overhead. The equations that setup the schedule calcul ation are shown below: 6(1)RequirementsOverheadTierTwoEffortEstimatePR 68(1)DesignOverheadTierTwoEffortEstimatePDPD 68(1)ProgrammingOverheadTierTwoEffortEstimatePAPA 68(1)TestingOverheadTierTwoEffortEstimateITIT Time for Plans and Requirement Phase: 1 1 2 2 3 3 4 4PR PR PR PR PR PR PR PRTime Time Time Time Requirements RequirementsDesign Requirements RequirementsDesign Requirements RequirementsDesignOverhead TeamSize Overhead TeamSize Overhead TeamSize Overhe 5 5 7 7 8 8PR PR PR PR PR PRTime Time Time Requirements RequirementsDesign Requirements RequirementsDesign Requirements RequirementsDesign Requirementsad TeamSize Overhead TeamSize Overhead TeamSize Overhead TeamSi RequirementsDesignIntegrationCustomerAcceptancezeTeamSize

PAGE 115

103 Time for Product Design Phase: 1 1 2 2 3 3 4 4PD PD PD PD PD PD PD PDTime Time Time Time Design RequirementsDesign Design RequirementsDesign Design RequirementsDesignImplementationOverhead TeamSize Overhead TeamSize Overhead TeamSizeTeamSize O 5 5 7 7PD PD PD PDTime Time Design RequirementsDesignIntegrationCustomerAcceptance Design RequirementsDesignIntegrationCustomerAcceptance Designverhead TeamSizeTeamSize Overhead TeamSizeTeamSize Overhead Tea RequirementsDesignTeamSize IntegrationCustomerAcceptancemSize Time for Programming Activity Phase: 1 1 2 2 3 3PA PA PA PA PA PATime Time Time TimeProgramming RequirementsDesign Programming RequirementsDesignImplementation Programming ImplementationOverhead TeamSize Overhead TeamSizeTeamSize Overhead TeamSize 4 4 5 5 7PA PA PA PA P PA Time Time Programming RequirementsDesignIntegrationCustomerAcceptance Programming RequirementsDesignIntegrationCustomerAcceptanceOverhead TeamSizeTeamSize Overhead TeamSizeTeamSize 7A Programming RequirementsDesignIntegrationCustomerAcceptanceOverhead TeamSizeTeamSize

PAGE 116

104 Time for Integration and Test Phase: 1 1 2 2 3 3 Time Time Time Testing RequirementsDesign Testing RequirementsDesignImplementation Testing ImplementationIntegratITOverhead IT TeamSize ITOverhead IT TeamSizeTeamSize ITOverhead IT TeamSizeTeamSize 4 4 5 5 7 7 Time Time Time ionCustomerAcceptance Testing RequirementsDesignIntegrationCustomerAcceptance Testing IntegrationCustomerAcceptanceITOverhead IT TeamSizeTeamSize ITOverhead IT TeamSize ITOverhe IT Testing IntegrationCustomerAcceptancead TeamSize To calculate the schedule, the tasks that ar e on the critical path are added together. The schedule equation for Two-Tier is equivale nt to the Three-Tier schedule calculation. 1212 33357PRPRPDPD PDPATimeTimeTimeTime Time+TimeTimeTimeTimeITITITTierTwoSchedule 5.13 One-Tier Structure By adding the impact of the team size on the total effort, the one-tier calculation for effort follows: TierOneEffortEstimate =()OneTeamCommunicationEffortMultiplerTeamSize COCOMOII Effort Estimate

PAGE 117

105 The schedule is: TierOneSchedule = TierOneEffortEstimate / ()OneTeamCommunicationEffortMultiplerTeamSize 5.14 Staff Loading A new variable called staff loading is created by this cost estimation model. This variable represents the percentage of time that groups in the two-tier and three-tier are assigned to a task. In the one-t ier structure, people can be thought to be always working on a task, so the staff loading is 100%. Each staff member in a one-tier process structure is always working on the critical path. If a staff member in a one-tier project is sick for a day, an extra day can be added to the end of th e schedule if that time is not made up in another way. With the two-tier and thr ee-tier structure, work is not always conducted on the critical path, so the staff loading represents how much work effort is being planned for the critical path. 5.15 Optimization The software cost estimation provides an optimization routine for each structure. Based on an objective function, the model runs different team size numbers in order to minimize the function. The defau lt function is listed below: ()OptimizationFunctionMinimizeSchedule The optimization function will try to sta ff the project in a way to minimize the amount of time the project takes to complete At some point adding additional staff will

PAGE 118

106 result in more overhead than the additional st aff will provide in productivity. Right before this point is the optimal staffing point. In addition to finding the optimal staffing point, the optimization engine can also have to additional constraint. The first constraint specifies a minimum total staff. The optimiza tion engine will find the optimal staffing point with a total staff that includes at least the minimum total staff. The second constraint is a maximum total staff. This wo rks by setting a maximum total staff size that the optimization engine must honor. Both constr aints can be used simultaneously to limit the solution space between a maximum an d minimum number of total staff. The algorithm is implemented in two di fferent methods. The first method is a brute force optimization. This is used for both the one-tier and two-tier process structures. All possible combination of staff can be ch ecked in under a second with a brute force approach. But optimizing a three-tier process structure is inefficient with a brute force approach. In some cases, an optimal result is e xpected to take many years so solve. So an external nonlinear so lver is used to provide th e optimization. Lingo 8.0 by LINDO Systems Inc. is used to solve exactly the sa me problem that was being attempted with the brute force attempt, but instead in a much more efficient way. The Pre-solver in Lingo can reduce the optimization problem so just solve in a few seconds. The Lingo script is available in Appendix C.

PAGE 119

107 5.16 Conclusion This chapter describes the building of th e new software cost estimation model. The improvements over COCOMO II were show n in staff allocation optimization. All the equations needed for the algor ithm to create the new estimates have been shown in the chapter along with a sample test case to show the algorithm in use. Improvements to cost estimation are possible with use of the new software cost estimation model. The following chapter will provide empirical suppor t to validate the model described in this chapter.

PAGE 120

108 CHAPTER 6 DECISION SUPPORT TOOL Build a system that even a fool can use, and only a fool will want to use it. — George Bernard Shaw 6.1 Example Test Run This chapter describes how the PSEstimate tool works. This chapter estimates a sample project through the tool. Screen shot s are provided to illu strate the tool at different parts through the estimation process. COCOMO II Estimate With the new software cost estimation model described, a sample project will show the models in use. A test case with so ftware size being 40 KSLOC is used with the default COCOMO II effort multipliers and s cale factors. COCOMO II estimates total effort to be 169.9 people-months. However, COCOMO II by default does not include the requirements phase of development. The effort and schedule required to build the requirements have to be added to COCOMO II estimate. In this case, the new effort from COCOMO II including requirements is 181.8 people-months. COCOMO II estimates that 9.5 people are required and the project will take 19.2 months. One-Tier Estimate The estimate for the one-tier estimate is 192 people-months. The difference between the COCOMO II estimate and the one -tier estimate is due to communication

PAGE 121

109 overhead. COCOMO II’s 9.5 staff estimate is rounded to 10 people to result in a schedule of 19.2 months. However, if 13 people are used instead of 10, the effort increase to 199.5 people-months, but the schedule is reduced to 15.3 months. COCOMO II has limited support for changing the schedule. Two-Tier Structure The estimate for the two-tier structure is 185 people-months. Ten people are used as in the one-tier structure. However, the ca lculated schedule is 31 months. With three people placed in the analysis and design gr oup, three in system testing, and four in implication, the model shows that software development will take much longer than COCOMO II estimates. If instead six people are placed in analysis and design, six in system testing, and nine in implementation, the total effort only increase to 198 peoplemonths, but the schedule is reduced to 16 months. Three-Tier Structure The estimate for the three-tier structure is 184 people-months. Ten people are used to get this estimate. Two people ar e in requirements, two in design, two in implementation, two in integr ation testing, and two in accepta nce testing. The calculated schedule is 52 months. But, if three people are in requirements, five in design, nine in implementation, six in integra tion testing, and one in accepta nce testing, the total effort increases to 199 people-months, but the schedule is reduced to 16 months.

PAGE 122

110 Conclusion In all three cases, using the model descri bed in this chapter, assigning different team size than COCOMO II suggested improves the schedule estimate for development. For the same data, the best process structure is to use a one-tier proc ess structure with 13 people. There were no bad structures for the sa mple test case if the number of staff were assigned to each group optimally. Without good staff allocation, the three-tier structure will deliver a software project much late r than is estimated by COCOMO II. 6.2 Tool Discussion This next section shows the developed tool in use. Four different screen shots are used to show the developed new cost estimation project tool. The first screenshot details the choices in project characteristics available. Both the lines of code or function point methodol ogy is available for software sizing. In the following screenshot, the function point metho dology is used with backfiring to come up with equivalent lines of code estimate. Using an estimate of 900 unadjusted function points of C++ yields 47700 lin es of code. The five scale factors are also selectable. Finally, the option of using the early design or po st-architecture effort multipliers is available.

PAGE 123

111 Figure 6-1 Screenshot of Estimating Software Size The next screenshot shows the results of the estimation based on the project characteristics. COCOMO II estimates for e ffort, schedule, and st affing are estimated along with the derived effort multipliers, scal e factors, and equivalent lines of code. The three different process structures are estimat ed based on a default-staffing algorithm.

PAGE 124

112 Figure 6-2: Screenshot of Developed Tool Simulation Results The next screenshot shows the results of optimizing each process structure. There is a large improvement in schedule after optimization.

PAGE 125

113 Figure 6-3: Screenshot of Developed Tool After Optimization 6.3 Tool Construction PSEstimate was developed in C# usi ng Visual Studio .NET 2002, Visual Studio .NET 2003, and finally Visual Studio 2005. The final version is complied in Visual Studio 2005. As the technology changed the software was updated as needed. The program runs with the Microsoft’s .NET fram ework 2.0. The software uses Microsoft’s ClickOnce deployment method to be placed on the web. The ClickOnce web deployment forces users to navigate to the web serv er where PSEstimate was located. Any updates were automatically retrieved. This allowed us ed to be guaranteed to have the latest

PAGE 126

114 version of the software. Approximately 10,000 lines of code were written in C# to implement the tool. The tool took the author approximately two years of full-time work to design, implement and test the tool. The tool uses external code, Lingo 8 API, for the tier-three non-linear solver. This code is ca lled via Dynamic Link Library calls. Since the Lingo 8 API is written for Windows machines the software only currently runs on Windows based machines.

PAGE 127

115 CHAPTER 7 EXPERIMENTAL VALIDATION 7.1 Introduction This chapter details the experimental validation used to assess the software cost estimation artifact and project management tool developed in this dissertation. Justifying and evaluating an artifact is an importan t step in the design science paradigm. Figure 7-1 Design Science Research Model (Hevner, March et al. 2004)

PAGE 128

116 7.2 Study Rationale According to McGrath, there are eight differe nt research strategies available when designing a study. These strategies includ e laboratory experiments, experimental simulations, field experiments, field studies, computer simula tions, formal theory, sample surveys and judgment tasks. Any particular type of study will have strengths and weaknesses when looking at th ree objectives: gene ralizability with re spect to populations, precision in control and measurement of variab les related to the behaviors of interest, and existential realism, for the participants, of the context within which those behaviors are observed (McGrath 1982). While each object ive is important, it is impossible to maximize all three objectives simultaneously with one study. This problem is commonly known as McGrath’s three-horn dilemma. A laboratory experiment can maximize control at the expense of both generalizability and reality. An experiment is the best study available to capture cause and effect. By using a control group and an experimental group, differences between the two groups can be attributed to the treat ment, i.e. being in the control group or experimental group. A field experiment can ma ximize reality at the expense of control and generalizability. In a field experiment, a study is conducted in an organization. But, it is very hard to create controlled conditi ons and the results are represent a particular organization. A sample survey can maximi ze generalizability at the expense of both control and reality. A survey can be sent to a random sample of pe ople, but control and reality are poor.

PAGE 129

117 An experiment should strive to provide the most control as possible in order to show cause and effect. A key strength of the controlled experiment is that, since other possible effects are controlled, a ll variation in the dependent variable is attributed to the treatments. Participants are randomly assigned into one of three treatment groups. This fact makes the study a true randomized contro lled experiment. Random assignment is important from the experimental and data anal ysis standpoint. In this study a participant is not guided or placed into any treatment gr oup based on any factor other than a random assignment therefore all partic ipants have an equal chance of being assigned to any treatment group. Two coin tosses were used to randomize participants into groups. Participants were placed in a treatment gr oup based they outcome as shown in Table 7-1. First Coin Toss Second Coin Toss Result Heads Head Manual Heads Tails COCOMO II Tails Heads PSEstimate Tails Tails Repeat Table 7-1 Randomizing to Treatments Notice, if two tails are flipped, the whol e procedure would start over to insure a participant had an equal chance of be ing in any one of the three groups. The three possible treatment groups ar e no tools support or a manual group, COCOMO II, and PSEstimate. The no tool supp ort group is the control group. This group will not be given any cost estimation models to help estimate software. The second

PAGE 130

118 group, COCOMO II, is given a computer tool that supports estimating with COCOMO II. The third group, PSEstimate, is given the tool developed in this dissertation to help estimate software costs. 7.3 Institutional Review The University of South Florida Institutio nal Review board is required to approve all studies that involve human subjects. The review board requires that all investigators have proper training in c onducting studies with human subjects. This study has been approved by the institutional review board as IRB # 101906.The approval for the study is listed in Appendix D. For this experiment, all participants were briefed on the study in general, able to read the consent form, answer any questions they had about the consent documents or study, and then signed that they acknowledged and gave consent to participant in the study. All participants were given a signed c opy of the consent document for their own records. 7.4 Research Question The research questions are repeated from Chapter 1: Research Question 1: Can a software cost estimation model be built that models the effect of both inter-group coor dination and intra-group communication? Research Question 2: Can a software cost estimation tool be built for project managers that implements inter-group co ordination, intra-gr oup communication and process structure?

PAGE 131

119 Research Question 3: Does an experiment demonstrate the effectiveness of the new software cost estimation model? The empirical study focuses on the third research question. Will the experiment demonstrate the effectiveness of the software cost estimation model that was built in the previous chapters? 7.5 Hypotheses Based on the third research question and the research model, five hypotheses were developed. Accuracy of Software Estimate Consistency of Software Estimates Method of EstimationNo Model State-of-the-practice model (COCOMO II) State-of-the-practice model that includes the effects of inter-group coordination and intra-groupcommunication Confidence of Software Estimates Satisfaction of Estimation Technique Perceived Usefulness of Estimation Technique H1 H2 H3 H4 H5 Table 7-2 Research Model

PAGE 132

120 H1: Use of a software cost estimation to ol for software development projects increases the accuracy of effort and schedule. H2: Use of a software cost estimation to ol for software development projects reduces the variation (increases the consiste ncy) of estimates for effort and schedule. H3: Users of a software cost estimation to ol for software development projects are more likely to have an appropriate level of confidence in their estimates than estimators without support. H4: Use of a software cost estimation to ol for software development projects increases the satisfaction with the estimation technique. H5: Use of a software cost estimation to ol for software development projects increases the users’ perceived usef ulness of the estimation technique. 7.6 Pretest Several pretests were conducted during the development of the cost estimation tool. One pretest used twelve master students in MIS to estimate the staff and effort while using the PSEstimate tool. It was found that more work was needed in order for the tool to be used in an experiment al setting. The participants were timed during this initial pretest and many participants needed more than 90 minutes to estimate two tasks. The main problem was when participants were tr ying to estimate staffing with Tier Three. Using a brute force algorithm to find an optimal staffing needed many calculations. A single staffing optimization scenario w ould not finish in less than one hour. The participants would either have to sit and wa it or otherwise cancel the task. To fix this

PAGE 133

121 problem a nonlinear solver was used that reduc ed the optimization problem to fewer than two seconds. This made the tool mu ch more useful to estimators. The second pretest was conducted on one MI S doctorial students to get an idea of the time needed to conduct the experiment with all the changed made from the first pretest. The pretest was successful; feedback was obtained about the software and the experimental materials. Slight changes in in structions were made to clarify what was expected in the experiment. 7.7 Pilot Test After two pretests and many changes to the experimental task and materials, a pilot test was conducted. In the pilot all thr ee experimental treatments were conducted. A total of four people went through the expe riment. One person was in the manual group, one person in the COCOMO II group, and tw o people were in the PSEstimate group. The pilot data was not sufficient to do any kind of analysis, but one task was changed because several participants rated one task bei ng impossible to estimate with the given information. This was an error and was correct ed before the main data collection started. 7.8 Main Study After several pretests and a pilot test, the main data collection was ready to occur. The first step was to find participants. Partic ipants in the study were selected based on their prior knowledge about software cost estim ation. Because of this, participants were recruited from local companies. Employees th at were either projec t managers or team leaders were targeted. People currently wo rking on a project management certification

PAGE 134

122 were also deemed to be sufficient since work experience as a project manager is a perquisite for the PMP certification. A gra duate course in project management was a targeted since estimation knowledge would be su fficient in this type of a course. Finally, various faculty members with work experience as project managers were also targeted. In the end, 34 participants completed the experiment. The average participant was 34 years old. The oldest participant was 54 and the youngest was 21. Twenty-four of the partic ipants were male, ten were female. On average participants had 15 years of full-tim e work experience with 12 years being IT related. Five years of time in current position was the average for the participants. 7.9 Training All participants were given a 45 minute presentation on software cost estimation before participating in the experiment. Duri ng the briefing, experimental materials were explained to the participants. After the brie fing any remaining questions were answered and then the participants were allowe d to work on the experimental tasks. 7.10 Experimental Tasks In this experiment, to obtain maximum control, all participants were given the same experimental materials except one sheet of paper notifying the participants a website to go to in order to download the software for the experiment. The COCOMO II and PSEstimate groups were each given a diffe rent website to go to. The manual group was given no additional information. The COCOMO II t and PSEstimate treatment page can be seen in the experimental materials in Appendix E.

PAGE 135

123 After initial training, all part icipants read an instruction sheet that thanked them for their participation and outlined the tasks. The welcome sheet can be seen in Appendix E. The next page in the experimental mate rials was the Institutional Review Board Consent form. These three pages can be seen in Appendix E. The participants were allowed to keep a copy of the consent form. Included on this was the phone number of the investigators in case they had an y additional questions or concerns. After the participants filled out the consen t form, the next step was to complete a pre-experiment questionnaire. This form can be seen in Appendix E. In this questionnaire, demographic information such as age, employment history, and previous estimating background was collected. 7.11 Experimental Task 1 After all the demographic information was collected, the participants were to start estimating the first task. The participants were asked to write the st art time when starting the first Task. Task 1 was broken into three pa rts, Task 1a, Task 1b, and Task 1c. In Task 1a, a 30K project was to be estimated. The pa rticipants were told they had 12 people to work on the task, and everyone worked in a single group. The participants were given historical data from other similar projects, but there was not a project that was exactly related to this project. Participants were al so given some qualitative information about the team being very experienced and working well together. They were also told that the project was not complex and the development platform was commonly used in the organization.

PAGE 136

124 With this information, the participants had to estimate the effort in man-hours required to complete the development. They had to rate their confidence in the estimate for man-hours. In addition they had to give a best and worst case value for effort. Finally, they had to provide a rationale for their estimat e. The participants also had to do the same for schedule. Task 1b was the same task as Task 1a except for the staffing available for the project increased from 12 people to 24 peop le. The participants were asked to estimate the same set of questions for Effort and Schedule as they did in Task 1a. In Task 1c, everything was the same as Task 1a except for the project was now bigger. The size went from 30K to 180K. Also the staffing was reduced to 11 instead of 12. The participants were asked to estimate e ffort and schedule. This task worked as a manipulation check because there was a very similar project in the historical data. In the historical data there was a 183K project versus the 180K project proposed. The participants should use this information to he lp them with estimation. After Task 1c was estimated, the participants were asked to write the stop time for this task in the materials. From this a total time on task measure can be calculated for Task 1. 7.12 Experimental Task 2 When starting Task 2, the participants were asked to record the time they starting working on the task. At the end of the Task 2, the participants were to record the stop time so the time on task for Task 2 can be calculated. Experimental Task 2 was again broken into 3 subtasks, Task 2a, Task 2b, and Task 2c. All three subtasks had the exact setup except for the staffing arrangement. The

PAGE 137

125 project was a larger project than the projects in Task 1a and 1b. The task was to make estimates about developing a financial syst em, which should require an e-commerce application that was 80K in size. The task wa s setup to be a financial system so it would require more effort than a normal project. In Task 2a, the staffing was set to be 30 people working in one project group. Also the historic al data was explained to be not relevant since this project more than double the staff of any historical proj ects. The participants had to estimate the effort and schedule required to complete this estimation just like in the previous task. In Task 2b, everything was the same from Task 2a except for the staffing structure changed. Instead of having one proj ect group, three projec t groups were used. There were a requirement and design team, an implementation team, and a testing team. Instead of having 30 people in one group, th e 30 people were broken into one of three groups. The first team was the requirements an d design team. This team consists of 9 people. The second team was the implementa tion team and consists of 13 people. The third team is the testing team and consists of 8 people. Again, the pa rticipants were asked to estimate effort and schedule. In Task 2c, another staffing structure cha nge was made from Task 2a. In this case five different project groups were used to build the software system. The first team was the requirements team. The requirements team consists of 4 people. The second team was the design team. Design team consists of 6 people. The third team was the implementation team and consists of 12 people The fourth team is the testing team and consists of 7 people. The fifth team is th e customer acceptance team. One person will

PAGE 138

126 perform all the customer acceptance activities. No tice, there is still a total of 30 people working on the system development. The partic ipants were asked to estimate the effort and schedule required to conduct this project. Experimental Task Project Size Staffing Historical Data Available? Project Characteristics Best Estimation Tool 1a 30K 12 Yes, but not exactly similar Team very experiences, works well together; project not tool complex and development platform common Historical Data 1b 30K 24 Yes, but not exactly similar same PSEstimate 1c 180K 11 Yes, very similar same Historical Data 2a 80K 30 Tier1 Yes, but not relevant Complex e-commerce application PSEstimate 2b 80K 30 Tier2 Yes, but not relevant same PSEstimate 2c 80K 30 Tier3 Yes, but not relevant Same PSEstimate Table 7-3 Experimental Tasks Overview 7.13 Post Experiment Questionnaire After all the tasks were estimated, a post-experiment questionnaire was completed. This can be seen in Appendix E. The questionnaire is used to measure three main constructs and includes the manipulation check. The first sets of questions were to measure the participants the pe rceived usefulness of the es timation technique. The next questions were to measure the participants’ satisfaction about the ex periment. Finally, the last sets of questions were used as additional manipulation checks. Manipulation checks were needed to be conducted in order to show that a particular participant in a treatment group received the treatment. Manipulation checks

PAGE 139

127 add to the rigor of the method of experiment ation. By asking certain questions after the experiment provides information to show that the manipulation was effective. Four questions were used in the manipulation check. In making my software cost estimates, the technique I mainly used: a. A Calculator b. Spreadsheet c. Historical Data d. Historical Data along with COCOMO II e. Historical Data and PSEstimate f. Other (please specify) __________________________________ During the study, circle all of the following technique s that you used to make software cost estimates: a. A Calculator b. Spreadsheet c. Historical Data d. COCOMO II e. PSEstimate f. Other (please specify) __________________________________ My preferred method to estimate software cost is to use ___________________ to come up with my estimates. When conducting Task 2, how do you think the difference in structures changed the communication that occurred as the same thirty people moved smaller group From these questions an analysis can be conducted on if a participant was placed in the COCOMO II group but did not end up using COCOMO II for the estimation task.

PAGE 140

128 CHAPTER 8 RESULTS AND DISCUSSION 8.1 Introduction This chapter reports the results of the empirical study presented in Chapter 7. In this chapter the hypotheses that are presented in both Chapter 1 and Chapter 7 are tested with a discussion on the results of the findi ngs. Five main hypotheses are being tested; see the research model for on overview. Accuracy of Software Estimate Consistency of Software Estimates Method of EstimationNo Model State-of-the-practice model (COCOMO II) State-of-the-practice model that includes the effects of inter-group coordination and intra-groupcommunication Confidence of Software Estimates Satisfaction of Estimation Technique Perceived Usefulness of Estimation Technique H1 H2 H3 H4 H5 Figure 8-1 Empirical Research Model

PAGE 141

129 8.2 Treatment Breakdown The random assignment of the 34 partic ipants to three groups resulted in a desirable breakdown. Twelve people were as signed into the manual group, eleven were assigned to COCOMO II, and another eleven were assigned into the PSEstimate group. Treatment Number of Participants Manual 12 COCOMO II 11 PSEstimate 11 Table 8-1 Treatment Breakdown 8.3 Data Analysis Overview With a sample size of 34 participants, a nonparametric data analysis will be the most conservative. Several data analysis tech niques were studied to find the best possible analysis that would not violate assumptions Since there were three groups, a KrusalWallis One-Way Analysis of Variance by Ranks was a possible choice. The KrusalWallis test is an extension of the Mann-Whitney U test. The Mann-Whitney U test is limited to two groups, whereas the Krusal-Wallis test expands the analysis to N groups. The Krusal-Wallis test has four assumptions (Abell, Braselton et al. 1999): 1. Samples are independent, random samples, one for each of K populations, where the median of population i is denoted byi M i=1,…k. 2. The sample values are at least ordinate, categorical data. 3. The populations all have the same shape. (If the populations differ, this difference is only in location).

PAGE 142

130 4. The populations each have a continuous distribution. Assumption one, two, and four are satisfi ed through the experimental design. But assumption three cannot be assumed to be satisfied. Particularly, Hypothesis Two will specifically test against the popul ations having different shapes. The parametric analysis like ANOVA will have even more challenging demands towards assumptions. The standard parametric and non-parametric tests can not be used; therefore a different test wa s required. The best analysis was found in SAS under the procedure MULTTEST. This procedure can use bootstrapping to get population estimates rather than rely on assumptions. The downside of this procedure is it may take much time when bootstrapping with large datasets a larg e number of times. With a modern computer bootstrapping 20,000 times was a tr ivial task for this dataset. 8.4 Expert Validation One expert in software cost estimation ra ted the tasks. These values will be used as the “correct” answer for the analyses c onducted in this chapter. The results of the expert ratings are shown below: Task 1a Task 1b Task 1c Task 2a Task 2b Task 2c Expert E S E S E S E S E S E S Expert 1 60 13 40 12 1582 40 180 20 160 19 150 19 Table 8-2 Experts Ratings of Effort and Schedule for Tasks 8.5 Accuracy The first hypothesis is about accuracy of the three treatment groups. Each treatment group was using a different type of tool to do estimation, the first was no

PAGE 143

131 support, and the second treatment group was using COCOMO II, and the third treatment group was using PSEstimate. Hypothesis H1 is as follows: H1: Use of a software cost estimation to ol for software development projects increases the accuracy of effort and schedule. It is important to break the hypothesis in to two parts, one for effort and one for schedule. This creates: H1a: Use of a software cost estimation tool for software development projects increases the accuracy of effort. H1b: Use of a software cost estimation tool for software development projects increases the accuracy of schedule. The first step to testing this hypothesis is to see if there is a difference among the groups in estimates for effort and schedule. Treatment Task 1A Effort Task 1B Effort Task 1C Effort Manual Mean = 28 Std. Dev = 14 Range = 5-50 Mean = 57 Std. Dev = 127 Range = 1.5-456 Mean = 133 Std. Dev = 85 Range = 2-250 COCOMOII Mean = 53 Std. Dev = 33 Range = 20.5-132.5 Mean = 106 Std. Dev = 183 Range = 8.73-633.9 Mean = 247 Std. Dev = 89 Range = 160-430 PSEstimate Mean = 63 Std. Dev = 25 Range = 30-112 Mean = 71 Std. Dev = 33 Range = 16-130 Mean = 288 Std. Dev = 126 Range = 75-550 Table 8-3 Results for Task 1 for Effort Treatment Task 2A Effort Task 2B Effort Task 2C Effort Manual Mean = 69 Std. Dev = 60 Range = 3.8-198 Mean = 75 Std. Dev = 68 Range = 3.8-210 Mean = 140 Std. Dev = 161 Range = 3.8-500 COCOMOII Mean = 375 Std. Dev = 197 Range = 76.1-610.3 Mean = 355 Std. Dev = 225 Range = 79.9-720.3 Mean = 394 Std. Dev = 285 Range = 80-986.6 PSEstimate Mean = 384 Std. Dev = 203 Range = 137-712 Mean = 353 Std. Dev = 172 Range = 109-600 Mean = 347 Std. Dev = 171 Range = 93-581 Table 8-4 Results for Task 2 for Effort

PAGE 144

132 The results of bootstrapping 20,000 times w ith a seed of 1054 are shown in Table 8-5, with significant differences (p < .10) sh own in boldface. The results show that there are significant differences in the effort estimations between PSEstimate and the manual groups for Tasks 1a, 1c, 2a and 2b; and be tween COCOMO II and the manual groups for Tasks 2a and 2b. No significant differences were found between the two groups using computer-based tools, COCOMO II and PSEstimate, in effort estimations. Contrast T1A E T1B E T1C E T2A E T2B E T2C E Manual vs. COCOMO II .23 .99 .13 .0008 .005 .09 Manual vs. PSEstimate .02 1.00 .01 .0007 .005 .26 COCOMO II vs. PSEstimate .99 .99 .99 1.00 1.00 1.00 Table 8-5 Bootstrap p-vals for Effort Treatment Task 1A Schedule Task 1B Schedule Task 1C Schedule Manual Mean = 11 Std. Dev = 11 Range = 3-40 Mean = 61 Std. Dev = 180 Range = 1.25-631.5 Mean = 23 Std. Dev = 16 Range = 3-71 COCOMOII Mean = 10 Std. Dev = 5 Range = 2.1-17.4 Mean = 7 Std. Dev = 4 Range = 1.5-10.8 Mean = 20 Std. Dev = 3 Range = 15.5-24.5 PSEstimate Mean = 10 Std. Dev = 7 Range = 2-23 Mean = 7 Std. Dev = 5 Range = 1.3-17 Mean = 35 Std. Dev = 33 Range = 10-127 Table 8-6 Results for Task 1 for Schedule Treatment Task 2A Schedule Task 2B Schedule Task 2C Schedule Manual Mean = 16 Std. Dev = 22 Range = 1-80 Mean = 14 Std. Dev = 18 Range = 1-65 Mean = 45 Std. Dev = 85 Range = 1-300 COCOMOII Mean = 20 Std. Dev = 12 Range = 3-48 Mean = 21 Std. Dev = 12 Range = 4-42.5 Mean = 21 Std. Dev = 12 Range = 6-45 PSEstimate Mean = 28 Std. Dev = 36 Range = 4.7-133 Mean = 33 Std. Dev = 34 Range = 9-133 Mean = 35 Std. Dev = 34 Range = 9-133 Table 8-7 Results for Task 2 for Schedule

PAGE 145

133 The results of bootstrapping 20,000 times w ith a seed of 1054 are shown in Table 8-8. There are no significant differences between groups in schedule estimations. Contrast T1A S T1B S T1C S T2A S T2B S T2C S Manual vs. COCOMO II 1.00 .92 1.00 1.00 .99 .97 Manual vs. PSEstimate 1.00 .92 .86 .96 .44 1.00 COCOMO II vs. PSEstimate 1.00 1.00 .66 .99 .94 .99 Table 8-8 Bootstrap p-vals for Schedule An additional test was conducted to test to see if there were differences in treatment groups in effort and schedule. We lch’s ANOVA is a test conducted when the assumptions of a parametric ANOVA are violated, particularly when the assumption of equal variance is violated. Task Welch’s ANOVA for Effort Welch’s ANOVA for Schedule Task 1a .0014 .92 Task 1b .7727 .60 Task 1c .0034 .35 Task 2a <.0001 .66 Task 2b <.0001 .29 Task 2c .0105 .32 Table 8-9 Welch's ANOVA fo r Effort and Schedule The results of the Welch’s ANOVA test are consistent with the results using the bootstrapping technique to determine whethe r the groups differed in their estimations. With both techniques there were signif icant differences between the manual and PSEstimate group for Task 1a, Task1c, Task 2a, and Task 2b for effort; and between the manual group and COCOMO II for Task 2b for effort. There were no significant differences in schedule between any of th e groups with either bootstrapping or Welch’s ANOVA.

PAGE 146

134 When combining treatments COCOMO II and PSEstimate to form a new group, tool versus the manual group or no tool the following results occur: Contrast T1A E T1B E T1C E T2A E T2B E T2C E Tool vs. No Tool .025 .999 .01 <.0001 .0004 .059 Table 8-10 Bootstrap p-vals for Effort Contrast T1A S T1B S T1C S T2A S T2B S T2C S Tool vs. No Tool 1. 00 .92 1.00 1.00 .98 .97 Table 8-11 Bootstrap p-vals for Schedule The tool did not make a significant difference between Task 1b and Task 2a. There was a difference for Task 1a, but wh en the manual group was given double the people, the group effectively doubled th e effort. Even though the manual group underestimated Task 1a, the gross correction for doubling the staff brought the average in line with the other groups. The fact that there are no significant di fferences between groups in schedule estimation is rather interesti ng. All treatment groups approach ed the same correct answer for the amount of time it took to complete a project. Even though individuals might not have a correct answer, the averaging of estimates led to a good estimate. With significant differences found for e ffort, an analysis is conducted to see which group is the most accurate. There are two methods in which accuracy can be judged. The first is to measure which group has a raw mean or bootstrap mean closest to expert’s estimates. Because of the am ount of bootstrapping, both raw means and bootstrap mean are equal. The second me thod is to measure the differences each participant is from the expert on a percentage score. The mean of each group raw

PAGE 147

135 percentage score or a bootstrap mean can be measure to find the closest on to zero. This would signify the most accurate. The table that follows summarizes the results of all four data analysis techniques. Raw/Bootstrapped Mean Raw/Bootstrap Percentage Mean Task Expert M C P M C P Effort 60 28 53 63 54% 12% 6% 1a Schedule 13 11 10 10 14% 22% 25% Effort 40 57 106 71 43% -166% -76% 1b Schedule 12 61 7 7 -411% 44% 45% Effort 1582 133 247 288 92% 84% 81% 1c Schedule 40 23 20 35 43% 49% 14% Effort 180 69 375 384 62% -109% -114% 2a Schedule 20 16 20 28 21% 2% -38% Effort 160 75 355 353 53% -122% -121% 2b Schedule 19 14 21 33 24% -12% -73% Effort 150 140 394 347 -7% -162% -132% 2c Schedule 19 45 21 35 -138% -10% -86% Table 8-12 Accuracy Results vs. Expert The results provide mixed support for H1 for effort estimations. As already discussed, there were significant difference s between the experimental groups in these estimations for some of the tasks. But when using the expert’s rating as a measure of accuracy, the PSEstimate group was both significa ntly different and more accurate than the manual group for Task 1a. Both of the computer-tool groups were both significantly different from the manual group on the more di fficult Task 2a and 2b, but for Task 2a the tool-using groups estimated effort higher than the expert, and for Task 2b the manual group was more accurate.

PAGE 148

136 8.6 Consistency Hypothesis 2 is about testing the variation that exists between estimators. A consistent estimate is more desirable rather than a non-consistent estimate assuming both have the same accuracy. H2: Use of a software cost estimation to ol for software development projects reduces the variation (increases the consiste ncy) of estimates for effort and schedule. H2a: Use of a software cost estimati on tool for software development projects reduces the variation (increases the co nsistency) of estimates for effort. H2b: Use of a software cost estimation tool for software development projects reduces the variation (increases the c onsistency) of estimates for schedule. To test this hypothesis two tests are used. The first anal ysis is the Levene Test. The Levene Test is a statistical test for ho moscedasticity, where as it will check for equal variance of a measure across groups. Levene Test is known for its robustness for violations against normal data. A significant te st supports the idea that the dispersion is different among the three groups. The results of the Levene Test are shown in Table 8-13. Task P-Value Levene Effort P-Value Levene Schedule Task 1a .21 .28 Task 1b .45 .34 Task 1c .37 .29 Task 2a .0054 .47 Task 2b .0042 .42 Task 2c .11 .32 Table 8-13 Levene Test for Effort and Schedule In this experiment Task 2 was designed to test the consistency of estimates. As the structure of the project changed, it would be expected that the manual group and the COCOMO II group would have increasingly di fficult problems with estimating. From the Levene Test, Task 2a and Task 2b have sign ificant dispersion among the three treatment

PAGE 149

137 groups. The Standard Deviation for Effort shows that the Manual group has much less variance than the COCOMO II or the PSEstim ate group. Therefore, Hypothesis H2a is not supported. There were significant diffe rence, but the opposite occurred than was hypothesized. The Levene Test for schedule shows no significant dispersion. Therefore, hypothesis H2b is also not supported. It is important to note that bootstrapping is a very conservative technique. In Task 1b, the manual group had a Standard Devi ation of 180 versus 4 for COCOMO II and 5 for PSEstimate. This high standard deviation occurred because of 1 participant. The bootstrapping technique will reduce the impact of this influential data point to where it is not significant. Another point that shows up in the data is that as the complexity of the structure of the project team changes across Task 2, both the Manual and COCOMO II group increase in variance. Note that the PSEstimat e group decreases in variance as the project team increases. 8.7 Confidence Hypothesis 3 states: H3: User of a software cost estimation t ool for software development projects are more likely to have an appropriate level of confidence in their estimates than estimators without support. There can be four types of confiden ces, two are inappropriate and two are appropriate. Overconfidence occurs when a pa rticipant has a high confidence rating but is

PAGE 150

138 not accurate. The next type is not confident inap propriately. In this case the participant is accurate but has low confidence. Overconfid ence and not confident inappropriately are inappropriate confidence estimates. The remaining two confidence levels are ap propriate level of confidence. The first is confident appropriately. In this case the participant is accu rate and has a high level of confidence. The last type is not confident appropriately. In this case the participant is inaccurate and has low confidence. The analysis on testing H3 is unique. The fi rst step is to rate for each task if the participant was accurate or not accurate. The m easure of accuracy used the expert’s best and worst case as the acceptabl e range. If the participant’s estimate, best case or worst case fell within the acceptable range, the es timate was deemed to be accurate. Otherwise it was deemed inaccurate. Next the confidence was analyzed on each task for each participant. Since the expert rated all tasks with a 50% confidence, this was the limit. To be confident, a participant had to be above the expert’s 50% confidence level; otherwise they were rated not confident. The next step was to position the participant’s estimate into one of the four types of estimates AP is appropriate confidence, OC is overconfident, NCA is not confident a ppropriately, and NCI is not confident inappropriately. After each task rated, a pivot table in Excel was created. The results follow:

PAGE 151

139 Count of Confidence Type Treatment AP NCA NCI OC Grand Total Manual 14 22 13 23 72 COCOMO II 19 13 13 21 66 PSEstimate 24 10 8 24 66 Grand Total 57 45 34 68 204 Table 8-14 Pivot Table of Confidence Type Results From the table is clear that the PSEstim ate group has more appropriate level of confidence than any other group. For Not C onfident Appropriately the manual group had neither a good estimate nor high confidence. For Not Confident Inappropriately it is clear why the manual group is so high. Even though they had good hi storical data, this was not enough for many participants to cr eate a high level of confidence in their estimates. For overconfidence, the groups were about even. Count of Confidence Type Treatment AP NCA NCI OC Grand Total No Tool 14 22 13 23 72 Tool 43 23 21 45 132 Grand Total 57 45 34 68 204 Table 8-15 Results of Tool vs. No Tool for Confidence 8.8 Satisfaction and Perceived Usefulness Satisfaction and Perceived Useful ness make up Hypothesis 4 and 5. H4: Use of a software cost estimation t ool for software development projects increases the users’ satisfaction with the estimation technique. H4a: The PSEstimate group will have hi gher satisfaction than COCOMO II group and the COCOMO II group will have higher satisfaction than the Manual group.

PAGE 152

140 H4b: The COCOMO II and PSEstimate group together will have higher satisfaction than the Manual group. H5: Use of a software cost estimation to ol for software development projects increases the user’s perceived useful ness with the estimation technique. H5a: The PSEstimate group will have higher perceived usefulness than COCOMO II group and the COCOMO II group will have higher perceived usefulness than the Manual group. H5b: The COCOMO II and PSEstimate gro up together will have higher perceived usefulness with the estimation technique than the Manual group. The constructs for satisfaction and perc eived usefulness will be analyzed with identical analysis techniques. The first step in checking for differences among the three treatment group for satisfaction and perceived usefulness is to condu ct two psychometric tests on the items that measure these two constructs. An item-total along with Cronbach’s alpha is a standard technique to test reliability. The results of these tests are shown in Table 8-16 and Table 8-17:

PAGE 153

141 Item Wording Scale Item-Total Cronbach’s Alpha Very Dissatisfied---Very Satisfied 1-7 .81 Very displeased---Very Pleased 1-7 .92 Very frustrated--Very contented 1-7 .85 Absolutely terrible—Absolutely delighted 1-7 .75 .93 Table 8-16 Item-Total for Satisfaction Item Wording Scale Item-Total Cronbach’s Alpha Using the software estimation technique in this experiment improves my performance in conducting software cost estimation. 1-7 .94 Using the software estimation technique in this experiment improves my productivity in conducting software cost estimation. 1-7 .94 Using the software estimation technique in this experiment improves my effectiveness in conducting software cost estimation. 1-7 .95 Overall, the software technique used in this experiment was useful in conducting software cost estimation. 1-7 .88 .97 Table 8-17 Item-Total and Cronbach ’s Alpha for Perceived Usefulness The results show strong item-total correlation for the items with the constructs. Cronbach’s alpha is excellent for both constructs. Having very reliable measures, a new measure called TotalSat was created that is a summation of the four satisfaction items. Also another measure TotalUse was created that was also a summation of the four perc eived usefulness items. These variables are used as dependent variables in the analysis.

PAGE 154

142 A bootstrapping technique was used to test the two hypotheses with the newly created dependent variables TotalSat and TotalUse. The results follow: Treatment Satisfaction (TotalSat) Perceived Usefulness (TotalUse) Manual Mean = 12.5 Std. Dev = 4.14 Mean = 12 Std. Dev = 6.6 COCOMOII Mean = 17.7 Std. Dev = 4.1 Mean = 20 Std. Dev = 5.2 PSEstimate Mean = 16.5 Std. Dev = 4.3 Mean = 16 Std. Dev = 5.0 Table 8-18 Satisfaction and Treatment Means Based on Table 8-18 hypothesis H4a a nd H5a are not supported. COCOMO II had a higher satisfaction and perceived usef ulness than both the PSEstimate and manual group. By combining the COCOMO II and PSEs timate group together, a tool versus no tool analysis can be conducted. This test can be used to test hypothesis H4b and H5b. Contrast SatisfactionPerceived Usefulness Tool vs. No Tool.014 .012 Table 8-19 Bootstrap p-vals for Sa tisfaction and Perceived Usefulness From Table 8-19 there is support for Hypo thesis H4b and H5b. The satisfaction and perceived usefulness of using an estimation tool was significantly better than not using an estimating tool or conducting the estimation manually. The results tend to show two different u nderpinnings when analyzing satisfaction and perceived usefulness. The first thing is individuals do not like to do estimation manually. It is rather frustrati ng for some and many people do not think it is an effective use of their time. This can result in a low ra ting for satisfaction and perceived usefulness. This result was prevalent during the pilot test after debriefing participants. The effect

PAGE 155

143 carried over to the main experiment. The s econd result is people really liked using COCOMO II to do estimating. Maybe because it is the state-of-the-practice tool, but whatever the reason, people report high satisfaction and perceived usefulness with COCOMO II. The PSEstimate group was inconclu sive. It was not significantly different from the manual group or the COCOMO II group. The PSEstimate group was almost significantly different from the Manual gr oup. The raw p-values were .03 and .07 when comparing the Manual group versus the PSEstimate group. The bootstrapping reduced the p-values to non-significan t results. Some additional wo rk needs to be conducted to see what is causing satisfaction and perceive d usefulness to lag slightly below COCOMO II. From the results is it clear that PSEstimate needs more development, most likely in the interface. PSEstimate is addressing a much more complex task in the explict modeling of team structure versus the CO COMO II version. A better way of inputting team information can affect the perceived usefulness scores.

PAGE 156

144 CHAPTER 9 CONCLUSIONS AND CONTRIBUTIONS 9.1 Introduction Software cost estimation remains an important unsolved challenge. Project managers need to have tools that help them successfully manage their projects. By better understanding software development, better software cost estimation models can be created that will help project managers m eet their goals. By introducing the software handoff, a different approach to software co st estimation is undertaken. A new software cost estimation tool is created to help su pport decision making by project managers. 9.2 Contributions to Research This dissertation contributes to resear ch in many ways. First, a theoretical framework for software cost estimation is presented. By using the concept of communication overhead, a cost estimation model that includes communication is created. The theoretical framework provide s a measure that is important for cost estimation but is not always measured. The second contribution to research is the use of secondary data to perform validation of the theoretical framework presented in this dissertation. By showing rigor, the effect that the findings are only spurious correlations is minimized. In the research performed so far on software cost estimatio n, many researchers ignore the assumptions of

PAGE 157

145 the particular statistical method used. By properly performing the analysis on the secondary data, a documented method of analysis is presented for others to understand. The third contribution to research comes from the experimental validation of the software cost estimation model. Designin g a software cost estimation validation experiment is not a method commonly performed in the field. A novel approach and methodology is presented in this dissertation. With this information, future software cost estimation experiments can be performed. The fourth contribution to research comes in the form of the optimization formula presented. Software cost estimation has yet to model the trade-off between effort and schedule. This initial attempt at developing an optimization formula will give future researchers a starting point when trying to unde rstand the tradeoff project managers make when respect to effort and schedule. The fifth contribution is from the empi rical study itself. An experiment was thought out and conducted that clearly provi des useful results to the software cost estimation research community. Accuracy, Consistency, Satisfaction, Confidence, and Perceived Usefulness are presented and m easured in an experimental setting. From the experimental results, ther e was mixed support for the hypothesized relationships. An interesting finding was th at even though effort was significantly different among the teams, the estimates for schedule were not. This finding will have to be further investigated in another study. In general PSEstimate was positive for estimators. People that estimate software

PAGE 158

146 believe there should be tools to help estima tion. The group that did estimation manually thought the process was archaic and many stated there has to be a be tter way to estimate. Through the design science paradigm, the artifact was created, which was the new software cost estimation model. The mode l was instantiated through PSEstimate and tested in the field. A finding the PSEstimate n eeds to be improved slightly in from the perceived usefulness ratings is another type of contribution. 9.3 Contributions to Practice Currently the COCOMO II schedule re duction multiplier is the most common method of estimating the impacts of reduci ng the delivery date of software projects. Many times a project manager is in charge of a project that has a critical time-to-market delivery date. Based on the work presente d in this disserta tion, the COCOMO II Schedule Reduction Multiplier is ineffectiv e at helping a project manager make changes to the project to deliver a project with the desired schedule reduction. By using COCOMO II on different sized projects to es timate effort, schedule and staff and then using the schedule reduction multiplier to recal culate effort, schedule, and staff, a clear pattern emerges. For any sized project, acco rding to COCOMO II to reduce the schedule to 75% of the original, staffing needs to be increased by about 91%. Using the schedulereduction-multiplier methodology ignores an y effects of communication overhead.

PAGE 159

147 Original Estimate 75% of Original Lines of Code Total Effort Schedule St aff Effort Schedule Staff Increase in Staff 2000 6.7 6.7 1 9.581 5.025 1.9 91% 8000 31 10.9 2.8 44.33 8.175 5.4 94% 16000 66.4 13.9 4.8 94.952 10.425 9.1 90% 32000 142.2 17.8 8 203.346 13.35 15.2 90% 64000 304.8 22.6 13.5 435.864 16.95 25.7 90% 128000 653.2 28 22.7 934.076 21 44.5 96% 512000 3000 46.8 64.1 4290 35.1 122.2 91% Table 9-1 COCOMO II Schedule Reduction Multiplier This dissertation provides to practice a replacement to the COCOMO II schedule reduction multiplier so needed by project managers. By including the effects of communication overhead, a better formulated es timate of what occurs when the schedule is reduced is explained. Another practical contribution to project management from this research is in giving project managers the ability to expe riment with different team sizes. With software cost estimation models such as COCOMO II, team size was not directly changeable. There is poor linkage between the staffing estimate given by the software cost estimation models and staff members assi gned to a software project. Many software cost estimation tools give a staffing estimate, but no support occurs if the staffing needed is not available. Project managers now have a tool to help with different staffing situations. The ability to understand that not ever y software development project needs the same type of structure adds to the practical contributions of this dissertation. By showing

PAGE 160

148 three different process structures with estimat es for effort, schedule, and staffing allows project managers to explore different structures to develop software. Finally delivering a decision support tool that the project manager can easily run on a personal computer is a major contributi on to practice. With project managers five years behind the state-of-the-a rt in tools and techniques fo r project management, getting relevant knowledge to project managers is a ch allenge. By developing an easy to use tool that helps support decision making, th e knowledge gap can be addressed. 9.4 Limitations and Key Assumptions There are two key assumption and limitations of this work. First, COCOMO II is used as the basis for effort calculations and any errors in COCOMO II are inherited in the estimates in this dissertation. Being an exte nsion rather than a re placement to COCOMO II, criticisms and limitations of COCOMO II are also assumed in this dissertation. A limitation in COCOMO II occurs when estimati ng very small project sizes less than eight KLOC. The cost estimation model presented in this dissertation will not be able to provide reasonable estimates of very small projects. The second limitation occurs with the empirical data on communication overhead with very large teams. Little empirical da ta explains the impact of communication overhead in teams over 30 people. At 30 people, the communication overhead is 54%. However, as the group increases to 50 people little is known. This dissertation assumes that for all teams above 30 people, the communication overhead remains at 54%. Based on the exponential shape of the communicati on overhead chart, communication overhead is expected to continually increase to a point where the communication overhead exceeds

PAGE 161

149 100%, meaning adding an additional person will cause more effort to be expended in communication than work on the project. By plac ing more than 30 people in a team, the model will underestimate the effort needed. 9.5 Future Work The work presented in this dissertation is a solid contribution to software cost estimation. Future work is possible based on th is dissertation. First, only three different process structures are presented in this diss ertation. In reality, having a customizable process structure allows the greatest flexib ility to a project manager. By ensuring the process structure matches what is used in th e project managers’ organization in addition to other potential process structures that might be used, will allow the software cost estimation to provide the best contribution to practice. Second, experience is an important cost driver in COCOMO II. However, COCOMO II provides experience at a group leve l rather than an individual level. There is a cost driver for analyst experience and programmer experience in COCOMO II, but the impacts of experience are not isolated. Consider the experience level of programmer is medium. An additional expert replaces a medium programmer thereby increasing the experience level of the group. With COCOMO II, the experience level cost driver for the programmers lowers the effort multiplier, which lowers the effort estimate. With the lower effort estimate, less staff will be needed, depending on which people are not needed, the experience level changes again, causing a different staffing level needed. This circular process never stops, therefor e only rough estimates of experience are modeled. By modeling each individual experi ence will allow this cost estimation model

PAGE 162

150 to better explain the effect of having people with differe nt experiences. In addition, experience can become a factor in different process structures. Third, taking the lessons learned in this dissertation and creating an optimization tool, where a project manager can input the t eam information and get an “optimal” team structure based on the team is possible. A decision support tool can have many more structures available for the manager.

PAGE 163

151 REFERENCES Abdel-Hamid, T. K. (1988). "The Economi cs of Software Quality Assurance: A Simulation-Based Case Study." MIS Quarterly 12(3): 395. Abdel-Hamid, T. K. (1988). "Understanding the "90% Syndrome" in Software Project Management: A Simulation-Based Case Study." The Journal of Systems and Software 8(4): 319. Abdel-Hamid, T. K. (1989). "The Dynamics of Software Projects Staffing: A System Dynamics Based Simulation Approac h." IEEE Transactions on Software Engineering 15(2): 109. Abdel-Hamid, T. K. (1992). "Investigating the impacts of managerial turnover/succession on software project performance." Journal of Management Information Systems 9(2): 127. Abdel-Hamid, T. K. (1993). "A multiproject perspective of singleproject dynamics." The Journal of Systems and Software 22(3): 151. Abdel-Hamid, T. K. and S. E. Madnick (1987). "On the Portability of Quantitative Software Estimation Models." Information & Management 13(1): 1. Abdel-Hamid, T. K. and S. E. Madnick (1989). "Lessons Learned from Modeling the Dynamics of Software Development." Association for Computing Machinery. Communications of the ACM 32(12): 1426. Abdel-Hamid, T. K. and S. E. Madnick (1991). Software Project Dynamics: An Integrated Approach Englewood Cliffs, New Jersey, Prentice Hall. Abdel-Hamid, T. K., K. Sengupta, et al. ( 1994). "The effect of reward structures on allocating shared staff resources among in terdependent software projects: An experimental investigation." IEEE Tr ansactions on Engineering Management 41(2): 115. Abdel-Hamid, T. K., K. Sengupta, et al. (199 9). "The impact of goals on software project management: An experimental investigation." MIS Quarterly 23(4): 531. Abell, M., L., J. Braselton, P., et al (1999). Statistics with Mathematica San Diego, Academic Press. Adrangi, B. (1987). "Effort Estimation in a System Development Project." Journal of Systems Management 38(8): 21. Agarwal, R., M. Kumar, et al. (2001). "E stimating software projects." ACM Sigsoft Software Engineering Notes 26(4): 60-67. Albrecht, A. J. and J. E. Gaftney (1983) "Software Function, Source of Code, and Development Effort Prediction: A Software Science Validation." IEEE Transactions on Software Engineering 9(2): 639-648. Amoroso, D. L. and R. A. Zawacki (1992). "Information Engineering: The One True Path Out of the Software Crisis?" Information Strategy 8(4): 35. Andolfi, M. A. (1996). "A Multi-criteria Methodology for the Evaluation of Software Costs Estimation Models and Tools." CSELT Technical Reports 24: 643-659.

PAGE 164

152 Angelis, L., I. Stamelos, et al. (2001). Build ing a software cost estimation model based on categorical data Seventh International Softwa re Metrics Symposium, 2001. METRICS 2001. Bailey, J. and V. R. Basili (1983). A Meta-Model for Software Development Resource Expenditures Fifth International Confer ence on Software Engineering. Banker, R. D., H. Chang, et al. (1994). "Evi dence on economies of scale in software development." Information and Software Technology 36(5): 275. Benbasat, I. and I. Vessey (1980). "Progr ammer and Analyst Time/Cost Estimation." MIS Quarterly 4(2): 31-43. Boehm, B. W. (1981). Soft ware Engineering Economics Englewood Cliffs, NJ, PrenticeHall Inc. Boehm, B. W. (1984). "Software Engin eering Economics." IEEE Transactions on Software Engineering SE10(1): 4. Boehm, B. W. (2000). Software cost estimation with COCOMO II Upper Saddle River, N.J., Prentice Hall PTR. Boehm, B. W. and K. J. Sullivan (2002). Software economics: a roadmap International Conference on Software Engineering, Future of Software Engineering, Limerick, Ireland. Boehm, B. W. and R. W. Wolverton (1980) "Software Cost Modeling: Some Lessons Learned." The Journal of Systems and Software 1(3): 195. Brady, S. and T. DeMarco (1994). "Manag ement-Aided Software Engineering." IEEE Software 11(6): 25-32. Briand, L. C., K. El Emam, et al. (1998) Explaining the Cost of European Space and Military Projects. Kaiserslautern, Germa ny, Fraunhofer Institute for Experimental Software Engineering ISERN-98-19: 1-11. Brooks, F. P. (1975). The Mythical Man-Month Reading, MA, Addison-Wesley. Callisen, H. and S. Colborne (1984). "A Proposed Method for Estimating Software Cost from Requirements." Journal of Parametrics IV(4): 33-40. Chen, Z., T. Menzies, et al. (2005). "Finding the Righ t Data for Software Cost Modeling." IEEE Software 22(6): 38-46. Conte, S. D., H. E. Dunsmore, et al. (1986) Software Engineering Metrics and Models. Menlo Park, CA, Benjamin/Cummings. Cover, D. K. (1988). Issues affecting the reliability of software-cost estimates (trying to make chicken salad out of chicken feathers) Annual Reliability and Maintainability Symposium. Cuelenaere, A. M. E., M. J. I. M. van Genu chten, et al. (1987). "Ca librating a Software Cost Estimation Model: Why and How." Information and Software Technology 29(10): 558. Curtis, B. (1992). Maintain ing the software process Conference on Software Maintenance, 1992. Curtis, B., H. Krasner, et al. (1998). "A Fi eld Study of the Software Process for Large Systems." Communication of the ACM 31(11): 1268-1287.

PAGE 165

153 Dekker, G. J. and F. J. van den Bosch (1983). "Functional Requirements for the Development and Use of a Software-Cost Database." Information & Management 6(4): 225. DeMarco, T. and T. R. Lister (1999). Peopleware: productive projects and teams New York, NY, Dorset House Publishing. Dion, R. (1993). "Process Improvement and the Corporate Balance Sheet." IEEE Software 10(4): 28-35. Fenton, N. (1993). "How effective are softwa re engineering methods?" The Journal of Systems and Software 22(2): 141. Ferens, D. V. (1988). "Software Parametric Cost Estimation: Wave of the Future." Engineering Costs and Production Economics 14(2): 157-165. Ferens, D. V. and D. S. Christensen ( 1998). Calibrating Software Cost Models to Department of Defense Data bases A Review of Ten St udies, Air Force Research Laboratory: 1-16. Ferens, D. V. and D. S. Christensen ( 2000). "Does Calibration Improve Predictive Accuracy." Crosstalk: The Journal of Defense Software Engineering 13(4): 14-17. Fried, L. (1991). "Team Size and Productivity in Systems Development." The Journal of Information Systems Management 8(3): 27-41. Guinan, P., J. Coprider, et al. (1998). "Enabling Software Development Team Performance During Requirements Definition: A Behavioral Versus Technical Approach." Information Systems Research 9(2): 101-125. Hair, J. F. J., R. E. Anderson, et al (1998). Multivaria te data analysis Upper Saddle River, New Jersey, Prentice-Hall Inc. Harry, M. J. and J. R. Lawson (1992). Six sigma producibility analysis and process characterization Reading, Mass., Addison-Wesley. Herd, J. R., J. Postak, et al. (1977). So ftware Cost Estimation Study-Study Results. Rockville, MD, Final Technical Report RADCTR-77-220, Volume 1, Doty Associates. Hevner, A. R., S. T. March, et al. (2004) "Design Science in Information Systems Research." MIS Quarterly 28(1): 75-106. Hu, Q. (1997). "Evaluating Alternativ e Software Production Functions." IEEE Transactions on Software Engineering 23(6): 379-387. Hu, Q., R. T. Plant, et al. (1998). "Softw are cost estimation using economic production models." Journal of Management Information Systems 15(1): 143. Humphrey, W. S., T. R. Snyder, et al. (19 91). "Software Process Improvement at Hughes Aircraft." IEEE Software 8(4): 11-23. Idri, A., T. M. Khoshgoftaar, et al. (2002). Can neural networ ks be easily interpreted in software cost estimation? 2002 IEEE International Conf erence on Fuzzy Systems, 2002. FUZZ-IEEE'02. Italiani, M. (1984). "Productive Capacity of a System for Software Development." Information & Management 7(5): 253. Jones, C. (1988). "Productivity in MIS: Building a Better Metr ic." Computerworld : 38. Jones, C. (1993). Assessment and control of software risks Englewood Cliffs, N.J., Yourdon Press.

PAGE 166

154 Jones, C. (1998). Estimating software costs New York, McGraw-Hill. Jones, C. (2002). "Software Cost Estimation in 2002." Crosstalk: The Journal of Defense Software Engineering 15(6): 4-8. Kadoda, G., M. Cartwright, et al. (2000). Experiences Using Case-Based Reasoning to Predict Project Effort, ESERG Techni cal Report No. 00-09, Bournemouth University (also published in the Proceedings of EASE 2000): 1-23. Kemerer, C. (1987). "An Empirical Validati on of Software Cost Estimation Models." Communication of the ACM 30(5): 416-429. King, J. (1997). Poor planning kills pr ojects, pushes costs up. ComputerWorld 31: 6. Kitchenham, B. A. (1992). "Empirical Studie s of Assumptions that Underlie Software Cost-Estimation Model." Information and Software Technology 34(4): 211-218. Kitchenham, B. A. (1997). "Counterpoint: The Problem with Function Points." IEEE Software 14(2): 29-31. Kitchenham, B. A. (2002). "The question of scale economies in software-why cannot researchers agree?" Information and Software Technology 44(1): 13-24. Kitchenham, B. A., S. L. Pfleeger, et al. (1995). "Towards a Framework for Software Measurement Validation." IEEE Trans actions on Software Engineering 21(12): 929-944. Kitchenham, B. A. and N. R. Taylor (1 985). "Software Project Development Cost Estimation." The Journal of Systems and Software 5(4): 267. Laranjeira, L. A. (1990). "Software Size Es timation of Object-Oriented Systems." IEEE Transactions on Software Engineering 16(5): 510. Leavitt, D. (1977). "Human Factors Also Vita Project Confirms Impact of Programming Techniques." Computerworld 11(11): 23. Lederer, A. L. and J. Prasad (1998). "A Causal Model for Software Cost Estimating Error." IEEE Transactions on Software Engineering 24(2): 137-148. Lewis, J. P. (2001). "Large Limits to Soft ware Estimation." ACM Software Engineering Notes 26(4): 54-59. Lipke, W. H. and K. L. Butler (1992) "Software Process Improvement: A Success Story." Crosstalk: The Journal of Defense Software Engineering (38): 29-31. Matson, J. E., B. E. Barrett, et al. (1994) "Software development cost estimation using function points." IEEE Transactions on Software Engineering 20(4): 275. McConnell, S. (2000). "The Best Influen ces on Software Engineering." IEEE Software 17(1): 11-17. McConnell, S. (2000). "Ten Myths of Rapid Development." Retrieved June 4, 2003. McGrath, J. E., Ed. (1982). Dilemmatics: Th e study of research choices and dilemmas Judgement Calls In Research. Beverly Hills, Sage. Mukhopadhyay, T., S. S. Vicinanza, et al. (1992). "Examining the Feasibility of a CaseBased Reasoning Model for Software Effort Estimation." MIS Quarterly 16(2): 155. Musgrave, G. L. and R. H. Rasche (1977) "Estimation of Cost Functions." The Engineering Economist 22(3): 175. Musilek, P., W. Pedrycz, et al. (2002). On the sensitivity of COCOMO II software cost estimation model Eighth IEEE Symposium on Software Metrics, 2002.

PAGE 167

155 NASA (2002). NASA Cost Estimating Handbook, NASA Independent Program Assessment Office: 186. Navlakha, J. K. (1990). "Choosing a So ftware Cost Estimation Model for Your Organization: A Case Study." Information & Management 18(5): 255. Nemecek, S. (2001). Systematic Defects in Software Cost Estimation Models Harm Management Portland International Conference on Management of Engineering and Technology, 2001. PICMET '01. Norden, P. V., Ed. (1970). Useful tools for project management Management of Production. Baltimore, MD, Penguin. Park, R. E. (1988). "Parametric Software Cost Estimation with an Adaptable Model." American Association of Cost Engin eers. Transactions of the American Association of Cost Engineers : G111. Paulk, M. C. (1995). The Rational Planning of (Software) Projects. Proceeding of the First World Congress for Software Quality San Francisco, CA, 20-22 June 1995, section 4. Paulk, M. C. (2001). "Extreme Programmi ng from a CMM Perspective." IEEE Software 18(6): 19-26. Paulk, M. C., D. Goldenson, et al. (2000). The 1999 Survey of High Maturity Organizations. Pittsburgh, PA, Carnegie Mellon Software Engineering Institute CMU/SEI-2000-SR-002. Paulk, M. C., C. V. Weber, et al. (1993) Capability Maturity Model for Software, Version 1.1, Software Engineering Ins titute, CMU/SEI-93-TR-25, DTIC Number ADA263432. Pillai, K. and V. S. S. Nair (1997). "A m odel for software development effort and cost estimation." IEEE Transactions on Software Engineering 23(8): 485. Putnam, L. H. (1978). "A General Empirica l Solution to the Macro Software Sizing and Estimating Problem." IEEE Transactions on Software Engineering SE-4(4): 345361. Putnam, L. H. (1985). The Impact of Meth odologies on Software Productivity: Case Studies National Conference Workshop Method ologies and Tools for Real-Time Systems. Putnam, L. H. and A. Fitzsimmons (1979). "Estimating Software Costs." Datamation 25(11): 171. Putnam, L. H. and W. Myers (1992). Measures for Excellence Yourdon Press. Putnam, L. H. and W. Myers (1997). "How Solved is the Cost Estimation Problem." IEEE Software 14(6): 105-107. Raja, M. K. (1985). "Software Project Ma nagement and Cost Control." Journal of Systems Management 36(10): 20. Random House (1998). Random Hous e Webster's unabridged dictionary New York, Random House. Reinertsen, D. (2000). "Multitasking engi neers isn't always a bad idea." Electronic Design 48(11): 52. Rosson, M. B. (1996). "Human Factors in Programming and Software Development." ACM Computing Surveys 28(1): 193-195.

PAGE 168

156 Samson, B., D. Ellison, et al. (1997). "S oftware cost estimation using an Albus perception (CMAC)." Information and Software Technology 39(1): 55. Scannell, T. (1979). "Software Developmen t Called 'Guessing Game'." Computerworld 13(15): 17. Seaman, C. B. and V. R. Basili (1997). "C ommunication and organization in software development: An empirical study." IBM Systems Journal 36(4): 550-563. Sengupta, K. and T. K. Abdel-Hamid (1993). "Alternative conceptio ns of feedback in dynamic decision environments: An experimental investigation." Management Science 39(4): 411. Sengupta, K., T. K. Abdel-Hamid, et al. (1999). "Coping with Staffing Delays in Software Project Management: An Experimental Investigation." IEEE Transactions on Systems, Man, and Cybernetics 29(1): 77-91. Simmons, D. B. (1991). "Communications: a software group productivity dominator." Software Engineering Journal 6(6): 454-462. Simon, H. A. (1969). The sc iences of the artificial Cambridge, MA, M.I.T. Press. Smith, R. K., J. E. Hale, et al. (2001). "An empirical study using task assignment patterns to improve the accuracy of software e ffort estimation." IEEE Transactions on Software Engineering 27(3): 264. Software Engineering Institute (1995). The Ca pability Maturity Model: Guidelines for Improving the Software Process Reading, Mass, Addison-Wesley. Stamelos, I., L. Angelis, et al. (2003). "On th e use of Bayesian belief networks for the prediction of software productivity." Information and Software Technology 45(1): 51. Standish Group Inc. (1994). The CHAOS Report, www.standishgroup.com Tausworthe, R. C. (1977). Standardiz ed Development of Computer Science Englewood Cliffs, NJ, Prentice-Hall, Inc. The Standish Group (2003). Press release: Latest Standish Group CHAOS Report shows project Success Rates have improved by 50%. West Yarmouth, MA. Thebaut, S. M. and V. Y. Shen (1984). "An analytic resource model for large-scale software development." Information Processing & Management 20(1-2): 293-315. van der Poel, K. G. and S. R. Schach (1983) "A Software Metric for Cost Estimation and Efficiency Measurement in Data Processing System Development." The Journal of Systems and Software 3(3): 187. van Genuchten, M. and H. Koolen (1991). "On the Use of Software Cost Models." Information & Management 21(1): 37. van Gunuchten, M. (1991). "Why Is Software Late? An Empirical Study of Reasons of Delay in Software Development." IEEE Transactions on Software Engineering 17(6): 582-591. Verner, J. M. and G. Tate (1987). "A Model for Software Sizing." The Journal of Systems and Software 7(2): 173. Vijayakumar, S. (1997). "Use of historical data in software cost estimation." Computing & Control Engineering Journal 8(3): 113-119. Vijayakumar, S. (2002). Improving software cost estimation by model validation. Project Manager Today : 6.

PAGE 169

157 Vu, J. D. (1997). Software Process Improvement Journey: From Level 1 to Level 5 Second Annual European Software Engineering Process Group Conference, Amsterdam, The Netherlands. Walston, C. E. and C. P. Felix (1977). "A method of programming measurement and estimation." IBM Systems Journal 16(1): 54-73. Wrigley, C. D. and A. S. Dexter (1987) Software Development Estimation Models: A Review and Critique ASAC Conference, University of Toronto, Toronto, Ontario. Yourdon, E. (1994). Software Metrics. Application Development Strategies

PAGE 170

158 APPENDICES

PAGE 171

159 APPENDIX A: KEMERER DATASET Project N umbe r Software Hardware Months Effort KSLOC SLOC/MM 1 Cobol IBM 308X 17 287 253 884 2 Cobol IBM 43XX 7 82.5 40.5 491 3 Cobol DEC VAX 15 1107.31 450 406 4 Cobol IBM 308X 18 86.9 214.4 2467 5 Cobol IBM 43XX 13 336.3 449.9 1338 6 Cobol DEC 20 5 84 50 595 7 Cobol DEC 20 5 23.2 43 1853 8 Cobol IBM 43XX 11 130.3 200 1535 9 Cobol IBM 308X 14 116 289 2491 10 Cobol, Natural IBM 308X 5 72 39 542 11 Cobol IBM 308X 13 258.7 254.2 983 12 Cobol IBM 43XX, 308X 31 230.7 128.6 557 13 Cobol HP 300, 68 20 157 161.4 1028 14 Cobol IBM 308X 26 246.9 164.8 667 15 N atural IBM 308X 14 69.9 60.2 861 Kemerer (1987) Dataset

PAGE 172

160 APPENDIX B: MERMAID-2 DATASET Project Number Adjusted FP Raw FP Total Effort (hours) Total Duration (months) Project Type N =new E=enhancement 1 23 23 238 3.45 E 2 38 42 490 6.75 E 3 36 44 616 2.9 E 4 57 51 910 2.55 E 5 36 47 1540 10 E 6 29 38 1680 10 E 7 23 34 1750 10.5 E 8 99 115 3234 9 N 9 605 550 3360 2 N 10 34 42 3850 5.5 E 11 338 371 5460 15 N 12 133 157 5110 16.25 E 13 118 107 6440 11 E\N 14 653 643 17920 35 E 15 502 528 18620 20 n/a 16 306 268 21280 27 N 17 170 179 24850 11.6 N 18 911 884 48230 29.6 E 19 221 235 3415 7.5 n/a 20 613 626 11551 7 N 21 1507 1408 4860 8.5 E 22 559 n/a 14224 26 E 23 218 291 9080 9 N 24 479 499 1635 9 E 25 26 33 296 4 E 26 125 1337 3720 5 E 27 205 n/a 4672 6 E 28 105 109 2065 8 E 29 114 107 1690 6 E 30 36 38 504 4 E MERMAID-2 Dataset (Kitchenham 2002)

PAGE 173

161 APPENDIX C: LINGO SCRIPT FOR TIER-THREE MODEL: SETS: GROUPS / R D I IT AT RD DI IIT ITAT RAT DIT O /: n, EM, E, emu; ENDSETS E(1) = PR1 + PR7; E(2) = PR4 + PR5 + PD2; E(3) = PA3; E(4) = IT5; E(5) = 0; E(6) = PR2 + PR3 + PD1 + PA1 + IT1; E(7) = PD3 + PA2 + IT2; E(8) = IT3; E(9) = IT7; E(10) = PR8 + PD7; E(11) = PD4 + PA4 + IT4 + PD5 + PA5 + PA7; E(12) = PR6 + PD6 + PA6 + IT6 + PD8 + PA8 + IT8; Design = N(1); Requirements = N(2); Implementation = N(3); Testing = N(4); Customer = N(5); @FOR (GROUPS(I) : EM(I) = 1 + (.001248269 n(I) (n(I) 1)/2)) ; @FOR (GROUPS(I) : @GIN(n(I))); @FOR (GROUPS(I) : n(I) >= 1); N(6) = N(1) + N(2); N(7) = N(2) + N(3); N(8) = N(3) + N(4); N(9) = N(4) + N(5); N(10) = N(1) + N(5); N(11) = N(2) + N(4); N(12) = N(1) + N(2) + N(3) + N(4) + N(5); MINSTAFFCALC = @IF(MINSCHECK #EQ# 0, 5, MINSTAFF); MAXSTAFFCALC = @IF(MAXSCHECK #EQ# 0, 10000, MAXSTAFF); N(12) >= MINSTAFFCALC; N(12) <= MAXSTAFFCALC; COCOMOEFFORT = COCOEFFORT; @FOR(GROUPS( I): emu( I) = E( I) EM( I)); effort = @SUM(GROUPS : emu) COCOMOEFFORT; time = effort ( (PR1 (1 + PR6))/N(1) + (PR2 (1 + PR6))/(N(1) + N(2)) + (PD1 (1 + PD6 + PD8))/(N(1) + N(2)) + (PD2 (1 + PD6 + PD8))/(N(2)) + (PD3 (1 + PD6 + PD8))/(N(2) + N(3)) + (PA3 (1 + PA6 + PA8))/(N(3)) + (IT3 (1 + IT6 + IT8))/(N(3) + N(4)) + (IT5 (1 + IT6 + IT8))/(N(4)) + (IT7 (1 + IT6 + IT8))/(N(4) + N(5)));

PAGE 174

162 APPENDIX C: LINGO SCRIPT FOR TIER-THREE (continued) DATA: @POINTER( 1) = time; @POINTER( 2) = Design; @POINTER( 3) = Requirements; @POINTER( 4) = Implementation; @POINTER( 5) = Testing; @POINTER( 6) = Customer; PR1 = @POINTER( 7); PR2 = @POINTER( 8); PR3 = @POINTER( 9); PR4 = @POINTER( 10); PR5 = @POINTER( 11); PR6 = @POINTER( 12); PR7 = @POINTER( 13); PR8 = @POINTER( 14); PD1 = @POINTER( 15); PD2 = @POINTER( 16); PD3 = @POINTER( 17); PD4 = @POINTER( 18); PD5 = @POINTER( 19); PD6 = @POINTER( 20); PD7 = @POINTER( 21); PD8 = @POINTER( 22); PA1 = @POINTER( 23); PA2 = @POINTER( 24); PA3 = @POINTER( 25); PA4 = @POINTER( 26); PA5 = @POINTER( 27); PA6 = @POINTER( 28); PA7 = @POINTER( 29); PA8 = @POINTER( 30); IT1 = @POINTER( 31); IT2 = @POINTER( 32); IT3 = @POINTER( 33); IT4 = @POINTER( 34); IT5 = @POINTER( 35); IT6 = @POINTER( 36); IT7 = @POINTER( 37); IT8 = @POINTER( 38); MINSTAFF = @POINTER( 39); MAXSTAFF = @POINTER( 40); MINSCHECK = @POINTER (41); MAXSCHECK = @POINTER (42); COCOEFFORT = @POINTER (43); ENDDATA MIN = time; END

PAGE 175

APPENDIX D: INSTITUTIONA L REVIEW BOARD APPROVAL 163

PAGE 176

APPENDIX D: INSTITUTIONAL REVI EW BOARD APPROVAL (continued) 164

PAGE 177

APPENDIX D: INSTITUTIONAL REVI EW BOARD APPROVAL (continued) 165

PAGE 178

APPENDIX E: EXPERIMENTAL MATERIALS 164 A Software Cost Estimation Study Thank You Thank you for agreeing to participate in this study. Your part icipation in this study will lead to a be tter understanding of soft ware cost estimation. Informed Consent First please read the informed consent docum ent in the folder with this sheet. If you have any questions regardi ng this study please contact th e principal investigator as listed on the informed consent document. Af ter reading the consent document, if you wish to participate in the study please sign the informed consent document. The informed consent must be signed and returned. This study has been approved by the Un iversity of South Florida Institution Review Board as an academic research study. The Study This study is designed to take no more th an one hour. You are asked to report the starting and stopping time on the in formation sheet for each task. This study consists of: 1) An IRB Form 2) A background questionnaire with dem ographic and software estimation experience. 3) The first estimation task with three subtasks. 4) The second estimation task with three subtasks. 5) An after experiment questionnaire that asks about the experiment. 6)

PAGE 179

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 165

PAGE 180

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 166

PAGE 181

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 167

PAGE 182

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 168 Please answer the following background questions: 1 Age: _________________ 2 Gender (circle one): Male Female 3 Current Work Status (c ircle one): Full-Time Part-Time Unemployed 4 What is your total full-time work experience in years? ___________________________ 5 What is your total length of full-time IT experience in year? ______________________ 6 What is your current role in your organization? ________________________________ 7 How long have you been in your current role? _________________________________ 8 What organizational level describes your position? Executive Middle Management Professional First Line Management Technical/Clerical Other 9 If you have estimated a project before, how many projects have you estimated? _____________________ 10 What techniques have you used for estimation? (circle all that apply): Ad Hoc (cannot be categorized, not a technique) Informal analogy (rulesof-thumb) Formal analogy (Example: A database of previous projects) Formal model (Example: COCOMO) Other (specify): _______________________ 11 If a formal model was used, which model(s) were used for estimation? (circle all that apply) PRICE-S COCOMO COCOMO II SLIM Other (specify): _______________________ 12 For what proportion of projects did you use an estimation technique other than Ad Hoc? (circle one) None 1-25% 26-50% 51-75% > 75% 13 Please list your educational background: Bachelors: _______________________ Degree __________________________ Major _______________________ Degree __________________________ Major Masters: _______________________ Degree __________________________ Major _______________________ Degree __________________________ Major Doctorate _______________________ Degree __________________________ Major 14 Please circle any professional certificates you have: a. PMI Project Management Pr ofessional Certificate (PMP) b. PMI Certified Associate in Proj ect Management Certificate (CAPM) c. Working on certificate _____________________ d. Other (please explain) ______________________ 15 My use of estimation techniques is (circle only one) a. I have not used estimation techniques. b. I have used estimation in an initial project only. c. I have used in mostly small projects, but not in large projects. d. I have used in a mixture of small and large projects. e. I have used in mostly large projects, but not in small projects. f. My use of estimation is completely routine (in all my projects). 16 Typically, in which phase do you make your first estimate of software costs (e.g., budget, effort)? a. Requirements specification b. Software analysis c. Software design

PAGE 183

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 169 d. Implementation e. Testing f. Maintenance 17 Typically, in which phases, if any, do you revise your initial software cost estimate? (circle all that apply): a. Requirements specification b. Software analysis c. Software design d. Implementation e. Testing f. Maintenance The following questions measure your feelin gs about conducting software cost estimation Please indicate the extent to which you agree or disagree with these statements by circling a number between 1 and 7 for each statement where: 1 = Strongly Disagree 2 = Somewhat Disagree 3 = Slightly Disagree 4 = Neutral 5 = Slightly Agree 6 = Somewhat Agree 7 = Strongly Agree 18 I am capable of dealing with most estimation problems that come up at work. 1 2 3 4 5 6 7 19 If I can’t estimate a project the first time, I keep trying until I can. 1 2 3 4 5 6 7 20 When I set important goals for myself, I rarely achieve them. 1 2 3 4 5 6 7 21 If estimation looks too complicated, I avoid it. 1 2 3 4 5 6 7 22 When trying to estimate a new project, I soon give up if I am not initially successful. 1 2 3 4 5 6 7 23 If a new estimation project seems especially difficult, I become more determined to master it. 1 2 3 4 5 6 7 24 Initial failure in estimation just makes me try harder. 1 2 3 4 5 6 7 25 I feel confident about my ab ility to estim ate projects. 1 2 3 4 5 6 7 26 I am a self-reliant person in softwa re cost estimation. 1 2 3 4 5 6 7 27 I can come up with good estimates for strai ghtforward projects. 1 2 3 4 5 6 7 28 Obstacles in estimating will not frustrate me. 1 2 3 4 5 6 7 29 I can come up with estimates under any circumstances. 1 2 3 4 5 6 7 30 I can come up with good estim ates if I had a tool to help me. 1 2 3 4 5 6 7 31 I can come up with good estimates if I see some one else estimating a project before I try it. 1 2 3 4 5 6 7 32 I can come up with good estimates fo r projects similar to projects I previously estimated. 1 2 3 4 5 6 7

PAGE 184

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 170 Task 1 Time Started: ___________ Time Stopped: ___________

PAGE 185

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 171 Estimation Task 1a: In this task you are to estimate the amount of effort (in people-months) and schedule (in months) needed to develop the follo wing software development project. Assume the following conversions: Effort: 1 people-month = 19 days. 1 working-day = 8 hours. Schedule: 1 month = 30 days. Estimation Details: You are a project manager for a small softwa re development company. Your organization consists of 12 total employees All the employees work in a single development team throughout system development. Project Information: A database application is e xpected to be around 30 KDSI. This development project is commonly conducte d in this organizati on. The project is not complex; in fact, it will be a simple developm ent project. The project will need to be reused. All the people on the team will be highly skilled team members and everyone works well together. The development platform the system will be developed on is commonly used throughout the organization. Typically for this kind of project, the fo llowing historical data is available. Historical Data: Estimated Effort (manmonths) Estimated Schedule (months) Estimated Team Size Actual Size (KDSI) Actual Effort (manmonths) Actual Schedule (weeks) Actual Schedule (months) Actual Average Team Size Project ID 42.25 9 5 74.2554.5358.756 3004 169 13 13 183.5252.58922.2511 5004 26.25 6 4 1619.52054 9076 38 12.99 3 153858.4314.60753 8965 Estimated and Actual Size is in KDSI (1 KDSI = 1000 Lines of Code)

PAGE 186

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 172 Please answer the following questions regarding the project: Effort: 1. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 2. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 3. Please give a worst case estimate of effort___________________________________. 4. Please give a best case estimate of effort ____________________________________. 5. Please describe you rationale for your estimate of effort. Schedule: 6. Please estimate how long it will take to conduct the project _____________________. 7. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 8. Please give a worst case estimate of schedule ________________________________. 9. Please give a best case estimate of schedule _________________________________. 10. Please describe you rationale for your estimate of effort.

PAGE 187

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 173 Estimation Task 1b: In this task you are to estimate the amount of effort (in people-months) needed to develop the following software development project. Task 1b is the same as Task 1a except for the following: The amount of staff of the project is doubled to 24 employees. Please answer the following questions regarding the project: Effort: 11. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 12. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 13. Please give a worst case estimate of effort _________________________________. 14. Please give a best case estimate of effort____________________________________. 15. Please describe you rationale for your estimate of effort. Schedule: 16. Please estimate how long it will take to conduct the project _______________________. 17. Please state how confident (from 0% to 100%) you are about your schedule estimate ___. 18. Please give a worst case estimate of schedule __________________________________. 19. Please give a best case estimate of schedule ___________________________________. 20. Please describe you rationale for your estimate of effort.

PAGE 188

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 174 Estimation Task 1c: In this task you are to estimate the amount of effort (in people-months) and schedule (in months) needed to develop the follo wing software development project. Assume the following conversions: Effort: 1 people-month = 19 days. 1 working-day = 8 hours. Schedule: 1 month = 30 days. Estimation Details: You are a project manager for a small softwa re development company. Your organization consists of 13 total employees. The employees all work in a single development team throughout system development. Project Information: A database application is e xpected to be around 180 KDSI. This development project is commonly conducte d in this organizati on. The project is not complex; in fact, it will be a simple developm ent project. The project will need to be reused. All the people on the team will be highly skilled team members and everyone works well together. The development platform the system will be developed on is commonly throughout the organization. Typically for this kind of project, the fo llowing historical data is available. Historical Data: Estimated Effort (manmonths) Estimated Schedule (months) Estimated Team Size Actual Size (KDSI) Actual Effort (manmonths) Actual Schedule (weeks) Actual Schedule (months) Actual Average Team Size Project ID 42.25 9 5 74.2554.5358.756 3004 169 13 13 183.5252.58922.2511 5004 26.25 6 4 1619.52054 9076 38 12.99 3 153858.4314.60753 8965 Estimated and Actual Size is in KDSI (1 KDSI = 1000 Lines of Code)

PAGE 189

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 175 Please answer the following questions regarding the project: Effort: 21. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 22. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 23. Please give a worst case estimate of effort ____________________________________. 24. Please give a best case estimate of effort ______________________________________. 25. Please describe you rationale for your estimate of effort. Schedule: 26. Please estimate how long it will take to conduct the project _______________________. 27. Please state how confident (from 0% to 100%) you are about your schedule estimate ___. 28. Please give a worst case estimate of schedule __________________________________. 29. Please give a best case estimate of schedule ___________________________________. 30. Please describe you rationale for your estimate of effort.

PAGE 190

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 176 Task 2 Time Started: ___________ Time Stopped: __________

PAGE 191

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 177 Estimation Task 2a: In this task you are to estimate the amount of effort (in people-months) and schedule (in months) needed to develop the follo wing software development project. Assume the following conversions: Effort: 1 people-month = 19 days. 1 working-day = 8 hours. Schedule: 1 month = 30 days. Estimation Details: You are a project manager for a medium sized software development company. For this particular project you are to manage 30 st aff. All the employees work in a single development team also known as an in tegrated project team throughout system development. Project Information: An ecommerce web application is expected to be around 80 KDSI. The web application is a business-to-bus iness e-commerce proj ect. Important stock transaction data will be rout ed through this application allowing mutual fund companies to trade stocks directly to other mutual funds. The following historical data is available for pa st projects, but the historical data is not expected to be helpful since this project wi ll have team sizes more than double past projects. Historical Data: Estimated Effort (manmonths) Estimated Schedule (months) Estimated Team Size Actual Size (KDSI) Actual Effort (manmonths) Actual Schedule (weeks) Actual Schedule (months) Actual Average Team Size Project ID 42.25 9 5 74.2554.5358.756 3004 169 13 13 183.5252.58922.2511 5004 26.25 6 4 1619.52054 9076 38 12.99 3 153858.4314.60753 8965

PAGE 192

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 178 Estimated and Actual Size is in KDSI (1 KDSI = 1000 Lines of Code)

PAGE 193

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 179 Please answer the following questions regarding the project: Effort: 31. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 32. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 33. Please give a worst case estimate of effort ____________________________________. 34. Please give a best case estimate of effort ______________________________________. 35. Please describe you rationale for your estimate of effort. Schedule: 36. Please estimate how long it will take to conduct the project _______________________. 37 Please state how confident (from 0% to 100%) you are about your schedule estimate ___. 38 Please give a worst case estimate of schedule __________________________________. 39 Please give a best case estimate of schedule ___________________________________. 40. Please describe you rationale for your estimate of effort.

PAGE 194

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 180 Estimation Task 2b: In this task you are to estimate the amount of effort (in people-months) and schedule (in months) needed to develop the follo wing software development project. Task 2b is the same as Task 2a except for the following: Instead of developing the system in one large integrated proj ect team, the project will be broken into three different teams. The first team will be the requirements and design team. This team will consist of 9 people. The second team will be the implement ation team and will consist of 13 people. The third team is the testing team and will consist of 8 people. Notice, there is still a total of 30 peopl e working on the system development. Please answer the following questions regarding the project: Effort: 41. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 42. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 43. Please give a worst case estimate of effort ____________________________________. 44. Please give a best case estimate of effort ______________________________________. 45. Please describe you rationale for your estimate of effort. Schedule: 46. Please estimate how long it will take to conduct the project _______________________.

PAGE 195

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 181 47. Please state how confident (from 0% to 100%) you are about your schedule estimate ___. 48. Please give a worst case estimate of schedule __________________________________. 49. Please give a best case estimate of schedule ___________________________________. 50. Please describe you rationale for your estimate of effort.

PAGE 196

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 182 Estimation Task 2c: In this task you are to estimate the amount of effort (in people-months) and schedule (in months) needed to develop the follo wing software development project. Task 2c is the same as Task 2a except for the following: Instead of developing the system in one large integrated proj ect team, the project will be broken into five different teams. The first team will be the requirements team. The requirements team will consist of 4 people. The second team will be the design team. Design team will consist of 6 people. The third team will be the implementation team and will consist of 12 people. The fourth team is the testing team and will consist of 7 people. The fifth team is the customer acceptance team. One person will perform all the customer acceptance activities. Notice, there is still a total of 30 peopl e working on the system development. Please answer the following questions regarding the project: Effort: 51. Please estimate (in people-months) how much effort will be required to conduct the project _____________. 52. Please state how confident (from 0% to 100%) you are about your effort estimate _____. 53. Please give a worst case estimate of effort ____________________________________. 54. Please give a best case estimate of effort ______________________________________. 55. Please describe you rationale for your estimate of effort.

PAGE 197

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 183 Schedule: 56. Please estimate how long it will take to conduct the project _______________________. 57. Please state how confident (from 0% to 100%) you are about your schedule estimate ___. 58. Please give a worst case estimate of schedule __________________________________. 59. Please give a best case estimate of schedule ___________________________________. 60. Please describe you rationale for your estimate of effort.

PAGE 198

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 184 AFTER TASKS QUESTIONNAIRE

PAGE 199

APPENDIX E: EXPERIMENTAL MATERIALS (continued) 185 The following questions measure your feelings based on your experience in estimating in the experiment. Please indicate the extent to which you agree or disagree with these statements by circling a number between 1 and 7 for each statement where: 1 = Strongly Disagree 2 = Somewhat Disagree 3 = Slightly Disagree 4 = Neutral 5 = Slightly Agree 6 = Somewhat Agree 7 = Strongly Agree 33 Using the software estimation technique in this experiment improves my performance in conducting software estimation. 1 2 3 4 5 6 7 34 Using the software estimation technique in this experiment improves my productivity in conducting software estimation. 1 2 3 4 5 6 7 35 Using the software estimation technique in this experiment improves my effectiveness in conducting software estimation. 1 2 3 4 5 6 7 36 Overall, the software technique used in this e xperiment was useful in conducting software cost estimation. 1 2 3 4 5 6 7 How would you rate your overall experience using the software estimation technique in this experiment (4=Neutral): 37 Very Dissatisfied 1 2 3 4 5 6 7Very satisfied 38 Very displeased 1 2 3 4 5 6 7 Very Pleased 39 Very frustrated 1 2 3 4 5 6 7 Very contented 40 Absolutely terrible 1 2 3 4 5 6 7 Absolutely delighted 41 In making my software cost estimates, the technique I mainly used: a. A Calculator b. Spreadsheet c. Historical Data d. Historical Data along with COCOMO II e. Historical Data and PSEstimate f. Other (please specify) __________________________________ 42 During the study, circle all of the following tech niques that you used to make software cost estimates: a. A Calculator b. Spreadsheet c. Historical Data d. COCOMO II e. PSEstimate f. Other (please specify) __________________________________ 43 My preferred method to estimate software cost is to use ___________________ to come up with my estimates. 44 When conducting Task 2, how do you think the difference in structures changed the communication that occurred as the same thirty people moved smaller group

PAGE 200

ABOUT THE AUTHOR Michael Douglas holds a B.S. in Com puter Engineering from Kansas State University and a M.B.A. from Fontbonne Un iversity in St. Louis, MO. Michael is starting his career at the University of Ar kansas at Little Rock. Michael is happily married to Kelly. Michael and Kelly have a poodle named Cody.