Analysts and engineers gain trust in a model by seeing it in action and can clearly demonstrate findings to management. For example, checking warehouse storage space utilization on any given date. Mining companies can significantly cut costs by optimizing asset usage and knowing their future equipment needs. In logistics, a realistic picture can be produced using simulation, including unpredictable data, such as shipment lead times. The Use of Simulation with an Example: Simulation Modeling for Efficient Customer Service This specific example may also be applicable to the more general problem of human and technical resource management, where companies naturally seek to lower the cost of underutilized resources, technical experts, or equipment, for example.
Firstly, for the bank, the level of service was defined as the average queue size. Relevant system measures were then selected to set the parameters of the simulation model - the number and frequency of customer arrivals, the time a teller takes to attend a customer, and the natural variations which can occur in all of these, in particular, lunch hour rushes and complex requests.
A flowchart corresponding to the structure and processes of the department was then created. Simulation models only need to consider those factors which impact the problem being analyzed. For example, the availability of office services for corporate accounts, or the credit department have no effect on those for individuals, because they are physically and functionally separate.
Finally, after feeding the model data, the simulation could be run and its operation seen over time, allowing refinement and analysis of the results. If the average queue size exceeded the specified limit, the number of available staff was increased and a new experiment was done. Beyond the usual documentation, which for complicated models can be fairly extensive, an "executive summary" of key assumptions used in the simulation model should be provided to experts to help them determine their reasonableness and therefore the utility of the simulation.
A full history of model development, especially any modification of model parameters and their justification, should also be made available to those with the responsibility for accrediting a model for use in operational testing. In model-test-model, a model is developed, a number of operational test runs are carried out, and the model is modified by adjusting parameters so that it is more in agreement with the operational test results.
Such external validation on the basis of operational use is extremely important in informing simulation models used to augment operational testing. However, there is an important difference one we suspect is not always well understood by the test community between comparing simulation outputs with test results and using test results to adjust a simulation.
Many complex simulations involve a large number of "free" parameters—those that can be set to different values by the analyst running the simulation. In model-test-model some of these parameters can be adjusted to improve the correspondence of simulation outputs with the particular operational test results with which they are being compared.
When the number of free parameters is large in relation to the amount of available operational test data, close correspondence between a "tuned'' simulation and operational results does not necessarily imply that the simulation would be a good predictor in any scenarios differing from those used to tune it. A large literature is devoted to this problem, known as overfitting. An alternative that would have real advantage would be "model-test-model-test," in which the final test step, using scenarios outside of the "fitted" ones, would provide validation of the version of the model produced after tuning and would therefore be a guard against overfitting.
If there was interest in the model being finalized before any operational testing was performed, this would be an additional reason for developmental testing to incorporate various operationally realistic aspects.
Overfitting is said to occur for a model and data set combination when a simple version of the model selected from a model hierarchy, formed by setting some parameters to fixed values is superior in predictive performance to a more complicated version of the model formed by estimating these parameters from the data set. For some types of statistical models, there are commonly accepted measures of the degree of overfitting.
An example is the Cp statistic for multiple regression models: a model with high Cp could be defined as being overfit. Recommendation 9. The panel reviewed several documents that describe the process used to decide whether to use a simulation model to augment an operational test. There are differences across the services, but the general approach is referred to as verification, validation, and accreditation.
Verification is "the process of determining that model implementation accurately represents the developer's conceptual description and specifications" U. Department of Defense, a. For constructive simulations, verification means that the computer code is a proper representation of what the software developer intended; the related software testing issues are discussed in Chapter 8. Validation is "the process of determining a the manner and degree to which a model is an accurate representation of the real-world from the perspective of the intended uses of the model, and b the confidence that should be placed on this assessment" U.
Accreditation is ''the official certification that a model or simulation is acceptable for use for a specific purpose" U. The panel supports the general goals of verification, validation, and accreditation and the emphasis on verification and validation and the need for formal approval, that is, accreditation, of a simulation model for use in operational testing.
Given the crucial importance of model validation in deciding the utility of a simulation for use in operational test, it is surprising that the constituent parts of a comprehensive validation are not provided in the directives concerning verification, validation, and accreditation. A statistical perspective is almost entirely absent in these directives. For example, there is no discussion of what it means to demonstrate that the output from a simulation is "close" to results from an operational test.
It is not clear what guidelines model developers or testers use to decide how to validate their simulations for this purpose and how accrediters decide that a validation is sufficiently complete and that the results support use of the simulation. Model validation cannot be algorithmically described, which may be one reason for the lack of specific instruction in the directives.
A test manager would greatly benefit from examples, advice on what has worked in the past, what pitfalls to avoid, and most importantly, specific requirements as to what constitutes a comprehensive validation. This situation is similar to that described in Chapter 1 , regarding the statistical training of those in charge of test planning and evaluation. Model validation has an extensive literature, in a variety of disciplines, including statistics and operations research, much of it quite technical, on how to demonstrate that a computer model is an acceptable representation of the system of interest for a.
Operational test managers need to become familiar with the general techniques represented in this literature, and have access to experts as needed.
We suggest, then, a set of four activities that can jointly form a comprehensive process of validation: 1 justification of model form, 2 an external validation, 3 an uncertainty analysis including the contribution from model misspecification or alternative specifications, and 4 a thorough sensitivity analysis. All important assumptions should be explicitly communicated to those in a position to evaluate their merit.
This could be done in the "executive summary" described above. A model's outputs should be compared with operational experience. The scenarios chosen for external validation of a model must be selected so that the model is tested under extreme as well as typical conditions. The need to compare the simulation with operational experience raises a serious problem for simulations used in operational test design, but it can be overcome by using operationally relevant developmental test results.
Although external validation can be expensive, the number of replications should be decided based on a cost-benefit analysis see the discussion in Chapter 5 on "how much testing is enough". External validation is a uniquely valuable method for obtaining information about a simulation model's validity for use in operational testing, and is vital for accreditation. An indication of the uncertainty in model outputs as a function of uncertainty in model inputs, including uncertainty due to model form, should be produced.
This activity can be extremely complicated, and what is feasible today may be somewhat crude, but DoD experience at this will improve as it is attempted for more models. In addition, exploration of alternative model forms will have benefits in providing further understanding of the advantages and limitations of the current model and in suggesting modifications of its current form.
An analysis of which inputs importantly affect which outputs, and the direction of the effect, should be carried out and evaluated by those with knowledge of the system being developed. The literature cited above suggests a number of methods for carrying out a comprehensive sensitivity analysis. It will often be necessary to carry out these steps on the basis of a reduced set of "important" inputs: whatever process is used to focus the analysis on a smaller number of inputs should be described.
There are tutorials that are provided at conferences, and other settings, and excellent reports in the DoD community e. A description of any methods used to reduce the number of inputs under analysis should be included in each of the steps. Models and simulations used for operational testing and evaluation must be archived and fully documented, including the objective of the use of the simulation and the results of the validation.
The purpose of a simulation is a crucial factor in validation. For some purposes, the simulation only needs to be weakly predictive, such as being able to rank scenarios by their stress on a system, rather than to predict actual performance. For other purposes, a simulation needs to be strongly predictive. Experience should help indicate, over time, which purposes require what degree and what type of predictive accuracy.
Models and simulations are often written in a general form so that they will have wide applicability for a variety of related systems. An example is a missile fly-out model, which might be used for a variety of missile systems.
A model that has been used previously is often referred to as a legacy model. In an effort to reduce the costs of simulation, legacy models are sometimes used to represent new systems, based on a complete validation for a similar system. Done to avoid costly development of a de novo simulation, this use of a legacy model presents validation challenges. In particular, new systems by definition have new features. Thus, a legacy model should not be used for a new application unless: a strong argument can be made about the similarity of the applications and an external validation with the new system is conducted.
A working presumption should be that the simulation will not be useful for the new application unless proven otherwise. Modeling and simulation may have their greatest contribution to operational test through improving operational test design.
Modeling and simulation were used to help plan the operational test for the Longbow Apache see Appendix B. Constructive simulation models can play at least four key roles. First, simulation models that properly incorporate both the estimated heterogeneity of system performance as a function of various characteristics of test scenarios , as well as the size of the remaining unexplained component of the variability of system performance, can be used to help determine the error probabilities of any significance tests used in assessing system effectiveness or suitability.
To do this, simulated relationships based on the various hypotheses of interest between measures of performance and environmental and other scenario characteristics can be programmed, along with the description of the number and characteristics of the test scenarios, and the results tabulated as in an operational.
Such replications can be repeated, keeping track of the percentage of tests that the system passed. This approach could be a valuable tool in computing error probabilities or operating test characteristics for non-standard significance tests.
Second, simulation models can help select scenarios for testing. Simulation models can assist in understanding which factors need controlling and which can be safely ignored in deciding which scenarios to choose for testing, and they can help to identify appropriate levels of factors. They can also be used to choose scenarios that would maximally discriminate between a new system and a baseline system.
This use requires a simulation model for the baseline system, which presumably would have been archived. For tests for which the objective is to determine system performance in the most stressful scenario s , a simulation model can help select the most stressful scenario s. As a feedback tool, assuming that information is to be collected from other than the most stressful scenarios, the ranking of the scenarios with respect to performance from the simulation model can be compared with that from the operational test, thereby providing feedback into the model-building process, to help validate the model and to discover areas in which it is deficient.
Third, there may be an advantage in using simulation models as a living repository of information collected about a system's operational performance. This repository could be used for test planning and also to chart progress towards development, since each important measure of performance or effectiveness would have a target value from the Operational Requirements Document, along with the values estimated at any time, using either early operational assessments or, for requirements that did not have a strong operational aspect, the results from developmental testing.
Fourth, every instance in which a simulation model is used to design an operational test, and the test is then carried out, presents an opportunity for model validation. The assumptions used in the simulation model can then be checked against test experience. Such an analysis will improve the simulation model under question, a necessary step if the simulation model is to be used in further operational tests or to assess the performance of the system as a baseline when the next innovation is introduced.
Feedback of this type will also help provide general experience to model developers as to which approaches work and which do not. Of course, this kind of feedback will not be possible without the data archive recommended in Chapter 3. Also mentioned in Chapters 3 , 6 , and 8 , inclusion of field use data in such an archive provides great opportunities for validation of methods used in operational test design.
The results of such tests, in turn, should be used to calibrate and validate all relevant models and simulations. The repository would include use of data from all relevant sources of information, including experience with similar systems, developmental testing, early operational assessments, operational testing, training exercises, and field use.
A final note is that validation for test design, although necessary, does not need to be as comprehensive as validation for simulation that is to be used for augmenting operational test evaluation. One can design an effective test for a system without understanding precisely how a system behaves.
For example, simulation can be used to identify the most stressful environment without knowing what the precise impact of that environment will be on system performance. The use of modeling and simulation to assist in the operational evaluation of defense systems is relatively contentious.
On one side, modeling and simulation is used in this way in industrial e. Simulation can save money, is safer, does not have the environmental problems of operational test, is not constrained in defense applications by the availability of enemy systems, and is always feasible in some form.
On the other side, information obtained from modeling and simulation may at times be limited in comparison with that from operational testing.
Its exclusive use may lead to unreliable or ineffective systems passing into full-rate production before major defects are discovered. An important example of a system for which the estimated levels for measures of effectiveness changed due to the type of simulation used is the M1A2 tank. In a briefing for then Secretary of Defense William Perry see Wright, , detailing work performed by the Army Operational Test and Evaluation Command, three simulation environments were compared: constructive simulation, virtual simulation, and live simulation essentially, an operational test.
The purpose was to "respond to Joint Staff request to explore the utility of the Virtual Simulation Environment in defining and understanding requirements. The virtual simulation indicated that M1A2 was not better, which was confirmed by the field test.
The problems with the M1A2 had to do, in part, with imm ature software. The specific limitations of the constructive simulation were that the various assumptions underlying the engagements resulted in the M1A2 detecting and killing more targets. Even though the overall results agreed with the field. The primary problem was the lack of fidelity of the simulated terrain, which resulted in units not being able to use the terrain to mask movements or to emulate having dug-in defensive positions.
In addition, insufficient uncertainty was represented in the scenarios. In this section we discuss some issues concerning how to use validated simulations to supplement operational test evaluation. The use of statistical models to assist in operational evaluation—possibly in conjunction with the use of simulation models—is touched on in Chapter 6. An area with great promise is the use of a small number of field events, modeling and simulation, and statistical modeling, to jointly evaluate a defense system under development.
Unfortunately, the appropriate combination of the first two information sources with statistical modeling is extremely specific to the situation. It is, therefore, difficult to make a general statement about such an approach, except to note that it is clearly the direction of the future, and research should be conducted to help understand the techniques that work.
Modeling and simulation have been suggested as ways of extrapolating or interpolating to untested situations or scenarios. Lengths of Stay , then the only certainty will be that our conclusion about required Capacity will be sub-Optimal. Computerised simulation models can provide visually powerful tools that can easily process many complex, inter-dependent decisions and so quickly provide the User with the likely consequences of a given Scenario.
Often, a User will gain significant insight and greater understanding of their problem from the very process of Model Design in addition to the Execution of Scenarios and the Analysis of Outcomes. It is sometimes possible to design Simulation Models to find the Optimal Solution with a high level of Confidence — automatically! It would enable a sense of ownership from the bottom up engaging clinicians and local leaders and would be a valuable way of undertaking simulations that explore all the potential scenarios for future service delivery based on real data provided from the grass roots with an embedded robust evidence base underpinning the modelling without the need to change any service and directly affect patient care unless it was clear that it improved outcomes.
0コメント