Cognitive Modelling Requirements for Operations Research and Human-In-The-Loop Simulation within the DSTO Architecture Simon Goss 1, Frank E. Ritter 2 and Nigel Shadbolt 2 (blackbag@ozemail.com.au) +10 gmt 1 DSTO Air Operations Division 2 AI Group, School of Psychology, University of Nottingham Abstract Human behaviour is important in complex systems. They are an essential component, often accounting for most of the information processign done in such systems. It is not only important to model what humans in the loop do, but also when they do it in these time-sensitive systems. In this report we note some characteristics of human behaviour that would be most important to include in models of human responses in complex control systems. We propose possible methods, approaches, and existing tools for including the characteristics of human behaviour we identify. Table of Contents Cognitive Modelling Requirements for Operations Research and Human-In- The-Loop Simulation within the DSTO Architecture 1 Abstract 1 Introduction 1 What we have: The status of the existing OR model 2 The structure of the OR model 2 Example of behaviours that the model models 2 world events 2 coordination communication events 2 Survey of relevant models and architectures 2 Cognitive architectures and AI 3 Issues 3 Descriptive models with tool support 4 Expert Systems as Psychological Theories 4 Belief, Desire, and Intentions (BDI) architectures 4 Process models and architecture 4 Lessons 5 Table 5 Important psychology regularities 5 Modelling Attention 5 Vision comments could go here 6 Errors 6 Modelling team behaviour 6 Validation 6 Conclusions 7 Recommendations for including human-like processing in OR 7 Things not yet ready to include 7 References 7 Introduction Defense analysis has traditionally paid considerable attention to modelling physical entities and processes. A great deal of work has gone into modelling physical entities such as fluid dynamics and airplane performance. And this is worthy in its own right. Larger and more complex systems include humans and their ability to behaviour intelligently and to make mistakes. Analysts have not always paid equal attention to modelling the relavent human behaviour and how it influences systems. In many situations, the behaviour of the operator will be the most important determiner of the outcome of the simulation. This document examines the current architecture in use for operations research (OR) models at the Australian Defense Systems and Training Organisation (DSTO) and notes the most important capabilities necessary to include and represent faithfully the role of human cognition in these simulations. Particular attention is paid to decision-making, in terms of intermediate behaviour, outcome, and timing. This report examines how to extend OR models so that they include models of the humans in the loop that behave like humans, that is, predict human capabilities and limitations in problem solving and acting. This report will first define cognitive models including their implementation. There are several good examples of modelling environments and approaches that will provide the framework for discussing the most important features of human behaviour for OR models. We are able to conclude with several suggestions for improving OR simulations where human behavior is an important component. What we have: The status of the existing OR model The structure of the OR model Example of behaviours that the model models Behaviours that the model currently does, and that we would like to have informed and formed by human information gathering, processing, and dissemination characteristics. Most particularly the characteristics of human input and output. world events react to bingo fuel react to bingo weapons aircraft killed spike spike lost contact lost missile launch change in contact range coordination communication events reposition and regroup change ROE abort current task abort and go home reposition / CAP scramble intercept vector Blackboard models (dMUSE) Survey of relevant models and architectures This section starts by defining cognitive models and archiectures. It then considers some aspects of behaviour and use that are important when choosing aa cognitive archiecture for use in OR simulations. With these issues in mind, several architectures and approaches are surveyed. Cognitive architectures and AI Models of human behaviour can be represtented in two parts. The first part is the cognitive architecture, what aspects of the information processing that do not change across time. This would typically include working memory, attention, and the ability to store information in long term memory. The secnod part of behaviour is knowledge, what is necessary to perform the task. This will vary somewhat across tasks, with similar tasks sharing some or all of the knowledge set. A cognitive model, then, is a set of knowledge combined with a particular architectre to produce intelligent behaviour. There exists a scientific community interested in unfolding the architecture of cognition. They are interested in creating cognitive models within cognitive architectures. Some architectures are symbolic only and view cognition as information processing in which memory elements are manipulated subject to constraints of bandwidth and internal buffer size. Other architectures are hybrid in nature and have a sub-symbolic account of activation of memory elements that in the connectionist extreme are distributed vectors of activation. These cognitive models are validated and improved by comparing their performance to data of human performance (e.g. timing diagrams, protocols unfolded as sequences) with predictions of models typically expressed through time and some performance variable (Ritter & Larkin, 1994) . In general these architectures have been based on regularities gained through laboratory investigations in constrained tasks aimed at teasing out the fine regularities and mechanisms of cognition. Many are about cognitive development such as the ability of 3 and 4 year olds to understand false beliefs. [FER sez: why say this?] Artificial intelligence (AI) programs are related to cognitive architectures. AI programs are designed to behaviour intelligently, but without the consrtaint of necisssarily doing it in a human like way. Because they behave intelligently, however, they may be useful for improving simulations by providing intelligent, autonomous behaviours to augment simulations. Issues There are several issues to keep in mind when evaluating potential tools and theories for moving AI/cog models into OR. Ease of use. How easy is the model to create, configure, and run. How easy is it to learn how to use the system? Repeatability of behaviour. During developement, even when modelling stochasitc aspects of behaivour, it is highly desirable for the system to be repeatable for debugging. During use, however, it is often useful to explore a wide range of behaviour. Learning by the model upon repeated task solving. Humans learn in many situations. In some domains and applications, learning is very important. In other domains, particularly behaviour of experts, learning is less prevalent. It will sometimes be useful and appropriate for models to learn while they work. Model of visual processing/active attention. Models are increasingly expected to interact with external tasks. As a model of an operator, these models are particularly expected to interact with an external task. Including a model of human vision can be an important constraint as well as serve as the link btween the model and the task it intercts with. Accounting for social activity. Models that work as parts of teams need knoweldge and the capability to interact with other models. Data for comparison? In order to validate models, there must be relavent data available or easily obtainable. We will also find that validation of these types of models, because of their complexity, will require further definition and work. Descriptive models with tool support In this section we examine several cognitive architectures that provide descriptions of human behaviour. They provide a language for describing what behaviour would occur and often provide a prediction of how long it will take, but they cannot on their own duplicate this behaviour in a simulation. GOMS COGNET (CHI systems). SAINT Other intellectual rather than implemented ideas Expert Systems as Psychological Theories Symbolic processing, chunks and memory elements. Get info without perception. Can be but not generally used as such. Concerned with knowledge level description (Newell, 1982) GDMs and PSMs are meta chunks (generalisation across activities) Goal directed rather than data driven. Mixed mode inference is situatedness? There is a distinction between a computational architecture and a model of cognition. SOAR (Newell, 1990) , for example, is designed to help create models of cognition. When knowledge is added to it as production rules, the goal is that the architecture, how these rules are interpreted, will produce behaviour that would correspond to human behaviour. We're not quite there yet. such as fatigue, individual differences in cognitive capabilities, differences in training or roe. Belief, Desire, and Intentions (BDI) architectures NASA JACK Process models and architecture SOAR ACT-R SlomanŐs Tool Kit PDP/connections (+Andy Clarke) BAIRNE LI KAI (do you mean LICAI, polson and Kitajima?) COGENT is a system developed at Birkbeck College. It shows that good interfaces and ease of use are possible, but it is not clear that the architecture could easily represent the large knowledge structures necessary for these OR models. These two aspects might be related. Lessons Table List of Models/Architecutre Comments ACT-R Less powerful but still good for modest sized tasks, said to be easy to use, learns primarily subsymbolically SOAR Powerful, hard to use, learns symbolically BAIRNE Sloman's Agent Toolkit BDI PDP/connectionist (+Andy Clarke) LICAI COGNET (CHI systems) COGENT Lovely interface, not clear it can scale Expert systems Models knowledge level Blackboard models (dMUSE) NASA JACK, SAINT Other intellectual rather than implemented ideas Important human operator regularities There are a large number of regularities accumulated about human operators (e.g. Boff & Lincoln, 1988; Wickens, 1992) . These regularities range from how fast and well our eyes can see, through how well memory works for differnt types of matierals, through how decisions are made, to how fast hands and feet can move to otput information. We review here some of the most important characterstics of humans as operators within complex systems. This set is an approximation, but a useful one, reflecting more closely the behaviour of such human systems than not including them. With further effort and with time, like with all simulations, the results will more clostely match the real behaviour. Modelling attention Human problem solvers do not have complete and immediate access to all the information in their environment. Finding information is a task itself and constrains human behaviour in ways that perfect knowledge algorithms are not constrained. This lack of complete knowledge gives rise to some characteristic behaviours. Out of sight out of mind can quite literally be true. Hard to find perceptual items are missed more often. Objects with perceptually salient clues are used more. Errors in perception can give rise to errors in cognition and problem solving. ETC. If there is account of where a dial is in the cockpit, or differentiation according to lay out or workload Vision and attention We know a lot about how vision works now (Boff & Lincoln, 1988) , and we can provide a summary of the most important for creating a model eye (Baxter & Ritter, 1996) . The next step is to take these regularities gathered as descriptive measures and synthesis a mechanism that gives rise to them. This simulated eye can also then be used to constrain models of behaviour. A way to include these characteristics is just to include a model of vision and attention, a model eye, and a hand. A similar, but slightly less complicated system can be created with a model ear and voice. The choice of systems and regularities would depend on the modalities needed to perform the tasks of interest. These models of interaction restrict what can be seen (filters) at any one time (or provides an appropriate lag in information access). If the model is more complete, they can start to provide a simulation of the types of misunderstandings that can happen. They will also provide a limitation of the number of actions that can be performed at once, for example, the including interaction will allow the model to move its hand and eye at the same time, but it would not be able to say two things at once, see two different areas, or touch two switches at the same time. Models of vision have been created for models of human cognition in a variety of domains, including air traffice control (Bass, Baxter, & Ritter, 1995) , a simple blocks world (Jones & Ritter, 1998) , and helicopter pilots (Hill in latest cgf proceedings). In these domaons, the models restrict the information the model has access to, and in the case of the blocks world model, at least, vision is modelled as being imperfect, sometimes blocks are misrcognised. The model then either makes mistakes or must look more carefully. These systems model vision on a symbolic level based upon the graphics objects in the simulation. Vision is modelled by looking at the drawing list, and finding what objects are generating what is being drawn. If the drawing routines do not know the object associated with a particularl error, it would make modelling vision much harder. [Simon write about his DIS concernes here writh respect to object tagging] Errors [Simon to fill in what he wants] French guy observes 96 lapses, observer sees 28, 8 are reported in the certification of an aircraft. Rasmussen addresses this as error. Deviation in performance from operational goals (crash aircraft. Modelling human performance should include these intentional lapses. We have seen errors based on perception occur and get corrected within a model now. A model has been created that solves a 3-d blocks puzzle (Jones & Ritter, 1998) . It selects blocks to put together based on the size that it sees. Objects in the fovea (the central part of human and now model vision) are seen perfectly. Objects in the parafovea are seen less clearly, and sometimes features are lost or misconstrued. When this model spies a block of the size it is looking for in its parafovea, it starts to move its hand to it and to fixate its fovea on it to get ready to pick it up. Sometimes, a few times in solving each 21 block puzzle, it finds out when the block is in the fovea that it is not the right size. This is similar to seeing someone move their hand to pickup a block, which happens, but then not picking it up. We propose that this is due that misrecognition error and correction. Modelling team behaviour [Simon to fill in what he wants] Validation There are a variety of methods and statistics available to test models of the operator (Ritter, 1993) . These methods are based on Grant's (1962) two step approach to testing. The most important aspect to examine first is if the model is worth taking seriously. This step is actually historically situated. What was a serious physics theory 50 years ago, is in many cases considered inadaquate because it accounts for too small a set a regularities, or the numeric fit is too poor. The same situation will hold for models of human operators. The requirements currently appear subjectively to be low but rising. There are several statistical tests taht are often used to show that a model is worth taking seiously. The weakest is to see if the model is differnet from human data using a test to show differences. This is week because good models or bad data can lead to a fidning of no difference. Regression models and correlation resulls are better tests because they return higher values when the model is better and there is more good data. The second step in testing a model is then to find out where it can be improved. Models that cannot be falsified should be avoided, because they do not represent real scientific theories. There are many ways for a model to be wrong, so there are many useful tests. The most initial tests are to look for ty pes of behaviorus that the model cannot do that the humans can. Comparing the aggregate measures of behaviour between model and subjects can be useful. Where the model and subject disagree suggest places where the model can be improved. It can be usefl to compare the model's output to human trials on time line level. Where the model and subject peform actions that the other does not do, or at different paces, suggests areas of behaviour to be imporoved. More complete agents, for example, those in synthetic military simulations, a Turing- like test is often used. A Turing test is where people observe the model and comment (sometimes spontaneously) on whthere the artificial agent seems authentic or like an arcade game. For example applications of such tests, see such papers as Ritter and Bibby (Ritter & Bibby, 1997) , Nerb (in press) , and conferences on cogntiive science and cogntiive modeling (Gernsbacker & Derry, 1998; Ritter & Young, 1998; Schmid, Krems, & Wysotzki, 1999) . [Simon: could insert some other regularities from other report here] Conclusions There are many regularities that can be included in models used for analysis. We review here some recommendations for behaviours to include. We also note some behaviours that are too distant to model well. Recommendations for including human-like processing in OR Human behaviour has characteristics than can be faithfully modelled. We have had for soem time stable data on human behaviour in relavent circumstances, for example, Boff (1988) and Wickens (1992) . We have also had cognitive and AI architectures for generating and summarising intelligent behaivour. There are several model of perceptual modalities and charactersitics model of output modalities and charactersitics Hand/eye // model of attention and interaction apply to multiple systems (generality) if the work is done within a user interface management system, or user interface design environment. Several of these have been used to do this work. For example, Mac Common Lisp, Tcl/Tk, Garnet, SLGMS. We believe that VAPS can do support this. We have a list of requirements for choosing a user interface management system to develop a cognitive model interface management system (Ritter, Jones, Baxter, & Young, 1998) . Continue with BDI. Australian community is getting critical mass, and is focused on the activity that we want. Can add capacity within architecture to give effects of timing and attention. This is valid as psychological model. Can add extra, maybe ACT-R, which is good at low level detail as external interface if required. Changes in audio output by agents, and how this influences the listenerŐs ability to model the world or understand the speaker. Things not yet ready to include As well as behaviurs that can and should be included in operational simulations, there are behaviours that are difficult to define and thus to model as well. * Emergent perceptual structures in the environment will be difficult to model. For example, a bunch of trees that spell out a message. * subtle social interactions will be difficult to model. By definition they are due to small effects and long term behaviour. * Strategic learning, for xample, over the course of a caompaighn. This will be difficlt because data is lacking and because learning theories may not cover this type of learning. References Bass, E. J., Baxter, G. D., & Ritter, F. E. (1995). Using cognitive models to control simulations of complex systems. AISB Quarterly, 93, 18-25. Baxter, G. D., & Ritter, F. E. (1996). Designing abstract visual perceptual and motor action capabilities for use by cognitive models (Tech. Report No. 36). ESRC CREDIT, Psychology, U. of Nottingham. Boff, K. R., & Lincoln, J. E. (1988). Engineering data compendium: Human perception and performance. Wright-Patterson Air Force Base, OH: Harry G. Armstrong Aerospace Medical Research Laboratory. Gernsbacker, M. A., & Derry, S. J. (Eds.). (1998). Proceedings of the 20th Annual Conference of Cognitive Science Society. Mahwah, NJ: LEA. Grant, D. A. (1962). Testing the null hypothesis and the strategy and tactics of investigating theoretical models. Psychological Review, 69(1), 54-61. Jones, G., & Ritter, F. E. (1998). Initial explorations of simulating cognitive and perceptual development by modifying architectures. In Proceedings of the 20th Annual Conference of the Cognitive Science Society. 543-548. Mahwah, NJ: Lawrence Erlbaum. Nerb, J., Ritter, F. E., & Krems, J. (in press). Knowledge level learning and the power law: A Soar model of skill acquisition in scheduling. Kognitionswissenschaft [Journal of the German Cognitive Science Society] Special issue on cognitive modelling and cognitive architectures, D. Wallach & H. A. Simon (eds.). Newell, A. (1982). The knowledge level. Artificial Intelligence, 18, 87-127. Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press. Ritter, F. E. (1993). TBPA: A methodology and software environment for testing process models' sequential predictions with protocols (Technical Report No. CMU- CS-93-101). School of Computer Science, Carnegie-Mellon University. Ritter, F. E., & Bibby, P. A. (1997). Modelling learning as it happens in a diagramatic reasoning task (Tech. Report No. 45). ESRC CREDIT, Dept. of Psychology, U. of Nottingham. Ritter, F. E., Jones, G., Baxter, G. D., & Young, R. M. (1998). Lessons from using models of attention and interaction. In 5th ACT-R Workshop. 117-123. Psychology Department, Carnegie-Mellon University. Ritter, F. E., & Larkin, J. H. (1994). Using process models to summarize sequences of human actions. Human-Computer Interaction, 9(3), 345-383. Ritter, F. E., & Young, R. M. (Eds.). (1998). Proceedings of the Second European Conference on Cognitive Modelling. Nottingham: Nottingham University Press. Schmid, U., Krems, J., & Wysotzki, F. (Eds.). (1999). Mind modeling - A cognitive science approach to reasoning, learning and discovery. Lengerich (Germany): Pabst Scientific Publishing. Wickens, C. D. (1992). Engineering psychology and human performance (2nd ed.). New York, NY: HarperCollins. Bones (left over ideas)