For submission to Kognitionswissenschaft, due end of October, up to 10 pages.

Implicit rule learning and the power law: A Soar model of skill acquisition in scheduling

to do at end:

* replace figure names with figure numbers

* spell/american

* Hyphanate

Frank E. Ritter? Josef Nerb? Josef F. Krems

Dept. of Psychology, University of Regensburg D-8400 Regensburg, Germany

Dept. of Psychology, University of Chemnitz D-8400 Regensburg, Germany

School of Psychology, U. of Nottingham, Nottingham NG7 2RD UK

(email: nerb@psychologie.uni-freiburg.de, josef.krems@phil.tu-chemnitz.de, frank.ritter@nottingham.ac.uk)

 

Summary. The chunking mechanism in Soar has been used in numerous ways. The model, presented here uses chunking to learn rule-like behaviour gradually while doing a job-shop scheduling task. The model learns episodic memory chunks while solving the scheduling tasks. This mechanism demonstrates how symbolic models can exhibit a gradual change in behaviour and how acquisition of general rules can be performed without resort to explicit declarative rule generation. The model fits many qualitative (e.g. learning rate) and quantitative (e.g. solution time) regularities found in previously collected data. The model's predictions were tested with data from a new study where a scheduling task was given to the model and to 14 subjects. Again, a general fit of the model was found with the restrictions that the task is easier for the model than for subjects, and its performance improves more quickly than subjects. The model provides an explanation of the noise typically found when fitting a set of data to a power law -- it is the result of learning actual pieces of knowledge that transfer more or less but rarely an average amount. Only when the data are averaged (over subjects here) does the smooth power law appear.

Introduction

Soar, a candidate unified theory of cognition (Newell, 1990), is a cognitive architecture designed to support learning. We note here several aspects of the architecture that particularly influence and support learning, and illustrate a model that learns partial episodic rules that slowly approximate rule based behaviour.

Overview of Soar

There are extensive explanations and introductions to the architecture (Lehman, Laird, & Rosenbloom, 1996; Norman, 1991), with some available online (Baxter & Ritter, 1996; Ritter & Young, 1996), so we will only briefly review Soar, as shown in Figure soar. Soar is best described at two levels, called the problem space level and the symbol level. Behaviour on the problem space level represents problem solving as search through and in problem spaces of states using operators. Operators are implemented using the symbol level, with production rules. In routine behaviour, processing proceeds by a repeated cycle of operator applications.

When there is a lack of knowledge about how to proceed, an impasse is declared. An impasse can be caused by, among other things, a lack of operators to apply, an operator that does not know to make changes to the state (no-change), or tied operators. Declaring an impasse creates a new state noting the impasse type and cause, and it typically allows further knowledge to apply about how to resolve the problem. States S2 and S3 in the figure are impasse states.

Soar models will typically end up with a stack of problem spaces, as problem solving on one impasse may lead to further impasses. The stack will change as the impasses get resolved through problem solving and new ones are added.

When knowledge about how to resolve an impasse becomes available from problem solving within the context of the impasse, a chunk (a newly acquired production rule) is created. This acquired rule will contain the knowledge in the higher context that has been used in problem solving in the impasse to recognise the same situation as its condition, and duplicated the changes from the problem solving as its action.

Figure soar. The main processes in Soar.

Figure chunking. How chunking encodes a new rule. Adapted from Howes and Young, 1997.

Learning mechanisms in Soar

There have been many learning mechanisms implemented in Soar using the chunking mechanism. The type and meaning of the impasses differentiate them. Many of these have been part of AI models, or cognitive models that have not been tested against human data, with the exception that we know that humans can learn in these ways as well. These include models that learn from instructions (Huffman & Laird, 1995), models that learn through reflection (Bass, Baxter, & Ritter, 1995; John, Vera, & Newell, 1994; Nielsen & Kirsner, 1994), models that learn through analogy (Rieman, Lewis, Young, & Polson, 1994), and a model that learns through abduction (Johnson, Johnson, Smith, DeJongh, Fischer, Amra, et al., 1991), Further examples are available in the Soar Papers (Rosenbloom, Laird, & Newell, 1992), and through pointers in the Soar FAQ (Baxter & Ritter, 1996).

There have been several models implemented in Soar that learn and that have had their predictions compared with human data. Altmann's (Altmann & John, In press) model of interaction suggests that learning is pervasive when interacting with the environment. His model uses learned information to help with searching for objects in an interface. This model suggests that searching in any environment is assisted by learning. His model has been compared with verbal protocols.check this

Able-Soar (Ritter, Jones, & Baxter, In press) models expert to novice transitions, using a mechanism similar to and based on Larkin's (Larkin & Simon, 1981) mechanism. This mechanism learns how to solve sub-goals in a means-ends analysis, which with practice causes backward-chaining novice-like behaviour to become forward-chaining expert-like behaviour.

Gradual learning in Soar

While most of Soar models learn fairly large chunks of knowledge, learning in Soar can also be graded and gradual. Mechanisms that learn this way tend to incorporate weak heuristics for problem solving, and are not always correct. Several mechanisms have been created and compare favourably with human data.

There are numerous models of computer users that learn developed by Young, Howes, and Rieman. There are models that illustrate how task to action mappings are acquired (Howes & Young, 1996). The model learns the action to perform to accomplish a task through external search leading to learning in a multi-pass algorithm. There are also models that learn through external scanning and internal comprehension (Rieman, Young, & Howes, 1996), and through exploration to recognise states and information near the target Howes, 1994 #651. These all have been compared against empirical phenomena in exploratory learning. The processing is heavily recognition-based, so that the search of a large external space of options can be performed with a small working memory.

Aasman (1995) has created a model of driving behavior. The model drives a simulated car through a residential area, avoiding bicycles and parked and moving cars. It learns plans for activities like negotiating intersections, and it learns how to control the car more accurately. Its behaviour has been compared with human driving data, including the order of behaviours and their times for visual orientation, motor control, and car speed.

SCA (Miller & Laird, 1996) learns concepts by starting with very general classification rules and learns more specific rules. It has been compared with aggregate data of subjects learning to classify novel stimuli.

Diag (Ritter & Bibby, 1997) is a model of diagrammatic reasoning that finds faults in a device. It learns which objects to examine and how to implement its internal problem solving more efficiently. It has been compared with aggregate and verbal protocol data while both subject and model learn.

Chong and Laird (1997) have created a series of models that become better at solving a dual task. The task they model is the Wickens task, which combines a tracking task with a choice-reaction time task. They identify many places where learning could occur in this sequence from novice to expert. They also implement a mechanism to learn the ordering and precedence of external behaviour by resolving ties for multiple behaviours when they are proposed.

We present here a model in more detail, showing how rule-like behaviour can arise out of apparently noisy behaviour, giving rise to a graded performance due to learning wile problem solving. This model proposes a mechanisms to match the variance in individual data that make up the power law of learning. We will state some things explicitly as well, observations which are well known to modelers in Act-R and Soar, such as sources of matching the powerlaw, and where learning comes from.

Skill acquisition in scheduling tasks

From a psychological point of view planning can be considered a problem-solving activity in which, in a prospective manner, an ordered sequence of executable actions has to be constructed. In a more formal sense, this means specifying a sequence of operators with well-defined conditions and consequences that transform a given state into a goal state. For interesting problems, the entire problem space cannot be searched, and heuristics must be used to guide the search.

Scheduling problems are a specific, important subset of problem solving. Here the task of the problem solver is to find an optimal schedule based on the given constraints (e.g. minimal processing time). Factory scheduling (so-called job-shop scheduling) is a further subset of scheduling tasks, namely, to find the optimal ordering of activities on machines in a factory.

Job-shop scheduling has direct, practical importance. Over the last two decades algorithms have been derived that produce optimal (or near optimal) solutions for scheduling tasks using operations research techniques (e.g. Graves, 1981) as well as AI techniques (e.g. Fox, Sadeh, & Baykan, 1989). One of the most popular systems, based on constraint-propagation, is ISIS (Smith, Ow, Potvin, Muscettola, & Matthys, 1990). Other AI-based approaches have used the general learning mechanisms in PRODIGY (Minton, 1990) or Soar (Prietula & Carley, 1994). These systems rely on the assumption that general methods for efficient problem solving can be discovered by applying a domain-independent learning mechanism. In psychology, on the other hand, little is known about how scheduling is performed as a problem solving activity and about the acquisition of scheduling skills (for a counterexample and earlier call to arms, see Sanderson, 1989).

One way to investigate how people acquire skills and knowledge in planning tasks is to use a general architecture for cognition like Soar (Newell, 1990) to construct and evaluate computational models of scheduling and the learning processes it incorporates. This approach provides methods for conceptualising the problem solving situation and for implementing considerations on the knowledge level such as learning.

Thus the main goal of a cognitive approach is not to find efficient methods for scheduling, but to find models of skill acquisition that can claim cognitive plausibility. Here, we first find and describe several empirical constraints on scheduling, and then describe a computational model that was developed based on these constraints. Use of this model to interpret further data allows us to examine performance, including learning in this domain in a principled, detailed way. The result we include here is a knowledge-based explanation of noise in the power law of learning.

The job-shop scheduling Task

The task -- for the subjects as well as for the computational model -- was to schedule five actions optimally, as a scheduler or dispatcher of a small factory. Jobs had to be scheduled on two machines (A and B). Each job had to be run in a fixed order, first A and then B, requiring different processing time on each machine for each job. Sets of five jobs with randomly created processing times were given to the subjects on a computer display. Subjects tried to find the order of jobs that produced the minimal total processing time, determining which out of the five jobs should be run first, which second, and so on.

For this kind of scheduling task an algorithm to find the optimal solution is available (Johnson, 1954). The general principle requires comparing the processing times of the jobs and finding the job requiring the shortest time on one of the two machines. If this is on machine A than the job has to be run first, if it is on B, then last. This principle is applied until all of the jobs are scheduled. Suboptimal sequences result if only parts of the general principle are used, e.g., only the demands of resources on machine A are used for ordering the jobs. This special task of modest complexity was selected because (a) it is simple enough to assess the value of each trial's solution by comparing it to the actual optimal solution (Johnson, 1954), but (b) the task is hard enough to be a genuine problem for subjects, who have to solve it deliberately, and (c) to solve the task without errors requires discovering and applying a general principle.

What is learned in this task?

Learning to solve scheduling tasks, like learning in general, requires the acquisition as well as the storage of rules in memory. In this task, acquisition is discovering the general rule or inferring at least useful scheduling heuristic rules while performing the task. If no rule on how to schedule jobs is available and the problem solver progresses through blind search, on average no great improvement should occur. Only if the subject generates internal hypotheses about the schedule ordering rules, and if feedback about the correctness of these assumptions is available, will the subject be able to discover efficient scheduling rules. And then, only if a discovered rule is stored in memory will the improvement be applied in later trials. As in impasse-driven learning-theories (VanLehn, 1988), it is assumed that rule acquisition particularly takes place when subjects face a situation in which their background knowledge is not sufficient to solve the problem immediately.

Of course, as in other domains, learning in scheduling tasks depends on the amount of practice and it is highly situated. An essential situational factor, which facilitates or inhibits the acquisition of rules and thus the progress in learning, is the interaction of the problem solver with the environment. In this task, the interaction provides feedback about the quality of the subject's solution and therefore about the efficiency of their applied rule.

The influence of different types of feedback on the quality of the learning process has been investigated in this task (Krems & Nerb, 1992). Subjects were given either (a) quantitative information: how good a single solution was compared with the optimum or previous solutions of the problem solver. No information about the underlying rule or how the optimal solution can be found was given. Or (b) qualitative information: an assessment of a solution in relation to the optimal scheduling rule, which out of the n jobs were correctly scheduled.

The subjects with less feedback, who received only quantitative information on the distance of their own solution to an optimum, more often discovered a good scheduling algorithm than those who received qualitative information about which jobs were correctly scheduled. The qualitative information subjects were much more oriented towards optimising a single solution rather than getting insight on a more abstract level. This result is surprising, however, in that more feedback usually leads to better performance.

Their behaviour can be well described within repair-theory (VanLehn, 1988). The results point to distinct types of learning based on different kinds of feedback. [Nerb: please expand a bit]

Sched-Soar

Sched-Soar is a computational model of skill-acquisition in scheduling tasks. The architectural component is strictly separated from the task-specific knowledge. It provides a good example of a model that fits the data well, and explains some of the mechanisms that may give rise to learning, the powerlaw, and how apparent rule-based behaviour arises from a more subtle mechanism.

Previous empirical constraints

The empirical constraints are taken from previous experiments (Krems & Nerb, 1992) where 38 subjects each created 100 schedules. Although the main focus of this previous work was to investigate the effect of different kinds of feedback on learning, the data also describe some general regularities. The main empirical results used to constraint the design of our process model of scheduling skill acquisition are:

(1) Total processing time: The task takes 22.3 s, on average, for a novice (min-value: 16 s, max-value: 26 s.).

(2) General speed-up effect: On average, the processing time for scheduling a set of jobs decreased 22% from the first ten trials to the last ten.

(3) Improvement of solutions: The difference between the subject's answers and the optimum answers decreased more than 50% over the 100 trials.

(4) Suboptimal behavior: It should be emphasised that even after 100 trials the solutions are not perfect.

(5) Algorithm not learned: None of the subjects detected the optimal scheduling rules (i.e. nobody could give a verbal description of the underlying principle when asked after the trials).

Model assumptions inherited from Soar

In addition to the empirical constraints, we include the following general constraints taken from the Soar architecture:

(1) The task is described and represented in terms of problem spaces, goals and operators, as a Problem Space Computational Model (Newell, 1990). All knowledge is implemented as a set of productions. Soar's bottom-up chunking mechanism is used for learning, which means that chunks were built only over terminal subgoals. This has been proposed as characteristic of human learning (Newell, 1990, p.317).

(2) An initial knowledge set about scheduling-tasks is provided as part of long-term knowledge (e.g., to optimise a sequence of actions it is first necessary to analyse the resource demands of every action). Also basic algebraic knowledge is included, such as ordinal relations between numbers. Together, this amounts to 401 productions implementing eight problem-spaces.

(3) The model is feedback-driven. If the available internal knowledge is not sufficient to choose the next action to schedule, the supervisor is asked for advice. These are situations in which a human problem-solver would have to do the same, or to guess.

Processing steps

Sched-Soar begins with an initial state containing the five jobs to be scheduled and a goal-state to have them well scheduled, but without knowledge of the actual minimum total processing time. The minimal scheduling knowledge that Sched-Soar starts with leads to these main processing steps, which are applied every single scheduling step:

(1) Sched-Soar analyses the situation and tries to find a job to attempt to schedule. Previous knowledge or learning may allow the model to proceed directly to step 3.

(2a) If no decision can be made, despite examination of all available internal knowledge, Sched-Soar requests advice from the environment. The advice specifies the job that is the optimal choice to schedule next in the current set.

(2b) After getting advice about which job to schedule next, Sched-Soar reflects on why it applies to the current situation. In doing so, Sched-Soar uses its scheduling and basic arithmetic knowledge to figure out what makes the proposed job different from all the others, using features like the relations between jobs, the resources required by single actions, and the position of an action in a sequence.

(2c) Based upon this analysis, Sched-Soar memorises explicitly those aspects of the situation that seem to be responsible for the supervisor's advice. We call this kind of chunk an episodic chunk. Episodic chunks implement search-control knowledge, specifying what has to be done and when. An example is: If two jobs are already scheduled, and three operators suggesting three jobs to scheduled next are suggested, and job 1 has the shortest processing time compared to the other jobs on machine A, then give a high preference value to the operator to schedule job 1. This kind of memorising is goal-driven, done by an operator, and would not arise from the ordinary chunking procedure without this deliberation. If in subsequent trials a similar situation is encountered, then Sched-Soar will bring its memorised knowledge to bear. [Josef: can you tell more about this? e.g. what other heuristics?] Because the memorised information is heuristic, positive as well as negative transfer can result. And because only explicit, specific rules are created, general declarative rule based behaviour appears to arise slowly and erratically.

(3) The job is assigned to the machines and bookkeeping is performed.

The model's behaviour

The model's behaviour can be examined like a subject's, individual runs on sets of problems like a simulated subject. Figure alldata shows Sched-Soar's solution time on 4 series of 16 randomly created tasks. Neither the power function (r2 = 0.55) nor a simple linear function (r2 = 0.53) proves a good fit to these data. However, when averaged, Figure avdata shows that these series fit a simple power function well (T = 274.0 * N-0.3 with r2 = 0.95).

Figure alldata. Individual data: Processing time in Soar model cycles (decision cycles) for four sets of simulated data for trials 1 to 16 and a power law fit.

Figure avdata. Averaged data: Processing time in model cycles for four sets of simulated data for trials 1 to 16 and a power law fit.

A closer look at the Sched-Soar's problem solving process shows that the variance in the individual trials comes from two main sources: the negative transfer of chunked knowledge and the number of requests for advice. Negative transfer results when an episodic chunk, built during solving a previous task, suggests an action in a situation that appears similar to the prior one, but which requires a different action. If this occurs, the situation has to be evaluated again to find the proper schedule-element, and, finally, if there is no suitable knowledge, the model still has to ask for advice. This explains why we found in the model's performance that additional requests for advice are often preceded by one or more instances of negative transfer. Both negative transfer and asking for advice directly lead to more deliberation efforts as measured in model cycles.

An assumption in Soar is that the learning rate (chunks/sec) % should be constant, so this measure should be in equivalent terms, chunks/dc, and I suspect the intercept is wrong. The learning rate (as chunks per trial) proved constant over all 4 X 16 trials (Chunks (N) = 15.48 * N + 417.8 with r2 = 0.98). [Nerb: need to clean this up. should we just delete it again?]

Comparison of Sched-Soar with data

The model was evaluated in two ways, by investigating how many of the preexisting, empirical constraints were met, and by comparing the model results to new empirical data in a further study. We take these up in turn.

Comparison with previous empirical constraints

Table constraints. Constraints met and not met.

1.  Total processing time       Met        
2.  Improvement of solutions    Unclear    
3.  Algorithm not learned       Met        
4.  General speed up effect     Not met    
5.  Suboptimal behavior         Met        
persists                                   
 
 

The results of the model show that the empirical constraints shown in Table constraints are met in general. (1) Solving the task requires 151 model cycles, averaged over all trials and runs. The Soar architecture specifies the rate of model cycles within half an order of magnitude with a centre point at ten per second. This only constrains the time per cycle to be between 30 ms to 300 ms (Newell, 1990). The model performs slightly faster than the mean expected rate of 100 ms per cycle, but at 147 ms/cycle it is well within the theoretical bounds. (2) The speed-up of the model is greater than the subjects' -- 57% (from 270 cycles in the first trial to 118 cycles in trial 16) compared with 22% by the subjects. (3) The improvement in correctness cannot be decided yet, because Sched-Soar was initially programmed to use advice to produce always correct solutions. (4) The model did not discover or implement the general algorithm. (5) Sched-Soar's behaviour is always suboptimal after 16 trials (and negative transfer might still occur in later trials). [Nerb, please check to see that this mapping from text to table is correct, I'm starting to doubt it.]

The new data

[Nerb: just needs to be beefed up here?]

The model results can be considered theoretical predictions about subjects behaviour in learning to solve the task. Some of these predictions were further evaluated in an additional empirical study because the task in the study forming the initial empirical constraints (Krems & Nerb, 1992) was not exactly the same as that one solved by the model (in the first study feedback was given after a complete solution, whereas Sched-Soar is advised immediately after every single scheduling level decision). Therefore, a second experiment was conducted where the exact same task given to the model was given to 14 subjects. Subjects were instructed to separate their decisions based on knowledge from those based on guesses: They were requested to ask for advice when they did not know what to do. Each subject solved a total of 18 different scheduling problems, a slightly longer series.

The new comparison

These subjects' learning rate is shown in Figure empdata. We found that a power function (T = 109.6 * N-0.38) accounts best for the averaged data (r2 = 0.82, compared with 0.71 and 0.73 for linear or exponential fits). The average processing times for trial 1 to 18 vary between 99.4 and 36.3 s.

As has been noted (Kieras, Wood, & Meyer, 1997), like many cognitive models (e.g. Peck & John, 1992), Sched-Soar performs the task more efficiently than subjects do, predicting values on these tasks between 270 and 116 decision cycles. That means one has to assume 369 or 313 ms/model cycle, which is slightly above the region defined by Newell (1990). The learning rate (power law coefficient) of the subjects is approximately 26% higher than the learning rate of the model (-0.3 versus -0.38). In general that means that the task is easier for the model and its performance improves more quickly (but will also asymptote sooner). If this time constraint is taken seriously, future extensions to Sched-Soar should include more of the tasks that subjects must perform that Sched-Soar has performed for it, such as waiting for advice to be generated, reading the jobs off the screen, and typing in the schedule. These will take time to perform, and are likely to have a slower learning rate.

Another explanation for the differences between the model's and the subjects' behaviour might be based on the assumption that the learning mechanisms of Soar and human subjects are qualitatively similar but that there are quantitative differences. The model's behaviour might be more efficient because it learns on every opportunity, whereas human subjects learning may be more specific than we propose here, may have even worse heuristics, or even be probabilistic.

We also found a correlation of r = 0.46 between processing time and the subjects' requests for advise, due to lack of knowledge or wrong decisions. This corresponds with how the model exhibits negative transfer of chunked knowledge. This correlation must be examined on a finer level before we can note them as equivalent.

Figure empdata. Processing time from trial 1 to 18 for 2 individuals (thin solid lines), the average solution time of all subjects (dashed line), and a power law fit to the average.

Conclusions

This model is pleasing in that it shows the advantages of not knowing everything. This model does not know enough to be or get perfect. Its representation and problem solving knowledge is too weak to do that. It does know enough, however, to get better. People exhibit this behaviour in many circumstances. This model suggests that their behavior is optimal, given the knowledge that they have. Further improvements will have to come with additional knowledge.

A major claim of this analysis is that the power law of practice (Newell, 1990) will appear when performing the kind of scheduling task used in these studies, but only when the data are averaged over subjects or trials. In this task, the cause of the variance is shown clearly not to be noise in the measurement process or variance in the processing rate of the underlying cognitive architecture (which might have been proposed for simpler, more perceptual tasks), or through improved interactions with the environment (Agre & Shrager, 1990), although it is related to this. The variance in solution time is caused by variance in how much learned knowledge transfers to new situations. This regularity may be further overshadowed by more deliberation effort. For example, a subject can make simple decisions to ask for advice, or start more elaborate lookahead search to an arbitrary depth constrained only by their working memory and, of course, their motivation. Stripping away this time or replacing it with a constant factor should also yield a power law function on the empirical side, but only when averaged. A more fine-grained analysis would be required (and is possible) to look at the firing of each episodic chunk that lead to negative transfer, comparing this with each subject's behaviour (Agre & Shrager, 1990; Ritter & Larkin, 1994).

Open questions

On the other hand, further empirical work will be necessary to answer some of the questions posed by the model. For example, when will a strategy change take place, and will the results be of a local or global nature?

As this model is further developed, it should prove useful for explaining other aspects of scheduling behaviour (e.g., the effect of further kinds of feedback on rule acquisition) and provide a possible new approach to constraint-based planning.

Where the powerlaw comes from

This is one of the first problem solving models to use episodic memory based on learning through reflection to learn a task in a cognitively plausible manner. It shows how both the acquisition and storage of general rules in memory can be modelled in the Soar-architecture through acquisition of specific, context dependent rules (in contrast to VanLehn's (VanLehn, 1991, p. 38) account. [tell us more nerb].

References

need better reference for ijcai89 and Fox et al

Aasman, J. (1995). Modelling driver behaviour is Soar. Leidschendam, The Netherlands: KPN Research.

Agre, P. E., & Shrager, J. (1990). Routine evolution as the microgenetic basis of skill acquisition. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. 694-701. Hillsdale, NJ: Lawrence Erlbaum.

Altmann, E. M., & John, B. E. (In press). Episodic indexing: A model of memory for attention events. Cognitive Science.

Bass, E. J., Baxter, G. D., & Ritter, F. E. (1995). Using cognitive models to control simulations of complex systems. AISB Quarterly, 93, 18-25.

Baxter, G. D., & Ritter, F. E. (1996). The Soar FAQ, http://www.psychology.nottingham.ac.uk/users/ritter/soar-faq.html (1.0). Nottingham: Psychology Department, U. of Nottingham.

Chong, R. S., & Laird, J. E. (1997). Identifying dual-task executive process knowledge using EPIC-Soar. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society. 107-112. Mahwah, NJ: Lawrence Earlbaum Associates.

Fox, M., Sadeh, N., & Baykan, C. (1989). Constrained Heuristic Search. In Proceedings of IJCAI'89. 20-25.

Graves, S. C. (1981). A review of production scheduling. Operations Research, 29, 646-675.

Howes, A., & Young, R. M. (1996). Learning consistent, interactive, and meaningful task-action mappings: A computational model. Cognitive Science, 20(3), 301-356.

Howes, A., & Young, R. M. (1997). The role of cognitive architecture in modeling the user: Soar's learning mechanism. Human-Computer Interaction, 12, 311-343.

Huffman, S. B., & Laird, J. E. (1995). Flexibly instructable agents. J. of AI Research, 3, 271-324.

John, B. E., Vera, A. H., & Newell, A. (1994). Towards real-time GOMS: A model of expert behavior in a highly interactive task. Behavior and Information Technology, 13, 255-267.

Johnson, K. A., Johnson, T. R., Smith, J. W. J., DeJongh, M., Fischer, O., Amra, N. K., & Bayazitoglu, A. (1991). RedSoar: A system for red blood

cell antibody identification. In Fifteenth Annual Symposium on Computer Applications in Medical Care. 664-668. Washington: McGraw Hill.

Johnson, S. M. (1954). Optimal two and three-stage production schedules with set up times included. Naval Research Logistics Quarterly, 1, 61-68.

Kieras, D. E., Wood, S. D., & Meyer, D. E. (1997). Predictive engineering models based on the EPIC architecture for a multimodal high-performance human-computer interaction task. Transactions on Computer-Human Interaction, 4(3), 230-275.

Krems, J., & Nerb, J. (1992). Kompetenzerwerb beim Löesen von

Planungsproblemen: experimentelle Befunde und ein SOAR-Modell (Skill acquisition in solving scheduling problems:

Experimental results and a Soar model) No. FORWISS-Report FR-1992-001). FORWISS, Muenchen.

Larkin, J. H., & Simon, H. A. (1981). Learning through growth of skill in mental modeling. In H. A. Simon (Ed.), Models of thought II. 134-144. New Haven, CT: Yale University Press.

Lehman, J. F., Laird, J. E., & Rosenbloom, P. S. (1996). A gentle introduction to Soar, an architecture for human cognition. In S. Sternberg & D. Scarborough (Eds.), Invitation to cognitive science, vol. 4. Cambridge, MA: MIT Press.

Miller, C. S., & Laird, J. E. (1996). Accounting for graded performance within a discrete search framework. Cognitive Science, 20, 499-537.

Minton, S. (1990). Quantitative results concerning the utility of explanation-based learning. Artificial Intelligence, 42, 363-391.

Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.

Nielsen, T. E., & Kirsner, K. (1994). A challenge for Soar: Modeling proactive expertise in a complex dynamic environment. In Singapore International Conference on Intelligent Systems (SPICIS-94). B79-B84.

Norman, D. A. (1991). Approaches to the study of intelligence. , 47, 327-346.

Peck, V. A., & John, B. E. (1992). Browser-Soar: A computational model of a highly interactive task. In Proceedings of the CHI `92 Conference on Human Factors in Computing Systems. 165-172. New York, NY: ACM.

Prietula, M. J., & Carley, K. M. (1994). Computational organization theory: Autonomous agents and emergent behavior. J. of Organizational Computing, 41(1), 41-83.

Rieman, J., Lewis, C., Young, R. M., & Polson, P. G. (1994). "Why is a raven like a writing desk" Lessons in interface consistency and analogical reasoning from two cognitive architectures. In Proceedings of the CHI `94 Conference on Human Factors in Computing Systems. 438-444. New York, NY: ACM.

Rieman, J., Young, R. M., & Howes, A. (1996). A dual-space model of iteratively deepening exploratory learning. International Journal of Human-Computer Studies, 743-775.

Ritter, F. E., & Bibby, P. A. (1997). Modelling learning as it happens in a diagramatic reasoning task (Tech. Report No. 45). ESRC CREDIT, Dept. of Psychology, U. of Nottingham.

Ritter, F. E., Jones, R. M., & Baxter, G. D. (In press). Reusable models and graphical interfaces: Realising the potential of a unified theory of cognition. In U. Schmid, J. Krems, & F. Wysotzki (Eds.), Mind modeling - A cognitive science approach to reasoning, learning and discovery. Lengerich: Pabst Scientific Publishing.

Ritter, F. E., & Larkin, J. H. (1994). Using process models to summarize sequences of human actions. Human-Computer Interaction, 9(3), 345-383.

Ritter, F. E., & Young, R. M. (1996). The Psychological Soar Tutorial, http://www.psychology.nottingham.ac.uk/staff/ritter/pst-ftp.html (12.). Nottingham: Psychology Department, U. of Nottingham.

Rosenbloom, P. S., Laird, J. E., & Newell, A. (1992). The Soar papers: Research on integrated intelligence. Cambridge, MA: MIT Press.

Sanderson, P. M. (1989). The human planning and scheduling role in

advanced manufacturing systems: An emerging human factors domain. Human Factors, 31(6), 635-666.

Smith, S., Ow, P. S., Potvin, J., Muscettola, N., & Matthys, D. (1990). An integrated framework for generating and revising factory schedules. J. of the Operations Research Soc., 41, 539-552.

VanLehn, K. (1988). Toward a theory of impasse-driven learning. In H. Mandl & A. Lesgold (Eds.), Learning issues for intelligent tutoring systems. 19-41. New York, NY: Springer.

VanLehn, K. (1991). Rule acquisition events in the discovery of problem-solving strategies. Cognitive Science, 15(1), 1-47.