# Training Control Centers' Operators in Incident Diagnosis and Power Restoration Using Intelligent Tutoring Systems

Luiz Faria
António Silva
Zita Vale, IEEE
Albino Marques

Pages: pp. 135-147

Abstract—The activity of Control Center operators is important to guarantee the effective performance of Power Systems. Operators' actions are crucial to deal with incidents, especially severe faults like blackouts. In this paper, we present an Intelligent Tutoring approach for training Portuguese Control Center operators in tasks like incident analysis and diagnosis, and service restoration of Power Systems. Intelligent Tutoring System (ITS) approach is used in the training of the operators, having into account context awareness and the unobtrusive integration in the working environment. Several Artificial Intelligence techniques were criteriously used and combined together to obtain an effective Intelligent Tutoring environment, namely Multiagent Systems, Neural Networks, Constraint-based Modeling, Intelligent Planning, Knowledge Representation, Expert Systems, User Modeling, and Intelligent User Interfaces.

Index Terms—Cooperative learning, intelligent tutoring systems, on-the-job training, operators' training, power systems control centers.

## Introduction

Current Power Systems are highly complex and require sophisticated and precise operation and control. The most important decisions concerning Power System operation are taken in Control Centers, where real-time information on the Power System state is received and the human operators are the final link of an extended chain. Although Power System reliability has been increasing, incidents with more or less severe consequences still occur. In some cases, this can result in blackout situations, leading to consumer lack of supply, for which the economic and social impact can dramatically be high. Fig. 1 shows the impact of the 14th August 2003 blackout in the Northeast part of the USA.

Figure    Fig. 1. Northeast USA before and after the 14th August 2003 blackout (Source: NOAA—National Oceanic and Atmospheric Administration).

Blackouts have been major concerns in Power Systems mainly since the occurrence of the 9th November 1965 Northeast Blackout in USA. In recent years, several blackouts occurred, making the need to keep lights on more important than ever. On the 4th of October 2006, a Saturday, some minutes after 10 p.m., the Union for the Coordination of Transmission of Electricity (UCTE) European Network experienced a quasi blackout situation affecting nine European countries and North Africa and about 10 million consumers [ 1]. It was due to the simultaneous occurrence of several unforeseen events, made worst by the increasing unpredictability, which is inherent to wind power production. The restoration process was hampered by limited coordination and lack of an accurate global view. Not long ago, IEEE Power and Energy Magazine devoted a special issue titled "Shedding light on blackout—From prevention through restoration" to the subject of blackouts, its prevention, and recovery [ 2].

Control Center operators' performance is determinant to minimize the incident consequences. The need of a good response of Control Centers to severe faults, like blackouts, is even more important nowadays, due to the generalization of liberalized Electricity Markets [ 3]. As Power Systems reliability increased, the number of incidents offering occasion for operator on-the-job training has decreased. The consequences of incorrect operator behavior are all the more severe during a serious incident [ 4]. Operator training and the availability of decision-support tools are vital for overcoming these problems [ 5].

Power System Control Centers are an interesting domain for Knowledge-Based Systems (KBSs) because they can provide solutions for a large set of problems for which traditional software techniques are not suitable. In fact, Power Systems are complex and dynamically changing environments, made up of a lot of plants and equipments. These characteristics of Power Systems require knowledge-based applications in Control Centers to deal with nonmonotonic and temporal reasoning. On the other hand, the analysis of these situations is event-driven, asking for each piece of information to be analyzed in context and not independently from the other available information.

Intelligent Tutoring Systems (ITSs) were the main approach selected to deal with the operators' training in diagnosis [ 6] and restoration tasks, namely because:

1. They represent domain knowledge in a structured way, allowing the inference of new knowledge (access to the essential knowledge).
2. They model the trainee, allowing action in a nonmonotonous way, adapting better to the trainee's characteristics and evolution (awareness of the needs of people).
3. With the right didactic knowledge, they allow the system to choose different pedagogical approaches in the different phases of the learning process (requirements customization).
4. They are able to constantly monitor the trainee's performance and evolution, gathering information to guide the system's adaptation (context awareness).
5. They typically require very little intervention from the training staff and can be used in the working environment without disturbing the normal working routines.

In this paper, we present an Intelligent Tutoring System used for training the Control Center operators of the Portuguese Power System's Network in fault diagnosis and power restoration tasks.

Fig. 2 illustrates the environment in which the Operators of the Portuguese Power System Control Center work.

Figure    Fig. 2. Power system control center environment.

Several Artificial Intelligence techniques are used to make this system able to minimize network experts' effort in training preparation and to enable on the job and cooperative effective training.

## Tutoring Environment Architecture

The tutoring environment that has been developed involves two main areas: one devoted to the training of fault diagnosis skills and another dedicated to the training of power system restoration techniques. Fig. 3 shows the tutoring environment architecture.

Figure    Fig. 3. Tutoring environment architecture.

The selection of the adequate established restoration procedure strongly depends on the correct identification of the Power System operation state. Therefore, the identification of the incidents or set of incidents occurring in the transmission network is of utmost relevance in order to establish the current Power System operation state. Thus, the proposed training framework divides operator's training into distinct stages. The first one, as described in Section 3, is intended to give operators with competence needed for incident diagnosis. After that, operators are able to use CoopTutor (Section 4) to train their skills to manage the restoration procedures.

### 2.1 DiagTutor's Structure

This tutoring module is focused on Fault Diagnosis Training and can be divided into two major classes: modules and information stores. Modules are active processes that work together to create the required intelligent behavior. The tutoring system modules are the following:

1. Planning and instruction modules—the macroadaptation module defines the decisions taken before the beginning of the training session and the microadaptation module is responsible for guiding the response to the operator actions during the training session.
2. Training scenario search module—looks for a training scenario whose features are closer to the set of features defined by the macroadaptation module.
3. Specific situation generation module—generates a model describing the diagnostic process for each incident included in the training scenario.
4. Domain expert and operator reasoning matching module—compare the domain solver (SPARSE expert system [ 4]) reasoning with the steps performed by the operator during problem resolution.
5. Errors identification module—detects operator misconceptions by comparing the operator errors with the error patterns' library.
6. User interface manager.

### 2.2 CoopTutor's Structure

The purpose of this tutoring module is Restoration Training and it is built on a multiagent system including both agents personifying the several entities usually present in a power system control structure and the agents responsible for the simulation and the pedagogical guidance tasks.

The main agents present in the system fall into one of these categories as follows:

1. Supporting and Guidance agents:
2. Role playing agents—performing the roles normally assured by the human operators present in the Control Centers.
3. Interface agents—assigned to students being trained.

## Tutoring Module for Fault Diagnosis Training

During the analysis of alarm messages lists, CC operators must have in mind the group of messages that describes each type of fault. The same group of messages can show up in the reports of different types of faults. So CC operators have to analyze the arrival of additional information whose presence or absence determines the final diagnosis.

Operators have to deal with uncertain, incomplete, and inconsistent information, due to the data loss or errors occurred in the data gathering system.

Let us consider a small example: a simplified situation that may occur in a Power System that helps to understand the importance of temporal reasoning when dealing with Power System operation.

Whenever a fault occurs in a Power System, its protection system should react to it, giving automatic opening orders to one or more breakers. The opening of these breakers ensures the isolation of the fault, being the protection system designed in such a way that only an area as small as possible is affected. Protection systems are very important for the performance and security of Power Systems and can be rather complex, especially in the case of transmission networks, involving a lot of different protection devices. In our example, we will consider that a fault occurs in a line connecting two substations of a Power System and, as a consequence of that, the protection system gives opening orders to two breakers installed at the two ends of this line ( Fig. 4).

Figure    Fig. 4. Power system line.

Fig. 5 considers one of these breakers and a possible sequence of operations after the occurrence of the fault.

Figure    Fig. 5. Sequence of breaker operations. $\rm T1<T2<T3$ .

Let us consider that the breaker opens at instant T1, closes at instant T2, and opens again at instant T3. The interpretation of this fault depends not only on the sequence of events but also on the time intervals between them. In fact, when the breaker closes at T2, after the first opening at T1, this is likely to be due to the automatic reclosure procedure of the protection equipment. Fast automatic reclosures are widely used in Power Systems in order to minimize the impact of faults. In this case, the time interval between T1 and T2 would depend on the type of the fault and on the regulation of the automatic reclosure in the protection. Let us consider that, for instance, for a fault involving only one of the three phases (single-phase fault), this time would be 900 milliseconds whereas for a fault involving the three phases (three-phase fault), it would be 300 milliseconds. Apart from considering these times, we have to consider some tolerance in the dating and transmission of the information from the plant to the Control Center. For this reason, let us say that in the case of a three-phase fault, the time interval between T1 and T2 should not exceed 500 milliseconds. So, if T2-T1 is less than or equal to 500 milliseconds, we can interpret the first two messages as a consequence of a three-phase fault. After this, we have to consider the third message reporting a new opening of the breaker at T3. Assuming that this is a consequence of a tripping command sent by the protection system, it is due to an incident situation. Once more, the time T3 is crucial for the interpretation of this part of the incident. If this tripping takes place in a short interval of time (let us say within 5 seconds) after the reclosure of the breaker, it is considered that it is caused by the same fault that originated the first opening of the breaker considered in this example. Under these circumstances, with T3-T2 equal to or less than 5 seconds, the whole incident would be seen as a three-fault with unsuccessful reclosure at this end of the line. If T3-T2 was greater than 5 seconds, the third message would be considered as reporting a fault independent from the already considered.

The above example shows the complexity of the analysis of the messages that CC operators have to interpret. Note that the same sequence of messages can be interpreted in different ways, depending on the time intervals between messages. If a Knowledge-Based System is used to assist this interpretation, its inference engine must be prepared to deal with the temporal nature of the problem. For instance, after receiving the second message considered in this example, the incident could be described as a three-phase fault with successful reclosure, but the inference engine will have to wait at least 5 seconds for the possible arrival of a message reporting another opening of the breaker. If the message arrives, the incident will be described as a three-phase fault with unsuccessful reclosure.

In fact, if we consider all the messages that are generated during the period of the incident, including not only the messages originated in the plants involved in the incident but also in other plants of the Power System, operators can be forced to consider several hundreds of messages in just a few minutes. It is important to note that an incident usually causes the generation of not only the messages that are relevant to the analysis of this particular incident but also a lot of other messages that are not important in that context, increasing the total number of received messages. However, on other contexts, these messages could be important, which stresses the need of a contextual interpretation of the information.

On the other hand, several incidents can take place almost at the same time and one incident can have consequences in much more than two plants, resulting on a much more complex interpretation of the situation. If we also take into account the need to consider missing information, we can have an idea of the difficulties that CC operators face and also of the complexity of a knowledge-based application for this area.

In order to illustrate how a diagnosis training session is conducted and the interaction between the operator and the tutor, this section presents a very simplified diagnosis problem containing a DmR (monophase tripping with reclosure) incident, occurred in panel 204 of SED substation. The relevant SCADA messages related to this incident are depicted in Table 1. These SCADA messages correspond to the following events: breaker tripping, breaker moving, and breaker closing [ 7]. In a real training scenario, the operator is faced with a huge amount of messages, typically several hundreds.

Table 1. Incident in Panel 204 of SED Substation

The interaction between the trainee and the tutor is performed through prediction tables ( Fig. 6), where the operator selects a set of premises and the corresponding conclusion. The premises represent events (SCADA messages), temporal constraints between events, or previous conclusions [ 7].

Figure    Fig. 6. Prediction table.

DiagTutor does not require the operator's reasoning to follow a predefined set of steps, as in other implementations of the model tracing technique [ 8]. In order to evaluate this reasoning, the tutor will compare the prediction tables' content with the specific situation model [ 9]. This model is obtained by matching the domain model with the inference undertaken by SPARSE expert system [ 4]. The process is used to: identify the errors revealing operator's misconceptions; provide assistance on each problem solving action, if needed; monitor the trainee knowledge evolution; and provide learning opportunities for the trainee to reach mastery. In the area of ITSs, this goal has been achieved through the use of cognitive tutors [ 10], [ 11].

The identified errors are used as opportunities to correct the faults in the operator's reasoning. The operator's entries in prediction tables cause immediate responses from the tutor. In case of error, the operator can ask for help that is supplied as hints. Hinting is a tactic that encourages active thinking structured within guidelines dictated by the tutor [ 12]. The first hints are generic, becoming more detailed if the help requests are repeated.

The situation-specific model generated by the tutoring system for the problem presented is shown in the left frame of Fig. 7. It presents high granularity since it includes all the elementary steps used to get the problem solution. The tutor uses this model to detect errors in the operator reasoning by comparing the situation-specific model with the set of steps used by the operator. The model's granularity level is adequate to a novice trainee but not to an expert operator. The right frame of Fig. 7 represents a model used by an expert operator, including only concepts representing events, temporal constraints between events, and the final conclusion. Any reasoning model between the higher and lower granularity level models is admissible since it does not include any violation to the domain model. These two levels are used as boundaries of a continuous cognitive space.

Figure    Fig. 7. Higher and lower granularity levels of the situation-specific model.

Indeed, the process used to evaluate the trainee's reasoning is based on the application of pattern matching algorithms. Similar approaches with the same purpose are used in other ITSs, such as in TAO [ 13], an ITS designed to provide tactical action officer students at US navy with practice-based and individualized instruction.

### 3.2 Adapting the Curriculum to the Operator

The main goal of the Curriculum Planning module is to select, from a library, a problem fitting the trainee needs.

The preparation of the tutoring sessions' learning material is a time-consuming task. In the industrial environment, usually there is not a staff exclusively dedicated to training tasks. In particular, in the electrical sector, the preparation of training sessions is done by the most experienced operators, which are often overloaded with power system operation tasks [ 7]. In order to overcome this difficulty, we developed two tools. The first one generates and classifies training scenarios from real cases previously stored. As these may not cover all the situations that control center operators must be prepared to face, another tool is used to create new training scenarios or to edit already existing ones [ 7]. The second tool, named Training Scenarios Generator, allows the user to choose the features of the training scenario such as the possibility of chronological inversion of SCADA messages.

The process used by the Curriculum Planning module to define the problems' features involves two phases. First, the tutor must define the difficulty level of the problem, using heuristic rules. These rules relate parameters like the trainee's performance in previous problems and his overall level of knowledge. In the second phase, the tutor uses the user model's contents to choose the type of the most suitable incidents to be included in the problem, taking into account the domain concepts involved in each type of incident and the corresponding trainee's expertise.

### 3.3 Difficulty Level Selection

To evaluate the problems' difficulty level, we need to identify the cases' characteristics that increase their complexity, namely number of incidents involved in the case, variety of incident types, number of involved plants, and existence of chronological inversion in SCADA messages.

The choice of the difficulty level depends on two factors contained in the trainee's model: the trainee's global knowledge and a global acquisition factor. The first parameter is a measure of the trainee' knowledge level in the whole range of domain concepts and is calculated using the mean of his knowledge level in each domain concept. The Curriculum Planning Module needs appropriate thresholds for deciding on the next problem difficulty level. The opinion of the trainees, regarding their personal evolution as the problems difficulty level is changed, can be used to tune these thresholds.

The acquisition factors record how well trainees learn new concepts. When a new concept is introduced, the tutor monitors the trainee's performance on the first few problems, namely how well and how quickly he solves them. This analysis determines the trainee's acquisition factor. The procedure used to determine the trainee's acquisition in each domain concept is based on the number of times the trainee's knowledge level about the concept increased, considering the three first applications of the concept.

The mechanism used to define the difficulty level of the problems is based on the following rule:

If the global knowledge level and the global acquisition factor  change in opposite directions (low-high or high-low), then the problem difficulty level does not change. Else, the problem difficulty level changes in the same direction of  the global knowledge level.

Table 2 illustrates the application of the previous rule.

Table 2. Application of the Mechanism Used to Define the Difficulty Level of Problems

Table 2 shows that if the trainee possesses a weak global acquisition factor, regardless of the global knowledge level, the resulting difficulty level never increases. In order to prevent this behavior, whenever the operator reaches three increase/decrease steps of the global acquisition factor after three consecutive problems, while the global acquisition factor shows a low/high level, then the problem's difficulty level is incremented/decremented. The goal of this heuristic rule is to prevent the global acquisition factor from inducing permanently the variation of the problem's difficulty level.

### 3.4 Problem Type Adequacy to the Trainee Cognitive Status

The mechanism used to classify each kind of incident in terms of adequacy to the trainee is based on a neural network (right side of Fig. 8). The nodes belonging to the input layer correspond to the concepts included in the domain's knowledge base (to be assimilated by the trainees). Each node represents the application of a concept in a specific context. For instance, the nodes ce1/T1 and ce1/T5 represent two instances of the same concept and characterize the application of the concept of breaker tripping in the situations of first tripping and tripping after an automatic reclosure. The input vector contains an estimate of the trainee's expertise level for each concept or its application and is obtained from the user model. Therefore, this vector represents an estimate of the trainee's domain knowledge.

Figure    Fig. 8. Classification mechanism.

The output layer units represent the adequacy of an incident type to the current learner's knowledge status. The number of units corresponds to the number of incident types. The five incident types considered are DS (simple tripping), DtR (triphase tripping with successful reclosure), DmR (monophase tripping with successful reclosure), DtD (triphase tripping with unsuccessful reclosure), and DmD (monophase tripping with unsuccessful reclosure). Each output layer's node, representing a type of incident, is connected only to the input nodes corresponding to the concepts involved with that incident type. These connections are done with links of weight wij.

The values used as weights are ${\rm wij} = \{1, 0, -\}$ , where "-" is used to indicate that there is no connection between the node i of the output layer and the input node j. This means that concept j is not involved in an incident type i.

Each output neuron activation level is computed using the input vector and its weight vector. The activation is defined by the euclidean distance, given by (1):

$a_i = \sqrt {\sum_{j = 1}^n {\left( {w_{ij} - x_j } \right)^2 }. }$

(1)

We can see that a neuron with a weight vector (w) similar to the activation level vector of the input node (x) will have a low activation level and vice versa. The output layer's node with the lowest activation will be the winner.

On the left side of Fig. 8, each line represents the evolution of the knowledge level about each domain concept, across a sequence of problems presented to the ideal operator. The vertical axis represents the knowledge level of the operator about each domain concept. The horizontal axis represents the sequence of problems obtained by the classification mechanism.

It can be observed that, after the third iteration, the concepts used in DS incident type overcome the medium level (0.5), leading to a new type of incident (DtR) in the next iteration. After the fourth iteration, some concepts that are not used in DS but are involved in DtR incident overtake the minimum level for the first time. In the simulation, all the model variables are set to their minimum value (0.1) and achieve a maximum value of 0.9. It is also assumed that the ideal operator applies correctly all the domain concepts involved in the problem and that the updating rate is constant (0.2).

We observed that an early introduction of new concepts can contribute to increase the instructional process efficiency. The problem selection mechanism ensures that the problem sequence is not monotonous, tending to stimulate the operator's performance with new kinds of incidents.

### 3.5 A Case Study

In this section, we will present a more elaborated example that can be presented to the Control Center's Operator trainee and is based on a real incident. This incident generated a set of messages from which we have selected the following 35 messages arriving in a period of just 130 ms.

Figure

These messages correspond to an incident at the line Ferreira do Alentejo-Palmela (SFA-SPM) involving only one phase that triggered the tripping of both ends. Automatic reclosure equipment performed the reclosure of the line, successfully in Palmela substation (SPM) but unsuccessfully in Ferreira do Alentejo substation (SFA). This end of the line has been closed by the automatic operator (OPA) of Ferreira do Alentejo substation. After the occurrence of a breaker tripping, the OPA will try to reengage it. If the fault persists and another tripping immediately occurs, the OPA will stop trying.

For this incident, the correct diagnosis is the following:

Figure

In this scenario, the automatic equipment was able to close both extremes of the line and the operator did not need to perform any corrective action. However, in other situations, where the cause of tripping is not transitory, the operator must perform corrective actions in order to restore the service. The training of these corrective actions is the goal of CoopTutor, presented in Section 4.

This diagnosis is reached by the SPARSE Expert System, which is used by DiagTutor as the Domain Expert. In order to support the trainee activity during the training session, DiagTutor receives the inference produced by the Domain Expert (SPARSE Expert System) used to get the correct diagnosis. The Expert System Knowledge Base is represented through production rules, so the inference produced includes the triggered rules, its premises, and its corresponding conclusions. This inference is used by DiagTutor to get the situation-specific model presented in Section 3.1. As presented before, this model is represented with two granularity levels. These two levels represent the boundaries of the trainee behavior during problem solving. One of these levels, with the lower granularity level, represents the reasoning of an expert during problem solving. An expert is able to solve the diagnosis problem with a minimum number of steps. On the other hand, a beginner trainee will require a maximum number of steps to reach the correct diagnosis. Such set of steps is represented by the higher granularity level of the situation-specific model.

Returning to the example, the incident involves tripping in both extremes of a line. In such case, the strategy used by an expert is to identify the tripping in each extreme of the line and then identify the correlation between the two trippings. This correlation occurs if the interval between trippings does not exceed a predefined number of seconds. DiagTutor supports the trainee activity with this strategy.

Fig. 9 presents the situation-specific model with the lower granularity level. The level represents the steps used by an expert to reach the correct diagnosis.

Figure    Fig. 9. Lower granularity level of the situation-specific model.

During the problem solving activity, an advanced trainee will need to use only three prediction tables (as presented in Fig. 9) to reach to the correct diagnosis: two prediction tables to get conclusions about the tripping in each extreme of the line (conclusions cs11 and cs13 in Fig. 9), and a third prediction table to conclude about the correlation between the two trippings (conclusion cc1 in Fig. 9).

On the other hand, a trainee in an earlier training stage may require the usage of a large number of steps to achieve the correct diagnosis, which means that the trainee will use more prediction tables during problem solving. Such trainee does not have the diagnosis task automated. Fig. 10 shows the higher level of the situation-specific model for the example.

Figure    Fig. 10. Higher granularity level of the situation-specific model.

A beginner trainee will need to use 8 prediction tables: 3 prediction tables to conclude about DmR in SPM (cs11), 4 prediction tables to conclude about DmD in SFA (cs13), and another prediction table to get the correlation about the tripping in both sides of the line (cc1).

For instance, the first of the three prediction tables used to conclude about DmR in SPM will allow to conclude about the concept cs6 (monophase tripping of unknown type at instant T1) based on the evidence of the events ce1 (breaker tripping at instant T1) and ce4 (breaker moving at instant T2), and based on verification of the temporal constraint ct1 ( $\vert {\rm T}1 - {\rm T}2 \vert \le 300$ milliseconds).

The existence of the two granularity levels of the situation specific model does not demand the operator's reasoning to follow the number of predefined set of steps expressed by each of the granularity levels. Considering the example, DiagTutor would accept as correct the conclusion about DmR in SFA (the tripping in the first extreme of the line) if a trainee with an intermediate level of knowledge about the diagnosis task uses 2 prediction tables instead of 3 from the higher granularity level. In this case, the first two prediction tables, corresponding to the higher granularity level, could be replaced by only one. This prediction table could conclude about the concept cs8 (monophase fast reclosure at instant T3) based on the evidence of the events ce1 (breaker tripping at instant T1), ce4 (breaker moving at instant T2), and ce2 (breaker closed at instant T3), and based on verification of the temporal constraints ct1 ( $\vert{\rm T}1 - {\rm T}2 \vert \le 300$ milliseconds) and ct4 ( $\vert {\rm T}2 - {\rm T}3 \vert \le 1$ second). This hypothetic reasoning his represented in the right side of Fig. 11.

Figure    Fig. 11. Two possible sequences of steps to conclude about monophase fast reclosure at instant T3.

The scenario illustrated in Fig. 11 shows that the trainee does not explicitly conclude the concept cs6 (see Fig. 10). However, since he concludes about cs8 based on all premises needed to conclude it, DiagTutor will accept that reasoning as a valid one. Furthermore, DiagTutor will infer that the trainee applied concept cs6 correctly and would increase the corresponding variable from the user model.

In order to fill the fields of prediction tables, the trainee uses a pull-down menu adjacent to each field. The set of items present in the pull-down menu is dynamic and depends on the expertise level of the trainee. A trainee, who is initiating his training, will have fewer options to fill the prediction table fields. As the trainee gets more expertise, the set of options available to fill each field increases. This adaptive behavior is based on the contents of the trainee model.

During problem solving, DiagTutor will present in green all correct inputs in the prediction tables and in red the wrong ones. In case of wrong entries, the trainee can ask about "What is wrong?". DiagTutor will answer with a hint in order that the trainee can overcome his difficulty. If the trainee asks for help about the same error, done before, the tutor will supply hints with increasing detail. The sequence of presented hints is maintained by the tutor in order to prevent showing repeated hints.

Another kind of help is supplied by DiagTutor. The trainee can ask help about "What to do next?". This kind of help is presented only when there is not a red entry in the prediction table.

This example is not one of the most complex presented to the trainee. In the final phase of the diagnosis training, the operator is faced with several incidents taking place during the same time interval and having consequences in more than two plants.

## Tutoring Module for Restoration Training

### 4.1 Restoration Training Issues

The management of a power system involves several distinct entities, responsible for different parts of the network. The power system restoration needs a close coordination between generation, transmission, and distribution personnel and their actions should be based on a careful planning and guided by adequate strategies [ 14].

In the Portuguese transmission network, four main entities can be identified: the National Dispatch Center (CG), responsible for the energy management and the thermal generation; the Operational Center (CO), controlling the transmission network; the Hydroelectric Control Center (CTCH), responsible for the remote control of hydroelectric power plants, and the Distribution Dispatch (EDIS), controlling the distribution network. It is important to note that several companies are involved.

The power restoration process is conducted by these entities in such a way that the parts of the grid they are responsible for will be slowly led to their normal state, by performing the actions specified in detailed operating procedures and fulfilling the requirements defined in previously established protocols. This process requires frequent negotiation between entities, agreement on common goals to be achieved, and synchronization of the separate action plans on well-defined moments.

Training programs should take this fact into account by providing an environment where these different roles can be performed and intensively trained. Traditionally, this requirement has been met by the use of training simulators. These systems are nowadays quite apt at describing accurately the power systems' behavior and representing the system's performance realistically. It is possible to turn them into the core of a training environment with great realism.

However, several drawbacks can be found in training programs solely based on the training simulators. The preparation of these training sessions typically requires several days of work from specialized training staff. The need to move away at least four control center operators from their workplace during several days for the simulation to be convincing has the consequence of no more than two training sessions per year being usually attended. Another facility usually absent from a simulator-based training session is the capability to perform an accurate evaluation of the trainees' knowledge level and learning evolution.

Some of these operator training simulators are built having in mind the need to reflect in the training the fragmented structure of the control hierarchy [ 15]. Therefore, they have basic provisions to emulate that environment. The roles of the different control centers are emulated by one or more instructors in a somewhat sketchy and cumbersome way.

The role of a simulation facility for the training of Power Systems restoration procedures and techniques is undeniable. The same can be said to several other areas addressed by ITSs. Systems like Tactical Action Officer (TAO) [ 13] make extensive use of simulation to provide tactical action officer students at US Navy with practice-based and individualized instruction.

To have a full-scale simulator at hand can obviously be convenient when building a power system restoration training system, but do we really need a full-blown simulator for that? In fact, provided that its purpose is not to accurately describe the network behavior but only to lend enough realism to the training environment, its limited simulation capabilities may be good enough to add some realistic sense to the tutoring process, confirming the conclusions of some recent research [ 16].

The purpose of this tutoring system is to allow the training of the established restoration procedures and the drilling of some basic techniques. Power system utilities have built detailed plans containing the actions to execute and the procedures to follow in case of incident. In the case of the Portuguese network, there are specific plans for the system restoration following several cases of partial blackouts as well as national blackouts, with or without loss of interconnection with the Spanish network. Table 3 illustrates a service restoration plan.

Table 3. Restoration Plan Example

In this section, we describe how we developed a training environment able to deal adequately with the training of the procedures, plans, and strategies of the power system restoration, using what may be called lightweight, limited scope simulation techniques. This environment's purpose is to make available to the trainees all the knowledge accumulated during years of network operation, translated into detailed power system restoration plans and strategies, in an expedite and flexible way. The embedded knowledge about procedures, plans, and strategies should easily be revisable, any time that new field tests, postincident analysis, or simulations supply new data.

This training environment aims to combine the traditional strengths of the Intelligent Tutors with some of simulation capabilities of the Operator Training Simulators.

### 4.2 Multiagent System

Several agents personify the four entities that are present in the power system restoration process: Operational Center (CO), National Dispatch (CG), Hydroelectric Generation (CTCH), and Distribution Dispatch (EDIS). In Fig. 12, it can be seen that the four agents behavior is like virtual CC operators.

Figure    Fig. 12. CoopTutor multiagent architecture.

The multiagent approach was chosen because it is the most natural way of translating the real-life roles and the split of domain knowledge and performed functions that can be witnessed in the actual power system. Several entities responsible for separate parts of the whole task must interact in a cooperative way toward the fulfillment of the same global purpose. Agents' technology has been considered well-suited to domains, where the data are split by distinct entities physically or logically and that must interact with one another to pursue a common goal [ 17].

These agents can be seen as virtual entities that possess knowledge about the domain. As real operators, they have tasks assigned to them, goals to be achieved, and beliefs about the network status and others agents' activity. They work asynchronously, performing their duties simultaneously and synchronizing their activities only when this need arises. Therefore, the system needs some kind of facilitator (simulator in Fig. 12) that supervises the process, ensuring that the simulation is coherent and convincing.

In our system, the trainee can choose to play any of the available roles, namely the CO and the CG ones, leaving to the tutor the responsibility of simulating the other participants.

The ITS architecture was planned in order that future upgrades of the involved entities or the inclusion of new agents are simple tasks.

### 4.3 Trainee's Model

The representation method used to model the trainee's knowledge about the domain knowledge is a variation of the Constraint-Based Modeling (CBM) technique [ 18]. This student model representation technique is based on the assumption that diagnostic information is not extracted from the sequence of student's actions but rather from the situation, also described as problem state that the student arrived at. Hence, the student model should not represent the student's actions but the effects of these actions. Because the space of false knowledge is much greater than the one for the correct one, it was suggested that the use of an abstraction mechanism based on constraints. In this representation, a state constraint is an ordered pair (Cr, Cs), where Cr stands for relevance condition and Cs for satisfaction condition. Cr identifies the class of problem states in which this condition is relevant and Cs identifies the class of relevant states that satisfy Cs. Under these assumptions, domain knowledge can be represented as a set of state constraints. Any correct solution for a problem cannot violate any of the constraints. A violation indicates incomplete or incorrect knowledge and constitutes the basic piece of information that allows the Student Model to be built on.

This CBM technique does not require an expert module and is computationally undemanding because it reduces student modeling processing to a basic pattern matching mechanism [ 19]. One example of a state constraint, as used in our system, can be found below:

If any circuit breaker is closed in a substation in automatic mode, then that circuit breaker must have been closed by the Automatic  Operator. Otherwise, Error #10 will be raised.

Each violation of a state constraint like the one above enables the tutor to intervene both immediately or at a later stage, depending on the seriousness of the error or the pedagogical approach that was chosen.

This technique gives the tutor the flexibility needed to address trainees with a wide range of experience and knowledge, tailoring, in a much finer way, the degree and type of support given, and, at the same time, spared us the exhaustive monitoring and interpretation of the student's errors during an extended period, which would be required by alternative methods.

Nevertheless, it was found the need for a metaknowledge layer in order to adapt the CBM method to an essentially procedural, time-dependent domain like the power system restoration field. In fact, the validity of certain constraints may be limited to only parts of the restoration process. On the other hand, the violation of a constraint can, in certain cases, render irrelevant the future verification of other constraints. Finally, equally valid constraints in a certain state of the process can have different relative importance from the didactic point of view. This fact suggests the convenience of establishing a constraint hierarchy.

This metaknowledge layer is composed of rules that control the constraints' application, depending on several issues: the phase of the restoration process in which the trainee is; the constraints previously satisfied; and the set of constraints triggered simultaneously.

These rules establish a dependency network between constraints that can be represented by a graph ( Fig. 13) [ 20]. The nodes 1-15 represent constraints. The relationships between constraints expressed by this graph can be of precedence, mutual exclusion, or priority.

Figure    Fig. 13. Constraint dependency graph.

For example, prior to the satisfaction of the R1 and R9 constraints (see Table 4), it does not make practical sense to verify all the other constraints. These two constraints deal with the need to assure that some preconditions are met in order to start the restoration process. So, only when they are satisfied, the remaining constraints will be inserted in the constraint knowledge base. This relationship is expressed by the following metarule:

meta_rule(1, satisfied, [2,3,4,5,6,7,8,9,10,11,12,13,14,15], insert).

There is a metarule that states that when the constraint R14 is violated simultaneously with R7 and R8 (see Table 4), only R14 should be addressed because of the didactic considerations. All the constraints being relevant, the system chooses to only present the more critical one in order to limit the student's cognitive load. This inhibition relationship is expressed by

meta_rule(14, violated, [7,8], inhibit).

Sometimes, it makes sense to let external events to have an impact on the set of available constraints. It is the case of the metarule below, which states that, after the end of the automatic restoration process, R10 and R13 must be removed because they are now counterproductive:

meta_rule(restorationFinished, _ , [10,13], remove).

The constraints R10 and R13 deal with restrictions concerning substations in automatic mode that make no longer sense when the last task assigned to the operators is precisely to check all the circuit breakers that should have been closed by automatic means, but for some reason, are still open. The end of the automatic restoration process does not mean then that some manual adjustments are not needed even in installations normally in automatic mode.

Table 4. Constraint Examples

### 4.4 The Cooperative Learning Environment

This tutor is able to train individual operators as if they were in a team, surrounded by virtual "operators," but is also capable of dealing with the interaction between several trainees engaged in a cooperative process. It provides specialized agents to fulfill the roles of the missing operators and, at the same time, monitors the cooperative work, stepping in when a serious imbalance is detected. It is not the first time that a multiagent system is used to support a cooperative training environment [ 21]. Our system, nevertheless, not only uses agents to support the cooperative process of interaction but also includes agents to perform vital roles in the simulation of the restoration environment. The tutor can be used as a distance learning tool, with several operators being trained at different locations.

To support the tutor monitoring activities of the cooperative discussion and decision processes, several provisions were made in order to be able to accurately model the interactions between trainees. The core data contained in the student model have been complemented with information concerning the quantity and characteristics of the interactions detected between trainees. The data are gathered by the tutor by means of a loose monitoring of the interaction patterns coupled with a surface-level analysis of the message contents.

The tutor will be active by its own initiative only if it detects a clear imbalance in the discussion process or a continued trend of passive behavior [ 22]. It may also be called to step in though by the trainees themselves, if they agree on a course of action or if they find themselves in an impasse situation. In the former case, the tutor will use the knowledge contained in the CBM module to evaluate the divergent proposals. In the latter case, it will combine the constraint satisfaction data previously gathered with procedural knowledge containing the sequence of the specific restoration plan, in order to issue recommendations about the next step to fulfill. In order to be able to monitor the interaction between students, the tutor, although lacking natural language understanding capabilities, requires only a minimal degree of message formalization [ 23].

The general aspect of the ITS interface is depicted in Fig. 14. It shows three main areas: the high-voltage transmission network (bottom-left side), the substation synoptic description (bottom-right side), and a cooperative work chatroom (top).

Figure    Fig. 14. CoopTutor interface.

## Conclusions

This paper described how an Intelligent Tutoring System can be used for the training of Power Systems Control Center operators in two main tasks: Incident Analysis and Diagnosis and Service Restoration. Several Artificial Intelligence (AI) techniques were joined to obtain an effective Intelligent Tutoring environment, namely Multiagent Systems, Neural Networks, Constraint-based Modeling, Intelligent Planning, Knowledge Representation, Expert Systems, User Modeling, and Intelligent User Interfaces.

The developed system is used in the training of Electrical Engineering BSc students who are the prime candidates to become CC operators. Note that CC operator teams are frequently renewed because the job is quite demanding, especially because of the operators' timetables.

It is quite usual the start of an Engineering career in a Power Systems company to be done at this level, in order to have a kind of hands-on experience and understanding of the Power System technical needs. Now the Intelligent Tutoring Systems is being used twofold: for training BSc students that have the possibility to be hired by the Power System company and to train the hired operators on the job.

It is also important to note that this tutorial environment has been selected as one of the most important systems combining AI techniques to be available in the "AI-50 years" Exhibition in Portugal [ 24], being tried by many undergraduate students, motivating them for the Electrical Engineering and Computer Science fields.

Concerning the operators' training, the most interesting features of this environment are the following:

1. The connection with SPARSE, a legacy Expert System used for Intelligent Alarm Processing [ 4].
2. The use of prediction tables and different granularity levels for fault diagnosis training.
3. The use of the model tracing technique to capture the operator's reasoning.
4. The development of two tools to help the adaptation of the curriculum to the operator—one that generates training scenarios from real cases and another that assists in creating new scenarios.
5. The automatic assignment of the difficulty level to the problems.
6. The identification of the operators' knowledge acquisition factors.
7. The automatic selection of the next problem to be presented, using Neural Networks.
8. The use of Multiagent Systems paradigm to model the interaction of several operators during system restoration.
9. The use of the Constraint-based Modeling technique in restoration training.
10. The availability of an Intelligent User Interface in the interaction with the operator.

## Acknowledgments

The authors would like to thank FCT (The Portuguese Foundation for Science and Technology), AdI (Innovation Agency), and FEDER, PEDIP, POSI, POSC, and PTDC European programmes for their support in several research projects leading to the development of the work described in this paper.

## References

• 1. "System Disturbance on 4 November 2006," final report, Union for the Co-Ordination of Transmission of Electricity, Brussels, Belgium, www.ucte.org, 2007.
• 2. "Shedding Light on Blackouts—From Prevention through Restoration," IEEE Power and Energy Magazine, vol. 4, no. 5, Sept./Oct. 2006.
• 3. I. Praça, C. Ramos, Z. Vale, and M. Cordeiro, "MASCEM: A Multiagent System That Simulates Competitive Electricity Markets," IEEE Intelligent Systems, special issue on agents and markets, vol. 18, no. 6, pp. 54-60, Nov./Dec. 2003.
• 4. Z. Vale, A. Moura, M. Fernandes, A. Marques, A. Rosado, and C. Ramos, "SPARSE: An Intelligent Alarm Processor and Operator Assistant," IEEE Expert, special track on AI applications in the electric power industry, vol. 12, no. 3, pp. 86-93, May 1997.
• 5. P. Kádár, "Practical Knowledge Management in a Dispatch Center," Eng. Intelligent Systems, vol. 13, no. 4, pp. 231-236, Dec. 2005.
• 6. A. Lesgold, S. Lajoie, M. Bunzo, and G. Eggan, "SHERLOCK: A Coached Practice Environment for an Electronics Troubleshooting Job," Computer Assisted Instruction and Intelligent Tutoring Systems: Shared Issues and Complementary Approaches, J. Larkin and R. Chabay, eds., pp. 201-238, Lawrence Erlbaum Assoc., 1992.
• 7. L. Faria, Z. Vale, C. Ramos, A. Silva, and A. Marques, "Training Scenarios Generation Tools for an ITS to Control Center Operators," Proc. Intelligent Tutoring Systems Conf. (ITS '00), 2000.
• 8. J. Anderson, A. Corbett, K. Koedinger, and R. Pelletier, "Cognitive Tutors: Lessons Learned," The J. Learning Sciences, vol. 4, no. 2, pp. 167-207, 1995.
• 9. L. Faria, Z. Vale, and C. Ramos, "Diagnostic Tasks Training Based on a Model Tracing Approach," Int'l J. Eng. Intelligent Systems for Electrical Eng. and Comm., vol. 13, no. 4, pp. 223-230, 2005.
• 10. K.R. Koedinger, V. Aleven, and N.T. Heffernan, "Toward a Rapid Development Environment for Cognitive Tutors," Proc. 12th Ann. Conf. Behavior Representation in Modeling and Simulation, 2003.
• 11. V. Aleven, and K.R. Koedinger, "An Effective Meta-Cognitive Strategy: Learning by Doing and Explaining with a Computer-Based Cognitive Tutor," Cognitive Science, vol. 26, no. 2, pp. 147-179, 2002.
• 12. L. Razzaq, and N.T. Heffernan, "Scaffolding vs. Hints in the Assistment System," Proc. Eighth Int'l Conf. Intelligent Tutoring Systems, pp. 635-644, 2006.
• 13. R. Stottler, and M. Vinkavich, "Tactical Action Officer Intelligent Tutoring System (TAO ITS)," Proc. Interservice/Industry, Training, Simulation & Education Conf. (I/ITSEC '00), 2000.
• 14. M. Sforna, and V. Bertanza, "Restoration Testing and Training in Italian ISO," IEEE Trans. Power Systems, vol. 17, no. 4, pp. 1258-1264, Nov. 2002.
• 15. K. Salek, U. Spanel, and G. Krost, "Flexible Support for Operators in Restoring Bulk Power Systems," Proc. CIGRE/IEEE PES Int'l Symp. Quality and Security of Electric Power Delivery Systems (CIGRE/PES '03), pp. 187-192, Oct. 2003.
• 16. S. Nurmi, "Simulations and Learning, Are Simulations Useful for Learning?" ERNIST project, www.eun.org/eun.org2/eun/en/ insight_research&Development/sub_area.cfm?sa=5811, 2004.
• 17. N. Jennings, and M. Wooldridge, "Applying Agent Technology," Applied Artificial Intelligence: An Int'l J., vol. 9, no. 4, pp. 351-361, 1995.
• 18. S. Ohlsson, "Constraint-Based Student Modeling," Student Modeling: The Key to Individualized Knowledge-Based Instruction, J.E. Greer and G.I. McCalla, eds., pp. 167-189, Springer-Verlag, 1993.
• 19. T. Mitrovic, K. Koedinger, and B. Martin, "A Comparative Analysis of Cognitive Tutoring and Constraint-Based Modeling," Proc. Ninth Int'l Conf. User Modeling (UM '03), 2003.
• 20. A. Silva, Z. Vale, and C. Ramos, "Cooperative Training of Power Systems Restoration Techniques," Proc. 13th Int'l Conf. Intelligent Systems Applications to Power Systems, Nov. 2005.
• 21. E. Blanchard, and C. Frasson, "Une Architecture Multi-Agents Pour Des Sessions D'apprentissage Collaborative," Proc. Int'l Conf. New Technology of Information and Comm., Nov. 2002.
• 22. A. Vizcaíno, "A Simulated Student Can Improve Collaborative Learning," Int'l J. Artificial Intelligence in Education, vol. 15, no. 1, pp. 3-40, 2005.
• 23. M. Rosatelli, and J. Self, "A Collaborative Case Study System for Distance Learning," Int'l J. Artificial Intelligence in Education, vol. 14, no. 1, pp. 97-125, 2004.
• 24. C. Ramos, "How Portugal Celebrated AI's 50th Anniversary," IEEE Intelligent Systems, vol. 21, no. 4, pp. 86-88, 2006.