The Community for Technology Leaders

Enrichment of Peer Assessment with Agent Negotiation

Chung Hsien Lan
Sabine Graf
K. Robert Lai

Pages: pp. 35-46

Abstract—This study presents a conceptual framework for providing intelligent supports through agent negotiation and fuzzy constraints to enhance the effectiveness of peer assessment. By using fuzzy constraints, it not only provides a flexible marking scheme to deal with the imprecision and uncertainty for the representation of assessment but also provides a computational framework to incorporate student's personal characteristics into the process for the reduction of assessment bias. Additionally, a fuzzy constraint-based negotiation mechanism is employed to coordinate the cognitive differences between students. Through iterative agent negotiation, students can reconcile the differences and reach an agreement on the assessment results. Thus, the proposed framework allows students to provide more detailed, informed, and less biased assessments for their peers' work. To demonstrate the usefulness and effectiveness of the proposed approach, a negotiation-based peer assessment system, NePAS, has been built and used in classroom. Experimental results suggested that students were more willing to accept the assessment results and able to acquire more useful information to reflect upon and revise their work. Instructors can also observe students' participation and performance to appropriately adjust instructional strategies.

Index Terms—Peer assessment, assessment bias, agent negotiation, fuzzy constraints.


Peer assessment supports group learning by motivating students in deep thinking, comparison, discussion, and critical judgment of peer work. A peer assessment process includes cognitive activities such as doing assignments, reviewing, summarizing, clarifying, providing feedback, diagnosing errors, identifying missing knowledge or deviations and evaluating the quality of peers' work [ 1], [ 2], [ 3]. Students are involved in both assessment and learning processes. When marking peers' work, students review the ideas of their peers and realize the mistakes that they had made in their own work. When receiving the feedback, students can reflect upon and attempt to acquire the missing knowledge, and then revise their own work. Thus, peer assessment has been widely recognized as a learning tool for improving student's performance in collaborative learning environment.

Numerous researchers have investigated the effectiveness of computer-based peer assessment systems in various learning scenarios [ 4], [ 5]. Davies developed a computer program to adjust the scores awarded to the coursework of others and encourage students to diligently and fairly review the coursework of their fellow students [ 6]. Sitthiworachart and Joy designed a web-based peer assessment system that successfully assists students in developing their understanding of computer programming [ 3]. Liu et al. developed a networked knowledge management and evaluation system that emphasizes expressing ideas via oral presentations and writing, accumulated wisdoms via discussion, critical thinking via peer assessment, and knowledge construction via project work. Student's achievement increased significantly as a result of the peer assessment process and the number of students willing to take part in learning activities also significantly increased [ 7]. While these earlier systems have demonstrated that students benefit significantly from peer assessment, some studies have also revealed that students may lack the ability to evaluate peers' work or not take their role seriously, allowing friendships and personal characteristics to influence the marks they give to peers' coursework [ 8], [ 9], [ 10]. Further, students also often encounter difficulties in interpreting assessment criteria and lack confidence to give exact marks or scores to peers' work [ 11], [ 12]. These obstacles often result in subjective and unfair assessments and limit the use of peer assessment.

This study presents a conceptual framework for providing intelligent supports through agent negotiation and fuzzy constraints to enhance the effectiveness of peer assessment. In this framework, assessments are represented as fuzzy membership functions to deal with the inexactness of marking and its subjective nature. By using fuzzy constraints, it not only provides a flexible marking scheme for the representation of assessment but also provides a computational framework to incorporate student's personal characteristics into the process for the reduction of assessment bias. Additionally, a fuzzy constraint-based negotiation mechanism is employed to coordinate the cognitive differences between students. Through iterative agent negotiation, students can reconcile the differences and reach an agreement on the assessment results. Thus, the proposed framework can provide more detailed, informed, and less biased assessments and make students more inclined to accept the results and to reflect upon and revise their work. Instructors can also observe students' participation and performance to appropriately adjust instructional strategies.

The remainder of this paper is organized as follows: Section 2 introduces the proposed conceptual framework for an enriched peer assessment process. Section 3 presents a Negotiation-based Peer Assessment System, NePAS, and then uses a walk-through example to illustrate the proposed peer assessment process with NePAS. Section 4 depicts the experimental results and evaluation of the questionnaire results. Finally, Section 5 draws the conclusion.

Enrichment of a Peer Assessment Process

Group learning emphasizes the importance of interaction [ 13]. However, it is worth reflecting on how effective the interactions in a peer assessment process can be. Summarizing from several previous studies, the advantages in peer assessment include: a more in-depth contact with the course material for knowledge interpretation; prolonged interaction between peers for provision of constructive feedback based on multiple observations of performance; and opportunity to develop critical reasoning skills [ 14], [ 15], [ 16] and self-directed learning [ 17]. That means the more peer assessment students do, the better their own results become [ 18]. Through a student-involved and interactive process, students' interpretation and reflection can be enhanced, and instructors also can improve their understanding of student performance by observing students' interaction. However, there are inherent challenges in using peer assessment [ 8], [ 9], [ 10], [ 19].

  • Students may lack confidence and competence in evaluating peers' work or may not be prepared to be critical, and thus instructors must devote time to guide students on what and how to effectively assess peers' work.
  • The existence of personal bias has been confirmed and need to be considered in peers' marking due to interpersonal relationships between students and personal characteristics.
  • Students may not have the control over the whole assessment process, and thus they possibly disagree with the assessment rating given by instructors or other peers.
  • Students have difficulties in comprehending how to reflect on their work if assessment results are only given as scores without textual feedback.

In order to enhance student interpretation, reflection, and acceptability, an interactive assessment method is necessary. On the one hand, in order to enable students to comprehend the profound knowledge, attention has to be paid to carefully define and interpret assessment criteria through student interaction or instructor explanation. If students do not understand the assessment criteria, they cannot provide accurate and complete feedback. Through exploring peers' ideas, students can acquire their missing knowledge and look for patterns to have a critical and investigating attitude and understand their course materials better. It is very critical to define appropriate assessment criteria at the beginning of peer assessment. Hattum-Janssen and Lourenco indicated that most students can assess the performance of their peers accurately under the condition that assessment criteria are extensively discussed by the students [ 16].

On the other hand, attitude, competence, relationship conflict, culture, preferences, experiences, abilities, social styles, and learning styles affect the peer assessment process [ 10], [ 12]. Individuals with different abilities and attitude levels have different contribution to evaluate and provide critical feedback. Personal characteristics and bias influences the accuracy of the assessment. Considering these factors and decreasing individual influence on the overall assessment has therefore high potential to provide more accurate feedback for students.

In addition to the consideration of assessment bias, it is also critical to represent and coordinate assessments in order to produce feedback which indeed reflects students' achievement. In general, student feedback is gathered through a forced choice questionnaire, where students can give a score or offer comments following specific criteria [ 10]. However, students often lack experience in assessment and in interpreting assessment criteria that results in encountering difficulties in grading. Moreover, conventional grading methods cannot completely represent students' assessments and the computational methods of coordinating grades, such as average and sum, often obstruct the consideration of personal bias and preference and therefore result in subjective and unfair assessments. In order to effectively represent and coordinate students' assessments, it is critical to improve the grading methods and apply intelligent mechanism to automatically coordinate the differences of assessment.

After coordinating students' assessment, an important issue in exploring the representation of assessment results has been raised. Mostly, assessment results are described by using one dimension (i.e., scores, levels, etc.). Such feedback makes it difficult for students to reflect on their own work due to insufficient information. Therefore, it is necessary to provide students plenty information with multidimensions which would facilitate students to reflect on and revise their work.

To alleviate the aforementioned weakness in peer assessment, we present a conceptual framework for the enrichment of a peer assessment process as shown in Fig. 1.


Figure    Fig. 1. A conceptual framework for the enrichment of a peer assessment.

2.1 Exploration of Assessment Criteria

In several studies [ 9], [ 16], [ 20], assessment criteria have been decided by instructors in order to simplify the marking process. However, identical instructional strategies and resources may train up students to own different abilities in learning, absorbing and elaborating knowledge. Students sometimes do not fully understand these assessment criteria and it is possible to foster superficial learning or incorrect evaluation [ 16]. In order to enable students to understand and internalize course materials and foster deep learning, assessment criteria can be explored through students' involvement and interaction. Students are able to critically reflect on what they have learned and use these contents to think about assessment criteria.

Student's participation may include that the assessment criteria are totally decided by students or by instructors and students together. If only students are involved in defining assessment criteria, it is possible that they lack the ability to converge the assessment criteria. Therefore, our proposed framework calls for instructors and students together to explore the assessment criteria.

Assume that there are k students who are involved in the peer assessment activity. At the beginning, according to the content of each subject and assignment, instructor provides a set of assessment criteria for students. After exploring the detailed description of initial assessment criteria provided by the instructor and discussing them with fellow students, each student can then assign a priority to each criterion and also can propose his/her own assessment criteria, together denoted by $\chi^{\rm t}$ . All assessment criteria ${\rlap{\cal X}{-}} = ({\rm U}^{\rm K}_{{\rm t} = 1}\; \chi^{\rm t} )$ and their descriptions can then be sent to the system. Based on the priorities submitted by each student, the system selects a final set of assessment criteria denoted by ${\rm X = \{ X}_1 , \ldots,{\rm X}_{\rm n}\}$ , where n is the number of assessment criteria and ${\rm X}$ is a subset of ${\rlap{\cal X}{-}},{\rm X} \in {\rlap{\cal X}{-}}$ . The exploration process can enhance student's knowledge by receiving the ideas and suggestions of peers and instructor. Interactive discussion also facilitates students with better comprehension of course materials and better understanding of the assessment criteria, and help them to do a better job on self-assessment and marking peers' work.

2.2 Assessment Representation

According to the assessment criteria, students evaluate peers' work and self-assess their own work. Generally, students' assessments are represented with numerical scores or scales [ 9]. However, through such marking method it is difficult to describe students' incompleteness or uncertainty of assessments and to provide more detailed feedback. Using approaches that supports a higher number of dimensions such as fuzzy sets or Bayesian probability can present more information to resolve the impreciseness and uncertainty.

Owing to the vague and subjective concepts observed while students assess peers' work, fuzzy constraint is suited for the representation of imprecise and uncertain information [ 21]. Thus, the proposed framework explores every fuzzy constraint that represents students' cognition and preferences to aggregate entire assessment effectively, lower the difficulty of representation and reduce individual subjectivity. Fuzzy constraints can not only provide the ability to represent and reason the uncertain information [ 22], but also build a unified model to define the assessment of each student and the entire relationships among students.

In the proposed framework, students' assessments are represented as fuzzy constraints with two dimensions: scores and satisfaction degrees. Since many constraints are involved in the assessment process, a constraint network can be used to represent a collection of issues interlined by a set of constraints that specify relationships which must be satisfied by the values that are assumed by these issues. However, most conventional constraint systems are unable to capture the meaning of imprecise concepts. Fuzzy constraint network (FCN) [ 22], as an extension of fuzzy sets [ 23] and constraint network, can be used to provide a conceptual framework for representing the meaning of imprecise knowledge. A generic fuzzy constraint associated with a set of assessment criteria is also a fuzzy set, and it is interpreted as the joint possibility distribution for the issues involved in the constraint. Because real-world environments are heterogeneous and inherently distributed, a distributed fuzzy constraint network (DFCN) is more appropriate for providing a formal framework for modeling assessment representation [ 21]. Each FCN has its own internal fuzzy constraints between issues and external fuzzy constraints that exist between FCNs. Thus, the assessments submitted by a group of students can be regarded as DFCN. According to the definition of a distributed fuzzy constraint network, $\aleph^p = ({\cal U}^p ,X^p ,{\rm C}^p )$ is an FCN $p$ , where $p = 1,2, \ldots k$ ; ${\cal U}^p$ is a universe of discourse for FCN $p$ ; ${\rm X}^p$ is a set of n nonrecurring assessment criteria ${\rm X}_1 , \ldots {\rm X}_n$ ; ${\rm C}^p$ is a set of fuzzy constraints, which is the union of a set of internal fuzzy constraints ${\rm C}^{p_i }$ existing among assessment criteria in ${\rm X}^p$ and a set of external fuzzy constraints ${\rm C}^{p_e}$ , and $\aleph^p$ is connected to other FCNs by ${\rm C}^{p_e }$ .

Students review and then mark peers' work with fuzzy membership functions for each assessment criterion. The type of fuzzy membership functions can be triangular, trapezoidal, or Gaussian [ 24], [ 25]. This marking scheme can sufficiently reveal imprecise or uncertain cognition and allow students to mark peers' work more easily.

2.3 Reduction of Assessment Bias

Students are independent and autonomous individuals who own different characteristics and abilities in playing their role in the assessment process. Several studies have been conducted on peers' personal bias while evaluating other peers. May and Gueldenzoph [ 10] studied the effect of social styles. Sherrard and Raafat [ 36] noted that women gave significantly better grades than men. Tziner et al. [ 37] found that assessors who are high on conscientiousness were more likely to discriminate among students and were less likely to give high ratings. Lin et al. [ 4] indicated that students with high executive thinking styles contributed substantially better feedback than their low executive counterparts. Owing to the difference of personal characteristics, for some students some assessment criteria are easy to apply and their peers' work can be assessed accurately, but for other assessment criteria students may lack the ability to evaluate peers' work. Therefore, individual characteristics, preferences, experiences, culture, social styles, and learning styles may cause assessment bias. In order to reduce the assessment bias and to improve the assessment quality, individual characteristics should be considered to adjust assessment representation.

Considering assessment bias, the importance or representativeness of each student's assessment may be different. Thus, the proposed framework adjusts students' assessment according to their characteristics. $\mu_{{\rm C}^p_{\rm q} } ( \cdot )$ is the satisfaction degree of the constraint ${\rm C}^p_q$ of the student $p$ over assessment criterion $q$ . The adjusted satisfaction degree can be defined as $\mu_{{\rm C}^p_q } ({\rm u})^{\varpi^p_q}$ , where $\varpi^p_q$ is the contribution of the student $p$ for the assessment criterion $q$ . In order to reduce assessment bias, $\varpi^p_q$ can be determined based on students' characteristics.

2.4 Coordination of Assessment Results

Students' cognitive differences lead to various assessment results. Most researches about peer assessment adopt the computational models of summary or average to aggregate all assessments [ 12]. However, these computational models cannot explicitly represent the differences of individual confidence and ability in assessment and also do not facilitates the opportunity of interaction in the assessment process between the assessors.

Considering the relation of competition and collaboration among students, negotiation mechanism is suitable to yield potential agreements and reach a mutually satisfactory outcome [ 26]. Negotiation mechanism can be used to share students' ideas in the process of carrying out a joint assessment, or to resolve outright conflict arising from certain assessment biases that occur due to different individual perspectives. Through negotiation, individual preferences can be considered to coordinate all assessments and produce more objective feedback. To reach consensus more quickly and to facilitate search for possible solutions, agent negotiation [ 27] is adopted in the proposed framework. In agent negotiation, an agent must employ iterative message passing to explore other agents' information and find a global solution, not knowing precisely the state of other agents. Each agent that represents an individual student has information about the student's individual assessment to communicate with other agents automatically in order to effectively find out accepted agreements for the assessment.

In the proposed framework, assessment agents are implemented to represent students to propose their interests and adopt negotiation strategies to attempt to reach an agreement during the negotiation process. Fig. 2 represents a high-level framework of assessment coordination through assessment agents and negotiation. Fig. 3 displays the workflow of offer generation for a negotiating agent.

Graphic: Fig. 2. The architecture of assessment coordination.

Figure    Fig. 2. The architecture of assessment coordination.

Graphic: Fig. 3. The workflow of offer generation for a negotiating agent.

Figure    Fig. 3. The workflow of offer generation for a negotiating agent.

Initially, each assessment agent in the negotiation can be represented as a different fuzzy constraint network, and all agents' preferences can be naturally expressed by fuzzy constraints. Agents try to reach a consensus by exchanging offers and counteroffers. When an agent receives a counteroffer, the procedure of counteroffer evaluation is used to conclude whether the counteroffer is consistent with agent's current intent. To measure the satisfactory extent of the counteroffer, an evaluation function should be defined first. In the proposed framework, an aggregated satisfaction value is specified to quantify the satisfactory extent of the counteroffer. The aggregated satisfaction value along with a simplified version of fuzzy constraint-based negotiation context adapted from [ 21] is defined as below.

  • The intent of a distributed fuzzy constraint network $({\cal U}^p ,{\rm X}^p ,{\rm C}^p )$ , written $\Pi_{({\cal U}^p ,{\rm X}^p ,{\rm C}^p )}$ , is an n-ary possibility distribution for the assessment criteria ${\rm X}^p$ involved in the FCN $p$ , which must hold for every constraint in ${\rm C}^p$ . That is
  • where, for each constraint $C^p_j (T_j ) \in {\rm C}^p ,\overline{C^p_1} (T_j )$ is its cylindrical extension in the space $X^p = (X^p_1, \ldots, X^p_1)$ . $k$ is the number of agent.
  • Meanwhile, $_{\alpha}{\Pi_{{\cal U}^p ,{\rm X}^p ,{\rm C}^p } }$ , the $\alpha$ -level cut of $\Pi_{{\cal U}^p ,{\rm X}^p ,{\rm C}^p }$ , can be viewed as a set of solutions satisfying all the internal and external constraints in FCN $p$ simultaneously to an extent that is greater than or equal to an acceptable threshold $\alpha$ .
  • For bias reduction, the contribution of agent $p$ forassessment criterion $q$ , denoted by $\varpi^p_q$ , need to be considered. The overall satisfaction degree of the constraints of FCN $p$ reached by a proposal u, denoted by $\mu_{\Pi_{{\cal U}^p ,{\rm X}^p ,{\rm C}^p } } ({\bf u})$ is defined as the satisfaction degree of the least satisfied constraint. For simplification, $\mu_{\Pi_{{\cal U}^p ,{\rm X}^p ,{\rm C}^p } } ({\bf u})$ is written as $\mu_{C^p } ({\bf u})$ . That is
  • where $n$ is the number of assessment criteria.
  • The aggregated satisfaction value of the proposal u to agent $p$ for the potential agreement in $\Pi_{{\rm C}^p }$ , denoted by $\psi_{c^p } ({\bf u})$ , can be defined as a function of the values of satisfaction with the assessment criteria as
  • To find an agreement that maximize the agents' aggregated satisfaction value at the highest possible satisfaction degree of all constraints, agents have to point out the set of feasible proposals which is defined as
  • where u is the latest proposal and $\alpha^p_i$ is an acceptable threshold of agent $p$ .
  • The task of offer generation by agent $p$ is to make an expected proposal ${\bf u}^{\ast}$ from ${}_{a^{p}_q}{\rm P}^p_{\bf u}$ . If agent $p$ faces no expected proposal ${\bf u}^{\ast}$ in ${}_{a^p_q}{\rm P}^p_{\bf u}$ , then agent $p$ lowers the threshold of acceptability $\alpha^p_i$ to the next threshold $\alpha^p_{i + 1}$ , and creates new feasible proposals ${}_{a^p_{i + 1}}{\rm P}^p_{\bf u}$ for selection. However, assuming that agent $p$ proposes an offer u to peer agent and a peer agent subsequently proposes a counteroffer ${\bf u}^{\prime}$ to agent $p$ , agent $p$ will accept the offer ${\bf u}^{\prime}$ as an agreement if
  • A rational agent will not propose a counteroffer that is worse than the offer proposed already by a peer agent. Thus, agent $p$ will also accept the offer ${\bf u}^{\prime}$ as an agreement if

The development of the negotiation process is determined by negotiation strategies [ 28] of the involved agents. These strategies determine how agents propose and evaluate offers to reach an agreement. Typically, each agent starts a negotiation by proposing its ideal offer. Whenever the offer is not acceptable by other agents, they make concessions or find new alternatives to move toward an agreement. Therefore, concession and trade-off strategies are considered.

  • In a concession strategy, assessment agents generate new proposals for achieving a mutual satisfactory assessment by reducing their desires and the set of feasible concession proposals at the threshold $\alpha^p_q$ for the assessment agent $p$ is defined as
  • where $r$ is the concession value.
  • In a trade-off strategy, assessment agents can explore options for achieving a mutual satisfactory assessment by reconciling their interests. Assessment agents can propose alternatives from a certain solution space and the degrees of satisfaction for constraints associated with the alternative are greater than or equal to an acceptable threshold. A set of feasible trade-off proposals at threshold $\alpha^p_q$ for the alternatives of the assessment agent p and is defined as
  • A normalized euclidean distance can be applied in establishing a trade-off strategy to measure the similarity [ 29] between alternatives, and thus generate the best possible offer. Hence, a similarity function is defined as
  • where ${\bf u}^{\prime} = \arg_{{\rm v^\prime }} \max_{{\rm v^\prime } \in {\rm {\bf u}^{\prime} }} (\mu_{c^p_q } ({\rm v}) - \mu_{c^p_q } ({\rm v^\prime }))$ .

If an agreement among assessment agents cannot be reached, students' assessments or negotiation strategies must be adjusted. Once an agreement is reached, the interests of all students are considered to produce the aggregated assessment. The iterative negotiation among students is useful in reaching consensus. Moreover, the involvement of self-assessment facilitates students in accepting the assessment results.

By applying constraints to express negotiation proposals, the assessment agent can perform the negotiation with increased efficiency and determine final results for overall assessments. By this process, students typically develop a serious attitude toward their coursework. Through considering different personal characteristics, the proposed methodology is able to flexibly aggregate fuzzy constraints to improve the accuracy of peer assessment. Through offer generation and evaluation, the negotiation process considers students' characteristics and coordinates assessment results, and then produces the final feedback.

2.5 Rich Feedback

Receiving accurate and complete feedback is correlated with effective learning [ 30]. In the stage of assessment representation, students can perceive fuzzy membership functions which provide complete and detailed information related to peers' markings to review their own work. Second, in the stage of coordinating assessment results, students can acquire the final assessment results to see the overall evaluation. It is critical that students can receive immediate peer feedback and observe the negotiation process to reflect upon their contribution and revise their own work. Owing to the involvement of self-assessment and the reduction of assessment bias, students are more inclined to accept the assessment results and comments and then seriously reflect upon their own work. Additionally, assessment results also can be translated into a real number value through a defuzzification technique [ 31]. The defuzzified value can be used as the final grade. On the other hand, the interactive assessment process facilitates instructors in perceiving students' performance and attitude to prevent subjective judgment. Through reviewing log data and monitoring the process of elaborating assessment criteria and marking, instructors can realize students' comprehension of course materials and whether students take their marking role seriously. Therefore, the peer assessment process would reveal students' performance clearly.

In summary, the enriched peer assessment process enables students to enhance course interpretation, frequently interact with peers, and represent their assessments. Through the interactive process and reduction of assessment bias, assessment accuracy and quality can be improved. The overall process facilitates students in fostering critical thinking skills and reflection as well as promoting meaningful learning.

System Realization and Walk-Through Illustrative Example

To support our proposed methodology, a negotiation-based peer assessment system, NePAS, has been developed. A walk-through example then is used to illustrate a peer assessment process with NePAS. Fig. 4 displays a high-level architecture of NePAS.


Figure    Fig. 4. A high-level architecture of NePAS.

First, instructors need to define assessment criteria which can be recorded in the criteria database. Each student accesses the NePAS through an assessment agent which provides intelligent supports for various assessment activities, including criteria exploration and ranking, characteristics detection, self-assessment, making peers' work and feedback. Coordination agent adopts a fuzzy constraint-based negotiation mechanism to resolve the cognitive differences among the assessors and learner himself. Student profile is a database to collect various personal characteristics of each student to adjust the assessment and reduce the personal bias. Assessment database includes students' assessment log and coordination results.

Assume that three students ( $I$ , $J$ , and $K$ ) enrolled in a course “ Introduction to Computer Science” are required to do a project of designing a website. After students have completed and submitted their projects to the system, they then move on to perform peer assessment activities. In what follows, a peer assessment process in NePAS is presented in four different phases: preparation, assessment, coordination, and feedback phases.

3.1 Preparation Phase

Students are asked to review the assessment criteria provided by the instructor, and then take part in ranking the assessment criteria. In this example, instructor initially suggests Superstructure, Graphics, Use of color, Content, Readability, Page layout, Hyperlinks, and Promotion as assessment criteria for a website design based on [ 32]. Students then assign a priority for each criterion, as illustrated in Fig. 5. Based on the rankings submitted by the students, NePAS selected Superstructure, Graphics, Content, and Readability as the final assessment criteria.

Graphic: Fig. 5. User interface for exploration of assessment criteria.

Figure    Fig. 5. User interface for exploration of assessment criteria.

Additionally, to collect learner's personal characteristics to reduce the assessment bias, students also have to fill out a questionnaire. As pointed out earlier, there are many different sources of personal background and characteristics that could affect the behavior of assessors. But, in NePAS, at this stage we have only incorporated the learning styles of assessor into the process for the reduction of assessment bias. To further reduce the assessment bias, other personal characteristics could also be included into the system in the future. In NePAS, we use the Felder-Silverman learning style model (FSLSM) [ 33] which is one of the most often used model in recent times and some researchers even argue that it is the most appropriate model for the use in adaptive web-based educational systems [ 34], [ 35]. FSLSM characterizes each student according to his/her preference on four dimensions: active/reflective, sensing/intuitive, visual/verbal, and sequential/global. The collected data about the learning styles of each student are stored in the student profile and accessible for each assessment agent to adjust the assessment later on. In this example, assume that the learning style of student $I$ is found to be {active, intuitive, visual, global}, and that of students $J$ and K is found to be {reflective, intuitive, visual, sequential} and {active, sensing, verbal, sequential}, respectively.

3.2 Assessment Phase

Now, students $I$ , $J$ , and $K$ can proceed to assess peers' work and do self-assessment based on the criteria agreed upon. Fig. 6 illustrates how the marking with fuzzy constraints can be done efficiently. As indicated in the figure, assessor has to select the type of fuzzy membership functions (i.e., triangular, trapezoidal, and Gaussian) for each criterion first and then fill out the required (e.g., supports) and optional parameters (e.g., satisfaction degrees) accordingly. Afterward, a graphical representation of the fuzzy membership function is displayed on the right for reviewing and can be changed literally, if necessary.

Graphic: Fig. 6. User interface of marking with fuzzy sets.

Figure    Fig. 6. User interface of marking with fuzzy sets.

Assume that student $I$ thinks the website layout of student $K$ is not easy to navigate from page to page, but both the graphics and the content are attractive and the pages are easy to read. Then, student $I$ proceeds to mark student $K$ 's work and is as shown in Fig. 6. Similarly, student $J$ can perform the assessment for student $K$ 's work, and, in NePAS, student $K$ has to do self-assessment for his/her own work as well. These fuzzy constraints are shown together in Fig. 7. By using fuzzy membership functions for assessment representation, it provides not only an effective approach for dealing with the uncertainty and impreciseness, and also allows the students to express the confidence of their assessment. If a student is confident about his/her assessment, the shape of the distribution for the assessment is a leptokurtosis, otherwise it is a platykurtosis. Additionally, in order to reduce assessment bias, the assessment submitted by students $I$ , $J$ , and $K$ (seen in Fig. 7) need to be modified on the basis of the learning style of each individual student.

Graphic: Fig. 7. Representation of peer assessment and self-assessment for student K's work.

Figure    Fig. 7. Representation of peer assessment and self-assessment for student K's work.

As stated before, FSLSM [ 33] describes students' learning style preferences on four dimensions: active/reflective, sensing/intuitive, visual/verbal, and sequential/global. Active students generally prefer to try things out and prefer to learn through communicating and collaborating with their peers. Therefore, it leads the system to put a greater weight on the assessment of students with active learning style on all criteria. On the other hand, for reflective learners, who learn through reflecting about the material and prefer to learn alone rather than through collaborating with their peers, we assume a lesser weight for their assessment.

In the second dimension, students with sensing learning style prefer to learn concrete material, tend to be more practical and check their work carefully, and are considered as more patient with the details. Therefore, sensing students are assumed a greater weight for their assessment on all criteria. Then, intuitive learners tend to be more innovative and creative, but not so careful with the details. Thus, for intuitive learners, we assume a lesser weight for their assessment on all criteria.

The visual/verbal dimension deals with the preferred input mode. It differentiates students who remember the best on what they have seen, such as pictures, diagrams, and flow-charts, from students who get more out of textual representations, regardless of whether they are in written or spoken forms. With respect to the assessment criteria in this example, learners with a visual learning style seem to be able to give more accurate ratings for the Superstructure criterion, which deals with assessing the layout of the website, and for the Graphics criterion, which deals with assessing the graphics on the website. On the other hand, for verbal learners, we assumed a lesser weight for their assessment on all criteria.

In the fourth dimension, sequential students learn in small incremental steps and therefore have a linear learning progress. They tend to follow logical stepwise paths in finding solutions. In contrast, global students use a holistic thinking process and learn in large leaps. They tend to absorb learning material almost randomly without seeing connections, but after they have learned enough material they suddenly get the whole picture. Then, they are able to solve complex problems and put things together in novel ways and are good at seeing connections and relations between different topics. Furthermore, global learners focus very much on getting an overview about a topic rather than focusing on the details. Therefore, with respect to the assessment criteria, global learners seem to be able to contribute more on both Superstructure and Graphics criteria, whereas sequential learners are more inclined to contribute highly in assessing the Content criterion, dealing with whether the content of the website can attract many visitors, and the Readability criterion, which deals with whether the page is easy to read for visitors. Table 1 displays how the learners with different learning styles are contributing to each assessment criterion.

Table 1. The Level of Contribution of Each Student with Different Learning Style on Each Criterion

For instance, for a student with {active, intuitive, visual, global} learning style, the composite weight on the Superstructure can be computed by combing the weights of each dimension, {high, low, high, high}, and we get {high} in the table. A five-level scale range from “very high” through “high,” “average,” and “low” to “very low” is employed to grade the contribution on each criterion with different leaning styles. Accordingly, the adjustment factors for the assessment of students $I$ , $J$ , and $K$ on each criterion are: $\varpi^I_1 = High$ , $\varpi^I_2 = High$ , $\varpi^I_3 = Low$ , $\varpi^I_4 = Low$ , $\varpi^J_1=Low$ , $\varpi^J_2 \!\;\!\!=\! Low$ , $\varpi^J_3 \!\;\!\!=\!Low$ , $\varpi^J_4 \!\;\!\!=\!Low$ , $\varpi^K_1\!\;\!\!=\! \it Average$ , $\varpi^K_2 \!\;\!\!=\! \it Average$ , $\varpi^K_3 = Very \;High$ , and $\varpi^K_4= Very \;High$ , where $\varpi_1$ is for Superstructure, $\varpi_2$ is for Graphics, $\varpi_3$ is for Content, and $\varpi_4$ is for Readability, respectively. Then, based on (2), Fig. 8 illustrates the newly modified fuzzy sets after incorporating the effects of learning styles to reduce the assessment bias.

Graphic: Fig. 8. Representation of adjusted fuzzy constraints based on learning styles.

Figure    Fig. 8. Representation of adjusted fuzzy constraints based on learning styles.

3.3 Coordination Phase

After assessment agents $I$ , $J$ , and $K$ have acquired and adjusted the assessments based on their learning styles, a negotiation is automatically performed to coordinate the cognitive differences among students $I$ , $J$ , and $K$ . Agreement is achieved when all participants agree. Therefore, during the negotiation process, agents $I$ , $J$ , and $K$ take turns to propose solutions until either an agreement has been reached or one of agents withdraws. The communication protocol for agent negotiation is adapted from [ 21], and Fig. 9 displays the process of offer generation and evaluation for a negotiating agent. The curves indicate the acceptable ranges when students propose their own offers by lowering the threshold. If there exists an overlap between acceptable ranges, an agreement can be expected through negotiation. Otherwise, negotiation failed and agents need to revise their assessments prior to a new negotiation process.

Graphic: Fig. 9. The process of offer generation and evaluation for a negotiating agent.

Figure    Fig. 9. The process of offer generation and evaluation for a negotiating agent.

In our example, Agent $I$ proposes its assessments $u^I_1 = (60,70,75,70)$ with respect to criteria Superstructure, Graphics, Content, and Readability at threshold $\alpha^I_1 = 1$ . However, according to (2), $\mu_{C^J } (u^J_1 ) = 0$ and $\mu_{C^K } (u^K_1 ) = 0$ , and, therefore, agents $J$ and $K$ cannot accept $u^I_1$ as an agreement. Subsequently, agents $J$ and $K$ propose their assessments $u^J_1 = (80,65,70,75)$ and $u^K_1 = (90,90,80,85)$ , respectively, at threshold $\alpha_1 = 1$ , but other agents cannot accept one of these assessments as an agreement as well.

Assumes that agent $K$ adopts the concession and trade-off strategies and has no feasible proposal at threshold $\alpha^K_1 = 1$ , agent $K$ then lowers its threshold to the next threshold level $\alpha^K_1 = 0.9$ to create a new set of feasible proposals for the next round of negotiation:

$$\eqalign{v^K_{2a} &= (82,90,80,85),\quad v^K_{2b} = (90,85,80,85), \cr v^K_{2c} &= (90,90,78,85),\quad v^K_{2d} = (90,90,80,80).}$$

Based on (8) and (9), a normalized euclidean distance can be used to measure the similarity between alternatives to generate the best offer. Accordingly, $K$ selected the most likely acceptable solution $v^K_{2c} = (90,90,78,85)$ , as the proposal for agents $I$ and $J$ . However, agents $I$ and $J$ cannot accept $u^K_2$ as an agreement due to $\mu_{C^I } (u^I_2 ) = 0$ and $\mu_{C^J } (u^J_2 ) = 0$ . Agents $I$ and $J$, in turn, adopt their negotiation strategies to propose feasible proposals. This procedure of offer generation and evaluation for agents $I$ , $J$ , and $K$ continues until an agreement is reached or no additional solutions are proposed. If one of these agents cannot reach an agreement with the others, assessment agent would require the student to rethink and revise his/her assessment as shown in Fig. 10.

Graphic: Fig. 10. Readjustment of assessments caused by a failed negotiation with no agreement.

Figure    Fig. 10. Readjustment of assessments caused by a failed negotiation with no agreement.

3.4 Feedback Phase

After several rounds of negotiation between agents $I$ , $J$ , and $K$ , it has arrived at an agreement on the assessment results of student $K$ 's website design as shown in Fig. 11 (purple areas).

Graphic: Fig. 11. Representation of assessment results and feedback.

Figure    Fig. 11. Representation of assessment results and feedback.

If the range of the assessment results is narrow, student $K$ can interpret this as having a high agreement. On the other hand, if the range of the assessment results is wide, then student $K$ may have to reflect on the cause for marking differences. At the same time, student $K$ can also examine the satisfaction value on the results, and, in this case, the degree of satisfaction for ( Superstructure, Graphics, Content, Readability) is at (0.6, 0.6, 0.7, 0.6), respectively. The closer the satisfaction value is to 1, the higher is the acceptance for the assessment results. Therefore, it can offer rich feedback with two-dimensional representation and foster deeper reflection and thinking. Additionally, NePAS also employs a defuzzification technique to render numerical scores for ( Superstructure, Graphics, Content, Readability) at (73, 78, 77, 74), respectively, and can be regarded as the final scores for students' performance.

Experiment and Evaluation

The goal of this experiment is to evaluate the usability and effectiveness of the proposed methodology. A total of 54 first-year college students (15 females and 39 males) participated in the experiment for six weeks while enrolling in a mandatory course “ Introduction to Computer Science” at a Taiwanese university. They were randomly assigned into three groups (Groups A, B, and C). Sixteen students in GroupA did not take part in any peer assessment activities. Then, 18 students in Group B used a conventional peer assessment (i.e., without intelligent techniques such as agent negotiation and fuzzy constraints), while 20 students in Group C used a negotiation-based peer assessment (i.e., with NePAS). Students in Groups B and C were further divided into four teams each, and they did not know the identity of their teammates. The instructor then gave all students a preexamination to test their knowledge of the course content to determine whether the learning status of the three groups is different. The tests were marked by the instructor and analyzed using one-way ANOVA analysis ( $F{\hbox{-}}{\rm value} = 0.119$ ; $p{\hbox{-}}{\rm value} = 0.887$ ). No significant difference in learning status was noted among the three groups as $p$ -values exceed the level of significance $(\alpha = 0.05)$ . That is, the result indicated that the three groups are not significantly different in their performance before participating in the experiment.

During the experiment, all students were assigned a project to design a website. After all students have submitted their projects, students in Groups B and C then moved on to take part in peer assessment activities. They were instructed to do the following three rounds.

4.1 Round 1

After students in Groups B and C have submitted their projects, they were also asked to fill out the learning style questionnaire and to read the description of assessment criteria. For students in Group B, assessment criteria ( Superstructure, Graphics, and Content) were decided by the instructor. But, for students in Group C, they were asked to rank a set of assessment criteria provided by the instructor first, and NePAS then selected those criteria ( Superstructure, Graphics, Content, and Readability) with higher students' ranking. Based on those selected assessment criteria, students proceed to mark not only peers' work and also their own work for self-assessment. Students in Group B were then to receive numerical scores as the results from a conventional peer assessment system, while students in Group C were to receive fuzzy membership functions as the feedback from NePAS. Afterward, students in both groups were allowed to revise their own projects and proceed to round 2.

4.2 Round 2

After receiving the feedback, students in Groups B and C were instructed to reflect upon and, if necessary, revise their work, and then perform peer and self-assessment again. Additionally, to evaluate the effectiveness of the NePAS, the instructor gave students in all three groups two postexaminations in relevant to the course content of designing a website to determine the variance among groups. The first postexamination (P1) was identical to the preexamination, and the second postexamination (P2) was a new examination. Table 2 presents the differences in each group based on the individual performance of the preexam and the first postexam P1 via the paired t-test.

Table 2. Performance Analysis via the Paired $t$ -Test

The paired $t$ -test analytical results indicated that the effectiveness in each group was significant. Especially, the performance improvement of students in Group C was more significant than that of their classmates in Groups A and B. Through one-way ANOVA analysis, the results ( $F{\hbox{-}}{\rm value} = 10.507$ ; $p{\hbox{-}}{\rm value} =0.0001$ ) indicated that significant difference was noted among the three groups. In order to further verify the effectiveness, one-way ANOVA and t-test were used to analyze the results of the second postexam P2 between any two groups. The results ( $F{\hbox{-}}{\rm value} = 11.65$ ; $p{\hbox{-}}{\rm value} =0.00005$ ) also indicated that significant difference was observed among the groups. Table 3 represents the $t$ -test and $F$ -test analytical results, and it can also be observed that the performance improvement for students in Group C was again significantly greater than that for students in other two groups. These results suggest that the learning performance can indeed be enhanced further through a negotiation-based peer assessment process like NePAS.

Table 3. Performance Comparison via $t$ -Test and $F$ -Test

4.3 Round 3

To evaluate the usefulness of the proposed methodology, students in Group B were asked to use NePAS and students in C were then switched to use conventional peer assessment. Following the experiment, students in Groups B and C provided feedback through a questionnaire consisting of seven questions. A five-point Likert scale ranging from “strongly disagree” through “disagree,” “no opinion,” and “agree” to “strongly agree” was employed to grade responses. Questionnaire results are shown in Table 4.

Table 4. The Questionnaire Results

The questionnaire results indicate that most students regarded NePAS as a helpful system since they were able to conveniently communicate assessment criteria, easily assess peer's work, receive acceptable feedback and acquire helpful information through observing the assessment representation and negotiation process. Although a few students thought the fuzzy set was difficult to define or they might have difficulties to interpret the results and feedback, most students believed that the system was useful. Overall, students agreed that NePAS was easy to operate and was able to provide richer feedback.


This paper has presented a framework for providing intelligent supports through agent negotiation and fuzzy constraints to enhance the effectiveness of peer assessment. In this framework, assessments are represented as fuzzy constraints; assessment bias can be reduced; and assessment results are reached through negotiation mechanism. Experimental results reveal that this framework significantly improved student performance. Students also agreed that marking with fuzzy membership functions is flexible and easy to use. Additionally, by incorporating personal characteristics into assessment process to reduce the bias and negotiation mechanism to coordinate the cognitive differences can improve the assessment accuracy and thus help students more inclined to accept the assessment results and to reflect upon their own work.

Finally, although the proposed methodology has yielded promising results in promoting the effectiveness of peer assessment in collaborative learning environment, considerable work remains to be done, including further development of the methodology to lower the cognitive loading for both students and instructors, large-scale classroom experiments in different levels and domains, improvement of marking method with other membership functions, and incorporation of other personal characteristics into assessment process to further reduce the assessment bias.


This research is supported in part by the National Science Council of the Republic of China under contract number NSC 98-2221-E-155-019. The authors also acknowledge the support of NSERC, iCORE, Xerox, and the research-related gift funding by Mr. A. Markin. Furthermore, the authors would like to thank the anonymous reviewers for their helpful comments on this paper.


About the Authors

Bio Graphic
Chung Hsien Lan received the MS degree in information managment and the PhD degree in computer science and engineering from Yuan Ze University, ZhongLi, Taiwan, in 2009. She is an assistant professor in information management at the Nanya Institute of Technology. Her research interests include peer assessment, collaborative learning, adaptive learning, agent negotiation, and fuzzy constraints. She has published three books and more than 15 journal papers and conference papers, of which one conference paper was awarded with the best paper award. Her joined projects include the application of LMS and the development of learning and teaching portfolio.
Bio Graphic
Sabine Graf is an assistant professor in the School of Computing and Information Systems at Athabasca University, Canada. Her research interests include adaptivity and personalization, student modeling, ubiquitous and mobile learning, artificial intelligence, and collaborative learning technologies. She has published more than 60 refereed journal papers, book chapters, and conference papers, of which three conference papers were awarded the best paper award. She is the editor of the Learning Technology Newsletter, a publication of the IEEE Computer Society's Technical Committee on Learning Technology (TCLT). She has given invited talks at universities/companies in Austria, Canada, New Zealand, Taiwan, and the United Kingdom and she is involved in research projects dealing with mobile and ubiquitous learning, adaptivity and personalization in learning systems, student modeling, and the application of e-learning at universities.
Bio Graphic
K. Robert Lai received the BS degree from the National Taiwan University of Science and Technology in 1980, the MS degree from Ohio State University, Columbus, in 1982, and the PhD degree in computer science from North Carolina State University, Raleigh, in 1992. From 1983 to 1989, he was a senior engineer with GE Aerospace Division, Maryland. In 1994, he joined the Department of Computer Science and Engineering, Yuan Ze University, Taiwan, where he is now a professor. His current research interests are in computational intelligence, agent technologies, and mobile computing.
Bio Graphic
Kinshuk received the PhD degree from De Montfort University, United Kingdom. He is theNSERC/iCORE/Xerox/Markin Industrial Research chair for adaptivity and personalization in informatics, and full professor and director of School of Computing and Information Systems at Athabasca University, Canada. His work has been dedicated to advancing research on the innovative paradigms, architectures, and implementations of learning systems for individualized, and adaptive learning in increasingly global environments. His research interests include learning technologies, mobile and location-aware learning systems, cognitive profiling, and interactive technologies. With more than 275 research publications in refereed journals, international refereed conferences and book chapters, he is frequently invited as a keynote or principal speaker at international conferences and is a visiting professor in many countries around the world. He is the founding chair of the IEEE Technical Committee on Learning Technologies and the founding editor of the Educational Technology & Society Journal (SSCI indexed with Impact Factor of 0.904 according to Thomson Scientific 2008 Journal Citations Report).
56 ms
(Ver 3.x)