JANUARY-MARCH 2009 (Vol. 2, No. 1) pp. 3-9 1939-1382/09/$26.00 © 2009 IEEE Published by the IEEE Computer Society Creating a Corpus of Targeted Learning Resources with a Web-Based Open Authoring Tool
Abstract—Personalizing learning to students' traits and interests requires diverse learning content. Previous studies have demonstrated the value of such materials in learning but a challenge remains in creating a corpus of content large enough to meet students' varied interests and abilities. We present and evaluate a prototype Web-based tool for the open authoring of learning materials. We conducted a study (an open Web experiment) to evaluate whether specific student profiles presented in the tool's interface increase the diversity of the contributions and whether authors tailor their contributions to the features in the profiles. We report on the quality of materials produced, the authors' facility in rating them, the effects of author traits, and the impact of the tailoring feature. Participants were professional teachers (math and nonmath) and amateurs. Participants were randomly assigned to the tailoring tool or a simplified version without the tailoring feature. We found that, while there were differences in teaching status, all three groups made worthy contributions. The tailoring feature leads contributors to tailor materials with greater potential to engage students. The experiment suggests that an open access Web-based tool is a feasible technology for developing a large corpus of materials for personalized learning. Introduction Personalization to students' interests and identities has been shown to improve both student engagement and test scores. Fourth grade math students have higher pretest-to-posttest gains with personalized instruction and also perform significantly better on both the pretest and posttest problems [ 1 ]. Similar effects have been found with fifth and sixth grade students [ 2 ]. Personalized instruction has also been demonstrated to increase the engagement and learning outcomes of minority groups, e.g., Hispanic [ 3 ]. One challenge to the growth of personalized learning environments is that the content they present is laborious to create. Intelligent tutoring systems are very adaptive to the learner's activity, yet require 100-1,000 hours of time from skilled experts for each hour of instruction [ 4 ], [ 5 ]. Newer approaches [ 6 ] lower the total resources necessary to create a tutor, but still require careful coordination of a group to create useful tutors. Tutors such as REDEEM [ 7 ] and pSAT [ 8 ] separate logic—which requires programming—from the domain material so that nonprogrammers can customize or extend the tutor. The example-tracing feature of CTAT lowers the expertise necessary to define tutoring logic and its bulk templating feature allows simple expansion of domain material within a logic [ 9 ]. Yet, each of these require some training to use. The Assistment Builder's problem-specific authoring paradigm and Web-based interface are easy enough that novice users can develop a simple tutor for a problem in under 30 minutes [ 10 ]. Yet, all of these are limited in the dimensions by which they can personalize to the student (e.g., knowledge components learned or preference of learning style). In this paper, we describe and evaluate an open Web-based problem-specific authoring tool with a novel feature to foster personalized instruction matched to learners' interests and abilities. The power of open authoring on the World Wide Web has been demonstrated over the last decade. Encyclopedias, Web browsers, computer operating systems, and other complex artifacts have been created by loose networks of volunteers, building on each other's contributions. These openly developed products often meet and sometimes exceed the quality of more cohesive sources and, in general, lower their costs. Existing open authoring systems for education, such as Wikiversity or Wikibooks, create monolithic artifacts that are the same for all learners. Connexions, an open textbook authoring system, was designed to support remixing of content "modules" [ 11 ], but these are tailored to the scope of a course rather than an individual learner. The work reported here is part of a larger research program on collaborative open educational resource development around a four-phase life cycle in which system users generate, evaluate, use, and improve shared materials [ 12 ]. Here, we consider the potential for this open authoring paradigm to support individualized instruction. Rather than encyclopedia articles or textbook modules, the artifacts created in this study are worked example problems, chosen for their value and versatility. Worked examples both instruct and help to foster self-explanation [ 13 ]. They fit easily into existing practices as an enhancement to existing intelligent tutoring systems [ 14 ], [ 15 ], as an instructional material, as a fading scaffold (by omitting some of the solution steps), or as a basic assessment (by omitting the solution altogether). A corpus of worked examples tied to personal interests and learning capacities would be a practical means of introducing personalized learning into multiple modes of use. 2. The Tool To facilitate the creation and growth of corpora of materials, we have created a prototype Web-based authoring application designed to promote tailoring of content to learner characteristics. The version of the tool evaluated here is for worked-example problems which can be repurposed into pure assessments or instruction. The tool can also be easily adapted to make these other types of resources directly. The client-side software is built in HTML and Javascript (AJAX) and works in modern Web browsers (IE7, Safari 3, FireFox 1.5, etc.). It is running on the Web all the time and is open to anyone to contribute to at education.hciresearch.org. In starting the tool, authors first see a page explaining what a worked-example problem is and what skill to target. This page also provides a search box to look up on the Web anything they want to learn or refresh themselves on and a table of pedagogical principles to consider in creating their worked example. When they are ready to author, they click Continue to reach the authoring interface, shown in Fig. 1 . The tailoring feature comprises the student profile shown at the top and the text guidance below it asking the author to "Please create a worked-out example to provide practice to the student above in understanding and applying the Pythagorean Theorem." Fig. 3 shows examples of other profiles. (In the control condition of the study, the profile image and the text "to the student above" are absent.) Below the guidance information is a dynamic HTML form in which they enter their worked example. They can enter a problem statement in a large textarea element to the left and can add a diagram or illustration of the problem using a Flash-based drawing widget to the right. The drawings are recorded in SVG format for future programmatic manipulation and native vector rendering in advanced Web browsers. Below the problem statement is the solution table where authors enter and annotate the solution steps, with columns for the work (i.e., the actual steps toward the solution), explanations of the work, and optional illustrations. Authors begin with the Add Step button which dynamically adds a row to the table and populates each field with starter text (e.g., "First...," "You do this because..."). Authors type out the first step of work to perform toward the solution, an explanation of why, and optionally draw an illustration. They repeat this for each step until their last, which contains the completed answer to the problem. Fig. 2 shows an example contribution authored with the tool. Fig. 1. Screenshot of authoring tool in profile condition. Fig. 2. Sample contribution authored with the tool. Because the tool is accessible to anyone to contribute, controlling the quality of the corpus is a critical challenge. To achieve this, we have implemented (and are experimenting with) a two-pass quality check system. In the first pass, an SQL query is run to filter out any contributions that are duplicates or are not within reasonable content parameters, described below. In the second pass, humans use a simple rating tool to select the quality level of three different components of the contribution (the problem statement, solution steps, and the explanations of the solution steps) on a four-point scale specified in Table 2 : Useless, Fixable, Worthy, or Excellent. The rater clicks on a button for each part to indicate its quality and then a submit button which automatically advances to the next contribution to evaluate. 3. Evaluation We have evaluated the system in an open Web-based experiment with hundreds of contributors. To increase statistical power for the evaluation, the study controls for skill by targeting one specific skill. The skill of understanding and applying the Pythagorean Theorem was chosen for its suitability to personalization. It affords a variety of real-world scenarios to demonstrate it, providing opportunities for the author to make the problem relevant to the student. Pythagorean Theorem problems also often have a visual component, making them more difficult to generate by any automated means and thus taking advantage of the human contribution. To explore the impact of open development and diverse levels of expertise, our study was open to all comers. Reasonably, this would lead to a volume of content without much value and this motivated our first two hypotheses: In evaluating the tailoring feature, we hypothesized To reliably assess the impact of the tailoring feature, participants were randomly assigned to one of two conditions. In the profile condition, participants used the tool with the tailoring feature that presents student profiles. In the generic condition, this feature was removed. No profiles were shown and the words "to the student above" were stricken from the task description. In the profile condition, the profiles were varied to assess how well the feature facilitated tailoring. Student profiles were designed to vary on six dimensions that might differentiate the learning patterns of real students. They varied on three dimensions of skill to increase the variation of the contributions on skill-level appropriateness. These were proficiency in the Pythagorean Theorem, proficiency in math generally, and verbal proficiency. They were also varied on cultural attributes to prompt creativity of the participants and increase the personal relevance of the examples to students. These were gender, hobbies/interests, and home environment. Four hobbies were crossed with four home environments to create 16 unique student profiles. Distributed evenly among them were four skill profiles and two genders. Additionally, each was assigned a favorite color to round out the description presented. Participants in the profile condition saw a new randomly selected profile for each worked-example problem they authored (e.g., one of the two in Fig. 3 ) Fig. 3. Sample profiles in profile condition. 3.1 Participants The URL to participate was advertised on various Web sites both related to education and not. Participants could earn up to $12 for their worked example contributions, regardless of their quality. After following the URL, they received a description of the task and a stated purpose of creating open educational materials. After consenting, they entered their e-mail, professional status, and their age. (To deter false age inputs, their was no mention of eligibility and visitors under age 18 were sent to a survey so that they would not be aware of their ineligibility.) Eligible participants would see a page describing the task in more detail and three principles of authoring worked examples. The next page presented the authoring tool. During the experiment, 1,427 people registered on the site to participate. After seeing the task in detail, most did not continue, but 570 participants did use the system to submit 1,130 contributions. Table 1 shows, by teacher status, the number of participants reaching each greater level of participation in the experiment. 3.2 Exit Survey After each submission of a contribution, the participant was invited to submit another or to conclude their session with an exit survey. The survey collected information on their participation, their educational experience, their perspective on worked example problems, their regard and preferences for community authoring, and their experience using the authoring tool. Of the 570 people who made qualifying contributions, 236 also completed the exit survey. Table 1. Count of Participants by Teacher Status and Degree of Participation 4. Results of Open Authoring To test To test Table 2. Quality Scale Used in Coding and Analysis In this second-pass quality check, 23 percent of whole problems (statements with solutions) were classified as Worthy, meaning that they were fit for use immediately. Fifty-seven percent were at least Fixable, meaning that they would be valuable with some additional effort. In general, the statements were of higher quality than the solutions. Of all the statements, 55 percent were Worthy and 9 percent were Excellent as is. To test Fig. 4. Mean quality score of statement and solution by teacher status. Math teachers were best at writing problems statements, compared to other participants. A comparison across teacher status showed a marginally significant effect ( Contrary to To better understand the teacher expertise effects, we examined more features of the participants' experience as educators. Since being a professional teacher affects quality, does being a teacher longer also? We found that while Statement quality is not correlated, Solution quality declines with years in the classroom ( 5. Discussion on Open Authoring In a short amount of time, about 1,500 people registered to contribute to a commons of educational materials. Of the raw contributions, 570 made the first-pass software filter blocked leaving 550, of which 109 were judged useless by human experts. The software filter saved human raters from seeing 84 percent ( Teacher status had an important impact on the quality of the components of contributions. As predicted in Additionally, it seems that tutors of math outside the classroom have less of this blind spot, either through less domain expertise or greater pedagogical content knowledge. Interestingly, there was no observed difference in quality by the number of years spent tutoring, so if it is due to pedagogical content knowledge, it may develop quickly. If so, an explanation may be that a tutor gets direct feedback from a tutee on her explanation while a teacher in front of a classroom has that feedback only in the aggregate of many students, if at all. Overall, it is clear that, at least for worked examples of the Pythagorean Theorem, participants of all teaching statuses were likely to make contributions of value. Math teachers do a better job at some parts of the process, but even laymen do fairly well. Educational content systems can benefit from opening the channels of contribution to all comers. 6. Results of Tailoring Feature The tailoring feature of the tool was evaluated experimentally. To test Table 3. Probabilities of Contribution Matching an Attribute To test whether authors tailor their contributions to the verbal skill of the student, we compared the verbal skill level of the student profile presented to the author with the reading level of the authored contribution. The reading level was measured using the Flesch-Kincaid Grade Level Formula [ 20 ]. This formula assesses US school reading grade level for a given text, making it easy to match a worked example contribution to the reading level in a student's record. The text analyzed is the concatenation of the problem statement and all the explanation steps. Because readability metrics are not calibrated to math expressions, the work steps were omitted from readability analysis. Outliers were curtailed by removing the top and bottom 2.5 percent percentile in the distribution of Flesch-Kincaid Grade Level, leaving a range Table 4. Correspondence of Verbal and Math Skill Levels with the Authoring Interface The same letters are not statistically different. (a) Matching to verbal skill. (b) Matching to math skill. Math difficulty was measured more simply because there is no established metric available. Since all problems were on the Pythagorean Theorem, we chose to measure math difficulty by whether the problem uses only the 3-4-5 triangle, the least challenging numerical solution. An The effect of the tailoring tool on author effort was also analyzed to test Effects on future effort were also analyzed using responses to the exit survey. The 10 five-point agreement Likert items from the Community section of the survey ( Table 5 with items marked (R) reversed) were combined to form a scale ( Table 5. Exit Survey Items on Community Fig. 5. Regard for community by professional status and experimental condition. Quality was analyzed by experimental condition to test 7. Discussion on Student Profiles Confirming Participants shown the student profiles also tailored their contributions to the student's skill level in both math and reading. Contributions made for students with high and low reading skill differed in terms of reading difficulty by almost a grade level. Contributions for profiles with high general math skill level were one-third less likely to make use of simple 3-4-5 triangle problems. Supporting That profiles would lead to contributions of higher quality on an absolute scale The profiles did have a curious effect on a measure of regard for community, a possible indicator of future participation. Amateurs were not affected by the profiles but teachers were. The profile feature led math teachers to value peer feedback more highly and trust in the quality of community-generated learning materials. In contrast, teachers of other subjects came to think less of peer feedback and of community-generated materials. While this may be due to different dispositions of math and other teachers, it may also be simply because math teachers saw it as a valuable tool in their work and other teachers thought it distracted from theirs. The explanation for this interaction remains an open question. 8. Limitations An important limitation of the study is that there are no measures yet of how these contributions actually aid learning. The expert ratings were taken as proxies for the utility in real learning contexts, but the true test will be using the community-authored materials to teach real students and measure their gains versus alternative materials. One potential pitfall is that the personalizing details in the tailored resources distract students from learning. Of course, the improvements to their motivation might offset this. A real-world study is necessary to answer these questions. Another key limitation of the findings here is the ecological validity of paying participants for their contributions. The problem is not that participants had an incentive to contribute. One can imagine a future system with incentives such as peer status or competitions with nonmonetary awards (e.g., [ 21 ]). Certainly, volunteers are always motivated by some incentive, external or internal. How though do contributions differ under more ecologically valid incentives? Because participants were paid for any contribution, there is good reason to believe that real-world volunteers would be more dedicated and likely to produce higher quality materials on average. It is worth noting that, since completion of the experiment, additional participants have contributed to the site without an incentive. At the close of the experiment, the Web site was disabled but at the request of people who still wanted to participate, two months later it was restored for free contributions. In the months that have elapsed, 93 people have registered and submitted 93 contributions, of which 40 pass machine filtering. We are addressing the above limitations by creating a production system in which materials are both authored, used, evaluated, and improved. We are planning an open-source open-content platform for collaborative authoring in different domains. We will manipulate and study the extrinsic (e.g., money and social credit) and intrinsic (e.g., fun) motivations of authors and may assess the learning impact of materials. 9. Conclusions We evaluated whether open authoring and profile-based tailoring might be a way of addressing a significant obstacle to a highly individualized instruction, namely, the fact that a large pool of differentiated instructional materials is needed. Our first main conclusion is that the results support the feasibility of open authoring of instructional materials targeted at highly specific instructional objectives. We confirmed that quality control of the contributed materials is feasible through simple means. Automated filtering of the least valuable content was trivial, and teachers using our rating tool did not have to expend much effort to separate the wheat from the remaining chafe. Importantly, both professional educators and amateurs contributed a large portion of useful materials. Contrary to our expectation, contributions from math teachers were not superior to those from others. This finding bodes well for the viability of open authoring to support math learning because there are many more people who are not math teachers than who are. Math teachers did write the best problem statements but amateurs wrote the best solutions. This finding suggests a model for community authoring in which math teachers contribute the problem statements and amateurs write the solutions. In general, it suggests that users of different aptitudes and abilities be directed to different tasks within the collaborative authoring system, a solid design implication. That additional tutoring experience led to greater solution quality while classroom teaching experience led to less invites the speculation that tutoring is a better way to build pedagogical content knowledge than classroom teaching is. This is worthy of further study. A second main conclusion to follow from this work is that community authoring efforts can be directed toward producing individualized materials. The tailoring feature of our authoring tool, in which authors are shown specific student profiles, successfully led to tailored materials. The profiles led to more highly tailored materials. On every attribute, the profile increased the likelihood of targeting it, compared to authoring without profiles. The profiles also drew out slightly more effort on the part of participants. While the profiles did not measurably improve the quality of contributions, they did not impair them either. Thus, the feature provides measurable gains in individualization without measurable impairments to the quality of the contributions. The tailoring feature also perhaps increased likelihood of future efforts from math teachers by causing them to hold community authoring in higher esteem. Curiously, the tailoring feature had the opposite effect on nonmath teachers. This unexpected interaction with teaching domain suggests a factor to consider in designing and evaluating education technologies. This study has positively, albeit partially, demonstrated the utility of a Web-based open authoring system for personalized learning resources. Participants, regardless of professional expertise, are able to make useful contributions. A relatively simple student profile feature is successful in eliciting contributions tailored to cultural (interests and environment) and cognitive (math and verbal) attributes of different learners. Thus, open authoring, combined with student profiles, helps overcome a significant obstacle to large-scale individualization of learning materials, namely, the need for a large pool of individualized materials. ACKNOWLEDGMENTSThe authors would like to acknowledge the suggestions of the reviewers. The photo shown in the student profile included in this paper came from Flickr user jenrock under a Creative Commons Attribution-Noncommercial 2.0 Generic license. This work was supported in part by Graduate Training Grant awarded to Carnegie Mellon University by the US Department of Education (#R305B040063). The research reported here was supported by the Institute of Education Sciences, US Department of Education, through "Effective Mathematics Education Research" program grant #R305K03140 to Carnegie Mellon University. The opinions expressed are those of the authors and do not represent the views of the US Department of Education. Manuscript received 16 Sept. 2008; revised 28 Dec. 2008; accepted 7 Jan. 2009; published online 16 Jan. 2009. For information on obtaining reprints of this article, please send e-mail to: lt@computer.org, and reference IEEECS Log Number TLTSI-2008-09-0089. Digital Object Identifier no. 10.1109/TLT.2009.8. REFERENCES
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||