2.4.1 The Google PageRank Larry Page and Sergey Brin [ 12], [ 24] consider the popularity of a website that depends on internal/external links. They propose an algorithm to calculate a numerical weighting, called PageRank , to each element of a hyperlinked set of documents. The formula is as follows:
2.4.2 Mining Methodologies for Time Series Data Landmark [ 30] represents the beginning of a system time. In this model, data will be processed from the time a system starts until the present time. Due to the consistent speed of time, we can easily analyze and compare the process history in the time model. The disadvantage is that it will cause extreme workload to systems when over time processing occurs. Besides, not all of data in each timescale are useful. We need to take garbage data processing into consideration when utilizing this model. The Sliding Window Model [ 27] improves the disadvantages of the Landmark Model. It helps analyzing the data stream in a specific timescale. Each sliding window contains a fixed width of data elements. The data will be loaded and processed in a specific timescale ahead of the current time. After that, data elements are implicitly deleted from the specific sliding window, when it moves out of the window scope. However, the use of such a model focuses on the basis of timescale and must be adjusted based on different conditions and circumstances.
The data stream is considered as the same in each timescale in the models presented. To represent the importance of data in each timescale clearly, the Time-Fading Model [ 7] was proposed. As such, the mentioned distance of time is also a key point of data mining. It separates the time into several blocks and gives each timescale a different decreasing weight progressively, from the current to the past. It improves the relationship between data and timescale, especially to those timeliness data. Taking Fig. 2 as an example, the data in the later timescale will have higher weights than the past ones.
Using the presented models for mining time series data, the quantity of data has to be concerned because of the memory size. In Landmark Model, it may take
to record the data in one month with the smallest measurement unit: 1 minute. In order to overcome this storage problem, the Tilted-Time Window [ 9] was proposed. The timescale is divided into different sections from the nearest one to the farthest one. The nearer sections will be given in more details, as shown in Fig. 3.
With the same example, the total memory costs will be:
In this research, we pay emphasis on the focus-to-date LOs, and to estimate their importance degree by using the mining methodologies, especially the Time-Fading Model and the Tilted-Time Window Model presented above. The LOs over a long period of time will be considered in a macroperspective through the integration of these methodologies.
Author reference ( ): System will collect citations of LOs created by same authors and sum them up. We could keep trace of the status of LOs according to the relationship between authors and LOs through how many times the specific LO is downloaded. The citations of a newly created LO are defaultly set to zero.
Time reference ( ): It represents the number of citations in a specific timescale. If the citation of a specific LO increases suddenly in a timescale but it is only utilized just a few times in the following days, it may not be evaluated simply through . We have to consider the citations in a specific timescale that LO has. Hence, TR can be utilized to record the time LO persisted and its corresponding citations to improve the accuracy of the weight of LO.
General: Title, Language, Keyword, Coverage.
Educational: LearningResourceType, IntendedEndUserRole, TypicalAgeRange, Difficulty, TypicalLearningTime.
1. A user follows the search criteria selected to find relevant LOs in the repository. The search criteria we set can be classified into following groups: Precise Criteria, Incremental Criteria, Precedence Criteria, Time/Duration Criteria, and Single/Multiple Choice Criteria, which have already addressed in [ 36].
2. An alternative way is to allow users to input one or several keywords to start the first query. The query criteria here include the definition of LOM (keywords, language, difficulty, etc.). The system will generate the query vector to proceed the queries and return the relevant results , shown in a reusability tree to users.
3. We calculate the similarity of each search result and take the intersection to generate a representative set. This set is called positive union . After that, we compare the elements in with the first-query result lists. Thus, we can retrieve the irrelevant elements. They are called negative union .
4. Then, we make use of the weighting mechanism we proposed and the diversity match function in previous work [ 25] to calculate the suggestion coefficient and the irrelevant degree . The system will reset based on to become a new suggestion list . After that, we check the elements both in and to see if there are any corresponding elements.
5. To filter out the irrelevant query results [ 8], we select 10 elements in descending order from and check them with . Then, we add them into the suggested revised query vector .
Assumption 1: The resource retrieval time in the posttest is not better than one in the pretest.
Assumption 2: The resource retrieval time in the posttest is better than one in the pretest.
Quantities of LOs. Although there are totally 20,738 learning objects in our registry system collected in the past years. Some of them are updated LOs based on existing ones. The number of learning objects we have can be regarded as a large amount of data. However, we still cannot reach a good sampling coverage as compared to some IR research domains.
External Connection. According to the definition of federated CORDRA, every subrepository will have certain connections with others. Actually, most of them only have relation with the central registry system which belonged to CNRI. The search performance among a large scale of federated repositories will be taken into consideration though they proceed efficiently in our current repository.
Selected Timescale. We have to set the timescale for evaluating the representative degree of specific learning objects. However, a variation of timescale may cause different results. It is hard to define a standard evaluation timescale. Our registry system has been developed for five years. In this research, we utilize three years as the default timescale. It is necessary to consider different time scales to make our results more precise according to the practical situations.
N.Y. Yen is with the Department of Human Informatics and Cognitive Sciences, Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa-shi, Saitama 359-1192, Japan.
T.K. Shih is with the Department of Computer Science and Information Engineering, National Central University, No. 300, Jhongda Rd., Jhongli 32001, Taiwan. E-mail: firstname.lastname@example.org.
L.R. Chao is with the Department of Computer Science and Information Engineering, Tamkang University, No. 151, Ying-chuan Road Tamsui, Taipei County 25137, Taiwan. E-mail: email@example.com.
Q. Jin is with the Department of Human Informatics and Cognitive Sciences, Faculty of Human Sciences, Waseda University, 2-579-15 Mikajima, Tokorozawa-shi, Saitama 359-1192, Japan.
Manuscript received 15 Dec. 2009; revised 11 Mar. 2010; accepted 16 May 2010; published online 14 July 2010.
For information on obtaining reprints of this article, please send e-mail to: firstname.lastname@example.org, and reference IEEECS Log Number TLT-2009-12-0199.
Digital Object Identifier no. 10.1109/TLT.2010.15.
1. All the changes that may be made will be recorded by the Hard SCORM authoring tool [ 37] while reusing the specific LOs.
Neil Y. Yen received the master's degree from Tamkang University in 2008. He is currently doing research at Waseda University, Japan, under the supervision of both Professor Timothy K. Shih and Professor Qun Jin. He is also a research member in the Multimedia Information Networking Laboratory, Taiwan, and in the Networked Information Systems Laboratory, Japan. His research interests are in the scope of web information retrieval, distance learning technology, and social computing.
Timothy K. Shih is a professor with the National Central University, Taiwan. He was the dean of the College of Computer Science, Asia University, Taiwan, and the department chair of the Computer Science and Information Engineering Department at Tamkang University, Taiwan. His current research interests include multimedia computing and distance learning. Dr. Shih has edited many books and published more than 440 papers and book chapters, and has participated in many international academic activities, including the organization of more than 60 international conferences. He was the founder and co-editor-in-chief of the International Journal of Distance Education Technologies, published by Idea Group Publishing, United States. He is an associate editor of the ACM Transactions on Internet Technology and an associate editor of the IEEE Transactions on Learning Technologies. He was also an associate editor of the IEEE Transactions on Multimedia. Dr. Shih has received many research awards, including research awards from the National Science Council of Taiwan, the IIAS research award from Germany, the HSSS award from Greece, the Brandon Hall award from the United States, and several best paper awards from international conferences. He has been invited to give more than 30 keynote speeches and plenary talks in international conferences, as well as tutorials at IEEE ICME 2001 and 2006 and ACM Multimedia 2002 and 2007. Dr. Shih is a fellow of the Institution of Engineering and Technology (IET) and a senior member of the ACM and the IEEE. He also joined the Educational Activities Board of the IEEE Computer Society.
Louis R. Chao is a professor in the Department of Computer Science and Information Engineering at Tamkang University, Taiwan. In addition to being the founder of the Association of E-Learning, he is also the director of the Department of Computer Science, the chief of the Engineering Institute, the dean of academic affairs, and chancellor at Tamkang University. He has been involved with international conferences such as the International Conference on Computers in Education, the International Computer Symposium, and the National Computer Symposium, and established the International Journal of Information and Management during his time at the university. He is not only a pioneer of informationization and internationalization at Tamkang University, but also a pioneer of computer-assisted instruction. His research interests are in the fields of distance learning, networking and communication, information security, multimedia, neural networks, and fuzzy theory. He also has experience in enterprise organization, management, and tactics application.
Qun Jin is a tenured full professor in the Networked Information Systems Laboratory, Department of Human Informatics and Cognitive Sciences, Faculty of Human Sciences, Waseda University, Japan. He was engaged extensively in research work in computer science, information systems, and social and human informatics. He seeks to exploit the rich interdependence between theory and practice in his work with interdisciplinary and integrated approaches. His recent research interests include user-centric ubiquitous computing, sustainable and secure information environments, user modeling, behavior informatics, information search and recommendation, human-computer interaction, e-learning support, and computing for well-being. He is a member of the IEEE.