Search For:

Displaying 1-50 out of 70 total
Wikipedia and how to use it for semantic document representation
Found in: 2011 26th IEEE/ACM International Conference on Automated Software Engineering
By Ian H. Witten
Issue Date:November 2011
pp. 1
Summary form only given. With petascale systems becoming broadly available in high end computing, attention is now focused on the challenges associated with the next major performance milestone: exascale computing. Demand for computational capability grows...
   
A new framework for building digital library collections
Found in: Digital Libraries, Joint Conference on
By Katherine J. Don, Ian H. Witten, David Bainbridge, George Buchanan
Issue Date:June 2005
pp. 23-31
This paper introduces a new framework for building digital library collections and contrasts it with existing systems. It describes a significant new step in the development of a widely-used open-source digital library system, Greenstone, which has evolved...
 
Measuring inter-indexer consistency using a thesaurus
Found in: Digital Libraries, Joint Conference on
By Ian H. Witten, Olena Medelyan
Issue Date:June 2006
pp. 274-275
When professional indexers independently assign terms to a given document, the term sets generally differ between indexers. Studies of inter-indexer consistency measure the percentage of matching index terms, but none of them consider the semantic relation...
 
Teaching Agents to Learn: From User Study to Implementation
Found in: Computer
By David Maulsby, Ian H. Witten
Issue Date:November 1997
pp. 36-44
<p>Graphical user interfaces have helped center computer use on viewing and editing, rather than on programming. Yet the need for end-user programming continues to grow. Software developers have responded to the demand with a barrage of customizable ...
 
Creating and Reading Realistic Electronic Books
Found in: Computer
By Veronica Liesaputra, Ian H. Witten, David Bainbridge
Issue Date:February 2009
pp. 72-81
A digital library project aims to combine the look and feel of physical books with the advantages of online documents such as hyperlinks and multimedia. A lightweight open source implementation enables highly responsive page turning and works within standa...
 
Grasping Society's Treasure Trove of Information
Found in: Mexican International Conference on Computer Science
By Ian H. Witten
Issue Date:September 2007
pp. xvi
Search engines --
   
Mining Domain-Specific Thesauri from Wikipedia: A Case Study
Found in: Web Intelligence, IEEE / WIC / ACM International Conference on
By David Milne, Olena Medelyan, Ian H. Witten
Issue Date:December 2006
pp. 442-448
Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia. In a comparison with a professional thesaurus for agricultu...
 
Thesaurus based automatic keyphrase indexing
Found in: Digital Libraries, Joint Conference on
By Ian H. Witten, Olena Medelyan
Issue Date:June 2006
pp. 296-297
We propose a new method that enhances automatic keyphrase extraction by using semantic information on terms and phrases gleaned from a domain-specific thesaurus. We evaluate the results against keyphrase sets assigned by a state-of-the-art keyphrase extrac...
 
Building digital library collections with greenstone
Found in: Digital Libraries, Joint Conference on
By David Bainbridge, Ian H. Witten
Issue Date:June 2005
pp. 425-425
This tutorial will demonstrate how to build a variety of different kinds of digital library collections with the Greenstone digital library software, a comprehensive, open-source system for constructing, presenting, and maintaining information collections....
 
Practical digital library interoperability standards
Found in: Digital Libraries, Joint Conference on
By Ian H. Witten, David Bainbridge
Issue Date:June 2005
pp. 426-426
As the field of digital libraries matures and new systems and standards develop, the ability to interoperate between systems becomes paramount. This tutorial gives a practical introduction to many recent standards and de facto standards for interoperabilit...
 
Realistic Books: A Bizarre Homage to an Obsolete Medium?
Found in: Digital Libraries, Joint Conference on
By Yi-Chun Chu, David Bainbridge, Matt Jones, Ian H. Witten
Issue Date:June 2004
pp. 78-86
For many readers, handling a physical book is an enjoyably exquisite part of the information seeking process. Many physical characteristics of a book-its size, heft, the patina of use on its pages and so on-communicate ambient qualities of the document it ...
 
Greenstone Digital Library Software: Current Research
Found in: Digital Libraries, Joint Conference on
By David Bainbridge, Ian H. Witten
Issue Date:June 2004
pp. 416-416
The Greenstone digital library software (www.greenstone.org) provides a exible way of organizing information and publishing it on the Internet or removable media such as CD-ROM. Its aim is to empower users, particularly in universities, libraries and other...
   
How to Turn the Page
Found in: Digital Libraries, Joint Conference on
By Yi-Chun Chu, Ian H. Witten, Richard Lobb, David Bainbridge
Issue Date:May 2003
pp. 186
Can digital libraries provide a reading experience that more closely resembles a real book than a scrolled or paginated electronic display? This paper describes a prototype page-turning system that realistically animates full three-dimensional page-turns. ...
 
Assembling and Enriching Digital Library Collections
Found in: Digital Libraries, Joint Conference on
By David Bainbridge, John Thompson, Ian H. Witten
Issue Date:May 2003
pp. 323
People who create digital libraries need to gather together the raw material, add metadata as necessary, and design and build new collections. This paper sets out the requirements for these tasks and describes a new tool that supports them interactively, m...
 
Power to the People: End-User Building of Digital Library Collections
Found in: Digital Libraries, Joint Conference on
By Ian H. Witten, David Bainbridge, Stefan J. Boddie
Issue Date:June 2001
pp. 94-103
Naturally, digital library systems focus principally on the reader: the consumer of the material that constitutes the library. In contrast, this paper describes an interface that makes it easy for people to build their own library collections. Collections ...
 
Text Mining: A New Frontier for Lossless Compression
Found in: Data Compression Conference
By Ian H. Witten, Zane Bray, Malika Mahoui, Bill Teahan
Issue Date:March 1999
pp. 198
Data mining, a burgeoning new technology, is about looking for patterns in data. Likewise, text mining is about looking for patterns in text. It may be defined as the process of analyzing text to extract information that is useful for particular purposes. ...
 
Lexical Attraction for Text Compression
Found in: Data Compression Conference
By Joscha Bach, Ian H. Witten
Issue Date:March 1999
pp. 516
The best methods of text compression work by conditioning each symbol's probability on its predecessors. Prior symbols establish a context that governs the probability distribution for the next one, and the actual next symbol is encoded with respect to thi...
   
Protein Is Incompressible
Found in: Data Compression Conference
By Craig G. Nevill-Manning, Ian H. Witten
Issue Date:March 1999
pp. 257
Life is based on two polymers, DNA and protein, whose properties can be described in a simple text file. It is natural to expect that standard text compression techniques would work on biological sequences as they do on English text. But biological sequenc...
 
Managing Complexity in a Distributed Digital Library
Found in: Computer
By Ian H. Witten, Rodger J. McNab, Steve Jones, Mark Apperley, David Bainbridge, Sally Jo Cunningham
Issue Date:February 1999
pp. 74-79
<p>As the capabilities of distributed digital libraries increase, managing organizational and software complexity becomes a key issue. How can collections and indexes be updated without impacting queries currently in progress? How can the system hand...
 
A Distributed Digital Library Architecture Incorporating Different Index Styles
Found in: Advances in Digital Libraries Conference, IEEE
By Rodger J. McNab, Ian H. Witten, Stefan J. Boddie
Issue Date:April 1998
pp. 36
The New Zealand Digital Library offers several collections of information over the World Wide Web. Although full-text indexing is the primary access mechanism, musical collections can also be accessed through a novel melody retrieval system. In offering th...
 
Linear-Time, Incremental Hierarchy Inference for Compression
Found in: Data Compression Conference
By Craig G. Nevill-Manning, Ian H. Witten
Issue Date:March 1997
pp. 3
Data compression and learning are, in some sense, two sides of the same coin. If we paraphrase Occam's razor by saying that a small theory is better than a larger theory with the same explanatory power, we can characterize data compression as a preoccupati...
 
The Development of Holte's 1R Classifier
Found in: Artificial Neural Networks and Expert Systems, New Zealand Conference
By Craig G. Nevill-Manning, Geoffrey Holmes, Ian H. Witten
Issue Date:November 1995
pp. 239
The 1R machine learning scheme is a very simple one that proves surprisingly effective on the standard datasets commonly used for evaluation. This paper describes the method and discusses two aspects of the algorithm that bear further analysis: the way tha...
 
Displaying 3D Images: Algorithms for Single-Image Random-Dot Stereograms
Found in: Computer
By Harold W. Thimbleby, Stuart Inglis, Ian H. Witten
Issue Date:October 1994
pp. 38-48
<p>A new, simple, and symmetric algorithm can be implemented that results in higher levels of detail in solid objects than previously possible with autostereograms. In a stereoscope, an optical instrument similar to binoculars, each eye views a diffe...
 
The Reactive Keyboard: A Predictive Typing Aid
Found in: Computer
By John J. Darragh, Ian H. Witten, Mark L. James
Issue Date:November 1990
pp. 41-49
<p>The Reactive Keyboard, a device that accelerates typewritten communication with a computer system by predicting what the user is going to type next is described. The version described is designed for entering free text and runs on any Apple Macint...
 
Clustering Documents with Active Learning Using Wikipedia
Found in: Data Mining, IEEE International Conference on
By Anna Huang, David Milne, Eibe Frank, Ian H. Witten
Issue Date:December 2008
pp. 839-844
Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this paper we propose to exploit the semantic knowledge in Wikipedia for clustering, e...
 
Using Compression to Identify Acronyms in Text
Found in: Data Compression Conference
By Stuart Yeates, David Bainbridge, Ian H. Witten
Issue Date:March 2000
pp. 582
Finding acronyms and their definitions in free text is useful for many purposes. Previous acronym definition finders relied heavily on heuristic methods. In contrast, we have developed a new method that uses several PPM models to encode the acronym in term...
   
Text Categorization Using Compression Models
Found in: Data Compression Conference
By Eibe Frank, Chang Chui, Ian H. Witten
Issue Date:March 2000
pp. 555
Text categorization is the assignment of natural language texts to predefined categories based on their content. It has often been observed that compression seems to provide a very promising approach to categorization. The overall compression of an article...
   
A Bookmaker's Workbench
Found in: Proceedings of the 12th Annual Conference of the New Zealand Chapter of the ACM Special Interest Group on Computer-Human Interaction (CHINZ '11)
By Ian H. Witten, Veronica Liesaputra
Issue Date:July 2011
pp. 1-8
We have been developing electronic Realistic Books that combine the natural advantages of electronic documents---full-text search, hyperlinks, animation, multimedia---with those of conventional books---the ambient information provided by the physical objec...
     
Perambulating libraries: demonstrating how a victorian idea can help OLPC users share books
Found in: Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries (JCDL '11)
By David Bainbridge, Ian H. Witten
Issue Date:June 2011
pp. 471-472
In this extended abstract we detail how the open source digital library toolkit Greenstone [4] can help users of the XOlaptop produced by the One Laptop Per Child Foundation manage and share electronic documents. The idea draws upon mobile libraries (bookm...
     
Exploring Wikipedia with HMpara
Found in: Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries (JCDL '11)
By David N. Milne, Ian H. Witten
Issue Date:June 2011
pp. 453-454
Experimental evaluation and comparison of techniques, algorithms or complete systems is a crucial requirement to assess the practical impact of research results. The quality of published experimental results is usually limited due to several reasons such a...
     
A link-based visual search engine for Wikipedia
Found in: Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries (JCDL '11)
By David N. Milne, Ian H. Witten
Issue Date:June 2011
pp. 223-226
This paper introduces HMpara, a new search engine that aims to make Wikipedia easier to explore. It works on top of the encyclopedia's existing link structure, abstracting away from document content and allowing users to navigate the resource at a higher l...
     
Subject metadata support powered by Maui
Found in: Proceedings of the 10th annual joint conference on Digital libraries (JCDL '10)
By Ian H. Witten, Olena Medelyan, Vye Perrone
Issue Date:June 2010
pp. 407-408
This extended abstract describes recent work in combining interactive map functionality with the Greenstone 3 digital library software research framework.
     
Learning to link with wikipedia
Found in: Proceeding of the 17th ACM conference on Information and knowledge mining (CIKM '08)
By David Milne, Ian H. Witten
Issue Date:October 2008
pp. 1001-1001
This paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to the ...
     
Running greenstone on an ipod
Found in: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (JCDL '08)
By David Bainbridge, Ian H. Witten, Matt Jones, Sam McIntosh, Steve Jones
Issue Date:June 2008
pp. 597-617
The open source digital library software Greenstone is demonstrated running on an iPod. The standalone configuration supports browsing, searching and displaying documents in a range of media formats. Plugged in to a host computer (Mac, Linux, or Windows), ...
     
A fedora librarian interface
Found in: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (JCDL '08)
By David Bainbridge, Ian H. Witten
Issue Date:June 2008
pp. 597-617
The Fedora content management system embodies a powerful and flexible digital object model. This paper describes a new open-source software front-end that enables end-user librarians to transfer documents and metadata in a variety of formats into a Fedora ...
     
Portable digital libraries on an ipod
Found in: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (JCDL '08)
By David Bainbridge, Ian H. Witten, Matt Jones, Sam McIntosh, Steve Jones
Issue Date:June 2008
pp. 597-617
This paper describes the facilities we built to run a self-contained digital library on an iPod. The digital library software used was the open source package Greenstone, and the paper highlights the technical problems that were encountered and solved. It ...
     
A competitive environment for exploratory query expansion
Found in: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (JCDL '08)
By David M. Nichols, David Milne, Ian H. Witten
Issue Date:June 2008
pp. 597-617
Most information workers query digital libraries many times a day. Yet people have little opportunity to hone their skills in a controlled environment, or compare their performance with others in an objective way. Conversely, although search engine logs re...
     
Lightweight realistic books: the greenstone connection
Found in: Proceedings of the 2007 conference on Digital libraries (JCDL '07)
By David Bainbridge, Ian H. Witten, Veronica Liesaputra
Issue Date:June 2007
pp. 502-502
The Cité de la musique in Paris has recently opened a new media Library. One of the Library's assignments is the dissemination of the Cité de la musique's collection of recorded concerts. This paper presents the concert's description model implem...
     
A retrospective look at Greenstone: lessons from the first decade
Found in: Proceedings of the 2007 conference on Digital libraries (JCDL '07)
By David Bainbridge, Ian H. Witten
Issue Date:June 2007
pp. 147-156
The Greenstone Digital Library Software has helped spread the practical impact of digital library technology throughout the world, with particular emphasis on developing countries. As Greenstone enters its second decade, this article takes a retrospective ...
     
Thesaurus based automatic keyphrase indexing
Found in: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (JCDL '06)
By Ian H. Witten, Olena Medelyan
Issue Date:June 2006
pp. 296-297
We propose a new method that enhances automatic keyphrase extraction by using semantic information on terms and phrases gleaned from a domain-specific thesaurus. We evaluate the results against keyphrase sets assigned by a state-of-the-art keyphrase extrac...
     
Measuring inter-indexer consistency using a thesaurus
Found in: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (JCDL '06)
By Ian H. Witten, Olena Medelyan
Issue Date:June 2006
pp. 274-275
When professional indexers independently assign terms to a given document, the term sets generally differ between indexers. Studies of inter-indexer consistency measure the percentage of matching index terms, but none of them consider the semantic relation...
     
Document level interoperability for collection creators
Found in: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries (JCDL '06)
By David Bainbridge, Ian H. Witten, Kaun Yu (Jeffrey) Ke
Issue Date:June 2006
pp. 105-106
Digital library interoperability for both documents and metadata is a critical and complex issue. Although many relevant standards have been developed, and continue to evolve, in practice things are not quite so easy as they seem. We have built a software ...
     
Practical digital library interoperability standards
Found in: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (JCDL '05)
By David Bainbridge, Ian H. Witten
Issue Date:June 2005
pp. 426-426
As the field of digital libraries matures and new systems and standards develop, the ability to interoperate between systems becomes paramount. This tutorial gives a practical introduction to many recent standards and de facto standards for interoperabilit...
     
Building digital library collections with greenstone
Found in: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (JCDL '05)
By David Bainbridge, Ian H. Witten
Issue Date:June 2005
pp. 425-425
This tutorial will demonstrate how to build a variety of different kinds of digital library collections with the Greenstone digital library software, a comprehensive, open-source system for constructing, presenting, and maintaining information collections....
     
A new framework for building digital library collections
Found in: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (JCDL '05)
By David Bainbridge, George Buchanan, Ian H. Witten, Katherine J. Don
Issue Date:June 2005
pp. 23-31
This paper introduces a new framework for building digital library collections and contrasts it with existing systems. It describes a significant new step in the development of a widely-used open-source digital library system, Greenstone, which has evolved...
     
Hands-on workshop: build your own digital library collections
Found in: Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries (JCDL '02)
By David Bainbridge, Ian H. Witten
Issue Date:July 2002
pp. 420-420
This tutorial is intended for people who have a basic familiarity with the function and structure of thesauri and ontologies. It will introduce criteria for the design and evaluation of thesauri and ontologies and then deal with methods and tools for their...
     
How to build a digital library using open-source software
Found in: Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries (JCDL '02)
By Ian H. Witten
Issue Date:July 2002
pp. 416-416
This introductory tutorial is intended for anyone concerned with subject access to digital libraries. It provides a bridge by presenting methods of subject access as treated in an information studies program for those coming to digital libraries from other...
     
The Greenstone plugin architecture
Found in: Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries (JCDL '02)
By David Bainbridge, Gordon Paynter, Ian H. Witten, Stefan Boddie
Issue Date:July 2002
pp. 285-286
This note describes how the Greenstone digital library system uses "plugins" to import documents and metadata in different formats, and associate metadata with the appropriate documents. Plugins that import documents can perform their own format conversion...
     
Power to the people: end-user building of digital library collections
Found in: Proceedings of the first ACM/IEEE-CS joint conference on Digital libraries (JCDL '01)
By David Bainbridge, Ian H. Witten, Stefan J. Boddie
Issue Date:January 2001
pp. 94-103
Naturally, digital library systems focus principally on the reader: th e consumer of the material that constitutes the library. In contrast, this paper describes an interface that makes it easy for people to build their own library collections. Collections...
     
Scalable browsing for large collections: a case study
Found in: Proceedings of the fifth ACM conference on Digital libraries (DL '00)
By George Buchanan, Gordon W. Paynter, Ian H. Witten, Sally Jo Cunningham
Issue Date:June 2000
pp. 215-223
Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of...
     
 1  2 Next >>