Search For:

Displaying 1-49 out of 49 total
WebSCSA: Web Search by Constrained Spreading Activation
Found in: Advances in Digital Libraries Conference, IEEE
By Fabio Crestani, Puay Leng Lee
Issue Date:March 1999
pp. 163
We present WebSCSA, an experimental Web search system based on the Constrained Spreading Activation model. WebSCSA performs an autonomous search by navigation using an algorithm based on the Constrained Spreading Activation model to find Web pages that are...
 
Finding Participants in a Chat: Authorship Attribution for Conversational Documents
Found in: 2013 International Conference on Social Computing (SocialCom)
By Giacomo Inches,Morgan Harvey,Fabio Crestani
Issue Date:September 2013
pp. 272-279
In this work we study the problem of Authorship Attribution for a novel set of documents, namely online chats. Although the problem of Authorship Attribution has been extensively investigated for different document types, from books to letters and from ema...
 
Experimental Results on the Aggregation Methods in Blog Distillation
Found in: Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on
By Mostafa Keikha, Fabio Crestani
Issue Date:September 2009
pp. 151-154
This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence about relevance of a blog to the query, and use aggregation methods like Ordered Weighte...
 
Discovering Significant Patterns in Multi-stream Sequences
Found in: Data Mining, IEEE International Conference on
By Robert Gwadera, Fabio Crestani
Issue Date:December 2008
pp. 827-832
Discovering significant patterns in synchronized multi-stream sequences also known as multi-attribute event sequences (multi-sequences), is an important problem in many domains, including monitoring systems and information retrieval. In this paper we propo...
 
Towards an Automated Approach to Offender Profiling
Found in: Computational Science and its Applications, International Conference
By Richard Bache, Fabio Crestani
Issue Date:July 2008
pp. 537-545
Offender profiling seeks to infer characteristics of an offender from the observed features of crimes he or she has committed. Traditionally such an approach has been subjective and required expert opinion. Here we propose an approach based on Language Mod...
 
Application of Language Models to Suspect Prioritisation and Suspect Likelihood in Serial Crimes
Found in: Information Assurance and Security, International Symposium on
By Richard Bache, Fabio Crestani, David Canter, Donna Youngs
Issue Date:August 2007
pp. 399-404
Language Models are successfully applied to the problem of analysing crime descriptions from a police database with the purpose of prioritising suspects for an unsolved crime, given details of solved crimes. The frequency of terms in each description relat...
 
Design and Implementation of a Cross-Media Indexing System for the Reveal-This System
Found in: Automated Production of Cross Media Content for Multi-Channel Distribution, International Conference on
By Murat Yakici, Fabio Crestani
Issue Date:December 2006
pp. 157-164
Despite the vast growth of heterogeneous, multimedia and increasingly multi-lingual digital content, there is a lack of integrated technology that facilitates its effective usage. The need is being expressed insistently by end-users, and professionals in c...
 
Effects of Word Recognition Errors in Spoken Query Processing
Found in: Advances in Digital Libraries Conference, IEEE
By Fabio Crestani
Issue Date:May 2000
pp. 39
The effects of word recognition errors (WRE) in Spoken Document Retrieval have been well studied and well reported in recent Information Retrieval (IR) literature. Much less, experimental work has been devoted to studying the effects of WRE in Spoken Query...
 
The DILIGENT framework for distributed information retrieval
Found in: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '07)
By Fabio Crestani, Fabio Simeoni, Ralf Bierig
Issue Date:July 2007
pp. 781-782
It is often argued that in information extraction (IE), certain machine learning (ML) approaches save development time over others, or that certain ML methods (e.g. Active Learning) require less training data than others, thus saving development cost. Howe...
     
Generalizing diversity detection in blog feed retrieval
Found in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13)
By Bruce Croft, Fabio Crestani, Mostafa Keikha
Issue Date:October 2013
pp. 1201-1204
The goal of a blog retrieval system is to retrieve and rank blogs, as collections of documents, in response to a given query. Previous studies have shown that diversity among the top retrieved posts from a blog is a positive feature for indicating relevanc...
     
Building user profiles from topic models for personalised search
Found in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13)
By Fabio Crestani, Mark J. Carman, Morgan Harvey
Issue Date:October 2013
pp. 2309-2314
Personalisation is an important area in the field of IR that attempts to adapt ranking algorithms so that the results returned are tuned towards the searcher's interests. In this work we use query logs to build personalised ranking models in which user pro...
     
Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval
Found in: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '13)
By Fabio Crestani, Jimmy Xiangji Huang, Parvaz Mahdabi, Shima Gerani
Issue Date:July 2013
pp. 113-122
Patent prior art search is a task in patent retrieval where the goal is to rank documents which describe prior art work related to a patent application. One of the main properties of patent retrieval is that the query topic is a full patent application and...
     
Aggregation Methods for Proximity-Based Opinion Retrieval
Found in: ACM Transactions on Information Systems (TOIS)
By Fabio Crestani, Mark Carman, Shima Gerani
Issue Date:November 2012
pp. 1-36
The enormous amount of user-generated data available on the Web provides a great opportunity to understand, analyze, and exploit people’s opinions on different topics. Traditional Information Retrieval methods consider the relevance of documents to a...
     
Diversity in blog feed retrieval
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Fabio Crestani, Mostafa Keikha, W. Bruce Croft
Issue Date:October 2012
pp. 525-534
Blog distillation (blog feed retrieval) is a task in blog retrieval where the goal is to rank blogs according to their recurrent relevance to a query topic. One of the main properties of blog feed retrieval is that the unit of retrieval is a collection of ...
     
Unsupervised linear score normalization revisited
Found in: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12)
By Avi Arampatzis, Fabio Crestani, Ilya Markov
Issue Date:August 2012
pp. 1161-1162
We give a fresh look into score normalization for merging result-lists, isolating the problem from other components. We focus on three of the simplest, practical, and widely-used linear methods which do not require any training data, i.e. MinMax, Sum, and ...
     
Automatic refinement of patent queries using concept importance predictors
Found in: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12)
By Fabio Crestani, Linda Andersson, Mostafa Keikha, Parvaz Mahdabi
Issue Date:August 2012
pp. 505-514
Patent prior art queries are full patent applications which are much longer than standard web search topics. Such queries are composed of hundreds of terms and do not represent a focused information need. One way to make the queries more focused is to sele...
     
On the generation of rich content metadata from social media
Found in: Proceedings of the 3rd international workshop on Search and mining user-generated contents (SMUC '11)
By Andrea Basso, Fabio Crestani, Giacomo Inches
Issue Date:October 2011
pp. 85-92
This contribution proposes a framework to generate auxiliary rich TV content metadata by processing social networks data. Based on simple criteria to identify authoritative social media sources, we have analysed Twitter short messages relative to TV progra...
     
Online conversation mining for author characterization and topic identification
Found in: Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management (PIKM '11)
By Fabio Crestani, Giacomo Inches
Issue Date:October 2011
pp. 19-26
The increasing popularity of online-based services (Twitter, Facebook, IRC, Myspace, blogs, just to mention few of them) results in a production of a huge amount of novel documents. These documents present properties that can not be found in standard edite...
     
Predicting document effectiveness in pseudo relevance feedback
Found in: Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11)
By Fabio Crestani, Jangwon Seo, Mostafa Keikha, W. Bruce Croft
Issue Date:October 2011
pp. 2061-2064
Pseudo relevance feedback (PRF) is one of effective practices in Information Retrieval. In particular, PRF via the relevance model (RM) has been widely used due to the theoretical soundness and effectiveness. In a PRF scenario, an underlying relevance mode...
     
Bayesian latent variable models for collaborative item rating prediction
Found in: Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11)
By Fabio Crestani, Ian Ruthven, Mark J. Carman, Morgan Harvey
Issue Date:October 2011
pp. 699-708
Collaborative filtering systems based on ratings make it easier for users to find content of interest on the Web and as such they constitute an area of much research. In this paper we first present a Bayesian latent variable model for rating prediction tha...
     
Aggregating multiple opinion evidence in proximity-based opinion retrieval
Found in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information (SIGIR '11)
By Fabio Crestani, Mostafa Keikha, Shima Gerani
Issue Date:July 2011
pp. 1199-1200
Blog post opinion retrieval is the problem of ranking blog posts according to the likelihood that the post is relevant to the query and that the author was expressing an opinion about the topic (of the query). A recent study has proposed a method for findi...
     
Time-based relevance models
Found in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information (SIGIR '11)
By Fabio Crestani, Mostafa Keikha, Shima Gerani
Issue Date:July 2011
pp. 1087-1088
This paper addresses blog feed retrieval where the goal is to retrieve the most relevant blog feeds for a given user query. Since the retrieval unit is a blog, as a collection of posts, performing relevance feedback techniques and selecting the most approp...
     
Relevance stability in blog retrieval
Found in: Proceedings of the 2011 ACM Symposium on Applied Computing (SAC '11)
By Fabio Crestani, Mostafa Keikha, Shima Gerani
Issue Date:March 2011
pp. 1119-1123
This paper investigates blog distillation where the goal is to rank blogs according to their recurrent relevance to the topic of the query. One of the main features of blogs is their relation to time but this important feature is under-utilized in the curr...
     
Defining ontology by using users collaboration on social media
Found in: Proceedings of the ACM 2011 conference on Computer supported cooperative work (CSCW '11)
By Fabio Crestani, Saman Kamran
Issue Date:March 2011
pp. 657-660
This novel method is proposed for building a reliable ontology around specific concepts, by using the immense potential of active volunteering collaboration of detected knowledgeable users on social media.
     
Towards query log based personalization using topic models
Found in: Proceedings of the 19th ACM international conference on Information and knowledge management (CIKM '10)
By Fabio Crestani, Mark Baillie, Mark J. Carman, Morgan Harvey
Issue Date:October 2010
pp. 1849-1852
We investigate the utility of topic models for the task of personalizing search results based on information present in a large query log. We define generative models that take both the user and the clicked document into account when estimating the probabi...
     
Proximity-based opinion retrieval
Found in: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '10)
By Fabio Crestani, Mark James Carman, Shima Gerani
Issue Date:July 2010
pp. 403-410
Blog post opinion retrieval aims at finding blog posts that are relevant and opinionated about a user's query. In this paper we propose a simple probabilistic model for assigning relevant opinion scores to documents. The key problem is how to capture opini...
     
Mining and ranking streams of news stories using cross-stream sequential patterns
Found in: Proceeding of the 18th ACM conference on Information and knowledge management (CIKM '09)
By Fabio Crestani, Robert Gwadera
Issue Date:November 2009
pp. 1709-1712
We present a new method for mining and ranking streams of news stories using cross-stream sequential patterns and content similarity. In particular, we focus on stories reporting the same event across the streams within a given time window, where an event ...
     
Blog distillation using random walks
Found in: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '09)
By Fabio Crestani, Mark James Carman, Mostafa Keikha
Issue Date:July 2009
pp. 435-435
This paper addresses the blog distillation problem. That is, given a user query find the blogs most related to the query topic. We model the blogosphere as a single graph that includes extra information besides the content of the posts. By performing a ran...
     
A statistical comparison of tag and query logs
Found in: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '09)
By Fabio Crestani, Mark Baillie, Mark J. Carman, Robert Gwadera
Issue Date:July 2009
pp. 435-435
We investigate tag and query logs to see if the terms people use to annotate websites are similar to the ones they use to query for them. Over a set of URLs, we compare the distribution of tags used to annotate each URL with the distribution of query terms...
     
Tag data and personalized information retrieval
Found in: Proceeding of the 2008 ACM workshop on Search in social media (SSM '08)
By Fabio Crestani, Mark Baillie, Mark J. Carman
Issue Date:October 2008
pp. 1001-1001
Researchers investigating personalization techniques for Web Information Retrieval face a challenge; that the data required to perform evaluations, namely query logs and click-through data, is not readily available due to valid privacy concerns. One option...
     
Estimating real-valued characteristics of criminals from their recorded crimes
Found in: Proceeding of the 17th ACM conference on Information and knowledge mining (CIKM '08)
By Fabio Crestani, Richard Bache
Issue Date:October 2008
pp. 1001-1001
Offender profiling concerns making inferences about a criminal from the crime(s) he has committed. Where descriptionsof the crimes are recorded electronically, text mining techniques provide a means by which recorded characteristics of the offenders can be...
     
Towards personalized distributed information retrieval
Found in: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08)
By Fabio Crestani, Mark J. Carman
Issue Date:July 2008
pp. 2-2
Our aim is to investigate if and how the performance of Distributed Information Retrieval (DIR) systems can be improved through personalization. Toward this aim we are building a testbed of document collections and corresponding personalized relevance judg...
     
Language models, probability of relevance and relevance likelihood
Found in: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (CIKM '07)
By Fabio Crestani
Issue Date:November 2007
pp. 853-856
This paper proposes a measure of relevance likelihood derived specifically for language models. Such a measure may be used to guide a user on how far to browse through the list of retrieved items or for pseudo-relevance feedback. To derive this measure, it...
     
Modelling epistemic uncertainty in ir evaluation
Found in: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '07)
By Fabio Crestani, Ian Ruthven, Mark Baillie, Murat Yakici
Issue Date:July 2007
pp. 769-770
Modern information retrieval (IR) test collections violate the completeness assumption of the Cranfield paradigm. In order to maximise the available resources, only a sample of documents (i.e. the pool) are judged for relevance by a human assessor(s). The ...
     
MUIA 2006: third international workshop on mobile and ubiquitous information access
Found in: Proceedings of the 8th conference on Human-computer interaction with mobile devices and services (MobileHCI '06)
By Fabio Crestani, Matt Jones, Stefano Mizzaro
Issue Date:September 2006
pp. 299-300
The recent trend towards pervasive computing and information technology becoming omnipresent and entering all aspects of modern living, means that we are moving away from the traditional interaction paradigm between human and technology being that of the d...
     
PENG: integrated search of distributed news archives
Found in: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '06)
By Fabio Crestani, Mark Baillie, Monica Landoni
Issue Date:August 2006
pp. 607-608
We consider the problem of evaluating retrieval systems using a limited number of relevance judgments. Recent work has demonstrated that one can accurately estimate average precision via a judged pool corresponding to a relatively small random sample of do...
     
Adaptive query-based sampling for distributed IR
Found in: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '06)
By Fabio Crestani, Leif Azzopardi, Mark Baillie
Issue Date:August 2006
pp. 605-606
We consider the problem of evaluating retrieval systems using a limited number of relevance judgments. Recent work has demonstrated that one can accurately estimate average precision via a judged pool corresponding to a relatively small random sample of do...
     
Towards better measures: evaluation of estimated resource description quality for distributed IR
Found in: Proceedings of the 1st international conference on Scalable information systems (InfoScale '06)
By Fabio Crestani, Leif Azzopardi, Mark Baillie
Issue Date:May 2006
pp. 41-es
An open problem for Distributed Information Retrieval systems (DIR) is how to represent large document repositories, also known as resources, both accurately and efficiently. Obtaining resource description estimates is an important phase in DIR, especially...
     
An evaluation of resource description quality measures
Found in: Proceedings of the 2006 ACM symposium on Applied computing (SAC '06)
By Fabio Crestani, Leif Azzopardi, Mark Baillie
Issue Date:April 2006
pp. 1110-1111
An open problem for Distributed Information Retrieval is how to represent large document repositories (known as resources) efficiently. To facilitate resource selection, estimated descriptions of each resource are required, especially when faced with non-c...
     
Editorial message: special track on information access and retrieval
Found in: Proceedings of the 2006 ACM symposium on Applied computing (SAC '06)
By Fabio Crestani, Gabriella Pasi
Issue Date:April 2006
pp. 1018-1019
Information Retrieval (IR) aims at modelling, designing and implementing systems able to provide fast and effective content-based access to a large amount of information. Information can be of any kind: textual, visual, or auditory. The aim of such systems...
     
Editorial message: special track on information access and retrieval
Found in: Proceedings of the 2005 ACM symposium on Applied computing (SAC '05)
By Fabio Crestani, Gabriella Pasi
Issue Date:March 2005
pp. 1009-1010
Information access technologies, like for example Information Retrieval (IR) and Information Filtering (IF), aim at modelling, designing and implementing systems able to provide fast and effective content-based access to a large amount of information. Info...
     
Data fusion with estimated weights
Found in: Proceedings of the eleventh international conference on Information and knowledge management (CIKM '02)
By Fabio Crestani, Shengli Wu
Issue Date:November 2002
pp. 648-651
This paper proposes an adptive approach for data fusion of information retrieval systems, which exploits estimated performances of all component input systems without relevance judgement or training. The estimation is conducted prior to the fusion but uses...
     
Experimenting with graphical user interfaces for structured document retrieval
Found in: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '02)
By Fabio Crestani, Jesus Vegas, Pablo de la Fuente
Issue Date:August 2002
pp. 373-374
We compare standard global IR searching with user-centric localized techniques to address the database selection problem. We conduct a series of experiments to compare the retrieval effectiveness of three separate search modes applied to a hierarchically s...
     
Mobile delivery of news using hierarchical query-biased summaries
Found in: Proceedings of the 2002 ACM symposium on Applied computing (SAC '02)
By Anastasios Tombros, Fabio Crestani, Simon O. Sweeney
Issue Date:March 2002
pp. 634-639
This paper presents the results of a study aimed at measuring the usefulness of presenting the results of an Information Retrieval search on WAP mobile phones. The experimentation focuses on presenting automatically-generated summaries of newspaper article...
     
Editorial message: special track on information access and retrieval systems
Found in: Proceedings of the 2002 ACM symposium on Applied computing (SAC '02)
By Fabio Crestani, Gabriella Pasi
Issue Date:March 2002
pp. 613-614
The cost for testing integrated circuits represents a growing percentage of the total cost for their production. The former strictly depends on the length of the test session, and its reduction has been the target of many efforts in the past. This paper pr...
     
Towards the use of prosodic information for spoken document retrieval
Found in: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '01)
By Fabio Crestani
Issue Date:September 2001
pp. 420-421
Topic segmentation is an important initial step in many text-based tasks. A hierarchical representation of a texts topics is useful in retrieval and allows judging relevancy at different levels of detail. This short paper describes research on generic algo...
     
Vocal access to a newspaper archive: design issues and preliminary investigations
Found in: Proceedings of the fourth ACM conference on Digital libraries (DL '99)
By Fabio Crestani
Issue Date:August 1999
pp. 59-66
This presentation focuses on the strategic design planning and vision creation process for the E-Quarium, the online complement to the Monterey Bay Aquarium. More than six months of informed investigation and analysis resulted in an ambitious redesign of t...
     
A methodology for the automatic construction of a hypertext for information retrieval
Found in: Proceedings of the 1993 ACM/SIGAPP symposium on Applied computing: states of the art and practice (SAC '93)
By Fabio Crestani, Maristella Agosti
Issue Date:February 1993
pp. 745-753
This paper describes the effects of program restructuring in a dataflow environment. Previous studies showed that dataflow programs can exhibit locality and that a memory hierarchy is feasible in a dataflow environment. This study shows that the order in w...
     
“Is this document relevant?…probably”: a survey of probabilistic models in information retrieval
Found in: ACM Computing Surveys (CSUR)
By Cornelis J. Van Rijsbergen, Fabio Crestani, Iain Campbell, Mounia Lalmas
Issue Date:March 1988
pp. 528-552
This article surveys probablistic approaches to modeling information retrieval. The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented. The v...
     
 1