Search For:

Displaying 1-23 out of 23 total
Corpus-Based Schema Matching
Found in: Data Engineering, International Conference on
By Jayant Madhavan, Philip A. Bernstein, AnHai Doan, Alon Halevy
Issue Date:April 2005
pp. 57-68
Schema Matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. Howev...
 
Efficiently Ordering Query Plans for Data Integration
Found in: Data Engineering, International Conference on
By AnHai Doan, Alon Halevy
Issue Date:March 2002
pp. 0393
The goal of a data integration system is to provide a uniform interface to a multitude of data sources. Given a user query formulated in this interface, the system translates it into a set of query plans. Each plan is a query formulated over the data sourc...
 
The Unreasonable Effectiveness of Data
Found in: IEEE Intelligent Systems
By Alon Halevy, Peter Norvig, Fernando Pereira
Issue Date:March 2009
pp. 8-12
Problems that involve interacting with humans, such as natural language understanding, have not proven to be solvable by concise, neat formulas like F = ma. Instead, the best approach appears to be to embrace the complexity of the domain and address it by ...
 
Crowd-powered find algorithms
Found in: 2014 IEEE 30th International Conference on Data Engineering (ICDE)
By Anish Das Sarma,Aditya Parameswaran,Hector Garcia-Molina,Alon Halevy
Issue Date:March 2014
pp. 964-975
We consider the problem of using humans to find a bounded number of items satisfying certain properties, from a data set. For instance, we may want humans to identify a select number of travel photos from a data set of photos to display on a travel website...
   
Consistent thinning of large geographical data for map visualization
Found in: ACM Transactions on Database Systems (TODS)
By Alon Halevy, Anish Das Sarma, Hector Gonzalez, Hongrae Lee, Jayant Madhavan
Issue Date:November 2013
pp. 1-35
Large-scale map visualization systems play an increasingly important role in presenting geographic datasets to end-users. Since these datasets can be extremely large, a map rendering system often needs to select a small fraction of the data to visualize th...
     
Structured data in web search
Found in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13)
By Alon Halevy
Issue Date:October 2013
pp. 7-8
For the first time since the emergence of the Web, structured data is playing a key role in search engines and is therefore being collected via a concerted effort. Much of this data is being extracted from the Web, which contains vast quantities of structu...
     
Channeling the deluge: research challenges for big data and information systems
Found in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13)
By Jiawei Han, Alon Halevy, Jure Leskovec, Lee Giles, Marti Hearst, Paul Bennett
Issue Date:October 2013
pp. 2537-2538
With massive amounts of data being generated and stored ubiquitously in every discipline and every aspect of our daily life, how to handle such big data poses many challenging issues to researchers in data and information systems. The participants of CIKM ...
     
Data integration with dependent sources
Found in: Proceedings of the 14th International Conference on Extending Database Technology (EDBT/ICDT '11)
By Alon Halevy, Anish Das Sarma, Xin Luna Dong
Issue Date:March 2011
pp. 401-412
Data integration systems offer users a uniform interface to a set of data sources. Previous work has typically assumed that the data sources are independent of each other; however, in scenarios involving large numbers of sources, such as the Web or large e...
     
Structured data on the web
Found in: Communications of the ACM
By Alon Halevy, Jayant Madhavan, Jayant Madhavan, Michael J. Cafarella, Michael J. Cafarella
Issue Date:February 2011
pp. 72-79
Google's Web Tables and Deep Web Crawler identify and deliver this otherwise inaccessible resource directly to end users.
     
Technical perspectiveSchema mappings: rules for mixing data
Found in: Communications of the ACM
By Alon Halevy
Issue Date:January 2010
pp. 100-100
Exciting research in the design of automated negotiators is making great progress.
     
Exploring schema repositories with schemr
Found in: Proceedings of the 35th SIGMOD international conference on Management of data (SIGMOD '09)
By Alon Halevy, Jayant Madhavan, Kuang Chen
Issue Date:June 2009
pp. 3-4
Schemr is a schema search engine, and provides users the ability to search for and visualize schemas stored in a metadata repository. Users may search by keywords and by example -- using schema fragments as query terms. Schemr uses a novel search algorithm...
     
Bootstrapping pay-as-you-go data integration systems
Found in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD '08)
By Alon Halevy, Anish Das Sarma, Xin Dong
Issue Date:June 2008
pp. 13-14
Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings fr...
     
Indexing dataspaces
Found in: Proceedings of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD '07)
By Alon Halevy, Xin Dong
Issue Date:June 2007
pp. 43-54
Dataspaces are collections of heterogeneous and partially unstructured data. Unlike data-integration systems that also offer uniform access to heterogeneous data sources, dataspaces do not assume that all the semantic relationships between sources are know...
     
Principles of dataspace systems
Found in: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '06)
By Alon Halevy, David Maier, Michael Franklin
Issue Date:June 2006
pp. 1-9
The most acute information management challenges today stem from organizations relying on a large number of diverse, interrelated data sources, but having no means of managing them in a convenient, integrated, or principled fashion. These challenges arise ...
     
Why your data won't mix
Found in: Queue
By Alon Halevy
Issue Date:October 2005
pp. 50-58
New tools and techniques can help ease the pain of reconciling schemas.
     
Personal information management with SEMEX
Found in: Proceedings of the 2005 ACM SIGMOD international conference on Management of data (SIGMOD '05)
By Alon Halevy, Jayant Madhavan, Jing Michelle Liu, Xin Luna Dong, Yuhan Cai
Issue Date:June 2005
pp. 921-923
The explosion of information available in digital form has made search a hot research topic for the Information Management Community. While most of the research on search is focused on the WWW, individual computer users have developed their own vast collec...
     
Supporting executable mappings in model management
Found in: Proceedings of the 2005 ACM SIGMOD international conference on Management of data (SIGMOD '05)
By Alon Halevy, Erhard Rahm, Philip A. Bernstein, Sergey Melnik
Issue Date:June 2005
pp. 167-178
Model management is an approach to simplify the programming of metadata-intensive applications. It offers developers powerful operators, such as Compose, Diff, and Merge, that are applied to models, such as database schemas or interface specifications, and...
     
Reference reconciliation in complex information spaces
Found in: Proceedings of the 2005 ACM SIGMOD international conference on Management of data (SIGMOD '05)
By Alon Halevy, Jayant Madhavan, Xin Dong
Issue Date:June 2005
pp. 85-96
Reference reconciliation is the problem of identifying when different references (i.e., sets of attribute values) in a dataset correspond to the same real-world entity. Most previous literature assumed references to a single class that had a fair number of...
     
The Lowell database research self-assessment
Found in: Communications of the ACM
By Alon Halevy, Avi Silberschatz, Bruce Croft, David DeWitt, David Maier, Dieter Gawlick, Gerhard Weikum, Hans Schek, Hector Garcia Molina, Jeff Naughton, Jeff Ullman, Jennifer Widom, Jim Gray, Joe Hellerstein, Laura Haas, Martin Kersten, Michael Pazzani, Mike Carey, Mike Franklin, Mike Lesk, Mike Stonebraker, Phil Bernstein, Rakesh Agrawal, Rick Snodgrass, Serge Abiteboul, Stan Zdonik, Stefano Ceri, Timos Sellis, Yannis Ioannidis
Issue Date:May 2005
pp. 111-118
Database needs are changing, driven by the Internet and increasing amounts of scientific and sensor data. In this article, the authors propose research into several important new directions for database management systems.
     
Rethinking the conference reviewing process
Found in: Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD '04)
By Alon Halevy, Anastassia Ailamaki, David DeWitt, Gerhard Weikum, Jennifer Widom, Michael J. Franklin, Philip A. Bernstein, Zachary Ives
Issue Date:June 2004
pp. 957-957
We demonstrate an XML full-text search engine that implements the TeXQuery language. TeXQuery is a powerful full-text search extension to XQuery that provides a rich set of fully composable full-text primitives, such as phrase matching, proximity distance,...
     
Efficient query reformulation in peer data management systems
Found in: Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD '04)
By Alon Halevy, Igor Tatarinov
Issue Date:June 2004
pp. 539-550
Peer data management systems (PDMS) offer a flexible architecture for decentralized data sharing. In a PDMS, every peer is associated with a schema that represents the peer's domain of interest, and semantic relationships between peers are provided locally...
     
iMAP: discovering complex semantic matches between database schemas
Found in: Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD '04)
By Alon Halevy, AnHai Doan, Pedro Domingos, Robin Dhamankar, Yoonkyong Lee
Issue Date:June 2004
pp. 383-394
Creating semantic matches between disparate data sources is fundamental to numerous data sharing efforts. Manually creating matches is extremely tedious and error-prone. Hence many recent works have focused on automating the matching process. To date, howe...
     
Learning to map between ontologies on the semantic web
Found in: Proceedings of the eleventh international conference on World Wide Web (WWW '02)
By Alon Halevy, AnHai Doan, Jayant Madhavan, Pedro Domingos
Issue Date:May 2002
pp. 662-673
Ontologies play a prominent role on the Semantic Web. They make possible the widespread publication of machine understandable data, opening myriad opportunities for automated information processing. However, because of the Semantic Web's distributed nature...
     
 1