Search For:

Displaying 1-50 out of 88 total
Crowdsourcing systems on the World-Wide Web
Found in: Communications of the ACM
By Alon Y. Halevy, Anhai Doan, Anhai Doan, Anhai Doan, Raghu Ramakrishnan, Raghu Ramakrishnan, Raghu Ramakrishnan
Issue Date:April 2011
pp. 86-96
The practice of crowdsourcing is transforming the Web and giving rise to a new field.
     
Declarative networking
Found in: Communications of the ACM
By Boon Thau Loo, David E. Gay, David E. Gay, David E. Gay, Ion Stoica, Ion Stoica, Ion Stoica, Joseph M. Hellerstein, Joseph M. Hellerstein, Joseph M. Hellerstein, Minos Garofalakis, Minos Garofalakis, Minos Garofalakis, Petros Maniatis, Petros Maniatis, Petros Maniatis, Raghu Ramakrishnan, Raghu Ramakrishnan, Raghu Ramakrishnan, Timothy Roscoe, Timothy Roscoe, Timothy Roscoe, Tyson Condie, Tyson Condie, Tyson Condie
Issue Date:November 2009
pp. 87-95
Declarative Networking is a programming methodology that enables developers to concisely specify network protocols and services, which are directly compiled to a dataflow framework that executes the specifications. This paper provides an introduction to ba...
     
Big Data in 10 Years
Found in: 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)
By Raghu Ramakrishnan
Issue Date:May 2013
pp. 887
No summary available.
 
Cloud Computing¿Was Thomas Watson Right After All?
Found in: Data Engineering, International Conference on
By Raghu Ramakrishnan
Issue Date:April 2008
pp. 8
No summary available.
 
Exploratory Mining in Cube Space
Found in: Data Mining, IEEE International Conference on
By Raghu Ramakrishnan
Issue Date:December 2006
pp. 6
Data Mining has evolved as a new discipline at the intersection of several existing areas, including Database Systems, Machine Learning, Optimization, and Statistics. An important question is whether the field has matured to the point where it has originat...
   
Toward a Query Language for Network Attack Data
Found in: Data Engineering Workshops, 22nd International Conference on
By Bee-Chung Chen, Vinod Yegneswaran, Paul Barford, Raghu Ramakrishnan
Issue Date:April 2006
pp. 28
The growing sophistication and diversity of malicious activity in the Internet presents a serious challenge for network security analysts. In this paper, we describe our efforts to develop a database and query language for network attack data from firewall...
 
On the Integration of Structure Indexes and Inverted Lists
Found in: Data Engineering, International Conference on
By Raghav Kaushik, Rajasekar Krishnamurthy, Jeffrey F Naughton, Raghu Ramakrishnan
Issue Date:April 2004
pp. 829
No summary available.
   
The QUIQ Engine: A Hybrid IR-DB System
Found in: Data Engineering, International Conference on
By Navin Kabra, Raghu Ramakrishnan, Vuk Ercegovac
Issue Date:March 2003
pp. 741
For applications that involve rapidly changing textual data and also require traditional DBMS capabilities, current systems are unsatisfactory. In this paper, we describe a hybrid IR-DB system that serves as the basis for the QUIQ-Connect product, a collab...
 
Dynamic Histograms: Capturing Evolving Data Sets
Found in: Data Engineering, International Conference on
By Donko Donjerkovic, Raghu Ramakrishnan, Yannis Ioannidis
Issue Date:March 2000
pp. 86
Conventional histograms are `static' since they cannot be updated but only recalculated. In this paper, we introduce a `dynamic' version of V-optimal histograms, which is constructed and maintained incrementally. Our experimental results indicate that a va...
   
DEMON: Mining and Monitoring Evolving Data
Found in: Data Engineering, International Conference on
By Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke
Issue Date:March 2000
pp. 439
Data mining algorithms have been the focus of much research recently. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-to-date through periodic or occasional addition and deletion of blocks of dat...
 
Mining Very Large Databases
Found in: Computer
By Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan
Issue Date:August 1999
pp. 38-45
<p>Established companies have had decades to accumulate masses of data about their customers, suppliers, products and services, and employees. Data mining, also known as knowledge discovery in databases, gives organizations the tools to sift through ...
 
Protecting the Quality of Service of Existing Information Systems
Found in: Cooperative Information Systems, IFCIS International Conference on
By Kevin S. Beyer, Miron Livny, Raghu Ramakrishnan
Issue Date:August 1998
pp. 74
Organizations that offer external access to their data need a mechanism that ensures a desired level of service for local users. We propose such a mechanism, called the provider agent (PA) architecture, that protects local users by ensuring a (DBA specifie...
 
Data Cube Materialization and Mining over MapReduce
Found in: IEEE Transactions on Knowledge and Data Engineering
By Arnab Nandi,Cong Yu,Philip Bohannon,Raghu Ramakrishnan
Issue Date:October 2012
pp. 1747-1759
Computing interesting measures for data cubes and subsequent mining of interesting cube groups over massive data sets are critical for many important analyses done in the real world. Previous studies have focused on algebraic measures such as SUM that are ...
 
CAP and Cloud Data Management
Found in: Computer
By Raghu Ramakrishnan
Issue Date:February 2012
pp. 43-49
Novel systems that scale out on demand, relying on replicated data and massively distributed architectures with clusters of thousands of machines, particularly those designed for real-time data serving and update workloads, amply illustrate the realities o...
 
PNUTS in Flight: Web-Scale Data Serving at Yahoo
Found in: IEEE Internet Computing
By Adam Silberstein,Jianjun Chen,David Lomax,Brad McMillan,Masood Mortazavi,P.P.S. Narayan,Raghu Ramakrishnan,Russell Sears
Issue Date:January 2012
pp. 13-23
Data management for stateful Web applications is extremely challenging. Applications must scale as they grow in popularity, serve their content with low latency on a global scale, and be highly available, even in the face of hardware failures. This need ha...
 
Distributed cube materialization on holistic measures
Found in: Data Engineering, International Conference on
By Arnab Nandi,Cong Yu,Philip Bohannon,Raghu Ramakrishnan
Issue Date:April 2011
pp. 183-194
Cube computation over massive datasets is critical for many important analyses done in the real world. Unlike commonly studied algebraic measures such as SUM that are amenable to parallel computation, efficient cube computation of holistic measures such as...
 
Data Management in the Cloud
Found in: Data Engineering, International Conference on
By Raghu Ramakrishnan
Issue Date:April 2009
pp. 5
We are in the midst of a computing revolution. As the cost of provisioning hardware and software stacks grows, and the cost of securing and administering these complex systems grows even faster, we're seeing a shift towards computing clouds. Clouds are ess...
 
Efficient Information Extraction over Evolving Text Data
Found in: Data Engineering, International Conference on
By Fei Chen, AnHai Doan, Jun Yang, Raghu Ramakrishnan
Issue Date:April 2008
pp. 943-952
Most current information extraction (IE) approaches have considered only static text corpora, over which we typically have to apply IE only once. Many real-world text corpora however are dynamic. They evolve over time, and to keep extracted information up ...
 
Parallel Evaluation of Composite Aggregate Queries
Found in: Data Engineering, International Conference on
By Lei Chen, Christopher Olston, Raghu Ramakrishnan
Issue Date:April 2008
pp. 218-227
Aggregate measures summarizing subsets of data are valuable in exploratory analysis and decision support, especially when dependent aggregations can be easily specified and computed. A novel class of queries, called composite subset measures, was previousl...
 
Toward a PeopleWeb
Found in: Computer
By Raghu Ramakrishnan, Andrew Tomkins
Issue Date:August 2007
pp. 63-72
Important properties of users and objects will move from being tied to individual Web sites to being globally available. The conjunction of a global object model with portable user context will lead to a richer content structure and introduce significant s...
 
Learning from Aggregate Views
Found in: Data Engineering, International Conference on
By Bee-Chung Chen, Lei Chen, Raghu Ramakrishnan, David R. Musicant
Issue Date:April 2006
pp. 3
In this paper, we introduce a new class of data mining problems called learning from aggregate views. In contrast to the traditional problem of learning from a single table of training examples, the new goal is to learn from multiple aggregate views of the...
 
Mondrian Multidimensional K-Anonymity
Found in: Data Engineering, International Conference on
By Kristen LeFevre, David J. DeWitt, Raghu Ramakrishnan
Issue Date:April 2006
pp. 25
K-Anonymity has been proposed as a mechanism for protecting privacy in microdata publishing, and numerous recoding
 
DEMON: Mining and Monitoring Evolving Data
Found in: IEEE Transactions on Knowledge and Data Engineering
By Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan
Issue Date:January 2001
pp. 50-63
<p><b>Abstract</b>—Data mining algorithms have been the focus of much research recently. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-to-date through periodic or occasional a...
 
Squeezing the Most Out of Relational Database Systems
Found in: Data Engineering, International Conference on
By Jonathan Goldstein, Raghu Ramakrishnan
Issue Date:March 2000
pp. 81
We present compelling experimental evidence of the suitability of FOR compression for many database applications. While there has been some previous and concurrent work on compressing relations, no alternative solution combines the high compression ratios,...
   
Clustering Large Datasets in Arbitrary Metric Spaces
Found in: Data Engineering, International Conference on
By Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, James French
Issue Date:March 1999
pp. 502
Clustering partitions a collection of objects into groups called clusters, such that similar objects fall into the same group. Similarity between objects is defined by a distance function satisfying the triangle inequality; this distance function along wit...
 
SRQL: Sorted Relational Query Language
Found in: Scientific and Statistical Database Management, International Conference on
By Raghu Ramakrishnan, Donko Donjerkovic, Arvind Ranganathan, Kevin S. Beyer, Muralidhar Krishnaprasad
Issue Date:April 1998
pp. 84
A relation is an unordered collection of records. Often, however, there is an underlying order (e.g., a sequence of stock prices), and users want to pose queries that reflect this order (e.g., find a weekly moving average). SQL provides no support for posi...
 
The Claremont report on database research
Found in: Communications of the ACM
By Alexander S. Szalay, Alon Y. Halevy, Anastasia Ailamaki, Anhai Doan, Beng Chin Ooi, Daniela Florescu, Donald Kossmann, Eric A. Brewer, Gerhard Weikum, Hank F. Korth, Hector Garcia-Molina, Johannes Gehrke, Joseph M. Hellerstein, Laura M. Haas, Le Gruenwald, Michael J. Carey, Michael J. Franklin, Michael Stonebraker, Philip A. Bernstein, Raghu Ramakrishnan, Rakesh Agrawal, Roger Magoulas, Samuel Madden, Sunita Sarawagi, Surajit Chaudhuri, Tim O'Reilly, Yannis E. Ioannidis, Alexander S. Szalay, Alon Y. Halevy, Anastasia Ailamaki, Anhai Doan, Beng Chin Ooi, Daniela Florescu, Donald Kossmann, Eric A. Brewer, Gerhard Weikum, Hank F. Korth, Hector Garcia-Molina, Johannes Gehrke, Joseph M. Hellerstein, Laura M. Haas, Le Gruenwald, Michael J. Carey, Michael J. Franklin, Michael Stonebraker, Philip A. Bernstein, Raghu Ramakrishnan, Rakesh Agrawal, Roger Magoulas, Samuel Madden, Sunita Sarawagi, Surajit Chaudhuri, Tim O'Reilly, Yannis E. Ioannidis
Issue Date:June 2009
pp. 101-104
Database research is expanding, with major efforts in system architecture, new languages, cloud services, mobile and virtual worlds, and interplay between structure and text.
     
Scaling mining algorithms to large databases
Found in: Communications of the ACM
By Johannes Gehrke, Paul Bradley, Raghu Ramakrishnan, Ramakrishnan Srikant
Issue Date:January 1988
pp. 38-43
Which insights about data structure make it possible to analyze the very large databases collected by Internet, business, scientific, and government applications?
     
Content recommendation on web portals
Found in: Communications of the ACM
By Bee-Chung Chen, Deepak Agarwal, Pradheep Elango, Raghu Ramakrishnan
Issue Date:June 2013
pp. 92-101
How to offer recommendations to users when they have not specified what they want.
     
Mobius: unified messaging and data serving for mobile apps
Found in: Proceedings of the 10th international conference on Mobile systems, applications, and services (MobiSys '12)
By Alexander Shraer, Byung-Gon Chun, Carlo Curino, Raghu Ramakrishnan, Russell Sears, Samuel Madden
Issue Date:June 2012
pp. 141-154
Mobile application development is challenging for several reasons: intermittent and limited network connectivity, tight power constraints, server-side scalability concerns, and a number of fault-tolerance issues. Developers handcraft complex solutions that...
     
Feed following: the big data challenge in social applications
Found in: Databases and Social Networks (DBSocial '11)
By Adam Silberstein, Ashwin Machanavajjhala, Raghu Ramakrishnan
Issue Date:June 2011
pp. 1-6
Internet users spend billions of minutes per month on sites like Facebook and Twitter. These sites support feed following, where users "follow" activity streams associated with other users and entities. Followers get personalized feeds that blend streams p...
     
Optimizing complex extraction programs over evolving text data
Found in: Proceedings of the 35th SIGMOD international conference on Management of data (SIGMOD '09)
By AnHai Doan, Byron J. Gao, Fei Chen, Jun Yang, Raghu Ramakrishnan
Issue Date:June 2009
pp. 3-4
Most information extraction (IE) approaches have considered only static text corpora, over which we apply IE only once. Many real-world text corpora however are dynamic. They evolve over time, and so to keep extracted information up to date we often must a...
     
Asynchronous view maintenance for VLSD databases
Found in: Proceedings of the 35th SIGMOD international conference on Management of data (SIGMOD '09)
By Adam Silberstein, Brian F. Cooper, Parag Agrawal, Raghu Ramakrishnan, Utkarsh Srivastava
Issue Date:June 2009
pp. 3-4
The query models of the recent generation of very large scale distributed (VLSD) shared-nothing data storage systems, including our own PNUTS and others (e.g. BigTable, Dynamo, Cassandra, etc.) are intentionally simple, focusing on simple lookups and scans...
     
A web of concepts
Found in: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '09)
By Andrew Tomkins, Bo Pang, Nilesh Dalvi, Philip Bohannon, Raghu Ramakrishnan, Ravi Kumar, Sathiya Keerthi, Srujana Merugu
Issue Date:June 2009
pp. 1-2
We make the case for developing a web of concepts by starting with the current view of web (comprised of hyperlinked pages, or documents, each seen as a bag of words), extracting concept-centric metadata, and stitching it together to create a semantically ...
     
Bellwether analysis: Searching for cost-effective query-defined predictors in large databases
Found in: ACM Transactions on Knowledge Discovery from Data (TKDD)
By Bee-Chung Chen, Jude W. Shavlik, Pradeep Tamma, Raghu Ramakrishnan
Issue Date:March 2009
pp. 1-49
How to mine massive datasets is a challenging problem with great potential value. Motivated by this challenge, much effort has concentrated on developing scalable versions of machine learning algorithms. However, the cost of mining large datasets is not ju...
     
Workload-aware anonymization techniques for large-scale datasets
Found in: ACM Transactions on Database Systems (TODS)
By David J. DeWitt, Kristen LeFevre, Raghu Ramakrishnan
Issue Date:August 2008
pp. 1-47
Protecting individual privacy is an important problem in microdata distribution and publishing. Anonymization algorithms typically aim to satisfy certain privacy definitions with minimal impact on the quality of the resulting data. While much of the previo...
     
Toward best-effort information extraction
Found in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD '08)
By AnHai Doan, Pedro DeRose, Raghu Ramakrishnan, Robert McCann, Warren Shen
Issue Date:June 2008
pp. 13-14
Current approaches to develop information extraction (IE) programs have largely focused on producing precise IE results. As such, they suffer from three major limitations. First, it is often difficult to execute partially specified IE programs and obtain m...
     
Efficient bulk insertion into a distributed ordered table
Found in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD '08)
By Adam Silberstein, Brian F. Cooper, Erik Vee, Raghu Ramakrishnan, Ramana Yerneni, Utkarsh Srivastava
Issue Date:June 2008
pp. 13-14
We study the problem of bulk-inserting records into tables in a system that horizontally range-partitions data over a large cluster of shared-nothing machines. Each table partition contains a contiguous portion of the table's key range, and must accept all...
     
Data challenges at Yahoo!
Found in: Proceedings of the 11th international conference on Extending database technology: Advances in database technology (EDBT '08)
By Raghu Ramakrishnan, Ricardo Baeza-Yates
Issue Date:March 2008
pp. 1-3
In this short paper we describe the data that Yahoo! handles, the current trends in Web applications, and the many challenges that this poses for Yahoo! Research. These challenges have led to the development of new data systems and novel data mining techni...
     
Databases on the web
Found in: Proceedings of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD '07)
By Raghu Ramakrishnan
Issue Date:June 2007
pp. 874-874
What role will database management play in the next generation of the web? I believe that a number of trends signal a growing and central role for the ideas and techniques that have emerged over the past three decades of database research. However, we will...
     
Theory of nearest neighbors indexability
Found in: ACM Transactions on Database Systems (TODS)
By Raghu Ramakrishnan, Uri Shaft
Issue Date:September 2006
pp. 814-838
In this article, we consider whether traditional index structures are effective in processing unstable nearest neighbors workloads. It is known that under broad conditions, nearest neighbors workloads become unstable---distances between data points become ...
     
Workload-aware anonymization
Found in: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '06)
By David J. DeWitt, Kristen LeFevre, Raghu Ramakrishnan
Issue Date:August 2006
pp. 277-286
Protecting data privacy is an important problem in microdata distribution. Anonymization algorithms typically aim to protect individual privacy, with minimal impact on the quality of the resulting data. While the bulk of previous work has measured quality ...
     
Managing information extraction: state of the art and research directions
Found in: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD '06)
By AnHai Doan, Raghu Ramakrishnan, Shivakumar Vaithyanathan
Issue Date:June 2006
pp. 799-800
This tutorial makes the case for developing a unified framework that manages information extraction from unstructured data (focusing in particular on text). We first survey research on information extraction in the database, AI, NLP, IR, and Web communitie...
     
Relaxed-currency serializability for middle-tier caching and replication
Found in: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD '06)
By Alan Fekete, Hongfei Guo, Philip A. Bernstein, Pradeep Tamma, Raghu Ramakrishnan
Issue Date:June 2006
pp. 599-610
Many applications, such as e-commerce, routinely use copies of data that are not in sync with the database due to heuristic caching strategies used to enhance performance. We study concurrency control for a transactional model that allows update transactio...
     
Declarative networking: language, execution and optimization
Found in: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD '06)
By Boon Thau Loo, David E. Gay, Ion Stoica, Joseph M. Hellerstein, Minos Garofalakis, Petros Maniatis, Raghu Ramakrishnan, Timothy Roscoe, Tyson Condie
Issue Date:June 2006
pp. 97-108
The networking and distributed systems communities have recently explored a variety of new network architectures, both for application-level overlay networks, and as prototypes for a next-generation Internet architecture. In this context, we have investiga...
     
Synopses for query optimization: A space-complexity perspective
Found in: ACM Transactions on Database Systems (TODS)
By Jeffrey F. Naughton, Raghav Kaushik, Raghu Ramakrishnan, Venkatesan T. Chakravarthy
Issue Date:December 2005
pp. 1102-1127
Database systems use precomputed synopses of data to estimate the cost of alternative plans during query optimization. A number of alternative synopsis structures have been proposed, but histograms are by far the most commonly used. While histograms have p...
     
Declarative routing: extensible routing with declarative queries
Found in: Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM '05)
By Boon Thau Loo, Ion Stoica, Joseph M. Hellerstein, Raghu Ramakrishnan
Issue Date:August 2005
pp. 289-300
The Internet's core routing infrastructure, while arguably robust and efficient, has proven to be difficult to evolve to accommodate the needs of new applications. Prior research on this problem has included new hard-coded routing protocols on the one hand...
     
Incognito: efficient full-domain K-anonymity
Found in: Proceedings of the 2005 ACM SIGMOD international conference on Management of data (SIGMOD '05)
By David J. DeWitt, Kristen LeFevre, Raghu Ramakrishnan
Issue Date:June 2005
pp. 49-60
A number of organizations publish microdata for purposes such as public health and demographic research. Although attributes that clearly identify individuals, such as Name and Social Security Number, are generally removed, these databases can sometimes be...
     
Similarity search in high-dimensional datasets
Found in: Proceedings of the 2nd international workshop on Computer vision meets databases (CVDB '05)
By Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft
Issue Date:June 2005
pp. 1-2
The problem of finding "similar" multimedia objects is a central one, and a popular approach is to represent objects as vectors in a high-dimensional space, and to build a spatial index over a collection of such vectors in order to retrieve the "nearest ne...
     
The EDAM project: mining mass spectra and more
Found in: Proceedings of the Thirteenth ACM conference on Information and knowledge management (CIKM '04)
By Raghu Ramakrishnan
Issue Date:November 2004
pp. 1-1
The EDAM project is a collaborative effort between computer scientists and environmental chemists at Carleton College and UW-Madison. The goal is to develop data mining techniques for advancing the state of the art in analyzing atmospheric aerosol datasets...
     
 1  2 Next >>