Search For:

Displaying 1-15 out of 15 total
Managing the Forecast Factory
Found in: Data Engineering Workshops, 22nd International Conference on
By Laura Bright, David Maier, Bill Howe
Issue Date:April 2006
pp. 64
The CORIE forecast factory consists of a set of data product generation runs that are executed daily on dedicated local resources. The goal is to maximize productivity and resource utilization while still ensuring timely completion of all forecasts. Many e...
 
Querying and Visualizing Gridded Datasets for e-Science
Found in: Data Engineering, International Conference on
By Bill Howe, David Maier
Issue Date:April 2005
pp. 1106-1107
No summary available.
   
Scalable Flow-Based Community Detection for Large-Scale Network Analysis
Found in: 2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW)
By Seung-Hee Bae,Daniel Halperin,Jevin West,Martin Rosvall,Bill Howe
Issue Date:December 2013
pp. 303-310
Community-detection is a powerful approach to uncover important structures in large networks. Since networks often describe flow of some entity, flow-based community-detection methods are particularly interesting. One such algorithm is called Info map, whi...
 
Collaborative Science Workflows in SQL
Found in: Computing in Science & Engineering
By Bill Howe,Daniel Halperin,Francois Ribalet,Sagar Chitnis,E. Virginia Armbrust
Issue Date:May 2013
pp. 22-31
SQLShare is a Web-based application that emphasizes a simple upload-query-share protocol over conventional database design and uses ad hoc interactive query over general-purpose programming. Here, a case study examines the use of SQLShare as an alternative...
 
Poster: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters
Found in: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC)
By Kai Ren,Garth Gibson,YongChul Kwon,Magdalena Balazinska,Bill Howe
Issue Date:November 2012
pp. 1453
We analyze Hadoop workloads from three different research clusters from an application-level perspective, with two goals: (1) explore new issues in application patterns and user behavior and (2) understand key performance chal- lenges related to IO and loa...
 
Abstract: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters
Found in: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC)
By Kai Ren,Garth Gibson,YongChul Kwon,Magdalena Balazinska,Bill Howe
Issue Date:November 2012
pp. 1452
We analyze Hadoop workloads from three different research clusters from an application-level perspective, with two goals: (1) explore new issues in application patterns and user behavior and (2) understand key performance chal- lenges related to IO and loa...
 
Virtual Appliances, Cloud Computing, and Reproducible Research
Found in: Computing in Science and Engineering
By Bill Howe
Issue Date:July 2012
pp. 36-41
As science becomes increasingly computational, reproducibility has become increasingly difficult, perhaps surprisingly. In many contexts, virtualization and cloud computing can mitigate the issues involved without significant overhead to the researcher, en...
 
COVE: A Visual Environment for Multidisciplinary Ocean Science Collaboration
Found in: eScience, IEEE International Conference on
By Keith Grochow, Mark Stoermer, James Fogarty, Charlotte Lee, Bill Howe, Ed Lazowska
Issue Date:December 2010
pp. 269-276
Advances in cyber infrastructure for virtual observatories are poised to allow scientists from disparate fields to conduct experiments together, monitor large collections of instruments, and explore extensive archives of observed and simulated data. Such s...
 
End-to-End eScience: Integrating Workflow, Query, Visualization, and Provenance at an Ocean Observatory
Found in: eScience, IEEE International Conference on
By Bill Howe, Peter Lawson, Renee Bellinger, Erik Anderson, Emanuele Santos, Juliana Freire, Carlos Scheidegger, António Baptista, Cláudio Silva
Issue Date:December 2008
pp. 127-134
Data analysis tasks at an Ocean Observatory require integrative and and domain-specialized use of database, workflow, visualization systems.
 
Scientific Mashups: Runtime-Configurable Data Product Ensembles
Found in: eScience, IEEE International Conference on
By Harrison Green-Fishback, Bill Howe
Issue Date:December 2008
pp. 442-443
The concept of a mashup is gaining popularity as a rapid-development, reuse-oriented programming model to replace monolithic, bottom-up application development---a programming style well-suited to scientific data management applications. A variety of mashu...
 
Scientific Exploration in the Era of Ocean Observatories
Found in: Computing in Science and Engineering
By António Baptista, Bill Howe, Juliana Freire, David Maier, Cláudio T. Silva
Issue Date:May 2008
pp. 53-58
The authors introduce an ocean observatory, offer a vision of observatory-enabled scientific exploration, and discuss the requirements and approaches for generating provenance-aware products in such environments.
 
Quarrying dataspaces: Schemaless profiling of unfamiliar information sources
Found in: Data Engineering Workshops, 22nd International Conference on
By Bill Howe, David Maier, Nicolas Rayner, James Rucker
Issue Date:April 2008
pp. 270-277
Traditional data integration and analysis approaches tend to assume intimate familiarity with the structure, semantics, and capabilities of the available information sources before applicable tools can be used effectively. This assumption often does not ho...
 
Education and career paths for data scientists
Found in: Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM)
By Alexandros Labrinidis, Bill Howe, Magdalena Balazinska, Susan B. Davidson
Issue Date:July 2013
pp. 1-2
MOTIVATION: As industry and science are increasingly data-driven, the need for skilled data scientists is exceeding what our universities are producing. According to a Mckinsey report: "By 2018, the United States alone could face a shortage of 140,000 to 1...
     
Real-time collaborative analysis with (almost) pure SQL: a case study in biogeochemical oceanography
Found in: Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM)
By Bill Howe, Daniel Halperin, E. Virginia Armbrust, Francois Ribalet, Konstantin Weitz, Mak A. Saito
Issue Date:July 2013
pp. 1-12
We consider a case study using SQL-as-a-Service to support "instant analysis" of weakly structured relational data at a multi-investigator science retreat. Here, "weakly structured" means tabular, rows-and-columns datasets that share some common context, b...
     
Automatic example queries for ad hoc databases
Found in: Proceedings of the 2011 international conference on Management of data (SIGMOD '11)
By Bill Howe, Garret Cole, Leilani Battle, Nodira Khoussainova
Issue Date:June 2011
pp. 1319-1322
Motivated by eScience applications, we explore automatic generation of example "starter" queries over unstructured collections of tables without relying on a schema, a query log, or prior input from users. Such example queries are demonstrably sufficient t...
     
 1