The Community for Technology Leaders
2013 IEEE International Conference on Big Data (2013)
Silicon Valley, CA, USA
Oct. 6, 2013 to Oct. 9, 2013
ISBN: 978-1-4799-1293-3
TABLE OF CONTENTS

Cover page (PDF)

pp. 1

Organization (PDF)

pp. 1-2

Security — A big question for big data (PDF)

Roger Schell , University of Southern California, USA
pp. 5

Communication efficient algorithms for fundamental big data problems (Abstract)

Peter Sanders , Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
Sebastian Schlag , Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
Ingo Muller , Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
pp. 15-23

Map-based graph analysis on MapReduce (Abstract)

Upa Gupta , University of Texas at Arlington, CSE Arlington, TX 76019
Leonidas Fegaras , University of Texas at Arlington, CSE Arlington, TX 76019
pp. 24-30

P-DOT: A model of computation for big data (Abstract)

Tao Luo , School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Yin Liao , School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Guoliang Chen , School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Yunquan Zhang , Key Lab. of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
pp. 31-37

Transparent composite model for large scale image/video processing (Abstract)

En-hui Yang , Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada
Xiang Yu , Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada
pp. 38-44

Elastic algorithms for guaranteeing quality monotonicity in big data mining (Abstract)

Rui Han , Imperial College London, London, UK
Lei Nie , Imperial College London, London, UK
Moustafa M. Ghanem , Imperial College London, London, UK
Yike Guo , Imperial College London, London, UK
pp. 45-50

HFSP: Size-based scheduling for Hadoop (Abstract)

Mario Pastorelli , EURECOM - Campus SophiaTech, France
Antonio Barbuzzi , EURECOM - Campus SophiaTech, France
Damiano Carra , University of Verona, Italy
Matteo Dell'Amico , EURECOM - Campus SophiaTech, France
Pietro Michiardi , EURECOM - Campus SophiaTech, France
pp. 51-59

An evaluation study of BigData frameworks for graph processing (Abstract)

Benedikt Elser , Università degli Studi di Trento, Italy
Alberto Montresor , Università degli Studi di Trento, Italy
pp. 60-67

Storing and manipulating environmental big data with JASMIN (Abstract)

B. N. Lawrence , Department of Meteorology, University of Reading, Reading, UK
V. L. Bennett , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
J. Churchill , Scientific Computing Department, STFC Rutherford Appleton Laboratory, Didcot, UK
M. Juckes , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
P. Kershaw , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
S. Pascoe , National Centre for Atmospheric Science, UK
S. Pepler , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
M. Pritchard , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
A. Stephens , Centre for Environmental Data Archival, STFC Rutherford Appleton Laboratory, Didcot, UK
pp. 68-75

Efficient gear-shifting for a power-proportional distributed data-placement method (Abstract)

Hieu Hanh Le , Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
Satoshi Hikida , Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
Haruo Yokota , Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
pp. 76-84

Agrios: A hybrid approach to big array analytics (Abstract)

Patrick Leyshock , Computer Science, Portland State University, Portland, Oregon, U.S.A.
David Maier , Computer Science, Portland State University, Portland, Oregon, U.S.A.
Kristin Tufte , Computer Science, Portland State University, Portland, Oregon, U.S.A.
pp. 85-93

Building a generic platform for big sensor data application (Abstract)

Chun-Hsiang Lee , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
David Birch , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Chao Wu , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Dilshan Silva , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Orestis Tsinalis , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Yang Li , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Shulin Yan , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Moustafa Ghanem , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
Yike Guo , Dept. of Computing, Imperial College London, London SW7 2AZ, United Kingdom
pp. 94-102

Locality-driven high-level I/O aggregation for processing scientific datasets (Abstract)

Jialin Liu , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Bradly Crysler , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Yin Lu , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
pp. 103-111

clusiVAT: A mixed visual/numerical clustering algorithm for big data (Abstract)

Dheeraj Kumar , EEE, U. of Melbourne, Victoria 3010, Australia
Marimuthu Palaniswami , EEE, U. of Melbourne, Victoria 3010, Australia
Sutharshan Rajasegarar , EEE, U. of Melbourne, Victoria 3010, Australia
Christopher Leckie , CIS, U. of Melbourne, Victoria 3010, Australia
James C. Bezdek , CIS, U. of Melbourne, Victoria 3010, Australia
Timothy C. Havens , ECE/CS, Michigan Tech U., Houghton, MI 49931, USA
pp. 112-117

Hardware acceleration of Hadoop MapReduce (Abstract)

Toshimori Honjo , NTT Software Innovation Center, NTT Corporation, Tokyo, Japan
Kazuki Oikawa , NTT Software Innovation Center, NTT Corporation, Tokyo, Japan
pp. 118-124

Optimizing the MapReduce framework on Intel Xeon Phi coprocessor (Abstract)

Mian Lu , Institute of High Performance Computing, A∗STAR
Lei Zhang , Peking University
Huynh Phung Huynh , Institute of High Performance Computing, A∗STAR
Zhongliang Ong , Institute of High Performance Computing, A∗STAR
Yun Liang , Peking University
Bingsheng He , Nanyang Technological University
Rick Siow Mong Goh , Institute of High Performance Computing, A∗STAR
Richard Huynh , Nanyang Technological University
pp. 125-130

On the performance and energy efficiency of Hadoop deployment models (Abstract)

Eugen Feller , Inria Centre Rennes - Bretagne Atlantique Campus universitaire de Beaulieu, 35042 Rennes Cedex, France
Lavanya Ramakrishnan , Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley, CA 94720, USA
Christine Morin , Inria Centre Rennes - Bretagne Atlantique Campus universitaire de Beaulieu, 35042 Rennes Cedex, France
pp. 131-136

Optimizing throughput on guaranteed-bandwidth WAN networks for the Large Synoptic Survey Telescope (LSST) (Abstract)

D. Michael Freemon , National Center for Supercomputing Applications, University of Illinois, Urbana, IL, USA
pp. 137-142

Feliss: Flexible distributed computing framework with light-weight checkpointing (Abstract)

Takuya Araki , Cloud System Research Laboratories, NEC Corporation, Japan
Kazuyo Narita , Cloud System Research Laboratories, NEC Corporation, Japan
Hiroshi Tamano , Knowledge Discovery Research Laboratories, NEC Corporation, Japan
pp. 143-149

Algebraic dataflows for big data analysis (Abstract)

Jonas Dias , Federal University of Rio de Janeiro - COPPE/UFRJ
Eduardo Ogasawara , Federal University of Rio de Janeiro - COPPE/UFRJ
Daniel de Oliveira , Fluminense Federal University - UFF
Fabio Porto , LNCC National Laboratory for Scientific Computing, Brazil
Patrick Valduriez , INRIA and LIRMM, France
Marta Mattoso , Federal University of Rio de Janeiro - COPPE/UFRJ
pp. 150-155

Scalable and robust key group size estimation for reducer load balancing in MapReduce (Abstract)

Wei Yan , Dept. of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA
Yuan Xue , Dept. of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA
Bradley Malin , Dept. of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
pp. 156-162

Robot: An efficient model for big data storage systems based on erasure coding (Abstract)

Chao Yin , School of Computer Science and Technology, Huazhong University of Science and Technology, China
Jianzong Wang , NetEase Inc., Guangzhou, China
Changsheng Xie , School of Computer Science and Technology, Huazhong University of Science and Technology, China
Jiguang Wan , School of Computer Science and Technology, Huazhong University of Science and Technology, China
Changlin Long , School of Computer Science and Technology, Huazhong University of Science and Technology, China
Wenjuan Bi , School of Computer Science and Technology, Huazhong University of Science and Technology, China
pp. 163-168

Multilevel Active Storage for big data applications in high performance computing (Abstract)

Chao Chen , Department of Computer Science, Texas Tech University, Lubbock, TX, 79409
Michael Lang , Los Alamos National Laboratory, Los Alamos, NM, 87544
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, TX, 79409
pp. 169-174

GPU accelerated item-based collaborative filtering for big-data applications (Abstract)

Chandima Hewa Nadungodage , Department of Computer & Information Science Purdue School of Science, IUPUI, Indianapolis, USA
Yuni Xia , Department of Computer & Information Science Purdue School of Science, IUPUI, Indianapolis, USA
John Jaehwan Lee , Department of Electrical & Computer Engineering Purdue School of Engineering & Technology IUPUI, Indianapolis, USA
Myungcheol Lee , Big-Data Software Platform, Research Department, Software Research Laboratory, Electronics & Telecommunications, Research Institute, Korea
Choon Seo Park , Big-Data Software Platform, Research Department, Software Research Laboratory, Electronics & Telecommunications, Research Institute, Korea
pp. 175-180

GPU-accelerated adaptive compression framework for genomics data (Abstract)

GuiXin Guo , BGI-Shenzhen, Shenzhen, P.R. China
Shuang Qiu , BGI-Shenzhen, Shenzhen, P.R. China
ZhiQiang Ye , BGI-Shenzhen, Shenzhen, P.R. China
BingQiang Wang , BGI-Shenzhen, Shenzhen, P.R. China
Lin Fang , BGI-Shenzhen, Shenzhen, P.R. China
Mian Lu , Institute of HPC, A∗STAR, Singapore
Simon See , BGI-NVIDIA Joint Innovation Lab, Shenzhen, P.R. China
Rui Mao , Guangdong Province Key Laboratory of Popular High Performance Computers, Shenzhen University, Shenzhen, P.R. China
pp. 181-186

An infrastructure for automating large-scale performance studies and data processing (Abstract)

Deepal Jayasinghe , Center for Experimental Research in Computer Systems, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
Josh Kimball , Center for Experimental Research in Computer Systems, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
Tao Zhu , Center for Experimental Research in Computer Systems, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
Siddharth Choudhary , Center for Experimental Research in Computer Systems, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
Pu. Calton , Center for Experimental Research in Computer Systems, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332-0765, USA
pp. 187-192

Kylin: An efficient and scalable graph data processing system (Abstract)

Li-Yung Ho , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Tsung-Han Li , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Jan-Jan Wu , Institute of Information Science Research Center for Information, Technology Innovation, Academia Sinica, Taipei, Taiwan
Pangfeng Liu , Department of Computer Science and Information Engineering, Graduate Intitute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan
pp. 193-198

Towards hybrid online on-demand querying of realtime data with stateful complex event processing (Abstract)

Qunzhi Zhou , Department of Computer Science, University of Southern California, Los Angeles, USA
Yogesh Simmhan , Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, USA
Viktor Prasanna , Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, USA
pp. 199-205

DDSN: Duplicate detection to reduce both storage and bandwidth consumption (Abstract)

Jiaran Zhang , School of Computer Science & Technology, Shandong University, Jinan, China
Xiaohui Yu , School of Computer Science & Technology, Shandong University, Jinan, China
Yang Liu , School of Computer Science & Technology, Shandong University, Jinan, China
Liwei Lin , School of Computer Science & Technology, Shandong University, Jinan, China
pp. 206-211

A reconfigurable computing architecture for semantic information filtering (Abstract)

Aalap Tripathy , Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA
Ka Chon Ieong , Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA
Atish Patra , Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA
Rabi Mahapatra , Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA
pp. 212-218

Iteration aware prefetching for unstructured grids (Abstract)

Oyindamola O. Akande , Department of Computer Science, University of Mississippi, Oxford, MS 38677
Philip J. Rhodes , Department of Computer Science, University of Mississippi, Oxford, MS 38677
pp. 219-227

Measuring inter-site engagement (Abstract)

Elad Yom-Tov , Microsoft Research
Mounia Lalmas , Yahoo! Labs
Ricardo Baeza-Yates , Yahoo! Labs
Georges Dupret , Yahoo! Labs
Janette Lehmann , Yahoo! Labs
Pinar Donmez , Banjo Inc.
pp. 228-236

A selective checkpointing mechanism for query plans in a parallel database system (Abstract)

Ting Chen , The University of Tokyo
Kenjiro Taura , The University of Tokyo
pp. 237-245

CORE: Cross-object redundancy for efficient data repair in storage systems (Abstract)

Kyumars Sheykh Esmaili , Nanyang Technological University, Singapore 639798
Lluis Pamies-Juarez , Nanyang Technological University, Singapore 639798
Anwitaman Datta , Nanyang Technological University, Singapore 639798
pp. 246-254

H2RDF+: High-performance distributed joins over large-scale RDF graphs (Abstract)

Nikolaos Papailiou , Computing Systems Laboratory, National Technical University of Athens
Ioannis Konstantinou , Computing Systems Laboratory, National Technical University of Athens
Dimitrios Tsoumakos , Department of Informatics, Ionian University
Panagiotis Karras , Management Science and Information Systems Rutgers University
Nectarios Koziris , Computing Systems Laboratory, National Technical University of Athens
pp. 255-263

Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures (Abstract)

Austin R. Benson , Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA
David F. Gleich , Department of Computer Science, Purdue University, West Lafayette, IN
James Demmel , Computer Sciences Division and Department of Mathematics, University of California, Berkeley, Berkeley, CA
pp. 264-272

Adaptive file management for scientific workflows on the Azure cloud (Abstract)

Radu Tudoran , Microsoft Research-Inria Joint Centre, Palaiseau, France
Alexandra Costan , Inria Rennes-Bretagne Atlantique, Rennes, France
Ramin Rezai Rad , Microsoft Research ATLE, Aachen, Germany
Goetz Brasche , Microsoft Research ATLE, Aachen, Germany
Gabriel Antoniu , Inria Rennes-Bretagne Atlantique, Rennes, France
pp. 273-281

Model-view sensor data management in the cloud (Abstract)

Tian Guo , School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 1015
Thanasis G. Papaioannou , School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 1015
Karl Aberer , School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland 1015
pp. 282-290

Spatio-temporal indexing in non-relational distributed databases (Abstract)

Anthony Fox , Commonwealth Computer Research, Inc
Chris Eichelberger , Commonwealth Computer Research, Inc
James Hughes , Commonwealth Computer Research, Inc
Skylar Lyon , Commonwealth Computer Research, Inc
pp. 291-299

Scientific discovery through weighted sampling (Abstract)

Lefteris Sidirourgos , Database Architectures, CWI, Amsterdam, The Netherlands
Martin Kersten , Database Architectures, CWI, Amsterdam, The Netherlands
Peter Boncz , Database Architectures, CWI, Amsterdam, The Netherlands
pp. 300-306

Scalable data citation in dynamic, large databases: Model and reference implementation (Abstract)

Stefan Proll , SBA Research, Vienna, Austria
Andreas Rauber , Technical University of Vienna, Vienna, Austria
pp. 307-312

On the use of shared storage in shared-nothing environments (Abstract)

K. R. Krish , Dept. of Computer Science, Virginia Tech
Aleksandr Khasymski , Dept. of Computer Science, Virginia Tech
Guanying Wang , Dept. of Computer Science, Virginia Tech
Ali R. Butt , Dept. of Computer Science, Virginia Tech
Gaurav Makkar , NetApp Inc.
pp. 313-318

Self-adaptive event recognition for intelligent transport management (Abstract)

Alexander Artikis , Institute of Informatics & Telecommunications, NCSR Demokritos, Athens, Greece
Matthias Weidlich , Technion - Israel Institute of Technology, Haifa, Israel
Avigdor Gal , Technion - Israel Institute of Technology, Haifa, Israel
Vana Kalogeraki , Department Informatics, Athens University of Economics and Business, Greece
Dimitrios Gunopulos , Department of Informatics and Telecommunications, University of Athens, Greece
pp. 319-325

Improving floating point compression through binary masks (Abstract)

Leonardo A. Bautista Gomez , Argonne National Laboratory
Franck Cappello , Argonne National Laboratory
pp. 326-331

Using pattern-models to guide SSD deployment for Big Data applications in HPC systems (Abstract)

Junjie Chen , Department of Computer Science, Texas Tech University, Lubbock, TX, USA
Philip C. Roth , Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, TX, USA
pp. 332-337

Robust crowdsourced learning (Abstract)

Zhiquan Liu , Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China
Luo Luo , Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China
Wu-Jun Li , Shanghai Key Laboratory of Scalable Computing and Systems, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China
pp. 338-343

Segmented analysis for reducing data movement (Abstract)

Jialin Liu , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Surendra Byna , Computational Research Division, Lawrence Berkeley Laboratory, Berkeley, California, USA
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
pp. 344-349

Continuous hyperparameter optimization for large-scale recommender systems (Abstract)

Simon Chan , Department of Computer Science, University College London, London, United Kingdom
Philip Treleaven , Department of Computer Science, University College London, London, United Kingdom
Licia Capra , Department of Computer Science, University College London, London, United Kingdom
pp. 350-358

4S: Scalable subspace search scheme overcoming traditional Apriori processing (Abstract)

Hoang Vu Nguyen , Karlsruhe Institute of Technology (KIT), Germany
Emmanuel Muller , Karlsruhe Institute of Technology (KIT), Germany
Klemens Bohm , Karlsruhe Institute of Technology (KIT), Germany
pp. 359-367

Computing betweenness centrality in external memory (Abstract)

Lars Arge , MADALGO, Dept. of Computer Science, Aarhus University
Michael T. Goodrich , Dept. of Computer Science, School of Info. & Comp. Sci., University of California, Irvine
Freek van Walderveen , MADALGO, Dept. of Computer Science, Aarhus University
pp. 368-375

A parallel computing platform for training large scale neural networks (Abstract)

Rong Gu , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
Furao Shen , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
Yihua Huang , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
pp. 376-384

Self-tuned kernel spectral clustering for large scale networks (Abstract)

Raghvendra Mall , Department of Electrical Engineering, ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Rocco Langone , Department of Electrical Engineering, ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Johan A.K. Suykens , Department of Electrical Engineering, ESAT-SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
pp. 385-393

NUMA-optimized parallel breadth-first search on multicore single-node system (Abstract)

Yuichiro Yasui , Chuo University, Tokyo, Japan
Katsuki Fujisawa , Chuo University, Tokyo, Japan
Kazushige Goto , Intel Corporation, Hillsboro, OR, USA
pp. 394-402

A distributed vertex-centric approach for pattern matching in massive graphs (Abstract)

Arash Fard , Computer Science Department, The University of Georgia, Athens, USA
M. Usman Nisar , Computer Science Department, The University of Georgia, Athens, USA
Lakshmish Ramaswamy , Computer Science Department, The University of Georgia, Athens, USA
John A. Miller , Computer Science Department, The University of Georgia, Athens, USA
Matthew Saltz , Computer Science Department, The University of Georgia, Athens, USA
pp. 403-411

Fast scalable selection algorithms for large scale data (Abstract)

Lee Parnell Thompson , Department of Computer Science, University of Texas at Austin, Austin, TX
Weijia Xu , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX
Daniel P. Miranker , Department of Computer Science, University of Texas at Austin, Austin, TX
pp. 412-420

An NML-based model selection criterion for general relational data modeling (Abstract)

Yoshiki Sakai , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Kenji Yamanishi , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
pp. 421-429

Parallel matrix factorization for binary response (Abstract)

Rajiv Khanna , University of Texas at Austin
Liang Zhang , LinkedIn
Deepak Agarwal , LinkedIn
Bee-chung Chen , LinkedIn
pp. 430-438

CallCab: A unified recommendation system for carpooling and regular taxicab services (Abstract)

Desheng Zhang , Department of Computer Science and Engineering, University of Minnesota, USA
Tian He , Department of Computer Science and Engineering, University of Minnesota, USA
Yunhuai Liu , Third Research Institute, Ministry of Public Security, China
John A. Stankovic , Department of Computer Science, University of Virginia, USA
pp. 439-447

Top-K aggregation over a large graph using shared-nothing systems (Abstract)

Abhirup Chakraborty , School of Informatics and Computer Science, Indiana University, Bloomington, Indiana 47408
pp. 448-457

Distributed confidence-weighted classification on MapReduce (Abstract)

Nemanja Djuric , Temple University, Philadelphia, PA, USA
Mihajlo Grbovic , Yahoo! Labs, Sunnyvale, CA, USA
Slobodan Vucetic , Temple University, Philadelphia, PA, USA
pp. 458-466

Scalable context-aware role mining with MapReduce (Abstract)

Zhiwei Yu , School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
Raymond K. Wong , School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
Chi-Hung Chi , ISSL, CSIRO, Tasmania, Australia
pp. 467-474

Elver: Recommending Facebook pages in cold start situation without content features (Abstract)

Yusheng Xie , Northwestern University Evanston, Illinois 60201 USA
Zhengzhang Chen , Northwestern University Evanston, Illinois 60201 USA
Kunpeng Zhang , Northwestern University Evanston, Illinois 60201 USA
Chen Jin , Northwestern University Evanston, Illinois 60201 USA
Yu Cheng , Northwestern University Evanston, Illinois 60201 USA
Ankit Agrawal , Northwestern University Evanston, Illinois 60201 USA
Alok Choudhary , Northwestern University Evanston, Illinois 60201 USA
pp. 475-479

Massively scalable near duplicate detection in streams of documents using MDSH (Abstract)

Paul Logasa Bogen , Computational Data Analytics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Christopher T. Symons , Computational Data Analytics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Amber McKenzie , Computational Data Analytics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Robert M. Patton , Computational Data Analytics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Robert E. Gillen , Computational Data Analytics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831
pp. 480-486

Classification of big velocity data via cross-domain Canonical Correlation Analysis (Abstract)

Bo Zhang , The Key Laboratory of Intelligent Information, Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Zhong-Zhi Shi , The Key Laboratory of Intelligent Information, Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
pp. 493-498

A distributed tree data structure for real-time OLAP on cloud architectures (Abstract)

F. Dehne , School of Computer Science, Carleton University, Ottawa, Canada
Q. Kong , Faculty of Computer Science, Dalhousie University, Halifax, Canada
A. Rau-Chaplin , Faculty of Computer Science, Dalhousie University, Halifax, Canada
H. Zaboli , School of Computer Science, Carleton University, Ottawa, Canada
R. Zhou , School of Computer Science, Carleton University, Ottawa, Canada
pp. 499-505

DL-MPI: Enabling data locality computation for MPI-based data-intensive applications (Abstract)

Jiangling Yin , Department of Electrical Engineering & Computer Science, University of Central Florida, Orlando, Florida 32826
Andrew Foran , Department of Electrical Engineering & Computer Science, University of Central Florida, Orlando, Florida 32826
Jun Wang , Department of Electrical Engineering & Computer Science, University of Central Florida, Orlando, Florida 32826
pp. 506-511

Sparse Poisson coding for high dimensional document clustering (Abstract)

Chenxia Wu , College of Computer Science, Zhejiang University, Hangzhou, China
Haiqin Yang , Shenzhen Research Institute, The Chinese University of Hong Kong, Shenzhen, China
Jianke Zhu , College of Computer Science, Zhejiang University, Hangzhou, China
Jiemi Zhang , College of Computer Science, Zhejiang University, Hangzhou, China
Irwin King , Shenzhen Research Institute, The Chinese University of Hong Kong, Shenzhen, China
Michael R. Lyu , Shenzhen Research Institute, The Chinese University of Hong Kong, Shenzhen, China
pp. 512-517

Fast OLAP query execution in main memory on large data in a cluster (Abstract)

Martin Weidner , SAP AG, 69190 Walldorf, Germany
Jonathan Dees , SAP AG, 69190 Walldorf, Germany
Peter Sanders , Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany
pp. 518-524

Group-Scheme: SIMD-based compression algorithms for web text data (Abstract)

Xudong Zhang , Department of Computer Science and Technology, Peking University, Beijing, China
Wayne Xin Zhao , Department of Computer Science and Technology, Peking University, Beijing, China
Dongdong Shan , Department of Computer Science and Technology, Peking University, Beijing, China
Hongfei Yan , Department of Computer Science and Technology, Peking University, Beijing, China
pp. 525-530

Efficient large graph pattern mining for big data in the cloud (Abstract)

Chun-Chieh Chen , Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan, R.O.C.
Kuan-Wei Lee , Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, R.O.C.
Chih-Chieh Chang , Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan, R.O.C.
De-Nian Yang , Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C.
Ming-Syan Chen , Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan, R.O.C.
pp. 531-536

A stream partitioning approach to processing large scale distributed graph datasets (Abstract)

Rui Wang , Department of Computer Science, State University of New York at Binghamton
Kenneth Chiu , Department of Computer Science, State University of New York at Binghamton
pp. 537-542

Scalable distributed event detection for Twitter (Abstract)

Richard McCreadie , School of Computing Science, University of Glasgow
Craig Macdonald , School of Computing Science, University of Glasgow
Iadh Ounis , School of Computing Science, University of Glasgow
Miles Osborne , School of Informatics, University of Edinburgh
Sasa Petrovic , School of Informatics, University of Edinburgh
pp. 543-549

Analysis of GSM calls data for understanding user mobility behavior (Abstract)

Barbara Furletti , KDDLab, ISTI-CNR, Pisa, Italy
Lorenzo Gabrielli , KDDLab, ISTI-CNR, Pisa, Italy
Chiara Renso , KDDLab, ISTI-CNR, Pisa, Italy
Salvatore Rinzivillo , KDDLab, ISTI-CNR, Pisa, Italy
pp. 550-555

Scaling concurrency of personalized Semantic search over Large RDF data (Abstract)

Haizhou Fu , North Carolina State University, Raleigh, NC, USA
HyeongSik Kim , North Carolina State University, Raleigh, NC, USA
Kemafor Anyanwu , North Carolina State University, Raleigh, NC, USA
pp. 556-562

A hypergraph-partitioned vertex programming approach for large-scale consensus optimization (Abstract)

Hui Miao , Dept. of Computer Science, University of Maryland, College Park, USADept. of Electrical & Computer Engineering, University of Maryland, College Park, USA
Xiangyang Liu , Dept. of Computer Science, University of Maryland, College Park, USADept. of Electrical & Computer Engineering, University of Maryland, College Park, USA
Bert Huang , Dept. of Computer Science, University of Maryland, College Park, USADept. of Electrical & Computer Engineering, University of Maryland, College Park, USA
Lise Getoor , Dept. of Computer Science, University of Maryland, College Park, USADept. of Electrical & Computer Engineering, University of Maryland, College Park, USA
pp. 563-568

A Higher-order data flow model for heterogeneous Big Data (Abstract)

Simon Price , Intelligent Systems Laboratory, University of Bristol, Bristol BS8 1UB, UK
Peter A. Flach , IT Services R&D / ILRT, University of Bristol, Bristol BS8 1HH, UK
pp. 569-574

Parallel subgroup discovery on computing clusters — First results (Abstract)

Daniel Trabold , Fraunhofer IAIS, Schloss Birlinghoven, 53754 Sankt Augustin
Henrik Grosskreutz , Fraunhofer IAIS, Schloss Birlinghoven, 53754 Sankt Augustin
pp. 575-579

DP-WHERE: Differentially private modeling of human mobility (Abstract)

Darakhshan J. Mir , Rutgers University
Sibren Isaacman , Loyola University Maryland
Ramon Caceres , AT&T Labs
Margaret Martonosi , Princeton University
Rebecca N. Wright , Rutgers University
pp. 580-588

Malicious URL filtering — A big data application (Abstract)

Min-Sheng Lin , Dept. of Computer Science and Information Engineering, National Taiwan Univ. of Science and Technology, Taipei, 10607 Taiwan
Chien-Yi Chiu , Dept. of Computer Science and Information Engineering, National Taiwan Univ. of Science and Technology, Taipei, 10607 Taiwan
Yuh-Jye Lee , Dept. of Computer Science and Information Engineering, National Taiwan Univ. of Science and Technology, Taipei, 10607 Taiwan
Hsing-Kuo Pao , Dept. of Computer Science and Information Engineering, National Taiwan Univ. of Science and Technology, Taipei, 10607 Taiwan
pp. 589-596

Zero-knowledge private graph summarization (Abstract)

Maryam Shoaran , University of Victoria, BC, Canada
Alex Thomo , University of Victoria, BC, Canada
Jens H. Weber-Jahnke , University of Victoria, BC, Canada
pp. 597-605

Scalable network traffic visualization using compressed graphs (Abstract)

Lei Shi , State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences
Qi Liac , Department of Computer Science, Central Michigan University
Xiaohua Sun , College of Design and Innovation, Tongji University
Yarui Chen , Department of Computer Science and Technology, Tsinghua University
Chuang Lin , Department of Computer Science and Technology, Tsinghua University
pp. 606-612

Breaking the Arc: Risk control for Big Data (Abstract)

Duncan Hodges , Department of Computer Science, University of Oxford, Oxford, UK
Sadie Creese , Department of Computer Science, University of Oxford, Oxford, UK
pp. 613-621

The BTWorld use case for big data analytics: Description, MapReduce logical workflow, and empirical evaluation (Abstract)

Tim Hegeman , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
Bogdan Ghit , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
Mihai Capota , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
Jan Hidders , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
Dick Epema , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
Alexandru Iosup , Parallel and Distributed Systems Group, Delft University of Technology, the Netherlands
pp. 622-630

Modeling heterogeneous time series dynamics to profile big sensor data in complex physical systems (Abstract)

Bin Liu , Rutgers University, NJ, USA
Haifeng Chen , NEC Laboratories America, Princeton, NJ, USA
Abhishek Sharma , NEC Laboratories America, Princeton, NJ, USA
Guofei Jiang , NEC Laboratories America, Princeton, NJ, USA
Hui Xiong , Rutgers University, NJ, USA
pp. 631-638

Efficiently extracting frequent subgraphs using MapReduce (Abstract)

Wei Lu , National University of Singapore
Gang Chen , Zhejiang University
Anthony K. H. Tung , National University of Singapore
Feng Zhao , National University of Singapore
pp. 639-647

Explaining the product range effect in purchase data (Abstract)

Diego Pennacchioli , IMT Institute for Advanced Studies, Piazza San Ponziano 6, Lucca, Italy
Michele Coscia , KDDLab ISTI-CNR, Via G. Moruzzi 1, Pisa, Italy
Salvatore Rinzivillo , KDDLab ISTI-CNR, Via G. Moruzzi 1, Pisa, Italy
Dino Pedreschi , KDDLab University of Pisa, Largo B. Pontecorvo 3, Pisa, Italy
Fosca Giannotti , KDDLab ISTI-CNR, Via G. Moruzzi 1, Pisa, Italy
pp. 648-656

Large Scale predictive analytics for real-time energy management (Abstract)

Natasha Balac , University of California, San Diego La Jolla, CA USA
Tamara Sipes , University of California, San Diego La Jolla, CA USA
Nicole Wolter , University of California, San Diego La Jolla, CA USA
Kenneth Nunes , University of California, San Diego La Jolla, CA USA
Bob Sinkovits , University of California, San Diego La Jolla, CA USA
Homa Karimabadi , University of California, San Diego La Jolla, CA USA
pp. 657-664

Parallel deterministic annealing clustering and its application to LC-MS data analysis (Abstract)

Geoffrey Fox , School of Informatics and Computing, Indiana University, Bloomington IN USA
D. R. Mani , Proteomics and Biomarker Discovery, The Broad Institute of MIT and Harvard, Cambridge, MA USA
Saumyadipta Pyne , CR Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad, India
pp. 665-673

Terabyte-scale image similarity search: Experience and best practice (Abstract)

Diana Moise , INRIA Rennes, France
Denis Shestakov , Aalto University, Finland, INRIA Rennes, France
Gylfi Gudmundsson , INRIA Rennes, France
Laurent Amsaleg , IRISA-CNRS, France
pp. 674-682

HIG — An in-memory database platform enabling real-time analyses of genome data (Abstract)

Matthieu-P. Schapranow , Hasso Plattner Institute, Enterprise Platform and Integration Concepts, August-Bebel-Str. 88, 14482 Potsdam, Germany
Hasso Plattner , Hasso Plattner Institute, Enterprise Platform and Integration Concepts, August-Bebel-Str. 88, 14482 Potsdam, Germany
pp. 691-696

Real-time streaming mobility analytics (Abstract)

Andras Garzo , Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI), Faculty of Informatics, University of Debrecen and Eötvös University, Budapest
Andras A. Benczur , Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI), Faculty of Informatics, University of Debrecen and Eötvös University, Budapest
Csaba Istvan Sidlo , Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI), Faculty of Informatics, University of Debrecen and Eötvös University, Budapest
Daniel Tahara , Yale University
Erik Francis Wyatt , St. Olaf College
pp. 697-702

QuPARA: Query-driven large-scale portfolio aggregate risk analysis on MapReduce (Abstract)

A. Rau-Chaplin , Risk Analytics Lab, Dalhousie University, Halifax, Nova Scotia, Canada
B. Varghese , Risk Analytics Lab, Dalhousie University, Halifax, Nova Scotia, Canada
D. Wilson , Risk Analytics Lab, Dalhousie University, Halifax, Nova Scotia, Canada
Z. Yao , Risk Analytics Lab, Dalhousie University, Halifax, Nova Scotia, Canada
N. Zeh , Risk Analytics Lab, Dalhousie University, Halifax, Nova Scotia, Canada
pp. 703-709

Constructing consumer profiles from social media data (Abstract)

Mauricio Hernandez , IBM Research-Almaden, San Jose, CA, USA
Kirsten Hildrum , IBM TJ Watson Research Center, Yorktown Heights, NY USA
Prateek Jain , IBM TJ Watson Research Center, Yorktown Heights, NY USA
Rohit Wagle , IBM TJ Watson Research Center, Yorktown Heights, NY USA
Bogdan Alexe , IBM Research-Almaden, San Jose, CA, USA
Rajasekar Krishnamurthy , IBM Research-Almaden, San Jose, CA, USA
Ioana Roxana Stanoi , IBM Research-Almaden, San Jose, CA, USA
Chitra Venkatramani , IBM TJ Watson Research Center, Yorktown Heights, NY USA
pp. 710-716

CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework (Abstract)

Chien-Chih Chen , Institute of Information Science Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Yu-Jung Chang , Institute of Information Science Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Wei-Chun Chung , Institute of Information Science Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Der-Tsai Lee , Institute of Information Science Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Jan-Ming Ho , Institute of Information Science Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
pp. 717-722

Demand response targeting using big data analytics (Abstract)

Jungsuk Kwac , Stanford Sustainable Systems Lab, Department of Electrical Engineering, Stanford University, Stanford, CA
Ram Rajagopal , Stanford Sustainable Systems Lab, Department of Civil and Environmental Engineering, Stanford University, Stanford, CA
pp. 683-690

Building dynamic thermal profiles of energy consumption for individuals and neighborhoods (Abstract)

Adrian Albert , Electrical Engineering Department, Stanford University
Ram Rajagopal , Civil Engineering Department, Stanford University
pp. 723-728

Terabyte-sized image computations on Hadoop cluster platforms (Abstract)

Peter Bajcsy , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Antoine Vandecreme , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Julien Amelot , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Phuong Nguyen , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Joe Chalfoun , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Mary Brady , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
pp. 729-737

A fast and scalable method for threat detection in large-scale DNS logs (Abstract)

Ron Begleiter , Fortscale Inc., Raoul Wallenberg 22a, Tel-Aviv, Israel
Yuval Elovici , Ben-Gurion University of the Negev
Yona Hollander , Fortscale Inc., Raoul Wallenberg 22a, Tel-Aviv, Israel
Ori Mendelson , Fortscale Inc., Raoul Wallenberg 22a, Tel-Aviv, Israel
Lior Rokach , Ben-Gurion University of the Negev
Roi Saltzman , Fortscale Inc., Raoul Wallenberg 22a, Tel-Aviv, Israel
pp. 738-741

Correlation-based performance analysis for full-system MapReduce optimization (Abstract)

Qi Guo , IBM Research - China, Beijing, China
Yan Li , IBM Research - China, Beijing, China
Tao Liu , IBM Research - China, Beijing, China
Kun Wang , IBM Research - China, Beijing, China
Guancheng Chen , IBM Research - China, Beijing, China
Xiaoming Bao , IBM Systems & Technology Group, Beijing, China
Wentao Tang , IBM Systems & Technology Group, Beijing, China
pp. 753-761

Large scale ad latency analysis (Abstract)

Mihajlo Grbovic , Yahoo! Labs, Sunnyvale, CA, USA
Jon Malkin , Yahoo! Labs, Sunnyvale, CA, USA
Hirakendu Das , Yahoo! Labs, Sunnyvale, CA, USA
pp. 762-767

Accelerating semantic graph databases on commodity clusters (Abstract)

Alessandro Morari , Pacific Northwest National Laboratory, Richland, WA, USA
Vito Giovanni Castellana , Pacific Northwest National Laboratory, Richland, WA, USA
David Haglin , Pacific Northwest National Laboratory, Richland, WA, USA
John Feo , Pacific Northwest National Laboratory, Richland, WA, USA
Jesse Weaver , Pacific Northwest National Laboratory, Richland, WA, USA
Antonino Tumeo , Pacific Northwest National Laboratory, Richland, WA, USA
Oreste Villa , NVIDIA, Santa Clara, CA, USA
pp. 768-772

Scaling deep social feeds at Pinterest (Abstract)

Varun Sharma , Pinterest Inc.
Jeremy Carroll , Pinterest Inc.
Abhi Khune , Pinterest Inc.
pp. 777-783

Big data analytics on high Velocity streams: A case study (Abstract)

Thibaud Chardonnens , eXascale Infolab, University of Fribourg, Switzerland
Philippe Cudre-Mauroux , eXascale Infolab, University of Fribourg, Switzerland
Martin Grund , eXascale Infolab, University of Fribourg, Switzerland
Benoit Perroud , VeriSign Inc, Fribourg, Switzerland
pp. 784-787

The Code rebalancing problem for a storage-flexible Data Center Network (Abstract)

Iryna Andriyanova , ETIS group, ENSEA/UCP/CNRS-UMR8051, Cergy-Pontoise, France
Alan Jule , ETIS group, ENSEA/UCP/CNRS-UMR8051, Cergy-Pontoise, France
Emina Soljanin , Math of Netw. & Comm. Research, Bell Labs, Alcatel-Lucent, Murray Hill, NJ 07974
pp. 1-6

suvfs: A virtual file system in userspace that supports large files (Abstract)

Wasim Ahmad Bhat , Department of Computer Sciences, University of Kashmir
S. M. K. Quadri , Department of Computer Sciences, University of Kashmir
pp. 7-11

Distributed storage evaluation on a three-wide inter-data center deployment (Abstract)

Yih-Farn Chen , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
Scott Daniels , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
Marios Hadjieleftheriou , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
Pingkai Liu , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
Chao Tian , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
Vinay Vaishampayan , AT&T Labs-Research, Shannon Laboratory, 180 Park Ave., Florham Park, NJ 07932
pp. 17-22

Efficient updates in cross-object erasure-coded storage systems (Abstract)

Kyumars Sheykh Esmaili , School of Computer Engineering, Nanyang Technological University, Singapore
Aatish Chiniah , Computer Science and Engineering Department, University of Mauritius, Mauritius
Anwitaman Datta , School of Computer Engineering, Nanyang Technological University, Singapore
pp. 28-32

Construction of exact-BASIC codes for distributed storage systems at the MSR point (Abstract)

Hanxu Hou , Shenzhen Eng. Lab of Converged Networks Technology, Shenzhen Key Lab of Cloud Computing Tech. and App., Peking University Shenzhen Graduate School
Kenneth W. Shum , Institute of Network Coding, the Chinese University of Hong Kong
Hui Li , Shenzhen Eng. Lab of Converged Networks Technology, Shenzhen Key Lab of Cloud Computing Tech. and App., Peking University Shenzhen Graduate School
pp. 33-38

Minimum storage BASIC codes: A system perspective (Abstract)

Xianxia Huang , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Hui Li , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Tai Zhou , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Yumeng Zhang , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Han Guo , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Hanxu Hou , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Huayu Zhang , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
Kai Lei , Shenzhen Engineering Lab of Converged Networks Technology, SPCCTA, Shenzhen Graduate School, Peking University, ShenZhen, China
pp. 39-43

Layout-aware I/O Scheduling for terabits data movement (Abstract)

Youngjae Kim , Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Scott Atchley , Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Geoffroy R. Vallee , Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Galen M. Shipman , Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
pp. 44-51

Reliability of erasure coded storage systems: A geometric approach (Abstract)

Antonio Campello , Institute of Mathematics, Statistics, and Computer Science, University of Campinas, São Paulo, 13083-859, Brazil
Vinay A. Vaishampayan , AT&T Labs-Research, Shannon Laboratory, 180 Park Avenue, Florham Park, NJ 07932, USA
pp. 12-16

Robustness of emotion extraction from 20th century English books (Abstract)

Alberto Acerbi , Department of Archaeology and Anthropology, University of Bristol, Bristol, United Kingdom
Vasileios Lampos , Department of Computer Science, University of Sheffield, Sheffield, United Kingdom
R. Alexander Bentley , Department of Archaeology and Anthropology, University of Bristol, Bristol, United Kingdom
pp. 1-8

Back to our data — Experiments with NoSQL technologies in the Humanities (Abstract)

Tobias Blanke , Centre for e-Research, Department of Digital Humanities, King's College, London
Michael Bryant , Centre for e-Research, Department of Digital Humanities, King's College, London
Mark Hedges , Centre for e-Research, Department of Digital Humanities, King's College, London
pp. 17-20

VisualPage: Towards large scale analysis of nineteenth-century print culture (Abstract)

Neal Audenaert , Digital Archives, Research and Technology Services, College Station, TX
Natalie M. Houston , Dept. of English, University of Houston, Houston, TX
pp. 9-16

Humanities ‘big data’: Myths, challenges, and lessons (Abstract)

Amalia S. Levi , College of Information Studies, University of Maryland College Park, MD 20742, USA
pp. 33-36

Digging into human rights violations: Data modelling and collective memory (Abstract)

Ben Miller , Georgia State University
Ayush Shrestha , Georgia State University
Jason Derby , Georgia State University
Jennifer Olive , Georgia State University
Karthikeyan Umapathy , University of North Florida
Fuxin Li , Georgia Institute of Technology
Yanjun Zhao , Georgia State University
pp. 37-45

The royal birth of 2013: Analysing and visualising public sentiment in the UK using Twitter (Abstract)

Vu Dung Nguyen , Big Data Laboratory, School of Computer Science, University of St Andrews, UK
Blesson Varghese , Big Data Laboratory, School of Computer Science, University of St Andrews, UK
Adam Barker , Big Data Laboratory, School of Computer Science, University of St Andrews, UK
pp. 46-54

Bibliographic records as humanities big data (Abstract)

Andrew Prescott , Dept of Digital Humanities, King's College London, United Kingdom
pp. 55-58

Customising geoparsing and georeferencing for historical texts (Abstract)

C.J. Rupp , Lancaster University, Lancaster, UK
Paul Rayson , Lancaster University, Lancaster, UK
Alistair Baron , Lancaster University, Lancaster, UK
Christopher Donaldson , Lancaster University, Lancaster, UK
Ian Gregory , Lancaster University, Lancaster, UK
Andrew Hardie , Lancaster University, Lancaster, UK
Patricia Murrieta-Flores , Lancaster University, Lancaster, UK
pp. 59-62

A concept of Generic Workspace for Big Data Processing in Humanities (Abstract)

Jedrzej Rybicki , Forschungszentrum Juelich GmbH, JSC, Juelich, Germany
Benedikt von St. Vieth , Forschungszentrum Juelich GmbH, JSC, Juelich, Germany
Daniel Mallmann , Forschungszentrum Juelich GmbH, JSC, Juelich, Germany
pp. 63-70

From assets to stories via the Google Cultural Institute Platform (Abstract)

W. Brent Seales , Computer Science, University of Kentucky, Lexington, Kentucky, USA
Steve Crossan , Cultural Institute, Google Paris, France
Mark Yoshitake , Cultural Institute, Google Paris, France
Sertan Girgin , Google Paris, France
pp. 71-76

The curious identity of Michael Field and its implications for humanities research with the semantic web (Abstract)

Susan Brown , University of Alberta / University of Guelph, Implementing the New Knowledge Environment/Text Mining & Visualization for Literary History, Edmonton, Canada
John Simpson , University of Alberta, Implementing the New Knowledge Environment/Text Mining & Visualization for Literary History, Edmonton, Canada
pp. 77-85

Infectious texts: Modeling text reuse in nineteenth-century newspapers (Abstract)

David A. Smith , College of Computer and Information Science
Ryan Cordell , Department of English, Northeastern University Boston, MA, U.S.A.
Elizabeth Maddock Dillon , Department of English, Northeastern University Boston, MA, U.S.A.
pp. 86-94

Mapping mutable genres in structurally complex volumes (Abstract)

Ted Underwood , Department of English, University of Illinois, Urbana-Champaign, Urbana, IL, USA
Michael L. Black , Department of English, University of Illinois, Urbana-Champaign, Urbana, IL, USA
Loretta Auvil , Illinois Informatics Institute, University of Illinois, Urbana-Champaign, Urbana, IL, USA
Boris Capitanu , Illinois Informatics Institute, University of Illinois, Urbana-Champaign, Urbana, IL, USA
pp. 95-103

CKM: A shared visual analytical tool for large-scale analysis of audio-video interviews (Abstract)

Lu Xiao , The University of Western Ontario, Canada
Yan Luo , The University of Western Ontario, Canada
Steven High , Concordia University, Canada
pp. 104-112

A case study on entity Resolution for Distant Processing of big Humanities data (Abstract)

Weijia Xu , Texas Advanced Computing Center, University of Texas at Austin
Maria Esteva , Texas Advanced Computing Center, University of Texas at Austin
Jessica Trelogan , Institute of Classical Archaeology, University of Texas at Austin
Todd Swinson , Department of Computer Sciences, University of Texas at Austin
pp. 113-120

The human face of crowdsourcing: A citizen-led crowdsourcing case study (Abstract)

Sheryl Grant , School of Information and Library Science (SILS), University of North Carolina Chapel Hill, Chapel Hill, NC, USA
Richard Marciano , School of Information and Library Science (SILS), University of North Carolina Chapel Hill, Chapel Hill, NC, USA
Priscilla Ndiaye , Southside Advisory Community Board, Asheville, NC, USA
Kristan E. Shawgo , HASTAC, Duke University, Durham, NC, USA
Jeff Heard , Renaissance Computing Institute (RENCI), University of North Carolina, Chapel Hill, Chapel Hill, NC, USA
pp. 21-24

A cloud service for the evaluation of company's financial health using XBRL-based financial statements (Abstract)

Wen-Chiao Hsu , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
Jyun-Yao Huang , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
Chi-Hao Chen , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
Chien-Yu Su , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
Hsiao-Chen Shih , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
Tzu-Ya Liao , Department of Business Administration, National Cheng-Kung University, Tainan, Taiwan
I-En Liao , Department of Computer Science and Engineering, National Chung Hsing University, Taichung, Taiwan
pp. 10-14

Real-time data analysis in ClowdFlows (Abstract)

Janez Kranjc , Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Vid Podpecan , Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Nada Lavrac , Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
pp. 15-22

Ma3tch: Privacy and knowledge: ‘Dynamic networked collective intelligence’ (Abstract)

Udo Kroon , FIU.NET, Ministry of Security and Justice, The Hague, Netherlands
pp. 23-31

Business model canvas perspective on big data applications (Abstract)

F. Canari Pembe Muhtaroglu , TÜBİTAK-BİLGEM Gebze, Kocaeli, Turkey
Seniz Demir , TÜBİTAK-BİLGEM Gebze, Kocaeli, Turkey
Murat Obali , TÜBİTAK-BİLGEM Gebze, Kocaeli, Turkey
Canan Girgin , TÜBİTAK-BİLGEM Gebze, Kocaeli, Turkey
pp. 32-37

Advancing value creation and value capture in data-intensive contexts (Abstract)

Roman Ferrando-Llopis , Research Unit R-Knowing
David Lopez-Berzosa , Business School, University of Exeter, UK
Catherine Mulligan , Innovation and Entrepreneurship group, Imperial College, London, UK
pp. 5-9

OpenFridge: A platform for data economy for energy efficiency data (Abstract)

Slobodanka Dana Kathrin Tomic , The Telecommunications Research Center Vienna (FTW), Vienna, Austria
Anna Fensel , Semantic Technology Institute (STI) Innsbruck, University of Innsbruck, Innsbruck, Austria
pp. 43-47

A study of innovation network database Construction by using big data and an enterprise strategy model (Abstract)

Zhou Wen , School of Computer Engineer and Science, Shanghai University, Shanghai, China
Ye Shu-Tao , School of Computer Engineer and Science, Shanghai University, Shanghai, China
Lu Xiao-Long , School of Computer Engineer and Science, Shanghai University, Shanghai, China
pp. 48-52

Enhanced user data privacy with pay-by-data model (Abstract)

Chao Wu , Department of Computing, Imperial College London, London, UK
Yike Guo , Department of Computing, Imperial College London, London, UK
pp. 53-57

Query optimization over a heterogeneously distributed scientific database (Abstract)

Helen X. Xiang , Computer Science, University of Hertfordshire, UK
pp. 58-64

Enterprise data economy: A hadoop-driven model and strategy (Abstract)

Wuheng Luo , Sears Holdings | MetaScale Hoffman Estates, IL, USA
pp. 65-70

Understanding the value of (big) data (Abstract)

Koutroumpis Pantelis , Imperial College Business School, Imperial College London, London, UK
Leiponen Aija , Imperial College Business School, Imperial College London, London, UK
pp. 38-42

Hash in a flash: Hash tables for flash devices (Abstract)

Tyler Clemons , The Ohio State University, 2015 Neil Ave, Columbus, OH, USA
S M Faisal , The Ohio State University, 2015 Neil Ave, Columbus, OH, USA
Shirish Tatikonda , IBM Almaden Research Center, 650 Harry Rd, San Jose, CA 95123
Charu Aggarwal , IBM T. J. Watson Center, 1101 Kitchawan Road, Yorktown Heights, NY, USA
Srinivasan Parthasarathy , The Ohio State University, 2015 Neil Ave, Columbus, OH, USA
pp. 7-14

Memory system characterization of big data workloads (Abstract)

Martin Dimitrov , Software and Services Group, Intel Corporation
Karthik Kumar , Software and Services Group, Intel Corporation
Patrick Lu , Datacenter and Connected Systems Group, Intel Corporation
Vish Viswanathan , Software and Services Group, Intel Corporation
Thomas Willhalm , Software and Services Group, Intel Corporation
pp. 15-22

Optimizing a MapReduce module of preprocessing high-throughput DNA sequencing data (Abstract)

Wei-Chun Chung , Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Yu-Jung Chang , Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
Chien-Chih Chen , Institute of Information Science, Academia Sinica Taipei, Taiwan, ROC
Der-Tsai Lee , Institute of Information Science, Academia Sinica Taipei, Taiwan, ROC
Jan-Ming Ho , Research Center for Information Technology Innovation, Academia Sinica Taipei, Taiwan, ROC
pp. 1-6

Performance evaluation of R with Intel Xeon Phi coprocessor (Abstract)

Yaakoub El-Khamra , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas USA
Niall Gaffney , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas USA
David Walling , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas USA
Eric Wernert , Pervasive Technology Institute, Indiana University, Bloomington, Indiana USA
Weijia Xu , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas USA
Hui Zhang , Pervasive Technology Institute, Indiana University, Bloomington, Indiana USA
pp. 23-30

A performance evaluation of Hive for scientific data management (Abstract)

Taoying Liu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Jing Liu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Hong Liu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Wei Li , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
pp. 39-46

Evaluating task scheduling in hadoop-based cloud systems (Abstract)

Shengyuan Liu , College of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China
Jungang Xu , College of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China
Zongzhen Liu , Computer Network Information, Center Chinese Academy of Sciences, Beijing, China
Xu Liu , Department of Computer Science, Rice University, Houston, America
pp. 47-53

Efficient near-duplicate document detection using FPGAs (Abstract)

Xi Luo , Computer Science and Engineering UC Riverside Riverside, CA, USA
Walid Najjar , Computer Science and Engineering UC Riverside Riverside, CA, USA
Vagelis Hristidis , Computer Science and Engineering UC Riverside Riverside, CA, USA
pp. 54-61

Workload-aware aggregate maintenance in columnar in-memory databases (Abstract)

Stephan Muller , Hasso Plattner Institute, University of Potsdam, Germany
Lars Butzmann , Hasso Plattner Institute, University of Potsdam, Germany
Stefan Klauck , Hasso Plattner Institute, University of Potsdam, Germany
Hasso Plattner , Hasso Plattner Institute, University of Potsdam, Germany
pp. 62-69

Virtualization I/O optimization based on shared memory (Abstract)

Fengfeng Ning , Department of Computer Science and Engineering Shanghai Jiao Tong University
Chuliang Weng , Department of Computer Science and Engineering Shanghai Jiao Tong University
Yuan Luo , Department of Computer Science and Engineering Shanghai Jiao Tong University
pp. 70-77

An ensemble MIC-based approach for performance diagnosis in big data platform (Abstract)

Pengfei Chen , School of Electronic and Information Engineer, Xi'an Jiaotong University
Yong Qi , School of Electronic and Information Engineer, Xi'an Jiaotong University
Xinyi Li , School of Electronic and Information Engineer, Xi'an Jiaotong University
Li Su , School of Electronic and Information Engineer, Xi'an Jiaotong University
pp. 78-85

A reconfigurable stream compression hardware based on static symbol-lookup table (Abstract)

Shinichi Yamagiwa , Faculty of Engineering, Information and Systems, University of Tsukuba / JST PRESTO, 1-1-1 Tennodai, Tsukuba, Ibaraki, Japan
Hiroshi Sakamoto , Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu Iizuka-shi Fukuoka, Japan
pp. 86-93

NativeTask: A Hadoop compatible framework for high performance (Abstract)

Dong Yang , Intel Corporation, Beijing, China
Xiang Zhong , Intel Corporation, Beijing, China
Dong Yan , Intel Corporation, Beijing, China
Fangqin Dai , Intel Corporation, Beijing, China
Xusen Yin , Intel Corporation, Beijing, China
Cheng Lian , Intel Corporation, Beijing, China
Zhongliang Zhu , Intel Corporation, Beijing, China
Weihua Jiang , Intel Corporation, Beijing, China
Gansha Wu , Intel Corporation, Beijing, China
pp. 94-101

On mixing high-speed updates and in-memory queries: A big-data architecture for real-time analytics (Abstract)

Tao Zhong , Software and Services Group, Intel
Kshitij A Doshi , Software and Services Group, Intel
Xi Tang , Software and Services Group, Intel
Ting Lou , Software and Services Group, Intel
Zhongyan Lu , Software and Services Group, Intel
Hong Li , Software and Services Group, Intel
pp. 102-109

AxPUE: Application level metrics for power usage effectiveness in data centers (Abstract)

Runlin Zhou , National Computer Network Emergency Response Technical Team, Coordination Center of China, Beijing, China
Yingjie Shi , State Key Laboratory Computer Architecture, Institute of Computing, Technology, Chinese Academy of Sciences, Beijing, China
Chunge Zhu , National Computer Network Emergency Response Technical Team, Coordination Center of China, Beijing, China
pp. 110-117

The implications from benchmarking three big data systems (Abstract)

Jing Quan , School of Software Engineering, University of Science and Technology of China, Hefei, China
Yingjie Shi , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Ming Zhao , School of Computing and Information Sciences, Florida International University, Florida, USA
Wei Yang , School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
pp. 31-38

A characterization of big data benchmarks (Abstract)

Wen Xiong , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Zhibin Yu , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Zhendong Bei , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Juanjuan Zhao , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Fan Zhang , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Yubin Zou , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Xue Bai , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Ye Li , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
Chengzhong Xu , Center for Cloud Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 518055 ShenZhen, China
pp. 118-125

Dynamic reduction of query result sets for interactive visualizaton (Abstract)

Leilani Battle , Electrical Engineering and Computer Science Department, MIT
Michael Stonebraker , Electrical Engineering and Computer Science Department, MIT
Remco Chang , Department of Computer Science, Tufts University
pp. 1-8

Typograph: Multiscale spatial exploration of text documents (Abstract)

Alex Endert , Pacific Northwest National Laboratory Richland, WA USA
Russ Burtner , Pacific Northwest National Laboratory Richland, WA USA
Nick Cramer , Pacific Northwest National Laboratory Richland, WA USA
Ralph Perko , Pacific Northwest National Laboratory Richland, WA USA
Shawn Hampton , Pacific Northwest National Laboratory Richland, WA USA
Kristin Cook , Pacific Northwest National Laboratory Richland, WA USA
pp. 17-24

VisReduce: Fast and responsive incremental information visualization of large datasets (Abstract)

Jean-Francois Im , Ecole de Technol. Super., Montréal, QC, Canada
Felix Giguere Villegas , Matel.com, Montréal, QC, Canada
Michael J. McGuffin , Ecole de Technol. Super., Montréal, QC, Canada
pp. 25-32

A system for large-scale visualization of streaming Doppler data (Abstract)

Peter Kristof , Microsoft
Bedrich Benes , Purdue University
Carol X. Song , Purdue University
Lan Zhao , Purdue University
pp. 33-40

Overplotting: Unified solutions under Abstract Rendering (Abstract)

Joseph Cottam , Indiana University, Center for Research in Extreme Scale Technologies (CREST), Bloomington, IN, USA
Andrew Lumsdaine , Indiana University, Center for Research in Extreme Scale Technologies (CREST), Bloomington, IN, USA
Peter Wang , Continuum Analytics, Austin, TX, USA
pp. 9-16

Visualization of streaming data: Observing change and context in information visualization techniques (Abstract)

Milos Krstajic , University of Konstanz Konstanz, Germany
Daniel A. Keim , University of Konstanz Konstanz, Germany
pp. 41-47

CompactMap: A mental map preserving visual interface for streaming text data (Abstract)

Xiaotong Liu , The Ohio State University, Columbus, USA
Yifan Hu , AT&T Labs Research Florham Park, USA
Stephen North , AT&T Labs Research Florham Park, USA
Han-Wei Shen , The Ohio State University, Columbus, USA
pp. 48-55

Egocentric storylines for visual analysis of large dynamic graphs (Abstract)

Chris W. Muelder , University of California at Davis, Davis, CA, USA
Tarik Crnovrsanin , University of California at Davis, Davis, CA, USA
Arnaud Sallaberry , LIRMM, Universitè Paul Valéry Montpellier 3, Montpellier, France
Kwan-Liu Ma , University of California at Davis, Davis, CA, USA
pp. 56-62

GPU-accelerated incremental correlation clustering of large data with visual feedback (Abstract)

Eric Papenhausen , Visual Analytics and Imaging Lab, Center for Visual Computing, Computer Science Department, Stony Brook University, Stony Brook, NY, USA
Bing Wang , Visual Analytics and Imaging Lab, Center for Visual Computing, Computer Science Department, Stony Brook University, Stony Brook, NY, USA
Sungsoo Ha , Visual Analytics and Imaging Lab, Center for Visual Computing, Computer Science Department, SUNY Korea, Songdo, Korea
Alla Zelenyuk , Chemical and Material Sciences Division, Pacific Northwest National Lab, Richland, WA, USA
Dan Imre , Imre Consulting, Richland, WA, USA
Klaus Mueller , Visual Analytics and Imaging Lab, Center for Visual Computing, Computer Science Department, Stony Brook University, Stony Brook, NY, USA
pp. 63-70

Visualization of big SPH simulations via compressed octree grids (Abstract)

Florian Reichl , Computer Graphics & Visualization Group, Technische Universität München, Munich, Germany
Marc Treib , Computer Graphics & Visualization Group, Technische Universität München, Munich, Germany
Rudiger Westermann , Computer Graphics & Visualization Group, Technische Universität München, Munich, Germany
pp. 71-78

A novel visual analytics approach for clustering large-scale social data (Abstract)

Zhangye Wang , State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
Chang Chen , College of Software Engineering, University of Science and Technology of China, Hefei, China
Juanxia Zhou , State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
Jiyuan Liao , State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
Wei Chen , State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China
Ross Maciejewski , School of CIDSE, Arizona State University, USA
pp. 79-86

DriveSense: Contextual handling of large-scale route map data for the automobile (Abstract)

Frederik Wiehr , Saarland University, Germany
Vidya Setlur , Nokia Research Center and Tableau Software, USA
Alark Joshi , University of San Francisco, USA
pp. 87-94

A big data analytics framework for scientific data management (Abstract)

Sandro Fiore , Centro Euro-Mediterraneo sui Cambiamenti Climatici, Italy
Cosimo Palazzo , Centro Euro-Mediterraneo sui Cambiamenti Climatici, Italy
Alessandro D'Anca , Centro Euro-Mediterraneo sui Cambiamenti Climatici, Italy
Ian Foster , Computation Institute, Argonne National Lab and University of Chicago, Chicago, IL 60637, USA
Dean N. Williams , Lawrence Livermore National Laboratory, Livermore, CA, USA
Giovanni Aloisio , Centro Euro-Mediterraneo sui Cambiamenti Climatici, Italy
pp. 1-8

Searching inter-disciplinary scientific big data based on latent correlation analysis (Abstract)

Eloy Gonzales , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Bun Theang Ong , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Koji Zettsu , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
pp. 9-12

Complete storm identification algorithms from big raw rainfall data using MapReduce framework (Abstract)

Kulsawasd Jitkajornwanich , Computer Science and Engineering Department, The University of Texas at Arlington
Upa Gupta , Computer Science and Engineering Department, The University of Texas at Arlington
Sakthi Kumaran Shanmuganathan , Computer Science and Engineering Department, The University of Texas at Arlington
Ramez Elmasri , Computer Science and Engineering Department, The University of Texas at Arlington
Leonidas Fegaras , Computer Science and Engineering Department, The University of Texas at Arlington
John McEnery , Computer Science and Engineering Department, The University of Texas at Arlington
pp. 13-20

A scalable data analysis platform for metagenomics (Abstract)

Wei Tang , Argonne National Laboratory, Argonne, IL, USA
Jared Wilkening , Argonne National Laboratory, Argonne, IL, USA
Narayan Desai , Argonne National Laboratory, Argonne, IL, USA
Wolfgang Gerlach , University of Chicago, Chicago, IL, USA
Andreas Wilke , Argonne National Laboratory, Argonne, IL, USA
Folker Meyer , Argonne National Laboratory, Argonne, IL, USA
pp. 21-26

Rethinking data management for big data scientific workflows (Abstract)

Karan Vahi , Information Sciences Institute - University of Southern California, Marina Del Rey, USA
Mats Rynge , Information Sciences Institute - University of Southern California, Marina Del Rey, USA
Gideon Juve , Information Sciences Institute - University of Southern California, Marina Del Rey, USA
Rajiv Mayani , Information Sciences Institute - University of Southern California, Marina Del Rey, USA
Ewa Deelman , Information Sciences Institute - University of Southern California, Marina Del Rey, USA
pp. 27-35

SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop (Abstract)

Pengfei Xuan , School of Computing, Clemson University, Clemson, SC, USA
Yueli Zheng , School of Computing, Clemson University, Clemson, SC, USA
Sapna Sarupria , Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA
Amy Apon , School of Computing, Clemson University, Clemson, SC, USA
pp. 36-44

Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization (Abstract)

Mohammadreza Babaee , Institute for Human-Machine Communication, Technische Universita¨t Mu¨nchen & Munich Aerospace Faculty, Munich, Germany
Mihai Datcu , Munich Aerospace Faculty, German Aerospace Center (DLR), Wessling, Germany
Gerhard Rigoll , Institute for Human-Machine Communication, Technische Universita¨t Mu¨nchen & Munich Aerospace Faculty, Munich, Germany
pp. 1-6

Hierarchical feature learning from sensorial data by spherical clustering (Abstract)

Bonny Banerjee , Institute for Intelligent Systems, and Dept. of Electrical & Computer Engineering, The University of Memphis Memphis, TN 38152, USA
Jayanta K. Dutta , Institute for Intelligent Systems, and Dept. of Electrical & Computer Engineering, The University of Memphis Memphis, TN 38152, USA
pp. 7-13

Efficient learning from explanation of prediction errors in streaming data (Abstract)

Bonny Banerjee , Institute for Intelligent Systems, and Dept. of Electrical & Computer Engineering, The University of Memphis Memphis, TN 38152, USA
Jayanta K. Dutta , Institute for Intelligent Systems, and Dept. of Electrical & Computer Engineering, The University of Memphis Memphis, TN 38152, USA
pp. 14-20

Distributed Pivot Clustering with arbitrary distance functions (Abstract)

L. Karl Branting , 7525 Colshire Drive, McLean, Virginia, USA
pp. 21-27

Feature selection strategies for classifying high dimensional astronomical data sets (Abstract)

Ciro Donalek , California Institute of Technology, Pasadena (CA), USA
S. G. Djorgovski , California Institute of Technology, Pasadena (CA), USA
Ashish A. Mahabal , California Institute of Technology, Pasadena (CA), USA
Matthew J. Graham , California Institute of Technology, Pasadena (CA), USA
Andrew J. Drake , California Institute of Technology, Pasadena (CA), USA
A. Arun Kumar , St. Thomas College, Kerala, India
N. Sajeeth Philip , St. Thomas College, Kerala, India
Thomas J. Fuchs , Jet Propulsion Laboratory, California Institute of Technology, Pasadena (CA), USA
Michael J. Turmon , Jet Propulsion Laboratory, California Institute of Technology, Pasadena (CA), USA
Michael Ting-Chang Yang , Graduate Institute of Astronomy, NCU, Taiwan, Taiwan
Giuseppe Longo , Università degli Studi Federico II, Napoli, Italy
pp. 35-41

How data partitioning strategies and subset size influence the performance of an ensemble? (Abstract)

Majed Farrash , School of Computing Sciences, University of East Anglia
Wenjia Wang , School of Computing Sciences, University of East Anglia
pp. 42-49

Fast Change Point Detection for electricity market analysis (Abstract)

William Gu , Lawrence National Berkeley Laboratory, Berkeley, CA, USA
Jaesik Choi , Lawrence National Berkeley Laboratory, Berkeley, CA, USA
Ming Gu , University of California Berkeley, Berkeley, CA, USA
Horst Simon , Lawrence National Berkeley Laboratory, Berkeley, CA, USA
Kesheng Wu , Lawrence National Berkeley Laboratory, Berkeley, CA, USA
pp. 50-57

A novel integrated method for human multiplex protein subcellular localization prediction (Abstract)

Hong Gu , School of Control Science and Engineering, Dalian University of Technology, Dalian, China
Junzhe Cao , School of Control Science and Engineering, Dalian University of Technology, Dalian, China
pp. 58-62

Learning from multiple data sets with different missing attributes and privacy policies: Parallel distributed fuzzy genetics-based machine learning approach (Abstract)

Hisao Ishibuchi , Department of Computer Science and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University, Sakai, Osaka 599-8531, Japan
Masakazu Yamane , Department of Computer Science and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University, Sakai, Osaka 599-8531, Japan
Yusuke Nojima , Department of Computer Science and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University, Sakai, Osaka 599-8531, Japan
pp. 63-70

Data chaos: An entropy based MapReduce framework for scalable learning (Abstract)

Jiaoyan Chen , College of Computer Science Zhejiang University
Huajun Chen , College of Computer Science Zhejiang University
Xi Chen , College of Computer Science Zhejiang University
Guozhou Zheng , College of Computer Science Zhejiang University
Zhaohui Wu , College of Computer Science Zhejiang University
pp. 71-78

Exploring sketches for probability estimation with sublinear memory (Abstract)

Anthony Kleerekoper , School of Computer Science, The University of Manchester, UK
Mikel Lujan , School of Computer Science, The University of Manchester, UK
Gavin Brown , School of Computer Science, The University of Manchester, UK
pp. 79-86

Agglomerative co-clustering for synonymous phrases based on common effects and influences (Abstract)

Koji Kumanami , Graduate School of System Informatics, Kobe University, Kobe, Japan
Kazuhiro Seki , Graduate School of System Informatics, Kobe University, Kobe, Japan
Kuniaki Uehara , Graduate School of System Informatics, Kobe University, Kobe, Japan
pp. 87-94

Leveraging memory mapping for fast and scalable graph computation on a PC (Abstract)

Zhiyuan Lin , College of Computing, Georgia Tech, Atlanta, GA, USA
Duen Horng Polo Chau , College of Computing, Georgia Tech, Atlanta, GA, USA
U Kang , Computer Science Department, KAIST, Republic of Korea
pp. 95-98

Scalable sentiment classification for Big Data analysis using Naïve Bayes Classifier (Abstract)

Bingwei Liu , Intelligent Fusion Technology, Inc. Germantown, Maryland, USA
Erik Blasch , Air Force Research Laboratory, Rome, New York, USA
Yu Chen , Binghamton University, Binghamton, New York, USA
Dan Shen , Intelligent Fusion Technology, Inc. Germantown, Maryland, USA
Genshe Chen , Intelligent Fusion Technology, Inc. Germantown, Maryland, USA
pp. 99-104

Meta-learning for large scale machine learning with MapReduce (Abstract)

Xuan Liu , School of EECS, University of Ottawa, Ottawa, Canada
Xiaoguang Wang , School of EECS, University of Ottawa, Ottawa, Canada
Stan Matwin , Faculty of Computer Science, Dalhousie University, Halifax, Canada
Nathalie Japkowicz , School of EECS, University of Ottawa Ottawa, Canada
pp. 105-110

Frequent Itemset Mining for Big Data (Abstract)

Sandy Moens , Universiteit Antwerpen, Belgium
Emin Aksehirli , Universiteit Antwerpen, Belgium
Bart Goethals , Universiteit Antwerpen, Belgium
pp. 111-118

Evaluating parallel logistic regression models (Abstract)

Haoruo Peng , HTC Research Center, Beijing, China
Ding Liang , HTC Research Center, Beijing, China
Cyrus Choi , HTC Research Center, Beijing, China
pp. 119-126

Approximate triangle counting algorithms on multi-cores (Abstract)

Mahmudur Rahman , Dept. of Computer and Information Science Indiana University-Purdue University, Indianapolis
Mohammad Al Hasan , Dept. of Computer and Information Science Indiana University-Purdue University, Indianapolis
pp. 127-133

Tree Labeled LDA: A Hierarchical model for web summaries (Abstract)

Anton Slutsky , College of Information Science and Technology, Drexel University, Philadelphia, PA, USA
Xiaohua Hu , College of Information Science and Technology, Drexel University, Philadelphia, PA, USA
Yuan An , College of Information Science and Technology, Drexel University, Philadelphia, PA, USA
pp. 134-140

Nearest neighbour regression outperforms model-based prediction of specific star formation rate (Abstract)

Kristoffer Stensbo-Smidt , Department of Computer Science, University of Copenhagen
Christian Igel , Department of Computer Science, University of Copenhagen
Andrew Zirm , Dark Cosmology Centre, Niels Bohr Institute, University of Copenhagen
Kim Steenstrup Pedersen , Department of Computer Science, University of Copenhagen
pp. 141-144

MapReduce implementation of Variational Bayesian Probabilistic Matrix Factorization algorithm (Abstract)

Naveen C. Tewari , Center for Knowledge Driven Intelligent Systems, Infosys Labs, Infosys Limited, Electronics City, Hosur Road, Bangalore - 560 100, India
Hari M. Koduvely , Center for Knowledge Driven Intelligent Systems, Infosys Labs, Infosys Limited, Electronics City, Hosur Road, Bangalore - 560 100, India
Sarbendu Guha , Center for Knowledge Driven Intelligent Systems, Infosys Labs, Infosys Limited, Electronics City, Hosur Road, Bangalore - 560 100, India
Arun Yadav , Center for Knowledge Driven Intelligent Systems, Infosys Labs, Infosys Limited, Electronics City, Hosur Road, Bangalore - 560 100, India
Gladbin David , Center for Knowledge Driven Intelligent Systems, Infosys Labs, Infosys Limited, Electronics City, Hosur Road, Bangalore - 560 100, India
pp. 145-152

A unified framework for predicting attributes and links in social networks (Abstract)

Xusen Yin , School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
Bin Wu , School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
Xiuqin Lin , School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
pp. 153-160

Scalable approximation of kernel fuzzy c-means (Abstract)

Zijian Zhang , Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI 49931 USA
Timothy C. Havens , Department of Electrical and Computer Engineering, Department of Computer Science, Michigan Technological University, Houghton, MI, 49931 USA
pp. 161-168

Large-scale restricted boltzmann machines on single GPU (Abstract)

Yun Zhu , Computer Science Department, Georgia State University, Atlanta, Georgia 30303
Yanqing Zhang , Computer Science Department, Georgia State University, Atlanta, Georgia 30303
Yi Pan , Computer Science Department, Georgia State University, Atlanta, Georgia 30303
pp. 169-174

Lung transplant outcome prediction using UNOS data (Abstract)

Ankit Agrawal , Dept. of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL USA
Reda Al-Bahrani , Dept. of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL USA
Mark J. Russo , Department of Cardiothoracic Surgery, Barnabas Health Heart Centers, Livingston, NJ USA
Jaishankar Raman , Department of Cardiothoracic & Vascular Surgery, Rush University Medical Center Chicago, IL USA
Alok Choudhary , Dept. of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL USA
pp. 1-8

Colon cancer survival prediction using ensemble data mining on SEER data (Abstract)

Reda Al-Bahrani , Dept. of Electrical Engg. and Computer Science, Northwestern University, Evanston, IL 60208, US
Ankit Agrawal , Dept. of Electrical Engg. and Computer Science, Northwestern University, Evanston, IL 60208, US
Alok Choudhary , Dept. of Electrical Engg. and Computer Science, Northwestern University, Evanston, IL 60208, US
pp. 9-16

A look at challenges and opportunities of Big Data analytics in healthcare (Abstract)

Raghunath Nambiar , Cisco Systems, Inc., San Jose, CA 95134, USA
Ruchie Bhardwaj , Cisco Systems, Inc./University of Southern California, San Jose, CA 95134, USA
Adhiraaj Sethi , Cisco Systems, Inc., Herndon, VA 20171, USA
Rajesh Vargheese , Cisco Systems, Inc., Austin, TX 78759, USA
pp. 17-22

Multidimensional analysis of fetal growth curves (Abstract)

Mario A. Bochicchio , Set-Lab, Department of Engineering for Innovation University of Salento Lecce, Italy
Antonella Longo , Set-Lab, Department of Engineering for Innovation University of Salento Lecce, Italy
Lucia Vaira , Set-Lab, Department of Engineering for Innovation University of Salento Lecce, Italy
Antonio Malvasi , Obstetric & Gynecology Department, Santa Maria Hospital, Bari, Italy
Andrea Tinelli , Obstetric & Gynecology Department, Vito Fazzi Hospital, Lecce, Italy
pp. 23-28

OWL reasoning over big biomedical data (Abstract)

Xi Chen , College of Computer Science Zhejiang University
Huajun Chen , College of Computer Science Zhejiang University
Ningyu Zhang , College of Computer Science Zhejiang University
Jiaoyan Chen , College of Computer Science Zhejiang University
Zhaohui Wu , College of Computer Science Zhejiang University
pp. 29-36

KUChemBio: A repository of computational chemical biology data sets (Abstract)

Aaron Smalter Hall , Molecular Graphics and Modeling Laboratory, University of Kansas, Lawrence, Kansas 66045, United States
Jun Huan , Dept. of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas 66045, United States
pp. 37-42

Parallel and memory-efficient Burrows-Wheeler transform (Abstract)

Shinya Hayashi , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Kenjiro Taura , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
pp. 43-50

Content-based assessment of the credibility of online healthcare information (Abstract)

Meeyoung Park , Electrical Engineering and Computer science, University of Kansas, Lawrence, U.S.A.
Hariprasad Sampathkumar , Electrical Engineering and Computer science, University of Kansas, Lawrence, U.S.A.
Bo Luo , Electrical Engineering and Computer science, University of Kansas, Lawrence, U.S.A.
Xue-wen Chen , Computer Science, Wayne State University, Detroit, U.S.A.
pp. 51-58

BIG DATA infrastructures for pharmaceutical research (Abstract)

Christian Seebode , ORTEC medical, ORTEC medical GmbH, Berlin, Germany
Matthias Ort , ORTEC medical, ORTEC medical GmbH, Berlin, Germany
Christian Regenbrecht , Charité - Universitätsmedizin Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany
Martin Peuker , Charité - Universitätsmedizin Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany
pp. 59-63

Big data solutions for predicting risk-of-readmission for congestive heart failure patients (Abstract)

Kiyana Zolfaghar , Institute of Technology, CWDS, UW Tacoma
Naren Meadem , Institute of Technology, CWDS, UW Tacoma
Ankur Teredesai , Institute of Technology, CWDS, UW Tacoma
Senjuti Basu Roy , Institute of Technology, CWDS, UW Tacoma
Si-Chi Chin , Institute of Technology, CWDS, UW Tacoma
Brian Muckian , Multicare Health System, Tacoma, Washington
pp. 64-71

The Microsoft Academic Search challenges at KDD Cup 2013 (Abstract)

Martine De Cock , Dept. of Appl. Math., CS and Statistics, Ghent University, 9000 Gent, Belgium
Senjuti Basu Roy , Institute of Technology, University of Washington, Tacoma, WA 98402, USA
Swapna Savvana , Institute of Technology, University of Washington, Tacoma, WA 98402, USA
Vani Mandava , Microsoft Research, Microsoft, Redmond, WA 98052, USA
Brian Dalessandro , Media6Degrees, New York, NY 10003, USA
Claudia Perlich , Media6Degrees, New York, NY 10003, USA
William Cukierski , Kaggle, Millington NJ 07946, USA
Ben Hamner , Kaggle, Millington NJ 07946, USA
pp. 1-4

Bibliometric-enhanced retrieval models for big scholarly information systems (Abstract)

Philipp Mayr , Knowledge Technologies for the Social Sciences, GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
Peter Mutschke , Knowledge Technologies for the Social Sciences GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
pp. 5-8

Academic publishing as a social media paradigm (Abstract)

Michael E. Payne , School of Computing, Clemson University, Clemson, South Carolina
Linh B. Ngo , School of Computing, Clemson University, Clemson, South Carolina
Amy W. Apon , School of Computing, Clemson University, Clemson, South Carolina
pp. 9-12

Big spatial data mining (Abstract)

Wang Shuliang , School of Software, Beijing Institute of Technology, Beijing, 100081, China
Ding Gangyi , School of Software, Beijing Institute of Technology, Beijing, 100081, China
Zhong Ming , School of Software, Beijing Institute of Technology, Beijing, 100081, China
pp. 13-21

Modeling and querying data in NoSQL databases (Abstract)

Karamjit Kaur , Computer Sci. and Engg. Deptt., Thapar University, Patiala, Punjab, India 147004
Rinkle Rani , Computer Sci. and Engg. Deptt., Thapar University, Patiala, Punjab, India 147004
pp. 1-7

Elastic data partitioning for cloud-based SQL processing systems (Abstract)

Lipyeow Lim , University of Hawai'i at Mānoa, Honolulu, HI 96822, USA
pp. 8-16

Parallel SECONDO: Practical and efficient mobility data processing in the cloud (Abstract)

Jiamin Lu , Faculty of Mathematics and Computer Science, Fern Universität Hagen, Hagen, Germany
Ralf Hartmut Guting , Faculty of Mathematics and Computer Science, FernUniversität Hagen, Hagen, Germany
pp. 107-25

Index-based join operations in Hive (Abstract)

Mahsa Mofidpoor , Computer Science and Software Engineering Concordia University Montreal, Canada
Nematollaah Shiri , Computer Science and Software Engineering Concordia University Montreal, Canada
T. Radhakrishnan , Computer Science and Software Engineering Concordia University Montreal, Canada
pp. 26-33

SLA data management criteria (Abstract)

Katerina Stamou , Institute of Services Science, University of Geneva, Switzerland
Verena Kantere , Institute of Services Science, University of Geneva, Switzerland
Jean-Henry Morin , Institute of Services Science, University of Geneva, Switzerland
pp. 34-42

Fast solution of load shedding problems via a sequence of linear programs (Abstract)

Harish S. Bhat , Applied Mathematics Unit, University of California, Merced, Merced, CA USA
Garnet J. Vaz , Applied Mathematics Unit, University of California, Merced, Merced, CA USA
Juan C. Meza , Applied Mathematics Unit, University of California, Merced, Merced, CA USA
pp. 1-6

Alarm prediction in large-scale sensor networks — A case study in railroad (Abstract)

Hongfei Li , IBM T. J. Watson Research, Yorktown Heights, NY 10598
Buyue Qian , IBM T. J. Watson Research, Yorktown Heights, NY 10598
Dhaivat Parikh , IBM Global Business Services, Dallas, TX 75019
Arun Hampapur , IBM T. J. Watson Research, Yorktown Heights, NY 10598
pp. 7-14

MiSTRAL: An architecture for low-latency analytics on MasSive time series (Abstract)

Alice Marascu , IBM Research - Ireland, Smarter Cities Technology Centre
Pascal Pompey , IBM Research - Ireland, Smarter Cities Technology Centre
Eric Bouillet , IBM Research - Ireland, Smarter Cities Technology Centre
Olivier Verscheure , IBM Research - Ireland, Smarter Cities Technology Centre
Michael Wurst , IBM R&D Information Management, Boeblingen-Germany
Martin Grund , eXascale Infolab, University of Fribourg, Fribourg-Switzerland
Philippe Cudre-Mauroux , eXascale Infolab, University of Fribourg, Fribourg-Switzerland
pp. 15-21

Yellow cabs as red corpuscles (Abstract)

Timothy H. Savage , Berkeley Research Group, Visiting Scientist, Center for Urban Science & Progress
Huy T. Vo , Center for Urban Science & Progress, New York Univeristy
pp. 22-28

Scalable prediction of energy consumption using incremental time series clustering (Abstract)

Yogesh Simmhan , University of Southern California Los Angeles, CA 90089
Muhammad Usman Noor , University of Southern California Los Angeles, CA 90089
pp. 29-36

A big data driven model for taxi drivers' airport pick-up decisions in New York City (Abstract)

M. Anil Yazici , University Transportation Research Center, The City College of New York, 160 Convent Avenue, Marshak Building, Suite J-910, New York, NY, 10031, USA
Camille Kamga , University Transportation Research Center, The City College of New York, 160 Convent Avenue, Marshak Building, Suite J-910, New York, NY, 10031, USA
Abhishek Singhal , University Transportation Research Center, The City College of New York, 160 Convent Avenue, Marshak Building, Suite J-910, New York, NY, 10031, USA
pp. 37-44

Managing massive graphs in relational DBMS (Abstract)

Ruiwen Chen , Simon Fraser University, Burnaby BC, Canada
pp. 1-8

A distributed approach for graph-oriented multidimensional analysis (Abstract)

Benoit Denis , Université Catholique de Louvain, Louvain-la-Neuve, Belgium
Amine Ghrab , EURA NOVA R&D Mont-Saint-Guibert, Belgium
Sabri Skhiri , EURA NOVA R&D Mont-Saint-Guibert, Belgium
pp. 9-16

Constructing E-Tourism platform based on service value broker: A knowledge management perspective (Abstract)

Yucong Duan , Hainan University, Haikou, P.R. China
Yongzhi Wang , Florida International University, Miami, USA
Jinpeng Wei , Florida International University, Miami, USA
Ajay Kattepur , INRIA Paris-Rocquencouri, Paris, France
Wencai Du , INRIA Paris-Rocquencouri, Paris, France
pp. 17-24

ADraw: A novel social network visualization tool with attribute-based layout and coloring (Abstract)

Zhenwen Wang , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China
Weidong Xiao , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China
Bin Ge , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China
Hao Xu , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China
pp. 25-32

IntegrityMR: Integrity assurance framework for big data analytics and management applications (Abstract)

Yongzhi Wang , Florida International University, Miami, USA
Jinpeng Wei , Florida International University, Miami, USA
Mudhakar Srivatsa , IBM T.J. Watson Research Center, Yorktown Heights, USA
Yucong Duan , Hainan University, Haikou, China
Wencai Du , Hainan University, Haikou, China
pp. 33-40

Local join optimization over a heterogeneously distributed scientific database (Abstract)

Helen X. Xiang , Computer Science, University of Hertfordshire, UK
pp. 41-45

Core-based community evolution in mobile social networks (Abstract)

Hao Xu , Key Laboratory for Information System Technology, National University of Defense Technology, Changsha, China
Weidong Xiao , Key Laboratory for Information System Technology, National University of Defense Technology, Changsha, China
Daquan Tang , Key Laboratory for Information System Technology, National University of Defense Technology, Changsha, China
Jiuyang Tang , Key Laboratory for Information System Technology, National University of Defense Technology, Changsha, China
Zhenwen Wang , Key Laboratory for Information System Technology, National University of Defense Technology, Changsha, China
pp. 46-51

Super-sequence frequent pattern mining on sequential dataset (Abstract)

Xinran Yu , Computer Science Department, University of Texas at San Antonio, San Antonio, TX 78249
Turgay Korkmaz , Computer Science Department, University of Texas at San Antonio, San Antonio, TX 78249
pp. 52-59

Exploring big data in small forms: A multi-layered knowledge extraction of social networks (Abstract)

Yun Wei Zhao , School of Software, Tsinghua University, Beijing, P. R. China, School of Economics and Management, Tilburg University, Tilburg, the Netherlands
Willem-Jan van den Heuvel , School of Economics and Management, Tilburg University, Tilburg, the Netherlands
Xiaojun Ye , School of Software Tsinghua University Beijing, P. R. China
pp. 60-67

Provenance comparison for large-scale knowledge discovery (Abstract)

Xiang Zhao , National University of Defense Technology, China
Bin Ge , National University of Defense Technology, China
Jiuyang Tang , National University of Defense Technology, China
Weidong Xiao , National University of Defense Technology, China
Haichuan Shang , The University of Tokyo, Japan
pp. 68-75

Re-projection of terabyte-sized images (Abstract)

Peter Bajcsy , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology (NIST), Gaithersburg, MD
Antoine Vandecreme , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology (NIST), Gaithersburg, MD
Mary Brady , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology (NIST), Gaithersburg, MD
pp. 1

Tile based visual analytics for Twitter big data exploratory analysis (Abstract)

Daniel Cheng , Oculus Info Inc. Toronto, Canada
Peter Schretlen , Oculus Info Inc. Toronto, Canada
Nathan Kronenfeld , Oculus Info Inc. Toronto, Canada
Neil Bozowsky , Oculus Info Inc. Toronto, Canada
William Wright , Oculus Info Inc. Toronto, Canada
pp. 2-4

Optimizing queries over semantically integrated datasets on MapReduce platforms (Abstract)

HyeongSik Kim , Department of Computer Science, North Carolina State University Raleigh, NC, USA
Kemafor Anyanwu , Department of Computer Science, North Carolina State University Raleigh, NC, USA
pp. 5-6

Secure Decoupled Linkage (SDLink) system for building a social genome (Abstract)

Hye-Chung Kum , Department of Health Policy and Management, Texas A&M University Health Science Center
Ashok Krishnamurthy , Department of Computer Science, The University of North Carolina at Chapel Hill
Darshana Pathak , Department of Computer Science, The University of North Carolina at Chapel Hill
Michael K. Reiter , Department of Computer Science, The University of North Carolina at Chapel Hill
Stanley Ahalt , Department of Computer Science, The University of North Carolina at Chapel Hill
pp. 7-11

Risk adjustment of patient expenditures: A big data analytics approach (Abstract)

Lin Li , Philips Research North America, Briarcliff Manor, US
Saeed Bagheri , Philips Research North America, Briarcliff Manor, US
Helena Goote , Philips Research North America, Briarcliff Manor, US
Asif Hasan , Philips Research North America, Briarcliff Manor, US
Gregg Hazard , Philips Research North America, Briarcliff Manor, US
pp. 12-14

Parallel auto-encoder for efficient outlier detection (Abstract)

Yunlong Ma , Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Peng Zhang , Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Yanan Cao , Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Li Guo , Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
pp. 15-17

New factors for identifying influential bloggers (Abstract)

Teng-Sheng Moh , Department of Computer Science, San Jose State University San Jose, CA 95192-0249, U.S.A.
SivaNaga Prasad Shola , Department of Computer Science, San Jose State University San Jose, CA 95192-0249, U.S.A.
pp. 18-27

A scalable infrastructure of interactive evolutionary computation to evolve services online with data (Abstract)

Masaharu Munetomo , Information Initiative Center, Hokkaido University, Sapporo, Japan
Shintaro Bando , Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
pp. 28

Big data for business managers — Bridging the gap between potential and value (Abstract)

Anmol Rajpurohit , Department of Computer Science, The LNM Institute of Information Technology, Jaipur, India
pp. 29-31

Granularity-based temporal data mining in hospital information system (Abstract)

Shusaku Tsumoto , Department of Medical Informatics, School of Medicine, Shimane University, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
Shoji Hirano , Department of Medical Informatics, School of Medicine, Shimane University, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
Haruko Iwata , Division of Nursing, Shimane University Hospital, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
pp. 32-40

Observation of Matthew Effects in Sina Weibo microblogger (Abstract)

Mengmeng Yang , Shanghai Jiao Tong University
Yi Zhou , Shanghai Jiao Tong University
Qu Zhou , Shanghai Jiao Tong University
Kai Chen , Shanghai Jiao Tong University
Jianhua He , Aston University
Xiaokang Yang , Shanghai Jiao Tong University
pp. 41-43

A framework of spatial co-location mining on MapReduce (Abstract)

Jin Soung Yoo , Department of Computer Science, Indiana University-Purdue University Fort Wayne, Fort Wayne, Indiana, USA
Douglas Boulware , Air Force Research Laboratory, Rome, New York, USA
pp. 44

Access control for big data using data content (Abstract)

Wenrong Zeng , Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, 66045, USA
Yuhao Yang , Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, 66045, USA
Bo Luo , Department of Electrical Engineering and Computer Science, The University of Kansas, Lawrence, KS, 66045, USA
pp. 45-47

Author index (PDF)

pp. 1-5

Knowledge cubes — A proposal for scalable and semantically-guided management of Big Data (Abstract)

Amgad Madkour , Purdue University, West Lafayette, USA
Walid G. Aref , Purdue University, West Lafayette, USA
Saleh Basalamah , Umm Al-Qura University, Makkah, KSA
pp. 1-7
87 ms
(Ver 3.3 (11022016))