The Community for Technology Leaders
2014 IEEE International Conference on Big Data (Big Data) (2014)
Washington, DC, USA
Oct. 27, 2014 to Oct. 30, 2014
ISBN: 978-1-4799-5666-1
TABLE OF CONTENTS

[Front cover] (PDF)

pp. 1

Organization (PDF)

pp. 1-2

Never-ending language learning (PDF)

Tom Mitchell , Machine Learning Department, Carnegie Mellon University
E. Fredkin , Machine Learning Department, Carnegie Mellon University
pp. 1

BASIC: An alternative to BASE for large-scale data management system (Abstract)

Lengdong Wu , Department of Computing Science, University of Alberta, Edmonton, Canada
Li-Yan Yuan , Department of Computing Science, University of Alberta, Edmonton, Canada
Jia-Huai You , Department of Computing Science, University of Alberta, Edmonton, Canada
pp. 5-14

BayesWipe: A multimodal system for data cleaning and consistent query answering on structured bigdata (Abstract)

Sushovan De , Department of Computer Science and Engineering Arizona State University Tempe, AZ 85281, USA
Yuheng Hu , Department of Computer Science and Engineering Arizona State University Tempe, AZ 85281, USA
Yi Chen , School of Management, New Jersey Institute of Technology, Newark, NJ 07102, USA
Subbarao Kambhampati , Department of Computer Science and Engineering Arizona State University Tempe, AZ 85281, USA
pp. 15-24

Scaling up M-estimation via sampling designs: The Horvitz-Thompson stochastic gradient descent (Abstract)

Stephan Clemencon , LTCI UMR 5141, Télécom ParisTech & CNRS
Patrice Bertail , Université Paris-Ouest MODAL'X & CREST - INSEE
Emilie Chautru , Université de Cergy-Pontoise Laboratoire AGM - UMR CNRS 8088
pp. 25-30

Metadata capital: Simulating the predictive value of Self-Generated Health Information (SGHI) (Abstract)

Jane Greenberg , Metadata Research Center, College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
Adrian Ogletree , Metadata Research Center, College of Computing and Informatics, Drexel University, Philadelphia, PA, USA
Angela P. Murillo , School of Information and Library Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Thomas P. Caruso , School of Information and Library Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Herbie Huang , Department of Economics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
pp. 31-36

Representative subsets for big data learning using k-NN graphs (Abstract)

Raghvendra Mall , KU Leuven, ESAT/STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Vilen Jumutc , KU Leuven, ESAT/STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Rocco Langone , KU Leuven, ESAT/STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Johan A.K. Suykens , KU Leuven, ESAT/STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
pp. 37-42

Towards building and evaluating a personalized location-based recommender system (Abstract)

Rubing Duan , Institute of High Performance Computing, Singapore
Rick Siow Mong Goh , Institute of High Performance Computing, Singapore
Feng Yang , Institute of High Performance Computing, Singapore
Yong Kiam Tan , Institute of High Performance Computing, Singapore
Jesus F.B. Valenzuela , Institute of High Performance Computing, Singapore
pp. 43-48

On the performance of MapReduce: A stochastic approach (Abstract)

Sarker Tanzir Ahmed , Texas A&M University, College Station, TX 77843, USA
Dmitri Loguinov , Texas A&M University, College Station, TX 77843, USA
pp. 49-54

PGMHD: A scalable probabilistic graphical model for massive hierarchical data problems (Abstract)

Khalifeh AlJadda , Department of Computer Science, University of Georgia, Athens, Georgia
Mohammed Korayem , School of Informatics and Computing, Indiana Univeristy, Bloomington, IN
Camilo Ortiz , CareerBuilder, Norcross, GA
Trey Grainger , CareerBuilder, Norcross, GA
John A. Miller , Department of Computer Science, University of Georgia, Athens, Georgia
William S. York , Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia
pp. 55-60

FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems (Abstract)

Dongfang Zhao , Illinois Institute of Technology
Zhao Zhang , UC Berkeley
Xiaobing Zhou , Hortonworks Inc.
Tonglin Li , Illinois Institute of Technology
Ke Wang , Illinois Institute of Technology
Dries Kimpe , Argonne National Laboratory
Philip Carns , Argonne National Laboratory
Robert Ross , Argonne National Laboratory
Ioan Raicu , Illinois Institute of Technology
pp. 61-70

BurstMem: A high-performance burst buffer system for scientific applications (Abstract)

Teng Wang , Auburn University Auburn University, AL 36849
Sarp Oral , Oak Ridge National Laboratory Oak Ridge, TN 37831
Yandong Wang , Auburn University Auburn University, AL 36849
Brad Settlemyer , Oak Ridge National Laboratory Oak Ridge, TN 37831
Scott Atchley , Oak Ridge National Laboratory Oak Ridge, TN 37831
Weikuan Yu , Auburn University Auburn University, AL 36849
pp. 71-79

Partial rollback-based scheduling on in-memory transactional data grids (Abstract)

Junwhan Kim , Dept. of Computer Science and Information Technology, University of the District of Columbia, Washington, DC 20008
pp. 80-89

Detecting and identifying system changes in the cloud via discovery by example (Abstract)

Hao Chen , Department of Electrical and Computer Engineering, Boston University, Boston, MA, 02215
Sastry S. Duri , IBM T J Watson Research Center, 1101 Kitchawan Rd., Yorktown Heights, NY, 10598
Vasanth Bala , IBM T J Watson Research Center, 1101 Kitchawan Rd., Yorktown Heights, NY, 10598
Nilton T. Bila , IBM T J Watson Research Center, 1101 Kitchawan Rd., Yorktown Heights, NY, 10598
Canturk Isci , IBM T J Watson Research Center, 1101 Kitchawan Rd., Yorktown Heights, NY, 10598
Ayse K. Coskun , Department of Electrical and Computer Engineering, Boston University, Boston, MA, 02215
pp. 90-99

PigOut: Making multiple Hadoop clusters work together (Abstract)

Kyungho Jeon , University at Buffalo, the State Univeristy of New York
Sharath Chandrashekhara , University at Buffalo, the State Univeristy of New York
Feng Shen , University at Buffalo, the State Univeristy of New York
Shikhar Mehra , University at Buffalo, the State Univeristy of New York
Oliver Kennedy , University at Buffalo, the State Univeristy of New York
Steven Y. Ko , University at Buffalo, the State Univeristy of New York
pp. 100-109

Parallel Breadth First Search on GPU clusters (Abstract)

Zhisong Fu , SYSTAP, LLC
Harish Kumar Dasari , University of Utah
Bradley Bebee , SYSTAP, LLC
Martin Berzins , University of Utah
Bryan Thompson , SYSTAP, LLC
pp. 110-118

Optimizing load balancing and data-locality with data-aware scheduling (Abstract)

Ke Wang , Illinois Institute of Technology
Xraobing Zhou , Hortonworks Inc
Tonglin Li , Illinois Institute of Technology
Dongfang Zhao , Illinois Institute of Technology
Michael Lang , Los Alantos National Laboratory
Ioan Raicu , Illinois Institute of Technology
pp. 119-128

Online temporal-spatial analysis for detection of critical events in Cyber-Physical Systems (Abstract)

Zhang Fu , Chalmers University of Technology
Magnus Almgren , Chalmers University of Technology
Olaf Landsiedel , Chalmers University of Technology
Marina Papatriantafilou , Chalmers University of Technology
pp. 129-134

A cross-job framework for MapReduce scheduling (Abstract)

Xuejie Xiao , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse
Jian Tang , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse
Zhenhua Chen , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse
Jielong Xu , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse
Chonggang Wang , InterDigital Communications Inc.
pp. 135-140

Scheduling MapReduce tasks on virtual MapReduce clusters from a tenant's perspective (Abstract)

Jia-Chun Lin , Department of Computer Science, National Chiao Tung University, Taiwan
Ming-Chang Lee , Department of Computer Science, National Chiao Tung University, Taiwan
Ramin Yahyapour , GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen, Göttingen, Lower Saxony, Germany
pp. 141-146

FlexDAS: A flexible direct attached storage for I/O intensive applications (Abstract)

Takatsugu Ono , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
Yotaro Konishi , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
Teruo Tanimoto , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
Noboru Iwamatsu , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
Takashi Miyoshi , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
Jun Tanaka , Fujitsu Laboratories Ltd. Kawasaki, Kanagawa, Japan
pp. 147-152

A two-sided market mechanism for trading big data computing commodities (Abstract)

Lena Mashayekhy , Department of Computer Science, Wayne State University, Detroit, MI 48202, USA
Mahyar Movahed Nejad , Department of Computer Science, Wayne State University, Detroit, MI 48202, USA
Daniel Grosu , Department of Computer Science, Wayne State University, Detroit, MI 48202, USA
pp. 153-158

MMap: Fast billion-scale graph computation on a PC via memory mapping (Abstract)

Zhiyuan Lin , Georgia Tech Atlanta, Georgia
Minsuk Kahng , Georgia Tech Atlanta, Georgia
Kaeser Md. Sabrin , Georgia Tech Atlanta, Georgia
Duen Horng Polo Chau , Georgia Tech Atlanta, Georgia
Ho Lee , KAIST, Daejeon, Republic of Korea
U Kang , KAIST, Daejeon, Republic of Korea
pp. 159-164

Large-scale network traffic monitoring with DBStream, a system for rolling big data analysis (Abstract)

Arian Bar , FTW Vienna, Austria
Alessandro Finamore , Politecnico di Torino, Italy
Pedro Casas , FTW Vienna, Austria
Lukasz Golab , University of Waterloo, Canada
Marco Mellia , Politecnico di Torino, Italy
pp. 165-170

Synthetic data generation for the internet of things (Abstract)

Jason W. Anderson , School of Computing, Clemson University, Clemson, SC
K. E. Kennedy , School of Computing, Clemson University, Clemson, SC
Linh B. Ngo , School of Computing, Clemson University, Clemson, SC
Andre Luckow , School of Computing, Clemson University, Clemson, SC
Amy W. Apon , School of Computing, Clemson University, Clemson, SC
pp. 171-176

Evaluating the performance and scalability of the Ceph distributed storage system (Abstract)

Diana Gudu , Karlsruhe Institute of Technology, Karlsruhe, Germany
Marcus Hardt , Karlsruhe Institute of Technology, Karlsruhe, Germany
Achim Streit , Karlsruhe Institute of Technology, Karlsruhe, Germany
pp. 177-182

Incremental window aggregates over array database (Abstract)

Li Jiang , Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Japan
Hideyuki Kawashima , Faculty of Information, Systems and Engineering, University of Tsukuba, Tsukuba, Japan
Osamu Tatebe , Faculty of Information, Systems and Engineering, University of Tsukuba, Tsukuba, Japan
pp. 183-188

BigCache for big-data systems (Abstract)

Michel Angelo Roger , Florida International University, Miami, Florida
Yiqi Xu , Florida International University, Miami, Florida
Ming Zhao , Florida International University, Miami, Florida
pp. 189-194

Automated workload-aware elasticity of NoSQL clusters in the cloud (Abstract)

Evie Kassela , CSLAB, National Technical University of Athens
Christina Boumpouka , CSLAB, National Technical University of Athens
Ioannis Konstantinou , CSLAB, National Technical University of Athens
Nectarios Koziris , CSLAB, National Technical University of Athens
pp. 195-200

Distributed class dependent feature analysis — A big data approach (Abstract)

Khoa Luu , Cylab Biometrics Center and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, USA
Chenchen Zhu , Cylab Biometrics Center and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, USA
Marios Savvides , Cylab Biometrics Center and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, USA
pp. 201-206

VENU: Orchestrating SSDs in hadoop storage (Abstract)

K. R. Krish , Department of Computer Science, Virginia Tech
M. Safdar Iqbal , Department of Computer Science, Virginia Tech
Ali R. Butt , Department of Computer Science, Virginia Tech
pp. 207-212

In-memory I/O and replication for HDFS with Memcached: Early experiences (Abstract)

Nusrat Sharmin Islam , Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Xiaoyi Lu , Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Md Wasi-ur-Rahman , Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Raghunath Rajachandrasekar , Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Dhabaleswar K. D. K. Panda , Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
pp. 213-218

Enabling composite applications through an asynchronous shared memory interface (Abstract)

Douglas Otstott , School of Computing and Information Sciences, Florida International University, Miami, FL, USA
Noah Evans , Ultrascale Systems Research Center, Los Alamos National Laboratory, Los Alamos, NM, USA
Latchesar Ionkov , Ultrascale Systems Research Center, Los Alamos National Laboratory, Los Alamos, NM, USA
Ming Zhao , School of Computing and Information Sciences, Florida International University, Miami, FL, USA
Michael Lang , Ultrascale Systems Research Center, Los Alamos National Laboratory, Los Alamos, NM, USA
pp. 219-224

k-Balanced sorting and skew join in MPI and MapReduce (Abstract)

Silu Huang , Department of Computer Science and Engineering, Chinese University of Hong Kong
Ada Wai-Chee Fu , Department of Computer Science and Engineering, Chinese University of Hong Kong
pp. 225-230

Virtual chunks: On supporting random accesses to scientific data in compressible storage systems (Abstract)

Dongfang Zhao , Illinois Institute of Technology
Jian Yin , Pacific Northwest National Lab
Kan Qiao , Illinois Institute of Technology
Ioan Raicu , Illinois Institute of Technology
pp. 231-240

Examination of data, rule generation and detection of phishing URLs using online logistic regression (Abstract)

Mohammed Nazim Feroz , Texas Tech University, Computer Science, Lubbock, USA
Susan Mengel , Texas Tech University Computer Science Lubbock, USA
pp. 241-250

Main memory evaluation of recursive queries on multicore machines (Abstract)

Mohan Yang , Department of Computer Science, University of California, Los Angeles
Carlo Zaniolo , Department of Computer Science, University of California, Los Angeles
pp. 251-260

Predicting glaucoma progression using multi-task learning with heterogeneous features (Abstract)

Shigeru Maya , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8656, Japan
Kai Morino , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8656, Japan
Kenji Yamanishi , Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8656, Japan
pp. 261-270

Provenance-based object storage prediction scheme for scientific big data applications (Abstract)

Dong Dai , Computer Science Department, Texas Tech University
Yong Chen , Computer Science Department, Texas Tech University
Dries Kimpe , Mathematics and Computer Science Division, Argonne National Laboratory
Rob Ross , Mathematics and Computer Science Division, Argonne National Laboratory
pp. 271-280

Synergistic partitioning in multiple large scale social networks (Abstract)

Songchang Jin , College of Computer, National University of Defense Technology, Changsha, Hunan 410073, China
Jiawei Zhang , Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607
Philip S. Yu , Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607
Shuqiang Yang , College of Computer, National University of Defense Technology, Changsha, Hunan 410073, China
Aiping Li , College of Computer, National University of Defense Technology, Changsha, Hunan 410073, China
pp. 281-290

TRISTAN: Real-time analytics on massive time series using sparse dictionary compression (Abstract)

Alice Marascu , IBM Research
Pascal Pompey , IBM Research
Eric Bouillet , IBM Research
Michael Wurst , IBM Research
Olivier Verscheure , IBM Research
Martin Grund , eXascale Infolab, University of Fribourg-Switzerland
Philippe Cudre-Mauroux , eXascale Infolab, University of Fribourg-Switzerland
pp. 291-300

Performance modeling in CUDA streams — A means for high-throughput data processing (Abstract)

Hao Li , Department of Computer Science and Engineering, University of South Florida 4202 E. Fowler Ave., ENB118, Tampa, FL 33620, U.S.A.
Di Yu , Department of Computer Science and Engineering, University of South Florida 4202 E. Fowler Ave., ENB118, Tampa, FL 33620, U.S.A.
Anand Kumar , Department of Computer Science and Engineering, University of South Florida 4202 E. Fowler Ave., ENB118, Tampa, FL 33620, U.S.A.
Yi-Cheng Tu , Department of Computer Science and Engineering, University of South Florida 4202 E. Fowler Ave., ENB118, Tampa, FL 33620, U.S.A.
pp. 301-310

Minimizing data movement through query transformation (Abstract)

Patrick Leyshock , Department of Computer Science, Portland State University, Portland, OR, U.S.A.
David Maier , Department of Computer Science, Portland State University, Portland, OR, U.S.A.
Kristin Tufte , Department of Computer Science, Portland State University, Portland, OR, U.S.A.
pp. 311-316

Multilevel partitioning of large unstructured grids (Abstract)

Oyindamola O. Akande , Intel Corporation, 5000 W Chandler Blvd, Chandler, AZ 85226
Philip J. Rhodes , Department of Computer Science, University of Mississippi, University, MS 38677
pp. 317-322

Low complexity sensing for big spatio-temporal data (Abstract)

Dongeun Lee , School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Korea
Jaesik Choi , School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology, Ulsan 689-798, Korea
pp. 323-328

In-advance data analytics for reducing time to discovery (Abstract)

Jialin Liu , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Yin Lu , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, Texas, USA
pp. 329-334

Estimating pairwise distances in large graphs (Abstract)

Maria Christoforaki , NYU Polytechnic School of Engineering, Brooklyn, NY
Torsten Suel , NYU Polytechnic School of Engineering, Brooklyn, NY
pp. 335-344

Distributed Adaptive Model Rules for mining big data streams (Abstract)

Anh Thu Vu , Royal Institute of Technology
Gianmarco De Francisci Morales , Yahoo Labs, Barcelona
Joao Gama , University of Porto
Albert Bifet , HUAWEI Noah's Ark Lab
pp. 345-353

Sparse computation for large-scale data mining (Abstract)

Dorit S. Hochbaum , University of California, Berkeley, Etcheverry Hall, Berkeley, CA 94720, USA
Philipp Baumann , University of California, Berkeley, Etcheverry Hall, Berkeley, CA 94720, USA
pp. 354-363

Topic similarity networks: Visual analytics for large document sets (Abstract)

Arun S. Maiya , Institute for Defense Analyses, Alexandria, VA 22311
Robert M. Rolfe , Institute for Defense Analyses, Alexandria, VA 22311
pp. 364-372

Efficient breadth-first search on a heterogeneous processor (Abstract)

Mayank Daga , AMD Research, Advanced Micro Devices, Inc., USA
Mark Nutter , AMD Research, Advanced Micro Devices, Inc., USA
Mitesh Meswani , AMD Research, Advanced Micro Devices, Inc., USA
pp. 373-382

Web-based visual analytics for extreme scale climate science (Abstract)

Chad A. Steed , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Katherine J. Evans , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
John F. Harney , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Brian C. Jewell , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Galen Shipman , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Brian E. Smith , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Peter E. Thornton , Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Dean N. Williams , Lawrence Livermore National Laboratory, Livermore, California 94550
pp. 383-392

Geotagging one hundred million Twitter accounts with total variation minimization (Abstract)

Ryan Compton , Information and System Sciences Laboratory, HRL Laboratories 3011 Malibu Canyon Rd, Malibu, CA 90265
David Jurgens , Information and System Sciences Laboratory, HRL Laboratories 3011 Malibu Canyon Rd, Malibu, CA 90265
David Allen , Information and System Sciences Laboratory, HRL Laboratories 3011 Malibu Canyon Rd, Malibu, CA 90265
pp. 393-401

Meeting predictable buffer limits in the parallel execution of event processing operators (Abstract)

Ruben Mayer , Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
Boris Koldehofe , Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
Kurt Rothermel , Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany
pp. 402-411

Metadata extraction and correction for large-scale traffic surveillance videos (Abstract)

Xiaomeng Zhao , Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, China
Huadong Ma , Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, China
Haitao Zhang , Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, China
Yi Tang , Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, China
Guangping Fu , Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, China
pp. 412-420

Facilitating Twitter data analytics: Platform, language and functionality (Abstract)

Ke Tao , TU Delft, Web Information Systems, PO Box 5031, 2600 GA, Delft, the Netherlands
Claudia Hauff , TU Delft, Web Information Systems, PO Box 5031, 2600 GA, Delft, the Netherlands
Geert-Jan Houben , TU Delft, Web Information Systems, PO Box 5031, 2600 GA, Delft, the Netherlands
Fabian Abel , XING AG, Gänsemarkt 43, 20354, Hamburg, Germany
Guido Wachsmuth , TU Delft, Software Engineering, PO Box 5031, 2600 GA, Delft, the Netherlands
pp. 421-430

Visual fusion of mega-city big data: An application to traffic and tweets data analysis of Metro passengers (Abstract)

Masahiko Itoh , The University of Tokyo
Daisaku Yokoyama , The University of Tokyo
Masashi Toyoda , The University of Tokyo
Yoshimitsu Tomita , Tokyo Metro Co., Ltd
Satoshi Kawamura , Tokyo Metro Co., Ltd., the University of Tokyo
Masaru Kitsuregawa , National Institute of Informatics, the University of Tokyo
pp. 431-440

Evaluating density-based motion for big data visual analytics (Abstract)

Ronak Etemadpour , School of Information, University of Arizona
Paul Murray , Dept. of Computer Science, University of Illinois at Chicago
Angus Graeme Forbes , Dept. of Computer Science, University of Illinois at Chicago
pp. 451-460

Regression trees for streaming data with local performance guarantees (Abstract)

Ulf Johansson , School of Business and IT, University of Borås, Sweden
Cecilia Sonstrod , School of Business and IT, University of Borås, Sweden
Henrik Linusson , School of Business and IT, University of Borås, Sweden
Henrik Bostrom , Dept. of Computer and Systems Sciences, Stockholm University, Sweden
pp. 461-470

Distributed algorithms for k-truss decomposition (Abstract)

Pei-Ling Chen , Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
Chung-Kuang Chou , Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
Ming-Syan Chen , Research Center of Information Technology Innovation, Academia Sinica, Taipei, Taiwan
pp. 471-480

PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks (Abstract)

George M. Slota , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
Kamesh Madduri , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
Sivasankaran Rajamanickam , Scalable Algorithms Department, Sandia National Laboratories, Albuquerque, NM
pp. 481-490

Effective caching techniques for accelerating pattern matching queries (Abstract)

Arash Fard , Computer Science Department, The University of Georgia, Athens, GA, USA
Satya Manda , Computer Science Department, The University of Georgia, Athens, GA, USA
Lakshmish Ramaswamy , Computer Science Department, The University of Georgia, Athens, GA, USA
John A. Miller , Computer Science Department, The University of Georgia, Athens, GA, USA
pp. 491-499

Clique guided community detection (Abstract)

Diana Palsetia , Northwestern University, Evanston, IL
Md. Mostofa Ali Patwary , Parallel Computing Lab, Intel, Santa Clara, CA
William Hendrix , Northwestern University, Evanston, IL
Ankit Agrawal , Northwestern University, Evanston, IL
Alok Choudhary , Northwestern University, Evanston, IL
pp. 500-509

Large-scale distributed sorting for GPU-based heterogeneous supercomputers (Abstract)

Hideyuki Shamoto , Tokyo Institute of Technology, Tokyo, Japan
Koichi Shirahata , Tokyo Institute of Technology, Tokyo, Japan
Aleksandr Drozd , Tokyo Institute of Technology, Tokyo, Japan
Hitoshi Sato , Tokyo Institute of Technology, Tokyo, Japan
Satoshi Matsuoka , Tokyo Institute of Technology, Tokyo, Japan
pp. 510-518

Large-scale logistic regression and linear support vector machines using spark (Abstract)

Chieh-Yen Lin , Dept. of Computer Science, National Taiwan Univ., Taiwan
Cheng-Hao Tsai , Dept. of Computer Science, National Taiwan Univ., Taiwan
Ching-Pei Lee , Dept. of Computer Science, Univ. of Illinois, USA
Chih-Jen Lin , Dept. of Computer Science, National Taiwan Univ., Taiwan
pp. 519-528

NVM-based Hybrid BFS with memory efficient data structure (Abstract)

Keita Iwabuchi , Tokyo Institute of Technology, Tokyo, Japan
Hitoshi Sato , Tokyo Institute of Technology, Tokyo, Japan
Yuichiro Yasui , Kyushu University, Fukuoka, Japan
Katsuki Fujisawa , Kyushu University, Fukuoka, Japan
Satoshi Matsuoka , Tokyo Institute of Technology, Tokyo, Japan
pp. 529-538

Identification of SNP interactions using data-parallel primitives on GPUs (Abstract)

Can Altinigneli , University of Munich, Munich, Germany
Bettina Konten , University of Halle, Halle, Germany
Dan Rujescir , University of Halle, Halle, Germany
Christian Bohm , University of Munich, Munich, Germany
Claudia Plant , Helmholtz Zentrum München, Technische Universität München, Munich, Germany
pp. 539-548

Random walks on adjacency graphs for mining lexical relations from big text data (Abstract)

Shan Jiang , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801 USA
ChengXiang Zhai , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801 USA
pp. 549-554

Entity resolution using inferred relationships and behavior (Abstract)

Jonathan Mugan , 21CT, Inc., Austin, Texas, USA
Ranga Chari , 21CT, Inc., Austin, Texas, USA
Laura Hitt , 21CT, Inc., Austin, Texas, USA
Eric McDermid , 21CT, Inc., Austin, Texas, USA
Marsha Sowell , 21CT, Inc., Austin, Texas, USA
Yuan Qu , 21CT, Inc., Austin, Texas, USA
Thayne Coffman , RGM Advisors, LLC, Austin, Texas USA
pp. 555-560

Rainbow: A distributed and hierarchical RDF triple store with dynamic scalability (Abstract)

Rong Gu , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
Wei Hu , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
Yihua Huang , National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 210093
pp. 561-566

In-situ visualization and computational steering for large-scale simulation of turbulent flows in complex geometries (Abstract)

Hong Yi , Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC 27517
Michel Rasquin , Argonne Leadership Computing Facility, Argonne National Laboratory, Argonne, IL 60439
Jun Fang , Department of Nuclear Engineering, North Carolina State University, Raleigh, NC 27695
Igor A. Bolotnov , Department of Nuclear Engineering, North Carolina State University, Raleigh, NC 27695
pp. 567-572

Building k-nn graphs from large text data (Abstract)

Thibault Debatty , Royal Military Academy, Brussels, Belgium
Pietro Michiardi , EURECOM, Campus SophiaTech, France
Olivier Thonnard , Symantec Research Labs, Sophia Antipolis, France
Wim Mees , Royal Military Academy, Brussels, Belgium
pp. 573-578

Learning to predict subject-line opens for large-scale email marketing (Abstract)

Raju Balakrishnan , Data Sciences, Groupon Inc., Palo Alto CA USA 94306
Rajesh Parekh , Data Sciences, Groupon Inc., Palo Alto CA USA 94306
pp. 579-584

MAGE: Matching approximate patterns in richly-attributed graphs (Abstract)

Robert Pienta , College of Computing, Georgia Institute of Technology, Atlanta, GA
Acar Tamersoy , College of Computing, Georgia Institute of Technology, Atlanta, GA
Hanghang Tong , Department of Computer Science, Arizona Sate University, Phoenix, AZ
Duen Horng Chau , College of Computing, Georgia Institute of Technology, Atlanta, GA
pp. 585-590

Bootstrapping K-means for big data analysis (Abstract)

Jungkyu Han , Software Innovation Center, Nippon Telegraph and Telephone, Tokyo, Japan
Min Luo , Software Innovation Center, Nippon Telegraph and Telephone, Tokyo, Japan
pp. 591-596

Distributed Adaptive Importance Sampling on graphical models using MapReduce (Abstract)

Ahsanul Haque , The University of Texas at Dallas, Richardson TX, USA
Swarup Chandra , The University of Texas at Dallas, Richardson TX, USA
Latifur Khan , The University of Texas at Dallas, Richardson TX, USA
Charu Aggarwal , IBM T. J. Watson Research Center, Yorktown NY, USA
pp. 597-602

Knowledge-based clustering of ship trajectories using density-based approach (Abstract)

Bo Liu , Faculty of Computer Science, Dalhousie University, Canada
Erico N. de Souza , Faculty of Computer Science, Dalhousie University, Canada
Stan Matwin , Faculty of Computer Science, Dalhousie University, Canada
Marcin Sydow , Polish-Japanese Institute of Information Technology, Warsaw, Poland
pp. 603-608

Immersive and collaborative data visualization using virtual reality platforms (Abstract)

Ciro Donalek , California Institute of Technology, Pasadena, CA 91125, USA
S. G. Djorgovski , California Institute of Technology, Pasadena, CA 91125, USA
Alex Cioc , California Institute of Technology, Pasadena, CA 91125, USA
Anwell Wang , California Institute of Technology, Pasadena, CA 91125, USA
Jerry Zhang , California Institute of Technology, Pasadena, CA 91125, USA
Elizabeth Lawler , California Institute of Technology, Pasadena, CA 91125, USA
Stacy Yeh , California Institute of Technology, Pasadena, CA 91125, USA
Ashish Mahabal , California Institute of Technology, Pasadena, CA 91125, USA
Matthew Graham , California Institute of Technology, Pasadena, CA 91125, USA
Andrew Drake , California Institute of Technology, Pasadena, CA 91125, USA
Scott Davidoff , Jet Propulsion Laboratory, Pasadena, CA 91109, USA
Jeffrey S. Norris , Jet Propulsion Laboratory, Pasadena, CA 91109, USA
Giuseppe Longo , University Federico II, Napoli, Italy
pp. 609-614

The Adaptive Projection Forest: Using adjustable exclusion and parallelism in metric space indexes (Abstract)

Lee Parnell Thompson , University of Texas at Austin
Weijia Xu , Texas Advanced Computing Center
Daniel P. Miranker , University of Texas at Austin
pp. 615-620

Scaling up Prioritized Grammar Enumeration for scientific discovery in the cloud (Abstract)

Tony Worm , Binghamton University
Kenneth Chiu , Binghamton University
pp. 621-626

MR-TRIAGE: Scalable multi-criteria clustering for big data security intelligence applications (Abstract)

Yun Shen , Symantec Research Labs, Dublin, Republic of Ireland
Olivier Thonnard , Symantec Research Labs, Sophia Antipolis, France
pp. 627-635

Increasing the veracity of event detection on social media networks through user trust modeling (Abstract)

Todd Bodnar , Center for Infectious, Disease Dynamics, Pennsylvania State University, University Park, Pennsylvania 16802
Conrad Tucker , Engineering Design, Industrial Engineering, Computer Science and Engineering, Pennsylvania State University University Park, Pennsylvania 16802
Kenneth Hopkinson , Electrical and Computer Engineering, Air Force Institute of Technology, Wright-Patterson Air Force Base, Fairborn, Ohio 45433
Sven G. Bilen , Engineering Design, Electrical Engineering, Pennsylvania State University, University Park, Pennsylvania 16802
pp. 636-643

Empowering users of social networks to assess their privacy risks (Abstract)

Vladimir Estivill-Castro , Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Roc Boronat, 138, Barcelona 08018 Spain
Peter Hough , Center for Research in Complex Systems, School of Computing and Mathematics, Charles Sturt University, Panorama Avenue, Bathurst NSW 2795 Australia
Md Zahidul Islam , Center for Research in Complex Systems, School of Computing and Mathematics, Charles Sturt University, Panorama Avenue, Bathurst NSW 2795 Australia
pp. 644-649

A unified approach to network anomaly detection (Abstract)

Tahereh Babaie , School of IT, University of Sydney, Sydney, NSW, Australia
Sanjay Chawla , School of IT, University of Sydney, Sydney, NSW, Australia
Sebastien Ardon , ATP Research Laboratory, NICTA, Alexandria, NSW, Australia
Yue Yu , School of IT, University of Sydney, Sydney, NSW, Australia
pp. 650-655

E-Sketch: Gathering large-scale energy consumption data based on consumption patterns (Abstract)

Zhichuan Huang , University of Maryland, Baltimore County
Hongyao Luo , Binghamton University, State University of New York
David Skoda , Binghamton University, State University of New York
Ting Zhu , University of Maryland, Baltimore County
Yu Gu , IBM Research-Austin
pp. 656-665

Hierarchical management of large-scale malware data (Abstract)

Lee Kellogg , Charles River Analytics 625 Mt. Auburn St, Cambridge, MA, 02138
Brian Ruttenberg , Charles River Analytics 625 Mt. Auburn St, Cambridge, MA, 02138
Alison O'Connor , Charles River Analytics 625 Mt. Auburn St, Cambridge, MA, 02138
Michael Howard , Charles River Analytics 625 Mt. Auburn St, Cambridge, MA, 02138
Avi Pfeffer , Charles River Analytics 625 Mt. Auburn St, Cambridge, MA, 02138
pp. 666-674

Random projection based clustering for population genomics (Abstract)

Sotiris Tasoulis , Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Finland
Lu Cheng , Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Finland
Niko Valimaki , Department of Computer Science, University of Helsinki, Finland
Nicholas J. Croucher , Department of Infectious Disease Epidemiology, Imperial College, London UK
Simon R. Harris , Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
William P. Hanage , Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA
Teemu Roos , Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Finland
Jukka Corander , Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of Helsinki, Finland
pp. 675-682

Structure recognition from high resolution images of ceramic composites (Abstract)

Daniela Ushizima , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA
Talita Perciano , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA
Harinarayan Krishnan , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA
Burlen Loring , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA
Hrishikesh Bale , Material Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA
Dilworth Parkinson , Advanced Light Source Division, Lawrence Berkeley National Laboratory, Berkeley, CA
James Sethian , Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA
pp. 683-691

Combining Hadoop and GPU to preprocess large Affymetrix microarray data (Abstract)

Sufeng Niu , Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634, USA
Guangyu Yang , School of Computing, Clemson University, Clemson, SC 29634 USA
Nilim Sarma , Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634, USA
Pengfei Xuan , School of Computing, Clemson University, Clemson, SC 29634 USA
Melissa C. Smith , Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634, USA
Pradip Srimani , School of Computing, Clemson University, Clemson, SC 29634 USA
Feng Luo , School of Computing, Clemson University, Clemson, SC 29634 USA
pp. 692-700

Content-Based Access Control: Use data content to assist access control for large-scale content-centric databases (Abstract)

Wenrong Zeng , Department of Electrical Engineering and Computer Science, Information and Telecommunication Technology Center, The University of Kansas, Lawrence, KS 66045, USA
Yuhao Yang , Department of Electrical Engineering and Computer Science, Information and Telecommunication Technology Center, The University of Kansas, Lawrence, KS 66045, USA
Bo Luo , Department of Electrical Engineering and Computer Science, Information and Telecommunication Technology Center, The University of Kansas, Lawrence, KS 66045, USA
pp. 701-710

Locating visual storm signatures from satellite images (Abstract)

Yu Zhang , College of Information Sciences and Technology, The Pennsylvania State University, USA
Stephen Wistar , Accuweather Inc., USA
Jose A. Piedra-Fernandez , Department of Information Technology, University of Almería, Spain
Jia Li , College of Information Sciences and Technology, The Pennsylvania State University, USA
Michael A. Steinberg , Accuweather Inc., USA
James Z. Wang , College of Information Sciences and Technology, The Pennsylvania State University, USA
pp. 711-720

Accurate and efficient selection of the best consumption prediction method in smart grids (Abstract)

Marc Frincu , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
Charalampos Chelmis , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
Muhammad Usman Noor , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
Viktor Prasanna , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
pp. 721-729

Visualizations for sense-making in financial market regulation (Abstract)

Andrew Todd , University of Virginia, Charlottesville, VA, USA
William Scherer , University of Virginia, Charlottesville, VA, USA
Peter Beling , University of Virginia, Charlottesville, VA, USA
Mark Paddrik , Office of Financial Research, US Treasury, Washington, DC, USA
Richard Haynes , Office of Financial Research, US Treasury, Washington, DC, USA
pp. 730-735

Big Automotive Data: Leveraging large volumes of data for knowledge-driven product development (Abstract)

Mathias Johanson , Alkit Communications AB, Mölndal, Sweden
Stanislav Belenki , Alkit Communications AB, Mölndal, Sweden
Jonas Jalminger , Alkit Communications AB, Mölndal, Sweden
Magnus Fant , Alkit Communications AB, Mölndal, Sweden
Mats Gjertz , Volvo Car Corporation, Gothenburg, Sweden
pp. 736-741

Toward personalized and scalable voice-enabled services powered by big data (Abstract)

Jong Hoon Ahnn , Cloud Research Lab, Samsung Research America - Silicon Valley 75 W Plumeria Drive, San Jose, CA 95134, USA
pp. 748-753

MaPLE: A MapReduce Pipeline for Lattice-based Evaluation and its application to SNOMED CT (Abstract)

Guo-Qiang Zhang , Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106
Wei Zhu , Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106
Mengmeng Sun , Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106
Shiqiang Tao , Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106
Olivier Bodenreider , National Library of Medicine, Bethesda, MD 20892, USA
Licong Cui , Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106
pp. 754-759

Dynamic pre-training of Deep Recurrent Neural Networks for predicting environmental monitoring data (Abstract)

Bun Theang Ong , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Komei Sugiura , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Koji Zettsu , Information Services Platform Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
pp. 760-765

Perldoop: Efficient execution of Perl scripts on Hadoop clusters (Abstract)

Jose M. Abuin , Centro de Investigación en Tecnoloxías da Información (CiTIUS) Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Juan C. Pichel , Centro de Investigación en Tecnoloxías da Información (CiTIUS) Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Tomas F. Pena , Centro de Investigación en Tecnoloxías da Información (CiTIUS) Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Pablo Gamallo , Centro de Investigación en Tecnoloxías da Información (CiTIUS) Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Marcos Garcia , Centro de Investigación en Tecnoloxías da Información (CiTIUS) Universidade de Santiago de Compostela, Santiago de Compostela, Spain
pp. 766-771

Department of energy strategic roadmap for Earth system science data integration (Abstract)

Dean N. Williams , Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, CA 94550 USA
Giri Palanisamy , Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831 USA
Galen Shipman , Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831 USA
Thomas A. Boden , Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831 USA
Jimmy W. Voyles , Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99354 USA
pp. 772-777

Analyzing the language of food on social media (Abstract)

Daniel Fried , University of Arizona, Tucson, AZ, USA
Mihai Surdeanu , University of Arizona, Tucson, AZ, USA
Stephen Kobourov , University of Arizona, Tucson, AZ, USA
Melanie Hingle , University of Arizona, Tucson, AZ, USA
Dane Bell , University of Arizona, Tucson, AZ, USA
pp. 778-783

Using geometric structures to improve the error correction algorithm of high-throughput sequencing data on MapReduce framework (Abstract)

Wei-Chun Chung , Institute of Information Science, Academia Sinica, Taiwan
Yu-Jung Chang , Institute of Information Science, Academia Sinica, Taiwan
D. T. Lee , Institute of Information Science, Academia Sinica, Taiwan
Jan-Ming Ho , Institute of Information Science, Academia Sinica, Taiwan
pp. 784-789

Empowering personalized medicine with big data and semantic web technology: Promises, challenges, and use cases (Abstract)

Maryam Panahiazar , Center for Science and Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
Vahid Taslimitehrani , Center for Science and Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
Ashutosh Jadhav , Center for Science and Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
Jyotishman Pathak , Center for Science and Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
pp. 790-795

On scaling time dependent shortest path computations for Dynamic Traffic Assignment (Abstract)

Amit Gupta , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX, USA
Weijia Xu , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX, USA
Kenneth Perrine , Network Modeling Center, University of Texas at Austin, Austin, TX, USA
Dennis Bell , Network Modeling Center, University of Texas at Austin, Austin, TX, USA
Natalia Ruiz-Juri , Network Modeling Center, University of Texas at Austin, Austin, TX, USA
pp. 796-801

High volume geospatial mapping for internet-of-vehicle solutions with in-memory map-reduce processing (Abstract)

Tao Zhong , Software and Services Group, Intel
Kshitij Doshi , Software and Services Group, Intel
Gang Deng , Software and Services Group, Intel
Xiaoming Yang , Research Institute, TransWiseWay
Hegao Zhang , Research Institute, TransWiseWay
pp. 802-807

Crowdsourced query augmentation through semantic discovery of domain-specific jargon (Abstract)

Khalifeh AlJadda , Department of Computer Science, University of Georgia, Athens, Georgia
Mohammed Korayem , School of Informatics & Computing, Indiana University, Bloomington, Indiana
Trey Grainger , Director of Engineering, CareerBuilder Search Group, Norcross, Georgia
Chris Russell , Engineering Lead, Relevancy CareerBuilder Search Group Norcross, Georgia
pp. 808-815

Spatial computations over terabyte-sized images on hadoop platforms (Abstract)

Peter Bajcsy , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Phuong Nguyen , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Antoine Vandecreme , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
Mary Brady , Software and Systems Division, Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD
pp. 816-824

Recall estimation for rare topic retrieval from large corpuses (Abstract)

Praveen Bommannavar , Twitter, Inc.
Alek Kolcz , Twitter, Inc.
Anand Rajaraman , Stanford University
pp. 825-834

Lightweight approximate top-k for distributed settings (Abstract)

Vinay Deolalikar , Hewlett Packard Research Sunnyvale, CA 94089, USA
Kave Eshghi , Google Mountain View, CA 94043, USA
pp. 835-844

Query revision during cluster based search on large unstructured corpora (Abstract)

Vinay Deolalikar , Hewlett-Packard Research, Sunnyvale, CA 94089
pp. 845-853

Astro: A predictive model for anomaly detection and feedback-based scheduling on Hadoop (Abstract)

Chaitali Gupta , ebay Inc., San Jose, California
Mayank Bansal , ebay Inc., San Jose, California
Tzu-Cheng Chuang , ebay Inc., San Jose, California
Ranjan Sinha , ebay Inc., San Jose, California
Sami Ben-romdhane , ebay Inc., San Jose, California
pp. 854-862

Automating data integration with HiperFuse (Abstract)

Eric Huang , Interaction and Analytics Laboratory, Palo Alto Research Center, Palo Alto, CA USA
Andres Quiroz , Interaction and Analytics Laboratory, Palo Alto Research Center, Palo Alto, CA USA
Luca Ceriani , Interaction and Analytics Laboratory, Palo Alto Research Center, Palo Alto, CA USA
pp. 863-867

Recommending similar items in large-scale online marketplaces (Abstract)

Jayasimha Katukuri , University of Louisiana, Lafayette, LA, USA
Tolga Konik , eBay Inc., San Jose, USA
Rajyashree Mukherjee , eBay Inc., San Jose, USA
Santanu Kolay , Turn Inc San Jose, USA
pp. 868-876

SE-CDA: A scalable and efficient community detection algorithm (Abstract)

Dhaval C. Lunagariya , Dept. of Computer Science and Engineering, National Institute of Technology, Warangal, India
D.V.L.N. Somayajulu , Dept. of Computer Science and Engineering, National Institute of Technology, Warangal, India
P. Radha Krishna , Infosys Labs, Infosys Limited, Hyderabad, India
pp. 877-882

Increasing the accessibility to Big Data systems via a common services API (Abstract)

Rohan Malcolm , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Cherrelle Morrison , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Tyrone Grandison , Proficiency Labs International Ashland, Oregon
Sean Thorpe , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Kimron Christie , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Akim Wallace , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Damian Green , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Julian Jarrett , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
Arnett Campbell , Computational Science Research Group, School of Computing and Information Technology, University of Technology, St. Andrew, Jamaica
pp. 883-892

Big data predictive analtyics for proactive semiconductor equipment maintenance (Abstract)

Sathyan Munirathinam , Business Intelligence Engineer, Micron Technology, Inc., Boise, USA
B. Ramadoss , Department of Computer Applications, National Institute of Technology, Trichy, India
pp. 893-902

Future directions of humans in Big Data Research: Summary of the 1st workshop on Human-Centered Big Data Research (Abstract)

Celeste Lyn Paul , Department of Defense Fort Meade, MD, United States
Chris Argenta , Applied Research Associates, Raleigh, NC, United States
William Elm , Resilient Cognitive Solutions Pittsburgh, PA, United States
Alex Endert , Georgia Institute of Technology, Atlanta, GA, United States
pp. 903-904

ALOJA: A systematic study of Hadoop deployment variables to enable automated characterization of cost-effectiveness (Abstract)

Nicolas Poggi , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
David Carrera , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Aaron Call , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Sergio Mendoza , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Yolanda Becerra , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Jordi Torres , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Eduard Ayguade , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Fabrizio Gagliardi , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Jesus Labarta , Barcelona Supercomputing Center (BSC) Universitat Poliècnica de Catlalunya (BarcelonaTech) Barcelona, Spain
Rob Reinauer , Microsoft Corporation, Microsoft Research (MSR) Redmond, USA
Nikola Vujic , Microsoft Development Center Serbia (MDCS) Belgrade, Serbia
Daron Green , Microsoft Corporation, Microsoft Research (MSR) Redmond, USA
Jose Blakeley , Microsoft Corporation, Microsoft Research (MSR) Redmond, USA
pp. 905-913

Heterogeneous stream processing for disaster detection and alarming (Abstract)

Francois Schnizler , Technion, Haifa, Israel
Thomas Liebig , Technical University Dortmund, Artificial Intelligence Group, Dortmund, Germany
Shie Marmor , Technion, Haifa, Israel
Gustavo Souto , Technical University Dortmund, Artificial Intelligence Group, Dortmund, Germany
Sebastian Bothe , Fraunhofer IAIS, Knowledge Discovery, Sankt Augustin, Germany
Hendrik Stange , Fraunhofer IAIS, Knowledge Discovery, Sankt Augustin, Germany
pp. 914-923

Identifying top Chinese network buzzwords from social media big data set based on time-distribution features (Abstract)

Yongli Tang , School of Computer, Central China Normal University, Wuhan, China
Tingting He , School of Computer, Central China Normal University, Wuhan, China
Bo Li , School of Computer, Central China Normal University, Wuhan, China
Xiaohua Hu , College of Information Science and Technology, Drexel University, Philadelphia, PA, USA
pp. 924-931

Bridging high velocity and high volume industrial big data through distributed in-memory storage & analytics (Abstract)

Jenny Weisenberg Williams , Knowledge Discovery Lab, GE Global Research Niskayuna, NY 12309 USA
Kareem S. Aggour , Knowledge Discovery Lab, GE Global Research Niskayuna, NY 12309 USA
John Interrante , Knowledge Discovery Lab, GE Global Research Niskayuna, NY 12309 USA
Justin McHugh , Knowledge Discovery Lab, GE Global Research Niskayuna, NY 12309 USA
Eric Pool , Life Cycle Engineering, GE Power & Water Atlanta, GA 30339 USA
pp. 932-941

Graph analytics and storage (Abstract)

Yinglong Xia , IBM Research, Yorktown Heights, NY 10598, aUSA
Ilie Gabriel Tanase , IBM Research, Yorktown Heights, NY 10598, aUSA
Lifeng Nai , Georgia Institute of Technology, Atlanta, GA 30332, USA
Wei Tan , IBM Research, Yorktown Heights, NY 10598, aUSA
Yanbin Liu , IBM Research, Yorktown Heights, NY 10598, aUSA
Jason Crawford , IBM Research, Yorktown Heights, NY 10598, aUSA
Ching-Yung Lin , IBM Research, Yorktown Heights, NY 10598, aUSA
pp. 942-951

An initial study of predictive machine learning analytics on large volumes of historical data for power system applications (Abstract)

Jiang Zheng , ABB US Corporate Research Center, 940 Main Campus Dr. Raleigh, NC USA, 27606
Aldo Dagnino , ABB US Corporate Research Center, 940 Main Campus Dr. Raleigh, NC USA, 27606
pp. 952-959

Author index (PDF)

pp. 1-7

Toward smart manufacturing using decision analytics (Abstract)

Alexander Brodsky , Department of Computer Science, George Mason University, Fairfax VA, USA
Mohan Krishnamoorthy , Department of Computer Science, George Mason University, Fairfax VA, USA
Daniel A. Menasce , Department of Computer Science, George Mason University, Fairfax VA, USA
Guodong Shao , Systems Integration Division, National Institute of Standards and Technology, Gaithersburg MD, USA
Sudarsan Rachuri , Systems Integration Division, National Institute of Standards and Technology, Gaithersburg MD, USA
pp. 967-977

An intelligent machine monitoring system for energy prediction using a Gaussian Process regression (Abstract)

Raunak Bhinge , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
Nishant Biswas , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
David Dornfeld , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
Jinkyoo Park , Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
Kincho H. Law , Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
Moneer Helu , Systems Integration Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
Sudarsan Rachuri , Systems Integration Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
pp. 978-986

Towards a domain-specific framework for predictive analytics in manufacturing (Abstract)

David Lechevalier , National Institute of Standards and Technology, Gaithersburg, MD, USA
Anantha Narayanan , National Institute of Standards and Technology, Gaithersburg, MD, USA
Sudarsan Rachuri , National Institute of Standards and Technology, Gaithersburg, MD, USA
pp. 987-995

Uncertainty quantification in performance evaluation of manufacturing processes (Abstract)

Saideep Nannapaneni , Department of Civil & Environmental Engineering, Vanderbilt University, Nashville, TN 37212, USA
Sankaran Mahadevan , Department of Civil & Environmental Engineering, Vanderbilt University, Nashville, TN 37212, USA
pp. 996-1005

CloudMan: A platform for portable cloud manufacturing services (Abstract)

Soheil Qanbari , Distributed Systems Group, Vienna University of Technology, Vienna, Austria
Samira Mahdi Zadeh , Distributed Systems Group, Vienna University of Technology, Vienna, Austria
Soroush Vedaei , Baha'i Institute for Higher Education (BIHE), Iran
Schahram Dustdar , Distributed Systems Group, Vienna University of Technology, Vienna, Austria
pp. 1006-1014

Building a rigorous foundation for performance assurance assessment techniques for “smart” manufacturing systems (Abstract)

Utpal Roy , Department of Mechanical and Aerospace Engineering Syracuse University, Syracuse, NY 13244, USA
Yunpeng Li , Department of Mechanical and Aerospace Engineering Syracuse University, Syracuse, NY 13244, USA
Bicheng Zhu , Department of Mechanical and Aerospace Engineering Syracuse University, Syracuse, NY 13244, USA
pp. 1015-1023

A system architecture for manufacturing process analysis based on big data and process mining techniques (Abstract)

Hanna Yang , School of Business Administration, Ulsan National Institute of Science and Technology, Ulsan, South Korea
Minjeong Park , School of Business Administration, Ulsan National Institute of Science and Technology, Ulsan, South Korea
Minsu Cho , School of Business Administration, Ulsan National Institute of Science and Technology, Ulsan, South Korea
Minseok Song , School of Business Administration, Ulsan National Institute of Science and Technology, Ulsan, South Korea
Seongjoo Kim , Cyberdigm Co., Seoul, South Korea
pp. 1024-1029

Researching persons & organizations: AWAKE: From text to an entity-centric knowledge base (Abstract)

Elizabeth Boschee , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
Marjorie Freedman , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
Saurabh Khanwalkar , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
Anoop Kumar , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
Amit Srivastava , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
Ralph Weischedel , Raytheon BBN Technologies Corp., 10 Moulton St., Cambridge, MA, USA
pp. 1030-1039

Integrating existing large scale medical laboratory data into the semantic web framework (Abstract)

Newres Al Haider , NICHE Research Group, Faculty of Computer Science, Dalhousie University, Halifax, Canada
Samina Abidi , NICHE Research Group, Faculty of Computer Science, Dalhousie University, Halifax, Canada
William van Woensel , NICHE Research Group, Faculty of Computer Science, Dalhousie University, Halifax, Canada
Syed S. R. Abidi , NICHE Research Group, Faculty of Computer Science, Dalhousie University, Halifax, Canada
pp. 1040-1048

Path knowledge discovery: Association mining based on multi-category lexicons (Abstract)

Chen Liu , Computer Science Department, University of California, Los Angeles, USA
Wesley W. Chu , Computer Science Department, University of California, Los Angeles, USA
Fred Sabb , Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, USA
D. Stott Parker , Computer Science Department, University of California, Los Angeles, USA
Joseph Korpela , Computer Science Department, University of California, Los Angeles, USA
pp. 1049-1059

Stochastic Finite Automata for the translation of DNA to protein (Abstract)

Tsau-Young Lin , San Jose State University (SJSU), USA
Asmi H. Shah , San Jose State University (SJSU), USA
pp. 1060-1067

Data mining and sharing tool for high content screening large scale biological image data (Abstract)

Asmi H. Shah , MGH, Harvard Medical School, Massachusetts, USA
Gayathri Gopalakrishnan , Karlsruhe Institute of Technology, Karlsruhe, Germany
Adithya Rajendran , Karlsruhe Institute of Technology, Karlsruhe, Germany
Urban Liebel , Karlsruhe Institute of Technology, Karlsruhe, Germany
pp. 1068-1076

A building performance evaluation & visualization system (Abstract)

Georgios Stavropoulos , Department of Electrical and Computer Engineering, University of Patras, Patras, Greece
Stelios Krinidis , Information Technologies Institute, Centre for Research and Technology Hellas, Thermi-Thessaloniki, Greece
Dimosthenis Ioannidis , Information Technologies Institute, Centre for Research and Technology Hellas, Thermi-Thessaloniki, Greece
Konstantinos Moustakas , Department of Electrical and Computer Engineering, University of Patras, Patras, Greece
Dimitrios Tzovaras , Information Technologies Institute, Centre for Research and Technology Hellas, Thermi-Thessaloniki, Greece
pp. 1077-1085

Statistical technique for online anomaly detection using Spark over heterogeneous data from multi-source VMware performance data (Abstract)

Mohiuddin Solaimani , Department of Computer Science, The University of Texas at Dallas, Richardson, TX
Mohammed Iftekhar , Department of Computer Science, The University of Texas at Dallas, Richardson, TX
Latifur Khan , Department of Computer Science, The University of Texas at Dallas, Richardson, TX
Bhavani Thuraisingham , Department of Computer Science, The University of Texas at Dallas, Richardson, TX
pp. 1086-1094

Extracting discriminative shapelets from heterogeneous sensor data (Abstract)

Om P. Patri , University of Southern California, Los Angeles, CA 90089
Abhishek B. Sharma , NEC Laboratories America, Princeton, NJ 08540
Haifeng Chen , NEC Laboratories America, Princeton, NJ 08540
Guofei Jiang , NEC Laboratories America, Princeton, NJ 08540
Anand V. Panangadan , University of Southern California, Los Angeles, CA 90089
Viktor K. Prasanna , University of Southern California, Los Angeles, CA 90089
pp. 1095-1104

Why name ambiguity resolution matters for scholarly big data research (Abstract)

Jinseok Kim , Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Urbana, USA
Jana Diesner , Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, Urbana, USA
Heejun Kim , School of Information and Library Science, University of North Carolina at Chapel Hill, Chapel Hill, USA
Amirhossein Aleyasen , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA
Hwan-Min Kim , Department of Overseas Information, Korea Institute of Science and Technology Information, Daejeon, Korea
pp. 1-6

Evolution of scientific collaboration networks (Abstract)

Gaurav Madaan , Computer Science Department, Thapar University, Patiala, India
Shivakumar Jolad , Department of Physics, IIT Gandhinagar, Ahmedabad, India
pp. 7-13

The OceanLink project (Abstract)

Tom Narock , Department of Information Technology and Management Science, Marymount University, Arlington, VA, USA
Robert Arko , Lamont-Doherty Earth Observatory, Columbia University, New York, NY, USA
Suzanne Carbotte , Lamont-Doherty Earth Observatory, Columbia University, New York, NY, USA
Adila Krisnadhi , Department of Computer Science, Wright State University, Dayton, OH, USA
Pascal Hitzler , Department of Computer Science, Wright State University, Dayton, OH, USA
Michelle Cheatham , Department of Computer Science, Wright State University, Dayton, OH, USA
Adam Shepherd , Woods Hole Oceanographic Institution, Woods Hole, MA, USA
Cynthia Chandler , Woods Hole Oceanographic Institution, Woods Hole, MA, USA
Lisa Raymond , Woods Hole Oceanographic Institution, Woods Hole, MA, USA
Peter Wiebe , Woods Hole Oceanographic Institution, Woods Hole, MA, USA
Timothy Finin , Department of Computer Science, University of Maryland, Baltimore County, Baltimore, MD, USA
pp. 14-21

Managing the academic data lifecycle: A case study of HPCC (Abstract)

Michael E. Payne , School of Computing, Clemson University, Clemson, SC
Linh B. Ngo , School of Computing, Clemson University, Clemson, SC
Flavio Villanustre , LexisNexis Risk Solutions, Atlanta, GA
Amy W. Apon , School of Computing, Clemson University, Clemson, SC
pp. 22-30

Computing fuzzy rough approximations in large scale information systems (Abstract)

Hasan Asfoor , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
Rajagopalan Srinivasan , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
Gayathri Vasudevan , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
Nele Verbiest , Dept. of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
Chris Cornells , Dept. of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
Matthew Tolentino , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
Ankur Teredesai , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
Martine De Cock , Center for Data Science, Institute of Technology, University of Washington Tacoma, USA
pp. 9-16

Fast learning for big data applications using parameterized multilayer perceptron (Abstract)

B. Chandra , Department of Mathematics, IIT New Delhi 110016
Rajesh Kumar Sharma , Department of Mathematics, IIT New Delhi 110016
pp. 17-22

Calculating feature importance in data streams with concept drift using Online Random Forest (Abstract)

Andrew Phelps Cassidy , Commonwealth Computer Research Inc. (CCRi) Charlottesville, USA
Frank A. Deviney , Commonwealth Computer Research Inc. (CCRi) Charlottesville, USA
pp. 23-28

Towards scalable graph computation on mobile devices (Abstract)

Yiqi Chen , College of Computing, Georgia Tech, Atlanta, GA, USA
Zhiyuan Lin , College of Computing, Georgia Tech, Atlanta, GA, USA
Robert Pienta , College of Computing, Georgia Tech, Atlanta, GA, USA
Minsuk Kahng , College of Computing, Georgia Tech, Atlanta, GA, USA
Duen Horng Chau , College of Computing, Georgia Tech, Atlanta, GA, USA
pp. 29-35

Boosting Stochastic Newton Descent for Bigdata large scale classification (Abstract)

Roberto D'Ambrosio , Université catholique de Louvain - ICTEAM Place Sainte Barbe 2 1348 Louvain-la-Neuve
Wafa Belhajali , I3S Laboratory, 2000 Route des Lucioles, 06903 Sophia Antipolis
Michel Barlaud , I3S Laboratory 2000 Route des Lucioles 06903 Sophia Antipolis
pp. 36-41

WS2F: A weakly supervised framework for data stream filtering (Abstract)

Cailing Dong , Department of Information Systems, University of Maryland, Baltimore County, Baltimore, MD, 21250
Arvind Agarwal , Palo Alto Research Center, 800 Phillips Rd, Bldg 128, Webster, NY, 14580
pp. 50-57

An improved memory management scheme for large scale graph computing engine GraphChi (Abstract)

Yifang Jiang , School of Information Security and Engineering, Shanghai Jiao Tong University, Shanghai, China
Diao Zhang , School of Software, Shanghai Jiao Tong University, Shanghai, China
Kai Chen , School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Qu Zhou , School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Yi Zhou , School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China
Jianhua He , School of Engineering and Applied Science, Aston University, UK
pp. 58-63

Fast algorithm for computing weighted projection quantiles and data depth for high-dimensional large data clouds (Abstract)

Ujjal Kumar Mukherjee , Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455, USA
Snigdhansu Chatterjee , School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455, USA
pp. 64-71

FS3: A sampling based method for top-k frequent subgraph mining (Abstract)

Tanay Kumar Saha , Dept. of Computer and Information Science, Indiana university-Purdue university Indianapolis, Indiana, IN-46202, USA
Mohammad Al Hasan , Dept. of Computer and Information Science, Indiana university-Purdue university Indianapolis, Indiana, IN-46202, USA
pp. 72-79

A clustering based scalable hybrid approach for web page recommendation (Abstract)

Mohammad Amir Sharif , The Center for Advanced Computer Studies, University of Louisiana at Lafayette, Lafayette, Louisiana, USA
Vijay V. Raghavan , The Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, Louisiana, USA
pp. 80-87

Multiresolution analysis of incomplete rankings with applications to prediction (Abstract)

Eric Sibony , LTCI UMR No. 5141, Telecom ParisTech/CNRS, Institut Mines-Telecom, Paris, 75013, France
Stephan Clemencon , LTCI UMR No. 5141, Telecom ParisTech/CNRS, Institut Mines-Telecom, Paris, 75013, France
Jeremie Jakubowicz , SAMOVAR UMR No. 5157, Telecom SudParis/CNRS, Institut Mines-Telecom, Evry, 91000, France
pp. 88-95

Pairwise Topic Model via relation extraction (Abstract)

Xiaoli Song , College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania 19104
Yue Shang , College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania 19104
Yuan Ling , College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania 19104
Mengwen Liu , College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania 19104
Xiaohua Hu , College of Computing and Informatics, Drexel University, Philadelphia, Pennsylvania 19104
pp. 96-103

A multi-view two-level classification method for generalized multi-instance problems (Abstract)

Xiaoguang Wang , Faculty of Computer Science, Dalhousie University, Canada
Xuan Liu , Faculty of Computer Science, Dalhousie University, Canada
Stan Matwin , Faculty of Computer Science, Dalhousie University, Canada
Nathalie Japkowicz , School of Electrical Engineering and Computer Science, University of Ottawa, Canada
Hongyu Guo , National Research Council of Canada, 1200 Montreal Road, Ottawa, ON., Canada
pp. 104-111

Applying instance-weighted support vector machines to class imbalanced datasets (Abstract)

Xiaoguang Wang , Faculty of Computer Science, Dalhousie University, Canada
Xuan Liu , Faculty of Computer Science, Dalhousie University, Canada
Stan Matwin , Faculty of Computer Science Dalhousie University, Canada Institute of Computer Science
Nathalie Japkowicz , School of Electrical Engineering and Computer Science, University of Ottawa, Canada
pp. 112-118

Connecting the dots: Triangle completion and related problems on large data sets using GPUs (Abstract)

Amlan Chatterjee , School of Computer Science, University of Oklahoma, Norman, USA
Sridhar Radhakrishnan , School of Computer Science, University of Oklahoma, Norman, USA
Chandra N. Sekharan , Department of Computer Science, Loyola University Chicago, Chicago, IL, USA
pp. 1-8

ParK: An efficient algorithm for k-core decomposition on multicore processors (Abstract)

Naga Shailaja Dasari , Department of Computer Science, Old Dominion University, Norfolk, USA
Ranjan Desh , Department of Computer Science, Old Dominion University, Norfolk, USA
M. Zubair , Department of Computer Science, Old Dominion University, Norfolk, USA
pp. 9-16

A partitioning approach to scaling anomaly detection in graph streams (Abstract)

William Eberle , Department of Computer Science, Tennessee Technological University, Box 5101, Cookeville, TN, 38505
Lawrence Holder , School of Electrical Engineering and Computer Science, Washington State University, Box 642752, Pullman, WA 99164
pp. 17-24

Global graphs: A middleware for large scale graph processing (Abstract)

S. M. Faisal , Dept. of CSE, The Ohio State University, 2015 Neil Ave, Columbus, Ohio 43210
Srinivasan Parthasarathy , Dept. of CSE, The Ohio State University, 2015 Neil Ave, Columbus, Ohio 43210
P. Sadayappan , Dept. of CSE, The Ohio State University, 2015 Neil Ave, Columbus, Ohio 43210
pp. 33-40

Toward an efficient, highly scalable maximum clique solver for massive graphs (Abstract)

Ronald D. Hagan , Dept. of Electr. Eng. & Comput. Sci., Univ. of Tennessee, Knoxville, TN, USA
Charles A. Phillips , Dept. of Electr. Eng. & Comput. Sci., Univ. of Tennessee, Knoxville, TN, USA
Kai Wang , Dept. of Electr. Eng. & Comput. Sci., Univ. of Tennessee, Knoxville, TN, USA
Gary L. Rogers , Nat. Inst. for Comput. Sci., Univ. of Tennessee, Oak Ridge, TN, USA
Michael A. Langston , Dept. of Electr. Eng. & Comput. Sci., Univ. of Tennessee, Knoxville, TN, USA
pp. 41-45

Extending SPARQL with graph functions (Abstract)

David Mizell , YarcData / Cray, Inc.
Kristyn J. Maschhoff , YarcData / Cray, Inc.
Steven P. Reinhardt , YarcData / Cray, Inc.
pp. 46-53

Change detection in temporally evolving computer networks: A big data framework (Abstract)

Josephine M. Namayanja , Department of Information Systems, University of Maryland, Baltimore County, Baltimore, United States
Vandana P. Janeja , Department of Information Systems, University of Maryland, Baltimore County, Baltimore, United States
pp. 54-61

Detecting communities around seed nodes in complex networks (Abstract)

Christian L. Staudt , Department of Informatics, Karlsruhe Institute of Technology (KIT)
Yassine Marrakchi , Department of Informatics, Karlsruhe Institute of Technology (KIT)
Henning Meyerhenke , Department of Informatics, Karlsruhe Institute of Technology (KIT)
pp. 62-69

Access-averse framework for computing low-rank matrix approximations (Abstract)

Ichitaro Yamazaki , Department of Computer Science, University of Tennessee, Knoxville, Tennessee, U.S.A.
Theo Mary , Université de Toulouse, INPT(ENSEEIHT)-IRIT, France
Jakub Kurzak , Department of Computer Science, University of Tennessee, Knoxville, Tennessee, U.S.A.
Stanimire Tomov , Department of Computer Science, University of Tennessee, Knoxville, Tennessee, U.S.A.
Jack Dongarra , Department of Computer Science, University of Tennessee, Knoxville, Tennessee, U.S.A.
pp. 70-77

Architecture-aware graph repartitioning for data-intensive scientific computing (Abstract)

Angen Zheng , Department of Computer Science, University of Pittsburgh
Alexandros Labrinidis , Department of Computer Science, University of Pittsburgh
Panos K. Chrysanthis , Department of Computer Science, University of Pittsburgh
pp. 78-85

A new Zigzag MDS code with optimal encoding and efficient decoding (Abstract)

Jun Chen , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Hui Li , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Hanxu Hou , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Bing Zhu , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Tai Zhou , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Lijia Lu , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Yumeng Zhang , Shenzhen Eng. Lab. of Converged Networks Technology, Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
pp. 1-6

An efficient scheme to ensure data availability for a cloud service provider (Abstract)

Seungmin Kang , Department of Electrical & Computer Engineering, National University of Singapore, Singapore
Bharadwaj Veeravalli , Department of Electrical & Computer Engineering, National University of Singapore, Singapore
Khin Mi Mi Aung , Data Storage Institute, A∗STAR, Singapore
Chao Jin , Data Storage Institute, A∗STAR, Singapore
pp. 15-20

A C library of repair-efficient erasure codes for distributed data storage systems (Abstract)

Chao Tian , Department of Electrical Engineering and Computer Science, University of Tennessee at Knoxville
pp. 21-26

ReCT: Improving MapReduce performance under failures with resilient checkpointing tactics (Abstract)

Hao Wang , School of Software, Shanghai Jiao Tong University, Shanghai, P.R. China
Haopeng Chen , School of Software, Shanghai Jiao Tong University, Shanghai, P.R. China
Fei Hu , School of Software, Shanghai Jiao Tong University, Shanghai, P.R. China
pp. 27-32

STORE: Data recovery with approximate minimum network bandwidth and disk I/O in distributed storage systems (Abstract)

Tai Zhou , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Hui Li , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Bing Zhu , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Yumeng Zhang , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Hanxu Hou , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
Jun Chen , Shenzhen Eng. Lab. of Converged Networks Technology, Shenzhen Key Lab. of Cloud Computing Tech. and App., Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
pp. 33-38

Privacy-aware filter-based feature selection (Abstract)

Yasser Jafer , School of Electrical Engineering and Computer Science, University of Ottawa, Canada
Stan Matwin , School of Electrical Engineering and Computer Science, University of Ottawa, Canada
Marina Sokolova , School of Electrical Engineering and Computer Science, University of Ottawa, Canada
pp. 1-5

Secure data storage in distributed cloud environments (Abstract)

Renata Jordao , Electrical Engineering Department, University of Brasilia, Campus Universitário Darcy Ribeiro, Asa Norte, 70910-900 Brasilia, DF, Brazil
Valerio Aymore Martins , Electrical Engineering Department, University of Brasilia, Campus Universitário Darcy Ribeiro, Asa Norte, 70910-900 Brasilia, DF, Brazil
Fabio Buiati , Electrical Engineering Department, University of Brasilia, Campus Universitário Darcy Ribeiro, Asa Norte, 70910-900 Brasilia, DF, Brazil
Rafael Timoteo de Sousa , Electrical Engineering Department, University of Brasilia, Campus Universitário Darcy Ribeiro, Asa Norte, 70910-900 Brasilia, DF, Brazil
Flavio Elias de Deus , Electrical Engineering Department, University of Brasilia, Campus Universitário Darcy Ribeiro, Asa Norte, 70910-900 Brasilia, DF, Brazil
pp. 6-12

Location prediction attacks using tensor factorization and optimal defenses (Abstract)

Takao Murakami , Research Institute for Secure Systems (RISEC), National Institute of Advanced Industrial Science and Technology (AIST) Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan
Hajime Watanabe , Research Institute for Secure Systems (RISEC), National Institute of Advanced Industrial Science and Technology (AIST) Tsukuba Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan
pp. 13-21

Information gateway for integrated pharmacogenomics data- IGIPD (Abstract)

Pavan Kumar , Centre for Development of Advanced Computing (C-DAC) No.1 Old Madras Road, Byappanahalli, Bangalore
Janaki Chintalapati , Centre for Development of Advanced Computing (C-DAC) No.1 Old Madras Road, Byappanahalli, Bangalore
N Neeharika , Amritha College of Engineering, Bangalore
Payal Saluja , Centre for Development of Advanced Computing (C-DAC) No.1 Old Madras Road, Byappanahalli, Bangalore
N. Mangala , Centre for Development of Advanced Computing (C-DAC) No.1 Old Madras Road, Byappanahalli, Bangalore
B B Prahlada Rao , Centre for Development of Advanced Computing (C-DAC) No.1 Old Madras Road, Byappanahalli, Bangalore
pp. 1-9

Predicting a biological response of molecules from their chemical properties using diverse and optimized ensembles of stochastic gradient boosting machine (Abstract)

Tarek Abdunabi , University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1
Otman Basir , University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1
pp. 10-17

Understanding the effects of concussion using big data (Abstract)

Jesus J. Caban , National Intrepid Center of Excellence (NICoE) Walter Reed National Military Medical Center, Bethesda, MD
Gerard Riedy , National Intrepid Center of Excellence (NICoE) Walter Reed National Military Medical Center, Bethesda, MD
Terrence R. Oakes , National Intrepid Center of Excellence (NICoE) Walter Reed National Military Medical Center, Bethesda, MD
Geoff Grammer , National Intrepid Center of Excellence (NICoE) Walter Reed National Military Medical Center, Bethesda, MD
Thomas DeGraba , National Intrepid Center of Excellence (NICoE) Walter Reed National Military Medical Center, Bethesda, MD
pp. 18-23

Duplicate drug discovery using Hadoop (Abstract)

Shao Hua Cheng , Advanced Research Institute, Institute for Information Industry, Taipei, Taiwan, ROC
Yu Shian Chiu , Advanced Research Institute, Institute for Information Industry, Taipei, Taiwan, ROC
Shih Yao Dai , Advanced Research Institute, Institute for Information Industry, Taipei, Taiwan, ROC
Hui-I Hsiao , Advanced Research Institute, Institute for Information Industry, Taipei, Taiwan, ROC
pp. 24-26

Towards integrating the detection of genetic variants into an in-memory database (Abstract)

Cindy Fahnrich , Hasso Plattner Institute, Enterprise Platform and Integration Concepts, August-Bebel-Str. 88, 14482 Potsdam, Germany
Matthieu-P. Schapranow , Hasso Plattner Institute, Enterprise Platform and Integration Concepts, August-Bebel-Str. 88, 14482 Potsdam, Germany
Hasso Plattner , Hasso Plattner Institute, Enterprise Platform and Integration Concepts, August-Bebel-Str. 88, 14482 Potsdam, Germany
pp. 27-32

A general supervised approach to segmentation of clinical texts (Abstract)

Kavita Ganesan , 3M Health Information Systems, 575 West Murray Blvd, Salt Lake City, UT
Michael Subotin , 3M Health Information Systems, 12215 Plum Orchard Drive, Silver Spring, MD 20904
pp. 33-40

A fast and memory-efficient algorithm for learning and retrieval of phenotypic dynamics in multivariate cohort time series (Abstract)

Shamim Nemati , Harvard School of Engineering and Applied Sciences, 33 Oxford Street, Cambridge, MA 02138, USA. Correspondence
Mohammad M. Ghassemi , Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
pp. 41-44

Big data in genomics: An overview (Abstract)

Ruchie Bhardwaj , Cisco Systems, Inc./University of Southern California, San Jose, CA 95134, USA
Adhiraaj Sethi , Cisco Systems, Inc., Herndon, VA 20171, USA
Raghunath Nambiar , Cisco Systems, Inc., San Jose, CA 95134, USA
pp. 45-49

Protective effects of rheumatoid arthritis in septic ICU patients (Abstract)

Mallory Bounds Sheth , Massachusetts Institute of Technology, Operations Research Center, Cambridge, Massachusetts, USA
Abdullah Chahin , Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
Roger Mark , Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, Massachusetts, USA
Natasha Markuzon , The Charles Stark Draper Laboratory, Inc., Cambridge, Massachusetts, USA
pp. 50-55

Workload characterization for MG-RAST metagenomic data analytics service in the cloud (Abstract)

Wei Tang , Argonne National Laboratory, Argonne, IL, USA
Jared Bischof , University of Chicago, Chicago, IL, USA
Narayan Desai , Ericsson, San Jose, CA, USA
Kanak Mahadik , Purdue University, West Lafayette, IN, USA
Wolfgang Gerlach , University of Chicago, Chicago, IL, USA
Travis Harrison , University of Chicago, Chicago, IL, USA
Andreas Wilke , Argonne National Laboratory, Argonne, IL, USA
Folker Meyer , Argonne National Laboratory, Argonne, IL, USA
pp. 56-63

TIDE: Inter-chromosomal translocation and insertion detection using embeddings (Abstract)

Rosarme Vetro , Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
Roshanak Farhoodi , Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
Rohith Kotla , Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
Nurit Haspel , Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
David Weisman , Department of Biology, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
Jennifer Rosen , MedSTAR Washington Hospital Center 110 Irving St. NW, Washington, DC 20010-2975 USA
Dan Simovici , Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, Massachusetts 02125 USA
pp. 64-70

Low redundancy feature selection with grouped variables and its application to healthcare data (Abstract)

Hang Wu , Department of Automation, Tsinghua University, Beijing, China
Ji-jiang Yang , Research Institute of Information Technology, Tsinghua University, Beijing, ChinaResearch Institute of Information Technology, Tsinghua University, School of Software Engineering, Beijing University of Technology, Beijing, China
Jianqiang Li , Research Institute of Information Technology, Tsinghua University, Beijing, ChinaResearch Institute of Information Technology, Tsinghua University, School of Software Engineering, Beijing University of Technology, Beijing, China
pp. 71-76

Pharmacological class data representation in the Web Ontology Language (OWL) (Abstract)

Qian Zhu , Department of Information Systems, University of Maryland Baltimore County, Baltimore, Maryland, USA
Cui Tao , School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
pp. 77-84

Spatiotemporal indexing techniques for efficiently mining spatiotemporal co-occurrence patterns (Abstract)

Berkay Aydin , Department of Computer Science, Georgia State University, Atlanta, GA 30302
Dustin Kempton , Department of Computer Science, Georgia State University, Atlanta, GA 30302
Vijay Akkineni , Department of Computer Science, Georgia State University, Atlanta, GA 30302
Shaktidhar Reddy Gopavaram , Department of Computer Science, Georgia State University, Atlanta, GA 30302
Karthik Ganesan Pillai , Department of Computer Science, Montana State University, Bozeman, MT 59717
Rafal Angryk , Department of Computer Science, Georgia State University, Atlanta, GA 30302
pp. 1-10

Scalable solar image Retrieval with Lucene (Abstract)

Juan M. Banda , Montana State University, Bozeman, Montana
Rafal A. Angryk , Georgia State University, Atlanta, Georgia
pp. 11-17

Stream mining for solar physics: Applications and implications for big solar data (Abstract)

Karl Battams , Space Science Division, U.S. Naval Research Laboratory, Washington, D.C.
pp. 18-26

A computer vision approach to mining big solar data (Abstract)

Simon Felix , University of Applied Sciences Northwestern Switzerland, Windisch, Switzerland
Andre Csillaghy , University of Applied Sciences Northwestern Switzerland, Windisch, Switzerland
pp. 27-35

Iterative refinement of multiple targets tracking of solar events (Abstract)

Dustin Kempton , Department of Computer Science, Georgia State University, P.O. Box 5060, Atlanta GA 30302-5060, USA
Karthik Ganesan Pillai , Department of Computer Science, Georgia State University, P.O. Box 5060, Atlanta GA 30302-5060, USA
Rafal Angryk , Department of Computer Science, Georgia State University, P.O. Box 5060, Atlanta GA 30302-5060, USA
pp. 36-44

Improved data exploitation for DKIST high-resolution solar observations (Abstract)

Kevin P. Reardon , National Solar Observatory, 3665 Discovery Drive, Boulder, CO 80303
Steve Berukoff , National Solar Observatory, 3665 Discovery Drive, Boulder, CO 80303
pp. 45-52

Massive labeled solar image data benchmarks for automated feature recognition (Abstract)

Michael A. Schuh , Dept of Computer Science, Montana State University, Bozeman, MT, 59717 USA
Rafal A. Angryk , Dept of Computer Science, Georgia State University, Atlanta, GA, 30302 USA
pp. 53-60

Spatial data analysis of complex urban systems (Abstract)

Farideddin Peiravian , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
Amirhassan Kermanshah , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
Sybil Derrible , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
pp. 1-6

Human activity recognition in big data smart home context (Abstract)

Sabrina Azzi , Department of Mathematics and Computer Science, University of Quebec at Chicoutimi, Chicoutimi, Canada
Abdenour Bouzouane , Department of Mathematics and Computer Science, University of Quebec at Chicoutimi, Chicoutimi, Canada
Sylvain Giroux , Department of Mathematics and Computer Science, University of Sherbrooke, Sherbrooke, Canada
Cindy Dallaire , Department of Mathematics and Computer Science, University of Quebec at Chicoutimi, Chicoutimi, Canada
Bruno Bouchard , Department of Mathematics and Computer Science, University of Quebec at Chicoutimi, Chicoutimi, Canada
pp. 1-8

A visualized data analysis for bogus business entity detection (Abstract)

Hilary Cheng , College of Management, Yuan Ze University, Chung-Li, Taiwan
Yi-Chuan Lu , Department of Information Management, College of Informatics, Yuan Ze University, Chung-Li, Taiwan
Chih-Cheng Hsu , College of Management, Yuan Ze University, Chung-Li, Taiwan
pp. 9-15

Advanced planning and control of manufacturing processes in steel industry through big data analytics: Case study and architecture proposal (Abstract)

Julian Krumeich , German Research Center for Artificial Intelligence (DFKI GmbH) Saarbrücken, Germany
Dirk Werth , German Research Center for Artificial Intelligence (DFKI GmbH) Saarbrücken, Germany
Peter Loos , German Research Center for Artificial Intelligence (DFKI GmbH) Saarbrücken, Germany
Jens Schimmelpfennig , Software AG Saarbrücken, Germany
Sven Jacobi , Saarstahl AG, Völklingen, Germany
pp. 16-24

An open schema for XML data in Hive (Abstract)

Wuheng Luo , Enterprise Data Architecture, Sears Holdings Hoffman Estates, IL, USA
Bo Liu , Advertising and Data Platforms, Yahoo! Champaign, IL, USA
Allie K. Watfa , Advertising and Data Platforms, Yahoo! Champaign, IL, USA
pp. 25-31

Parallel and quantitative sequential pattern mining for large-scale interval-based temporal data (Abstract)

Guangchen Ruan , Data to Insight Center, School of Informatics and Computing, Indiana University
Hui Zhang , Visualization and Analytics, Pervasive Technology Institute, Indiana University
Beth Plale , Data to Insight Center, School of Informatics and Computing, Indiana University
pp. 32-39

A big data aggregation, analysis and exploitation integrated platform for increasing social management intelligence (Abstract)

Greg Sand , Molloy College
Leonidas Tsitouras , International Business President, GlobaliFusion
George Dimitrakopoulos , Electrical & Computer Engineer, Lecturer, Harokopio University of Athens, Greece
Vassilis Chatzigiannakis , Electrical & Computer Engineer, CTO, GlobaliFusion
pp. 40-47

A CCG virtual system for big data application communication costs analysis (Abstract)

Yongen Yu , Department of Computer Science, Illinois Institute of Technology, Chicago, IL
Wei Tang , Argonne National Laboratory, Argonne, IL
Hongbo Zou , Queensland University of Technology, Brisbane, Qld 4001, Australia
Liwei Liu , Northwestern University, Evanston, IL
pp. 54-60

High-frequency financial statistics with parallel R and Intel Xeon Phi coprocessor (Abstract)

Jian Zou , Department of Mathematical Sciences, Worcester Polytechnic Institute
Hui Zhang , Pervasive Technology Institute, Indiana University, Bloomington
pp. 61-69

Situation aware computing for big data (Abstract)

Eric S. Chan , Oracle Corporation, Redwood Shores, California USA
Dieter Gawlick , Oracle Corporation, Redwood Shores, California USA
Adel Ghoneimy , Oracle Corporation, Redwood Shores, California USA
Zhen Hua Liu , Oracle Corporation, Redwood Shores, California USA
pp. 1-6

Topic-specific post identification in microblog streams (Abstract)

Shanika Karunasekera , Dept. of Computing and Information Systems, The University of Melbourne, Australia
Aaron Harwood , Dept. of Computing and Information Systems, The University of Melbourne, Australia
Sameendra Samarawickrama , Dept. of Computing and Information Systems, The University of Melbourne, Australia
Kotagiri Ramamohanarao , Dept. of Computing and Information Systems, The University of Melbourne, Australia
Garry Robins , Melbourne School of Psychological Sciences, The University of Melbourne, Australia
pp. 7-13

Handling smart environment devices, data and services at the semantic level with the FI-WARE core platform (Abstract)

Fano Ramparany , Orange Labs, 28 chemin du Vieux Chene, 38243 Meylan, France
Fermin Galan Marquez , Telefónica I+D, Ronda de la Comunicación s/n, 28050 Madrid, Spain
Javier Soriano , Universidad Politécnica de Madrid, 28660 Boadilla del Monte, Madrid, Spain
Tarek Elsaleh , University of Surrey, Guildford GU2 7XH, Surrey, UK
pp. 14-20

Learning machines for computational epidemiology (Abstract)

Magnus Boman , SICS Swedish ICT and KTH/ICT/SCS, Electrum, SE-16429 Kista, Sweden
Daniel Gillblad , SICS Swedish ICT, Electrum, SE-16429 Kista, Sweden
pp. 1-5

Epidemiological modeling of bovine brucellosis in India (Abstract)

Gloria J. Kang , Department of Population Health Sciences, Virginia Tech, Blacksburg, USA
L. Gunaseelan , Department of Veterinary Public Health and Epidemiology, Madras Veterinary College, Tamil Nadu Veterinary and Animal Sciences University, Chennai, India
Kaja M. Abbas , Department of Population Health Sciences, Virginia Tech, Blacksburg, USA
pp. 6-10

Big data problems on discovering and analyzing causal relationships in epidemiological data (Abstract)

Yiheng Liang , Department of Computer Science and Engineering, University of North Texas, Denton, Texas 76207
Armin R. Mikler , Department of Computer Science and Engineering, University of North Texas, Denton, Texas 76207
pp. 11-18

Spatial big data analytics of influenza epidemic in Vellore, India (Abstract)

Daphne Lopez , School of Information Technology and Engineering, VIT University, Vellore, Tamil Nadu, India
M. Gunasekaran , School of Information Technology and Engineering, VIT University, Vellore, Tamil Nadu, India
B. Senthil Murugan , School of Information Technology and Engineering, VIT University, Vellore, Tamil Nadu, India
Harpreet Kaur , Indian Council of Medical Research, Government of India, New Delhi, India
Kaja M. Abbas , Department of Population Health Sciences, Virginia Tech, Blacksburg, USA
pp. 19-24

Multiway Analysis of bridge structural types in the National Bridge Inventory (NBI): A tensor decomposition approach (Abstract)

Offei Adarkwa , Dept. of Civil & Environmental Engineering University of Delaware Newark, DE, U.S
Thomas Schumacher , Dept. of Civil & Environmental Engineering University of Delaware Newark, DE, U.S
Nii Attoh-Okine , Dept. of Civil & Environmental Engineering University of Delaware Newark, DE, U.S
pp. 1-6

Big data challenges in railway engineering (Abstract)

Nii Attoh-Okine , Department of Civil and Environmental Engineering, University of Delaware, Newark, Delaware
pp. 7-9

Efficient traffic speed forecasting based on massive heterogenous historical data (Abstract)

Xing-Yu Chen , Department of Computer Science, National Taiwan University of Science and Technology, Taipei, Taiwan
Hsing-Kuo Pao , Department of Computer Science, National Taiwan University of Science and Technology, Taipei, Taiwan
Yuh-Jye Lee , Department of Computer Science, National Taiwan University of Science and Technology, Taipei, Taiwan
pp. 10-17

A dynamic programming approach for 4D flight route optimization (Abstract)

Christian Kiss-Toth , Department of Mathematics and Computer Science, Szechenyi István University, Győr, Hungary
Gabor Takacs , Department of Mathematics and Computer Science, Szechenyi István University, Győr, Hungary
pp. 24-28

Impact analysis of extreme events on flows in spatial networks (Abstract)

Amirhassan Kermanshah , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A.
Alireza Karduni , Department of Urban Planning and Policy, University of Illinois at Chicago, Chicago, Illinois, U.S.A.
Farideddin Peiravian , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A.
Sybil Derrible , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A.
pp. 29-34

Applications of linked data in the rail domain (Abstract)

Christopher Morris , Birmingham Centre for Railway Research and Education, University of Birmingham, Birmingham, United Kingdom
John Easton , Birmingham Centre for Railway Research and Education, University of Birmingham, Birmingham, United Kingdom
Clive Roberts , Birmingham Centre for Railway Research and Education, University of Birmingham, Birmingham, United Kingdom
pp. 35-41

Metaheuristics in big data: An approach to railway engineering (Abstract)

Silvia Galvan Nunez , Department of Civil and Environmental Engineering, University of Delaware, Newark, DE, USA
Nii Attoh-Okine , Department of Civil and Environmental Engineering, University of Delaware, Newark, DE, USA
pp. 42-47

Facilitating maintenance decisions on the Dutch railways using big data: The ABA case study (Abstract)

Alfredo Nunez , Section of Railway Engineering, (∗) Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands
Jurjen Hendriks , Section of Railway Engineering, (∗) Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands
Zili Li , Section of Railway Engineering, (∗) Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands
Bart De Schutter , Section of Railway Engineering, (∗) Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands
Rolf Dollevoet , Section of Railway Engineering, (∗) Delft Center for Systems and Control, Delft University of Technology, Delft, The Netherlands
pp. 48-53

Spatial data analysis of complex urban systems (Abstract)

Farideddin Peiravian , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
Amirhassan Kermanshah , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
Sybil Derrible , Complex and Sustainable Urban Networks (CSUN) Lab, Department of Civil and Materials Engineering, University of Illinois at Chicago, Chicago, Illinois, U.S.A. 60607
pp. 54-59

Evaluating structural engineering finite element analysis data using multiway analysis (Abstract)

Matija Radovic , Civil Engineering Department, University of Delaware, Newark, DE
Jennifer McConnell , Civil Engineering Department, University of Delaware, Newark, DE
pp. 60-67

Multi-objective optimization for resilient airline networks using socioeconomic-environmental data (Abstract)

Hidefumi Sawai , International Affairs Department, National Institute of Information and Communications Technology, 4-2-1, Nukui-Kitamachi, Koganei, Tokyo 184-8795 Japan
Aki-Hiro Sato , Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Honmachi, Yoshida, Sakyo-ku, Kyoto 606-8501 Japan
pp. 68-77

Predicting flight arrival times with a multistage model (Abstract)

Gabor Takacs , Department of Mathematics and Computer Science, Széchenyi István University, Győr, Hungary
pp. 78-84

Ontology-driven data integration for railway asset monitoring applications (Abstract)

Jonathan Tutcher , Centre for Railway Research & Education, University of Birmingham, Edgbaston, UK
pp. 85-95

Some examples of big data in railroad engineering (Abstract)

Allan M. Zarembski , Department of Civil and Environmental Engineering University of Delaware Newark, DE, USA
pp. 96-102

Taking an electronic ticketing system to the cloud: Design and discussion (Abstract)

Filipe Araujo , CISUC, Dept. of Informatics Engineering, University of Coimbra, Portugal
Marilia Curado , CISUC, Dept. of Informatics Engineering, University of Coimbra, Portugal
Pedro Furtado , CISUC, Dept. of Informatics Engineering, University of Coimbra, Portugal
Raul Barbosa , CISUC, Dept. of Informatics Engineering, University of Coimbra, Portugal
pp. 1-10

A contention aware hybrid evaluator for schedulers of big data applications in computer clusters (Abstract)

Shouvik Bardhan , Department of Computer Science, George Mason University, Fairfax, VA 22030
Daniel A. Menasce , Department of Computer Science, George Mason University, Fairfax, VA 22030
pp. 11-19

RuleMR: Classification rule discovery with MapReduce (Abstract)

Vasilis Kolias , National Technical University of Athens, Athens, Greece
Constantinos Kolias , George Mason University, Fairfax VA, U.S.A.
Ioannis Anagnostopoulos , University of Thessaly, Lamia, Greece
Eleftherios Kayafas , National Technical University of Athens, Athens, Greece
pp. 20-28

Community structure analysis in big climate data (Abstract)

Michael P. McGuire , Department of Computer and Information Sciences, Towson University, Towson, Maryland, USA
Nam P. Nguyen , Department of Computer and Information Sciences, Towson University, Towson, Maryland, USA
pp. 38-46

The best of two worlds: Integrating IBM InfoSphere Streams with Apache YARN (Abstract)

Zubair Nabi , IBM Research - Ireland
Rohit Wagle , IBM Research - T. J. Watson Research Center
Eric Bouillet , IBM Research - Ireland
pp. 47-51

Temporal bipartite projection and link prediction for online social networks (Abstract)

Tsunghan Wu , Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, Taiwan, R.O.C.
Sheau-Harn Yu , Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, Taiwan, R.O.C.
Wanjiun Liao , Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, R.O.C.
Cheng-Shang Chang , Institute of Communications Engineering, National Tsing Hua University, Hsinchu 300, Taiwan, R.O.C.
pp. 52-59

The Deep Film Access Project: Ontology and metadata design for digital film production assets (Abstract)

Sarah Atkinson , Art, Design and Media, University of Brighton, Brighton, UK
Jos Lehmann , Art, Design and Media, University of Brighton, Brighton, UK
Roger Evans , Computing, Engineering and Mathematics, University of Brighton, Brighton, UK
pp. 1-4

Mining microdata: Economic opportunity and spatial mobility in Britain and the United States, 1850–1881 (Abstract)

Peter Baskerville , Department of History University of Alberta, Edmonton, Canada
Lisa Dillon , Department of Demography, Université de Montréal, Montréal, Canada
Kris Inwood , Departments of Economics and History, University of Guelph, Guelph, Canada
Evan Roberts , Department of History & Minnesota Population Center, University of Minnesota, Minneapolis, United States of America
Steven Ruggles , Department of History & Minnesota Population Center, University of Minnesota, Minneapolis, United States of America
Kevin Schurer , Department of History, University of Leicester, Leicester, United Kingdom
John Robert Warren , Department of History & Minnesota Population Center, University of Minnesota, Minneapolis, United States of America
pp. 5-13

Mining mobile youth cultures (Abstract)

Tobias Blanke , Department of Digital Humanities, King's College London, United Kingdom
Giles Greenway , Department of Digital Humanities, King's College London, United Kingdom
Jennifer Pybus , Department of Digital Humanities, King's College London, United Kingdom
Mark Cote , Department of Digital Humanities, King's College London, United Kingdom
pp. 14-17

Scientific findings as big data for research synthesis: The metaBUS project (Abstract)

Frank Bosco , School of Business, Virginia Commonwealth University, Richmond, VA
Krista Uggerslev , JR Shaw School of Business, Northern Alberta Institute of Technology, Edmonton, AB
Piers Steel , Haskayne School of Business, University of Calgary, Calgary, AB
pp. 18-22

Scaling historical text re-use (Abstract)

Marco Buchler , Göttingen Centre for Digital Humanities, Georg-August-University Göttingen, Göttingen, Germany
Greta Franzini , Computer Science Department, University of Leipzig, Leipzig, Germany
Emily Franzini , Computer Science Department, University of Leipzig, Leipzig, Germany
Maria Moritz , Computer Science Department, University of Leipzig, Leipzig, Germany
pp. 23-31

Revolutionary entities: Turning data into knowledge to drive personalized exploration of The irish rising of 1916 (Abstract)

Owen Conlan , CNGL - Centre for Global Intelligent Content, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Alexander O'Connor , CNGL - Centre for Global Intelligent Content, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Orla Ni Loinsigh , CNGL - Centre for Global Intelligent Content, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Gary Munnelly , CNGL - Centre for Global Intelligent Content, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Seamus Lawless , CNGL - Centre for Global Intelligent Content, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Rachel Murphy , Digital Humanities / School of History, University College Cork, Cork, Ireland
pp. 32-38

Understanding the role of medical experts during a public health crisis digital tools and library resources for research on the 1918 Spanish influenza (Abstract)

E. Thomas Ewing , Department of History, Virginia Tech, Blacksburg, VA 24061
Samah Gad , Discovery Analytics Center, Department of Computer Science, Virginia Tech, Arlington, VA 22203
Naren Ramakrishnan , Discovery Analytics Center, Department of Computer Science, Virginia Tech, Arlington, VA 22203
Jeffrey S. Reznick , History of Medicine Division, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894
pp. 39-46

A metadata infrastructure for the analysis of parliamentary proceedings (Abstract)

Richard Gartner , Centre for e-Research, Department of Digital Humanities, King's College London, London, United Kingdom
pp. 47-50

Scaled Entity Search: A method for media historiography and response to critiques of big humanities data research (Abstract)

Eric Hoyt , Department of Communication Arts, University of Wisconsin-Madison, Madison, WI, USA
Kit Hughes , Department of Communication Arts, University of Wisconsin-Madison, Madison, WI, USA
Derek Long , Department of Communication Arts, University of Wisconsin-Madison, Madison, WI, USA
Anthony Tran , Department of Communication Arts, University of Wisconsin-Madison, Madison, WI, USA
Kevin Ponto , Department of Design Studies, University of Wisconsin-Madison, Madison, WI, USA
pp. 51-59

On the coverage of science in the media: A big data study on the impact of the Fukushima disaster (Abstract)

Thomas Lansdall-Welfare , Department of Computer Science, University of Bristol, Bristol, United Kingdom
Saatviga Sudhahar , Department of Computer Science, University of Bristol, Bristol, United Kingdom
Giuseppe A. Veltri , Department of Media and Communication, University of Leicester, Leicester, United Kingdom
Nello Cristianini , Department of Computer Science, University of Bristol, Bristol, United Kingdom
pp. 60-66

Integrating Data Mining and Data Management Technologies for Scholarly Inquiry (Abstract)

Ray R. Larson , School of Information, University of California, Berkeley, Berkeley, California, USA
Richard Marciano , College of Information Studies - Maryland's iSchool, University of Maryland, College Park, Maryland, USA
Chien-Yi Hou , School of Information and Library Science (SILS) University of North Carolina Chapel Hill, Chapel Hill, NC, USA
Shreyas , School of Information, University of California, Berkeley, Berkeley, California, USA
Paul Watry , Psychological Sciences, University of Liverpool, Liverpool, UK
John Harrison , Psychological Sciences University of Liverpool, Liverpool, UK
Luis Aguilar , School of Information, University of California, Berkeley, Berkeley, California, USA
Jerome Fuselier , Psychological Sciences, University of Liverpool, Liverpool, UK
pp. 67-71

The exceptional and the everyday: 144 Hours in Kiev (Abstract)

Lev Manovich , The Graduate Center, City University of New York, New York, NY, U.S.A.
Alise Tifentale , The Graduate Center, City University of New York, New York, NY, U.S.A.
Mehrdad Yazdani , California Institute for Telecommunication and Information, La Jolla, CA, U.S.A.
Jay Chow , Web Developer, Katana, San Diego, CA, U.S.A.
pp. 72-79

Dealing with heterogeneous big data when geoparsing historical corpora (Abstract)

C.J. Rupp , Lancaster University, Lancaster, UK
Paul Rayson , Lancaster University, Lancaster, UK
Ian Gregory , Lancaster University, Lancaster, UK
Andrew Hardie , Lancaster University, Lancaster, UK
Amelia Joulain , Lancaster University, Lancaster, UK
Daniel Hartmann , Lancaster University, Lancaster, UK
pp. 80-83

BigExcel: A web-based framework for exploring big data in social sciences (Abstract)

Muhammed Asif Saleem , School of Computer Science, University of St Andrews, St Andrews, Fife, UK KY16 9SX
Blesson Varghese , School of Computer Science, University of St Andrews, St Andrews, Fife, UK KY16 9SX
Adam Barker , School of Computer Science, University of St Andrews, St Andrews, Fife, UK KY16 9SX
pp. 84-91

Probabilistic estimates of attribute statistics and match likelihood for people entity resolution (Abstract)

Xin Wang , Data Research, Intelius Inc, Bellevue, WA
Ang Sun , Data Research, Intelius Inc, Bellevue, WA
Hakan Kardes , Data Research, Intelius Inc, Bellevue, WA
Siddharth Agrawal , Data Research, Intelius Inc, Bellevue, WA
Lin Chen , Data Research, Intelius Inc, Bellevue, WA
Andrew Borthwick , Data Research, Intelius Inc, Bellevue, WA
pp. 92-99

A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments (Abstract)

Alex C. Williams , Middle Tennessee State University, Murfreesboro, Tennessee, USA
John F. Wallin , Middle Tennessee State University, Murfreesboro, Tennessee, USA
Haoyu Yu , University of Minnesota, Minneapolis, Minnesota, USA
Marco Perale , University of Liverpool, Liverpool, Merseyshire, UK
Hyrum D. Carroll , Middle Tennessee State University, Murfreesboro, Tennessee, USA
Anne-Francoise Lamblin , University of Minnesota, Minneapolis, Minnesota, USA
Lucy Fortson , University of Minnesota, Minneapolis, Minnesota, USA
Dirk Obbink , University of Oxford, Oxford, Oxfordshire, UK
Chris J. Lintott , University of Oxford, Oxford, Oxfordshire, UK
James H. Brusuelas , University of Oxford, Oxford, Oxfordshire, UK
pp. 100-105

A model architecture for Big Data applications using relational databases (Abstract)

Erin-Elizabeth A. Durham , Department of Computer Science, Georgia State University, Atlanta, USA
Andrew Rosen , Department of Computer Science, Georgia State University, Atlanta, USA
Robert W. Harrison , Department of Computer Science, Georgia State University, Atlanta, USA
pp. 9-16

Optimizing graph queries with graph joins and Sprinkle SPARQL (Abstract)

Eric L. Goodman , Sandia National Laboratories, Albuquerque, NM, USA
Edward Jimenez , Sandia National Laboratories, Albuquerque, NM, USA
Cliff Joslyn , Pacific Northwest National Laboratory, Richland, WA, USA
David Haglin , Pacific Northwest National Laboratory, Richland, WA, USA
Sinan Al-Saffar , Semantic Scale LLC, Tampa, FL, USA
Dirk Grunwald , University of Colorado, Boulder, USA
pp. 17-24

Vessel route anomaly detection with Hadoop MapReduce (Abstract)

Xiaoguang Wang , Faculty of Computer Science, Dalhousie University, Canada
Xuan Liu , Faculty of Computer Science, Dalhousie University, Canada
Bo Liu , Faculty of Computer Science, Dalhousie University, Canada
Erico N. de Souza , Faculty of Computer Science, Dalhousie University, Canada
Stan Matwin , Faculty of Computer Science, Dalhousie University, Canada Institute of Computer Science Polish Academy of Sciences, Poland
pp. 25-30

HBGSim: A structural similarity measurement over heterogeneous big graphs (Abstract)

Jiazhen Nian , Department of Machine Intelligence, Peking University, Beijing 100871, China
Shan Jiang , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801
Yan Zhang , Department of Machine Intelligence, Peking University, Beijing 100871, China
pp. 31-38

Knowledge based dimensionality reduction for technical text mining (Abstract)

Walid Shalaby , Computer Science Department, University of North Carolina at Charlotte, Charlotte, USA
Wlodek Zadrozny , Computer Science Department, University of North Carolina at Charlotte, Charlotte, USA
Sean Gallagher , Computer Science Department, University of North Carolina at Charlotte, Charlotte, USA
pp. 39-44

A distributed instance-weighted SVM algorithm on large-scale imbalanced datasets (Abstract)

Xiaoguang Wang , Faculty of Computer Science, Dalhousie University, Canada
Xuan Liu , Faculty of Computer Science, Dalhousie University, Canada
Stan Matwin , Faculty of Computer Science, Dalhousie University, Canada, Institute of Computer Science, Polish Academy of Sciences, Poland
pp. 45-51

A layer based architecture for provenance in big data (Abstract)

Rajeev Agrawal , Department of Computer Systems Technology, North Carolina A&T State University, Greensboro, USA
Ashiq Imran , Department of Computer Science North Carolina A&T State University, Greensboro, USA
Cameron Seay , Department of Computer Systems Technology, North Carolina A&T State University, Greensboro, USA
Jessie Walker , Department of Mathematics and Computer Science, University of Arkansas at Pine Bluff, Pine Bluff, USA
pp. 1-7

Sharing best practices for the implementation of Big Data applications in government and science communities (Abstract)

Joan L. Aron , Independent Consultant, Columbia, Maryland, U.S.A.
Brand Niemann , Semantic Community Fairfax, Virginia, U.S.A.
pp. 8-10

Big Data: Challenges, practices and technologies: NIST Big Data Public Working Group workshop at IEEE Big Data 2014 (Abstract)

Nancy W. Grady , Science Applications International Corporation
Mark Underwood , Krypton Brothers, LLC
Arnab Roy , Fujitsu Laboratories of America
Wo L. Chang , National Institute of Standards and Technology
pp. 11-15

Big data machine learning and graph analytics: Current state and future challenges (Abstract)

H. Howie Huang , Department of Electrical and Computer Engineering, George Washington University
Hang Liu , Department of Electrical and Computer Engineering, George Washington University
pp. 16-17

A standard for benchmarking big data systems (Abstract)

Raghunath Nambiar , Cisco Systems, Inc, 275 E Tasman Drive, San Jose, CA 94134, USA
pp. 18-20

Addressing data veracity in big data applications (Abstract)

Saima Aman , Department of Computer Science, University of Southern California, Los Angeles, CA
Charalampos Chelmis , Department of Electrical Engineering, University of Southern California, Los Angeles, CA
Viktor Prasanna , Department of Electrical Engineering, University of Southern California, Los Angeles, CA
pp. 1-3

Machine learning and interactive visualization applied to TB-sized images of stem cells (Abstract)

Julien Amelot , Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, USA
Peter Bajcsy , Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, USA
Anne Plant , Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, USA
Mary Brady , Information Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, USA
pp. 4

Big data: A new challenge for tourism (Abstract)

Gael Chareyron , De Vinci Technology Lab, ESILV, Pôle Universitaire Léonard de Vinci, Paris La Défense, France
Jerome Da-Rugna , De Vinci Technology Lab, ESILV, Pôle Universitaire Léonard de Vinci, Paris La Défense, France
Thomas Raimbault , De Vinci Technology Lab, ESILV, Pôle Universitaire Léonard de Vinci, Paris La Défense, France
pp. 5-7

OME: Tool for generating and managing metadata to handle BigData (Abstract)

Ranjeet Devarakonda , Oak Ridge National Laboratory
Biva Shrestha , Oak Ridge National Laboratory
Giriprakash Palanisamy , Oak Ridge National Laboratory
Les Hook , Oak Ridge National Laboratory
Terri Killeffer , Oak Ridge National Laboratory
Misha Krassovski , Oak Ridge National Laboratory
Tom Boden , Oak Ridge National Laboratory
Robert Cook , Oak Ridge National Laboratory
Lisa Zolly , United States Geological Survey
Viv Hutchison , United States Geological Survey
Mike Frame , United States Geological Survey
Alice Cialella , Brookhaven National Laboratory
Kathy Lazer , Brookhaven National Laboratory
pp. 8-10

The EMBERS architecture for streaming predictive analytics (Abstract)

Andy Doyle , CACI Inc., Lanham, MD 20706
Graham Katz , CACI Inc., Lanham, MD 20706
Kristen Summers , CACI Inc., Lanham, MD 20706
Chris Ackermann , CACI Inc., Lanham, MD 20706
Ilya Zavorin , CACI Inc., Lanham, MD 20706
Zunsik Lim , CACI Inc., Lanham, MD 20706
Sathappan Muthiah , Virginia Tech, Blacksburg, VA 24061
Liang Zhao , Virginia Tech, Blacksburg, VA 24061
Chang-Tien Lu , Virginia Tech, Blacksburg, VA 24061
Patrick Butler , Virginia Tech, Blacksburg, VA 24061
Rupinder Paul Khandpur , Virginia Tech, Blacksburg, VA 24061
Youssef Fayed , BASIS Technology, Herndon, VA 20171
Naren Ramakrishnan , Virginia Tech, Blacksburg, VA 24061
pp. 11-13

A novel approach to determine docking locations using fuzzy logic and shape determination (Abstract)

Chinua Umoja , Computer Science Department, Georgia State University, Atlanta, Georgia, USA
J.T. Torrance , Computer Science Department, Georgia State University, Atlanta, Georgia, USA
Erin-Elizabeth A. Durham , Computer Science Department, Georgia State University, Atlanta, Georgia, USA
Andrew Rosen , Computer Science Department, Georgia State University, Atlanta, Georgia, USA
Robert W. Harrison , Computer Science Department, Georgia State University, Atlanta, Georgia, USA
pp. 14-16

Linked Open Data mining for democratization of big data (Abstract)

Roberto Espinosa , WaKe Research, Universidad de Matanzas “Camilo Cienfuegos”, Cuba
Larisa Garriga , WaKe Research, Universidad de Matanzas “Camilo Cienfuegos”, Cuba
Jose Jacobo Zubcoff , WaKe Research, Universidad de Alicante, Spain
Jose-Norberto Mazon , WaKe Research, Universidad de Alicante, Spain
pp. 17-19

Building Wrangler: A transformational data intensive resource for the open science community (Abstract)

Niall Gaffney , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX
Christopher Jordan , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX
Tommy Minyard , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX
Dan Stanzione , Texas Advanced Computing Center, University of Texas at Austin, Austin, TX
pp. 20-22

CELAR: Automated application elasticity platform (Abstract)

Ioannis Giannakopoulos , National Technical University of Athens, School of ECE
Nikolaos Papailiou , National Technical University of Athens, School of ECE
Christos Mantas , National Technical University of Athens, School of ECE
Ioannis Konstantinou , National Technical University of Athens, School of ECE
Dimitrios Tsoumakos , Department of Informatics, Ionian University, Corfu, Greece
Nectarios Koziris , National Technical University of Athens, School of ECE
pp. 23-25

Semantic HMC for big data analysis (Abstract)

Thomas Hassan , Universit de Bourgogne, Dijon, France
Rafael Peixoto , Polytechnic of Porto, Porto, Portugal
Christophe Cruz , Universit de Bourgogne, Dijon, France
Aurlie Bertaux , Universit de Bourgogne, Dijon, France
Nuno Silva , Polytechnic of Porto, Porto, Portugal
pp. 26-28

A layer based architecture for provenance in big data (Abstract)

Ashiq Imran , Department of Computer Science, North Carolina A&T, State University, Greensboro, USA
Rajeev Agrawal , Department of Computer Systems Technology, North Carolina A&T, State University, Greensboro, USA
Jessie Walker , University of Arkansas at Pine Bluff, Department of Mathematics & Computer Sciences, Pine Bluff, AR 71601
Anthony Gomes , University of Arkansas at Pine Bluff, Department of Mathematics & Computer Sciences, Pine Bluff, AR 71601
pp. 29-31

B-dids: Mining anomalies in a Big-distributed Intrusion Detection System (Abstract)

Vandana P. Janeja , University of Maryland, Baltimore County
Ali Azari , University of Maryland, Baltimore County
Josephine M. Namayanja , University of Maryland, Baltimore County
Brian Heilig , HAMR Analytic Technologies, E.T. International
pp. 32-34

Enabling genomic analysis on federated clouds (Abstract)

Fan Jiang , Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, USA
Michael Shoffner , Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, USA
Claris Castillo , Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, USA
Charles Schmitt , Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, USA
pp. 35-37

Challenges of data integration and interoperability in big data (Abstract)

Anirudh Kadadi , Department of Computer Systems Technology, North Carolina A & T State University, Greensboro, NC, USA
Rajeev Agrawal , Department of Computer Systems Technology, North Carolina A & T State University, Greensboro, NC, USA
Christopher Nyamful , Department of Computer Systems Technology, North Carolina A & T State University, Greensboro, NC, USA
Rahman Atiq , University of Arkansas at Pine Bluff, Pine Bluff, AR, USA
pp. 38-40

Large scale author name disambiguation in digital libraries (Abstract)

Madian Khabsa , Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
Pucktada Treeratpituk , Science Park Promotion Agency, Ministry of Science and Technology, Bangkok, Thailand
C. Lee Giles , Information Sciences and Technology, Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
pp. 41-42

Real-time traffic incident detection using probe-car data on the Tokyo Metropolitan Expressway (Abstract)

Akira Kinoshita , The University of Tokyo, Tokyo, Japan
Atsuhiro Takasu , National Institute of Informatics, Tokyo, Japan
Jun Adachi , National Institute of Informatics, Tokyo, Japan
pp. 43-45

Differentially private models of tollgate usage: The Milan tollgate data set (Abstract)

Nick Manfredi , Wellesley College
Darakhshan J. Mir , Wellesley College
Shannon Lu , Wellesley College
Dominick Sanchez , Bowdoin College
pp. 46-48

MoDisSENSE: A distributed platform for social networking services over mobile devices (Abstract)

Ioannis Mytilinis , Computing Systems Laboratory, National Technical University of Athens
Ioannis Giannakopoulos , Computing Systems Laboratory, National Technical University of Athens
Ioannis Konstantinou , Computing Systems Laboratory, National Technical University of Athens
Katerina Doka , Computing Systems Laboratory, National Technical University of Athens
Nectarios Koziris , Computing Systems Laboratory, National Technical University of Athens
pp. 49-51

A challenge of authorship identification for ten-thousand-scale microblog users (Abstract)

Syunya Okuno , Dept. of Computer Sci. and Eng. Waseda Univ. Tokyo, Japan
Hiroki Asai , Dept. of Computer Sci. and Eng., Waseda Univ., Tokyo, Japan
Hayato Yamana , Faculty of Sci. and Eng., Waseda Univ., / National Inst. of Informatics, Tokyo, Japan
pp. 52-54

Cognitive map of tourist behavior based on Tripadvisor (Abstract)

Thomas Raimbault , De Vinci Technology Lab - ESILV - University of Léonard de Vinci, Paris La Défense - France
Gael Chareyron , De Vinci Technology Lab - ESILV - University of Léonard de Vinci, Paris La Défense - France
Corinne Krzyzanowski-Guillot , De Vinci Technology Lab - ESILV - University of Léonard de Vinci, Paris La Défense - France
pp. 55-57

Biclustering using Spark-MapReduce (Abstract)

Tugdual Sarazin , LIPN-UMR 7030, University of Paris 13 - CNRS, 99, av. J-B Celment F-93430 Villetaneuse, France
Mustapha Lebbah , LIPN-UMR 7030, University of Paris 13 - CNRS, 99, av. J-B Celment F-93430 Villetaneuse, France
Hanane Azzag , LIPN-UMR 7030, University of Paris 13 - CNRS, 99, av. J-B Celment F-93430 Villetaneuse, France
pp. 58-60

A summarization paradigm for big data (Abstract)

Zubair Shah , University of New South Wales, Canberra, Australia
Abdun Naser Mahmood , University of New South Wales, Canberra, Australia
pp. 61-63

An open source framework to add spatial extent and geospatial visibility to Big Data (Abstract)

Biva Shrestha , Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, USA
Ranjeet Devarakonda , Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, USA
Giriprakash Palanisamy , Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, USA
pp. 64-66

Developing a cloud computing platform for Big Data: The OpenStack Nova case (Abstract)

Jose Teixeira , Information Systems Science Unit, University of Turku, Finland
pp. 67-69

Language, cultural influences and intelligence in historical gazetteers of the Great War (Abstract)

Robert Warren , Big Data Institute, Dalhousie University, Halifax, Canada
Bo Liu , Big Data Institute, Dalhousie University, Halifax, Canada
pp. 70-72

The Bot will serve you now: Automating access to archival materials (Abstract)

Joshua A. Westgard , Digital Systems and Stewardship Division, University of Maryland Libraries, College Park, MD, USA
pp. 73-74

Incremental and parallel spatial association mining (Abstract)

Jin Soung Yoo , Department of Computer Science, Indiana University-Purdue University Fort Wayne, Fort Wayne, Indiana, USA
Douglas Boulware , Rome Research Site/RIEA, Air Force Research Laboratory, Rome, New York, USA
pp. 75-76

Sharding for literature search via cutting citation graphs (Abstract)

Haozhen Zhao , College of Computing & Informatics, Drexel University, Philadelphia, USA
pp. 77-79

Repair efficient storage codes via combinatorial configurations (Abstract)

Bing Zhu , Institute of Big Data Technologies, Shenzhen Eng. Lab of Converged Networks Technology, Peking University Shenzhen Graduate School
Hui Li , Institute of Big Data Technologies, Shenzhen Eng. Lab of Converged Networks Technology, Peking University Shenzhen Graduate School
Kenneth W. Shum , Institute of Network Coding, The Chinese University of Hong Kong, Shatin, Hong Kong
pp. 80-81

Search space preprocessing in solving complex optimization problems (Abstract)

Ruoqian Liu , EECS Department, Northwestern University, Evanston, IL USA
Ankit Agrawal , EECS Department, Northwestern University, Evanston, IL USA
Wei-keng Liao , EECS Department, Northwestern University, Evanston, IL USA
Alok Choudhary , EECS Department, Northwestern University, Evanston, IL USA
pp. 1-5
97 ms
(Ver 3.3 (11022016))