The Community for Technology Leaders
2015 IEEE International Conference on Big Data (Big Data) (2015)
Santa Clara, CA, USA
Oct. 29, 2015 to Nov. 1, 2015
ISBN: 978-1-4799-9925-5
TABLE OF CONTENTS

[Front cover] (PDF)

pp. 1

Organization (PDF)

pp. 1-2

Moving past the "Wild West" era for Big Data (PDF)

H. V. Jagadish , Bernard A Galler Collegiate Electrical Engineering and Computer Science, University of Michigan
pp. 2

Conquering Big Data with Spark (PDF)

Ion Stocia , UC Berkeley, USA
pp. 3

Online and on-demand partitioning of streaming graphs (Abstract)

Ioanna Filippidou , Athens University of Economics and Business, 76 Patission Street, Athens, Greece
Yannis Kotidis , Athens University of Economics and Business, 76 Patission Street, Athens, Greece
pp. 4-13

Learning to accurately COUNT with query-driven predictive analytics (Abstract)

Christos Anagnostopoulos , School of Computing Science, University of Glasgow, UK, G12 8QQ
Peter Triantafillou , School of Computing Science, University of Glasgow, UK, G12 8QQ
pp. 14-23

Rewriting complex SPARQL analytical queries for efficient cloud-based processing (Abstract)

Padmashree Ravindra , Microsoft Corporation, Redmond, USA
HyeongSik Kim , North Carolina State University, Raleigh, USA
Kemafor Anyanwu , North Carolina State University, Raleigh, USA
pp. 32-37

Concept hierarchies and human navigation (Abstract)

Salvador Aguinaga , University of Notre Dame
Aditya Nambiar , IIT Bombay
Zuozhu Liu , Zhejiang University
Tim Weninger , University of Notre Dame
pp. 38-45

Iteratively refining SVMs using priors (Abstract)

Enric Junque de Fortuny , INSEAD, Boulevard de Constance, 77305 Fontainebleau, France
Theodoros Evgeniou , INSEAD, Boulevard de Constance, 77305 Fontainebleau, France
David Martens , Faculty of Applied Economics, University of Antwerp, Belgium
Foster Provost , Information, Operations & Management Sciences, Stern School of Business, New York University, New York
pp. 46-52

Towards scalable quantile regression trees (Abstract)

Harish S. Bhat , Applied Mathematics Unit, UC Merced, Merced, USA
Nitesh Kumar , Skytree, Inc., San Jose, USA
Garnet J. Vaz , Microsoft, Bellevue, USA
pp. 53-60

Super-CWC and super-LCC: Super fast feature selection algorithms (Abstract)

Kilho Shin , University of Hyogo, Kobe, Japan
Tetsuji Kuboyama , Gakushuin University Tokyo, Japan
Takako Hashimot , Chiba University of Commerce, Chiba, Japan
Dave Shepard , University of California, Los Angeles, USA
pp. 1-7

Considerations and recommendations for data availability for data analytics for manufacturing (Abstract)

Don Libes , Engineering Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, 20899 USA
Seungjun Shin , Engineering Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, 20899 USA
Jungyub Woo , Engineering Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland, 20899 USA
pp. 68-75

ScaleGraph: A high-performance library for billion-scale graph analytics (Abstract)

Toyotaro Suzumura , IBM T.J. Watson Research Center
Koji Ueno , Tokyo Institute of Technology
pp. 76-84

System and architecture level characterization of big data applications on big and little core server architectures (Abstract)

Maria Malik , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
Setareh Rafatirah , Department of Information Sciences and Technology, George Mason University, Fairfax, VA, USA
Avesta Sasan , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
Houman Homayoun , Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
pp. 85-94

Data streaming algorithms for the Kolmogorov-Smirnov test (Abstract)

Ashwin Lall , Department of Mathematics and Computer Science, Denison University, Granville, OH, USA
pp. 95-104

Techniques for fast and scalable time series traffic generation (Abstract)

Jilong Kuang , Computing Science Innovation Center, Samsung Research America
Daniel G. Waddington , Computing Science Innovation Center, Samsung Research America
Changhui Lin , Computing Science Innovation Center, Samsung Research America
pp. 105-114

Energy-efficient acceleration of big data analytics applications using FPGAs (Abstract)

Katayoun Neshatpour , Department of Electrical and Computer Engineering, George Mason University
Maria Malik , Department of Electrical and Computer Engineering, George Mason University
Mohammad Ali Ghodrat , Computer Science Department, University of California Los Angeles
Avesta Sasan , Department of Electrical and Computer Engineering, George Mason University
Houman Homayoun , Department of Electrical and Computer Engineering, George Mason University
pp. 115-123

Workload scheduling in distributed stream processors using graph partitioning (Abstract)

Lorenz Fischer , Department of Informatics, University of Zurich, Switzerland
Abraham Bernstein , Department of Informatics, University of Zurich, Switzerland
pp. 124-133

Evaluating different distributed-cyber-infrastructure for data and compute intensive scientific application (Abstract)

Arghya Kusum Das , School of Electrical Engineering and Computer Science, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803
Seung-Jong Park , School of Electrical Engineering and Computer Science, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803
Jaeki Hong , Samsung Electronics Co., Ltd. 95, Samsung, 2-ro Giheung-gu, Yongin-si, Gyeonggi-do, 446711
Wooseok Chang , Samsung Electronics Co., Ltd. 95, Samsung, 2-ro Giheung-gu, Yongin-si, Gyeonggi-do, 446711
pp. 134-143

Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join (Abstract)

Vincenzo Gulisano , Chalmers University of Technology, Gothenburg, Sweden
Yiannis Nikolakopoulos , Chalmers University of Technology, Gothenburg, Sweden
Marina Papatriantafilou , Chalmers University of Technology, Gothenburg, Sweden
Philippas Tsigas , Chalmers University of Technology, Gothenburg, Sweden
pp. 144-153

When computing meets heterogeneous cluster: Workload assignment in graph computation (Abstract)

Jilong Xue , Computer Science Department, Peking University, Beijing, China
Zhi Yang , Computer Science Department, Peking University, Beijing, China
Shian Hou , Computer Science Department, Peking University, Beijing, China
Yafei Dai , Computer Science Department, Peking University, Beijing, China
pp. 154-163

A scalable parallel XQuery processor (Abstract)

E. Preston Carman , University of California, Riverside
Till Westmann , Couchbase
Vinayak R. Borkar , X15 Software, Inc.
Michael J. Carey , University of California, Irvine
Vassilis J. Tsotras , University of California, Riverside
pp. 164-173

Computing load aware and long-view load balancing for cluster storage systems (Abstract)

Guoxin Liu , Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29631, USA
Haiying Shen , Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29631, USA
Haoyu Wang , Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29631, USA
pp. 174-183

Towards green cloud computing: Demand allocation and pricing policies for cloud service brokerage (Abstract)

Chenxi Qiu , Department of Electrical and Computer Engineering, Clemson University, Clemson, USA
Haiying Shen , Department of Electrical and Computer Engineering, Clemson University, Clemson, USA
Liuhua Chen , Department of Electrical and Computer Engineering, Clemson University, Clemson, USA
pp. 203-212

Elastic complex event processing exploiting prediction (Abstract)

Nikos Zacheilas , Department of Informatics Athens, University of Economics and Business, Athens, Greece
Vana Kalogeraki , Department of Informatics Athens, University of Economics and Business, Athens, Greece
Nikolas Zygouras , Department of Informatics, University of Athens, Athens, Greece
Nikolaos Panagiotou , Department of Informatics, University of Athens, Athens, Greece
Dimitrios Gunopulos , Department of Informatics, University of Athens, Athens, Greece
pp. 213-222

PortHadoop: Support direct HPC data processing in Hadoop (Abstract)

Xi Yang , Department of Computer Science Illinois Institute of Technology, Chicago, IL
Ning Liu , Department of Computer Science Illinois Institute of Technology, Chicago, IL
Bo Feng , Department of Computer Science Illinois Institute of Technology, Chicago, IL
Xian-He Sun , Department of Computer Science Illinois Institute of Technology, Chicago, IL
Shujia Zhou , Northrop Grumman Corporation, USA
pp. 223-232

Machine learning at the limit (Abstract)

John Canny , UC Berkeley, Berkeley, CA 94720, USA
Huasha Zhao , UC Berkeley, Berkeley, CA 94720, USA
Bobby Jaros , Yahoo Research, 701 First Ave Sunnyvale, CA, 94089, USA
Ye Chen , Changning District Anhua Rd 492 C-2, Shanghai, China 200050
Jiangchang Mao , Microsoft 1020 Enterprise Way Sunnyvale, CA 94089, USA
pp. 233-242

Performance characterization and acceleration of in-memory file systems for Hadoop and Spark applications on HPC clusters (Abstract)

Nusrat Sharmin Islam , Department of Computer Science and Engineering, The Ohio State University
Md. Wasi-ur-Rahman , Department of Computer Science and Engineering, The Ohio State University
Xiaoyi Lu , Department of Computer Science and Engineering, The Ohio State University
Dipti Shankar , Department of Computer Science and Engineering, The Ohio State University
Dhabaleswar K. Panda , Department of Computer Science and Engineering, The Ohio State University
pp. 243-252

Panopticon: A lock broker architecture for scalable transactions in the datacenter (Abstract)

Serafettin Tasci , Computer Science & Engineering Department, University at Buffalo, SUNY
Murat Demirbas , Computer Science & Engineering Department, University at Buffalo, SUNY
pp. 253-262

Toward locality-aware scheduling for containerized cloud services (Abstract)

Dongfang Zhao , Cloud Management Services Department, IBM Almaden Research Center, San Jose, CA 95120, United States
Nagapramod Mandagere , Cloud Management Services Department, IBM Almaden Research Center, San Jose, CA 95120, United States
Gabriel Alatorre , Cloud Management Services Department, IBM Almaden Research Center, San Jose, CA 95120, United States
Mohamed Mohamed , Cloud Management Services Department, IBM Almaden Research Center, San Jose, CA 95120, United States
Heiko Ludwig , Cloud Management Services Department, IBM Almaden Research Center, San Jose, CA 95120, United States
pp. 263-270

ATOM: Automated tracking, orchestration and monitoring of resource usage in infrastructure as a service systems (Abstract)

Min Du , School of Computing, University of Utah
Feifei Li , School of Computing, University of Utah
pp. 271-278

Composable and efficient functional big data processing framework (Abstract)

Dongyao Wu , Software Systems Research Group, NICTA, Sydney, Australia
Sherif Sakr , Software Systems Research Group, NICTA, Sydney, Australia
Liming Zhu , Software Systems Research Group, NICTA, Sydney, Australia
Qinghua Lu , Software Systems Research Group, NICTA, Sydney, Australia
pp. 279-286

Hybrid active learning for non-stationary streaming data with asynchronous labeling (Abstract)

Hyunjoo Kim , Palo Alto Research Center, 800 Phillips Road, Webster, New York, USA
Sriganesh Madhvanath , Palo Alto Research Center, 800 Phillips Road, Webster, New York, USA
Tong Sun , Palo Alto Research Center, 800 Phillips Road, Webster, New York, USA
pp. 287-292

Octopus: A multi-job scheduler for Graphlab (Abstract)

Srikant Padala , Dept. of Computer Science & Engineering, IIT Madras Chennai, India
Dinesh Kumar , Dept. of Computer Science & Engineering, IIT Madras Chennai, India
Arun Raj , Dept. of Computer Science & Engineering, IIT Madras Chennai, India
Janakiram Dharanipragada , Dept. of Computer Science & Engineering, IIT Madras Chennai, India
pp. 293-298

G-Storm: GPU-enabled high-throughput online data processing in Storm (Abstract)

Zhenhua Chen , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, 13244
Jielong Xu , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, 13244
Jian Tang , Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, 13244
Kevin Kwiat , US Air Force Research Lab (AFRL), Rome, NY
Charles Kamhoua , US Air Force Research Lab (AFRL), Rome, NY
pp. 307-312

Chronos: Failure-aware scheduling in shared Hadoop clusters (Abstract)

Orcun Yildiz , Inria Rennes - Bretagne Atlantique Research Center, Rennes, France
Shadi Ibrahim , Inria Rennes - Bretagne Atlantique Research Center, Rennes, France
Tran Anh Phuong , Inria Rennes - Bretagne Atlantique Research Center, Rennes, France
Gabriel Antoniu , Inria Rennes - Bretagne Atlantique Research Center, Rennes, France
pp. 313-318

An architecture for stream OLAP exploiting SPE and OLAP engine (Abstract)

Kosuke Nakabasami , Graduate School of Systems and Information Engineering, University of Tsukuba
Toshiyuki Amagasa , Faculty of Engineering, Information and Systems, University of Tsukuba
Salman Ahmed Shaikh , Cener for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8573, Japan
Franck Gass , Cener for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba 305-8573, Japan
Hiroyuki Kitagawa , Faculty of Engineering, Information and Systems, University of Tsukuba
pp. 319-326

Two-mode data distribution scheme for heterogeneous storage in data centers (Abstract)

Wei Xie , Department of Computer Science, Texas Tech University, Lubbock, TX 79413
Jiang Zhou , Department of Computer Science, Texas Tech University, Lubbock, TX 79413
Mark Reyes , Department of Computer Science, Texas Tech University, Lubbock, TX 79413
Jason Noble , Nimboxx, Inc., B #150, 1825 Kramer Ln, Austin, TX 78758
Yong Chen , Department of Computer Science, Texas Tech University, Lubbock, TX 79413
pp. 327-332

A predictive scheduling framework for fast and distributed stream data processing (Abstract)

Teng Li , Department of Electrical Engineering and Computer Science at Syracuse University
Jian Tang , Department of Electrical Engineering and Computer Science at Syracuse University
Jielong Xu , Department of Electrical Engineering and Computer Science at Syracuse University
pp. 333-338

A scalable implementation of information theoretic feature selection for high dimensional data (Abstract)

Anthony Kleerekoper , School of Computing, Mathematics and Digital Technologies, Manchester Metropolitan University, UK
Michael Pappas , School of Computer Science, The University of Manchester, UK
Adam Pocock , Oracle Labs, Burlington, MA
Gavin Brown , School of Computer Science, The University of Manchester, UK
Mikel Lujan , School of Computer Science, The University of Manchester, UK
pp. 339-346

Edge importance identification for energy efficient graph processing (Abstract)

S M Faisal , Dept. of CSE, The Ohio State University, Columbus, OH, USA
G. Tziantzioulis , Dept of EECS, Northwestern University, Evanston, IL, USA
A. M. Gok , Dept of EECS, Northwestern University, Evanston, IL, USA
N. Hardavellas , Dept of EECS, Northwestern University, Evanston, IL, USA
S. Ogrenci-Memik , Dept of EECS, Northwestern University, Evanston, IL, USA
S. Parthasarathy , Dept. of CSE, The Ohio State University, Columbus, OH, USA
pp. 347-354

Regular expression acceleration on the micron automata processor: Brill tagging as a case study (Abstract)

Keira Zhou , University of Virginia Charlottesville, VA 22904 USA
Jack Wadden , University of Virginia Charlottesville, VA 22904 USA
Jeffrey J. Fox , University of Virginia Charlottesville, VA 22904 USA
Ke Wang , University of Virginia Charlottesville, VA 22904 USA
Donald E. Brown , University of Virginia Charlottesville, VA 22904 USA
Kevin Skadron , University of Virginia Charlottesville, VA 22904 USA
pp. 355-360

Parallel in-memory trajectory-based spatiotemporal topological join (Abstract)

Suprio Ray , Department of Computer Science, University of Toronto
Angela Demke Brown , Department of Computer Science, University of Toronto
Nick Koudas , Department of Computer Science, University of Toronto
Rolando Blanco , SAP Canada, Waterloo
Anil K. Goel , SAP Canada, Waterloo
pp. 361-370

Spatially clustered join on heterogeneous scientific data sets (Abstract)

Bin Dong , Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720
Surendra Byna , Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720
Kesheng Wu , Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720
pp. 371-380

Recommending missing sensor values (Abstract)

Chung-Yi Li , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Wei-Lun Su , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Todd G. McKenzie , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Fu-Chun Hsu , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Shou-De Lin , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Jane Yung-jen Hsu , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Phillip B. Gibbons , Carnegie Mellon University, Pittsburgh, USA
pp. 381-390

The roles of network communities in social information diffusion (Abstract)

Cheng-Te Li , Research Center for IT Innovation, Academia Sinica, Taiwan
Yu-Jen Lin , Institute of Information Science Academia Sinica, Taiwan
Mi-Yen Yeh , Institute of Information Science Academia Sinica, Taiwan
pp. 391-400

Big data entity resolution: From highly to somehow similar entity descriptions in the Web (Abstract)

Vasilis Efthymiou , University of Crete, Greece
Kostas Stefanidis , ICS-FORTH, Greece
Vassilis Christophides , University of Crete, Greece
pp. 401-410

Parallel meta-blocking: Realizing scalable entity resolution over large, heterogeneous data (Abstract)

Vasilis Efthymiou , ICS-FORTH, Greece
George Papadakis , University of Athens, Greece
George Papastefanatos , Athena Research Center, Greece
Kostas Stefanidis , ICS-FORTH, Greece
Themis Palpanas , Paris Descartes University, France
pp. 411-420

Slingshot: A modular framework for designing data processing systems (Abstract)

Bogdan Simion , Department of Computer Science, University of Toronto
Daniel N. Ilha , Department of Computer Science, University of Toronto
Suprio Ray , Department of Computer Science, University of Toronto
Leslie Barron , Department of Computer Science, University of Toronto
Angela Demke Brown , Department of Computer Science, University of Toronto
Ryan Johnson , Department of Computer Science, University of Toronto
pp. 421-430

TrustMR: Computation integrity assurance system for MapReduce (Abstract)

Huseyin Ulusoy , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
Murat Kantarcioglu , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
Erman Pattuk , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
pp. 441-450

AccountableMR: Toward accountable MapReduce systems (Abstract)

Huseyin Ulusoy , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
Murat Kantarcioglu , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
Erman Pattuk , The University of Texas at Dallas, 800 W. Campbell Rd, Richardson TX, 75080
Lalana Kagal , Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139
pp. 451-460

TKSimGPU: A parallel top-K trajectory similarity query processing algorithm for GPGPUs (Abstract)

Eleazar Leal , School of Computer Science, University of Oklahoma, Norman, OK 73019, USA
Le Gruenwald , School of Computer Science, University of Oklahoma, Norman, OK 73019, USA
Jianting Zhang , Dept. of Computer Science, City College of New York, New York City, NY 10031, USA
Simin You , Dept. of Computer Science, CUNY Graduate Center, New York City, NY 10016, USA
pp. 461-469

A transaction model for management of replicated data with multiple consistency levels (Abstract)

Anand Tripathi , Department of Computer Science University of Minnesota, Minneapolis, Minnesota, 55455 USA
Bhagavathi Dhass Thirunavukarasu , Department of Computer Science University of Minnesota, Minneapolis, Minnesota, 55455 USA
pp. 470-477

Quadtree-based lightweight data compression for large-scale geospatial rasters on multi-core CPUs (Abstract)

Jianting Zhang , Dept. of Computer Science, The City College of New York, New York, NY, USA
Simin You , Dept. of Computer Science, CUNY Graduate Center, New York, NY, USA
Le Gruenwald , Dept. of Computer Science, The University of Oklahoma, Norman, OK, USA
pp. 478-484

DSDQuery DSI ? Querying scientific data repositories with structured operators (Abstract)

Roee Ebenstein , Department of Computer Science and Engineering, The Ohio State University
Gagan Agrawal , Department of Computer Science and Engineering, The Ohio State University
pp. 485-492

Brown Dog: Leveraging everything towards autocuration (Abstract)

Smruti Padhy , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Greg Jansen , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Jay Alameda , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Edgar Black , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Liana Diesendruck , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Mike Dietze , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Praveen Kumar , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Rob Kooper , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Jong Lee , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Rui Liu , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Richard Marciano , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Luigi Marini , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Dave Mattson , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Barbara Minsker , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Chris Navarro , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Marcus Slavenas , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
William Sullivan , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Jason Votava , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Inna Zharnitsky , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
Kenton McHenry , National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
pp. 493-500

Cost-efficient partitioning of spatial data on cloud (Abstract)

Afsin Akdogan , Computer Science Dept. University of Southern California Los Angeles, CA, USA
Saratchandra Indrakanti , eBay Inc., San Jose, CA, USA
Ugur Demiryurek , Computer Science Dept. University of Southern California Los Angeles, CA, USA
Cyrus Shahabi , Computer Science Dept. University of Southern California Los Angeles, CA, USA
pp. 501-506

BigFUN: A performance study of big data management system functionality (Abstract)

Pouria Pirzadeh , University of California Irvine, Irvine, USA
Michael J. Carey , University of California, Irvine Irvine, USA
Till Westmann , Couchbase, Mountain View, USA
pp. 507-514

A flexible QoS fortified distributed key-value storage system for the cloud (Abstract)

Tonglin Li , Illinois Institute of Technology
Ke Wang , Intel
Dongfang Zhao , Illinois Institute of Technology
Kan Qiao , Google
Iman Sadooghi , Illinois Institute of Technology
Xiaobing Zhou , Hortonworks
Ioan Raicu , Illinois Institute of Technology
pp. 515-522

TPS: A task placement strategy for big data workflows (Abstract)

Mahdi Ebrahimi , Wayne State University Detroit, U.S.A.
Aravind Mohan , Wayne State University Detroit, U.S.A.
Shiyong Lu , Wayne State University Detroit, U.S.A.
Robert Reynolds , Wayne State University Detroit, U.S.A.
pp. 523-530

Improving transaction processing performance by consensus reduction (Abstract)

Yuqing Zhu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Yilei Wang , Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
pp. 531-538

Benchmarking key-value stores on high-performance storage and interconnects for web-scale workloads (Abstract)

Dipti Shankar , Department of Computer Science and Engineering, The Ohio State University
Xiaoyi Lu , Department of Computer Science and Engineering, The Ohio State University
Md. Wasi-ur-Rahman , Department of Computer Science and Engineering, The Ohio State University
Nusrat Islam , Department of Computer Science and Engineering, The Ohio State University
Dhabaleswar K. Panda , Department of Computer Science and Engineering, The Ohio State University
pp. 539-544

Bandwidth-efficient distributed k-nearest-neighbor search with dynamic time warping (Abstract)

Chin-Chi Hsu , Institute of Information Science, Academia Sinica, Taiwan
Perng-Hwa Kung , The Media Lab, Massachusetts Institute of Technology
Mi-Yen Yeh , Institute of Information Science, Academia Sinica, Taiwan
Shou-De Lin , Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Phillip B. Gibbons , Computer Science Department and Electrical & Computer Engineering Department, Carnegie Mellon University
pp. 551-560

Dynamic theme tracking in Twitter (Abstract)

Liang Zhao , Virginia Tech
Feng Chen , University of Albany, SUNY
Chang-Tien Lu , Virginia Tech
Naren Ramakrishnan , Virginia Tech
pp. 561-570

SyntacticDiff: Operator-based transformation for comparative text mining (Abstract)

Sean Massung , Department of Computer Science, College of Engineering University of Illinois at Urbana-Champaign
ChengXiang Zhai , Department of Computer Science, College of Engineering University of Illinois at Urbana-Champaign
pp. 571-580

Visual analysis of bi-directional movement behavior (Abstract)

Yixian Zheng , Department of Computer Science and Engineering, Hong Kong University of Science and Technology
Wenchao Wu , Department of Computer Science and Engineering, Hong Kong University of Science and Technology
Huamin Qu , Department of Computer Science and Engineering, Hong Kong University of Science and Technology
Chunyan Ma , Department of Software Engineering, Northwestern Polytechnical University
Lionel M. Ni , Department of Computer and Information Science, University of Macau
pp. 581-590

User-curated image collections: Modeling and recommendation (Abstract)

Yuncheng Li , University of Rochester, Department of Computer Science, Rochester, New York 14627, USA
Tao Mei , Microsoft Research, Building 2, No. 5 Dan Ling Street, Haidian District, Beijing 100080, China
Yang Cong , Chinese Academy of Sciences, Shenyang Institute of Automation, State Key Laboratory of Robotics
Jiebo Luo , University of Rochester, Department of Computer Science, Rochester, New York 14627, USA
pp. 591-600

Angular quantization based affinity propagation clustering and its application to astronomical big spectra data (Abstract)

Ke Wang , School of Computer Science and Technology Beijing Institute of Technology, Beijing 100081, P. R. China
Ping Guo , School of Computer Science and Technology Beijing Institute of Technology, Beijing 100081, P. R. China
A-Li Luo , Key Laboratory of Optical Astronomy National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, P. R. China
pp. 601-608

Scalable classification for large dynamic networks (Abstract)

Yibo Yao , School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
Lawrence B. Holder , School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164
pp. 609-618

CINTIA: A distributed, low-latency index for big interval data (Abstract)

Ruslan Mavlyutov , eXascale Infolab, U. of Fribourg-Switzerland
Philippe Cudre-Mauroux , eXascale Infolab, U. of Fribourg-Switzerland
pp. 619-628

Revealing the fog-of-war: A visualization-directed, uncertainty-aware approach for exploring high-dimensional data (Abstract)

Yang Wang , Computer Science Department, University of California, Davis Davis, USA
Kwan-Liu Ma , Computer Science Department, University of California, Davis, Davis, USA
pp. 629-638

Inferring crowd-sourced venues for tweets (Abstract)

Bokai Cao , Department of Computer Science, University of Illinois at Chicago, IL, USA
Francine Chen , FX Palo Alto Laboratory, Palo Alto, CA, USA
Dhiraj Joshi , FX Palo Alto Laboratory, Palo Alto, CA, USA
Philip S. Yu , Department of Computer Science, University of Illinois at Chicago, IL, USA
pp. 639-648

Core decomposition in large temporal graphs (Abstract)

Huanhuan Wu , Department of Computer Science and Engineering, The Chinese University of Hong Kong
James Cheng , Department of Computer Science and Engineering, The Chinese University of Hong Kong
Yi Lu , Department of Computer Science and Engineering, The Chinese University of Hong Kong
Yiping Ke , School of Computer Engineering, Nanyang Technological University
Yuzhen Huang , Department of Computer Science, Sun Yat-sen University
Da Yan , Department of Computer Science and Engineering, The Chinese University of Hong Kong
Hejun Wu , Department of Computer Science and Engineering, The Chinese University of Hong Kong
pp. 649-658

Recommending forum posts to designated experts (Abstract)

Jason H.D. Cho , University of Illinois at Urbana-Champaign
Yanen Li , LinkedIn Inc.
Roxana Girju , University of Illinois at Urbana-Champaign
Chengxiang Zhai , University of Illinois at Urbana-Champaign
pp. 659-666

Accelerating collaborative filtering using concepts from high performance computing (Abstract)

Mark Gates , Innovative Computing Lab University of Tennessee Knoxville, USA
Hartwig Anzt , Innovative Computing Lab University of Tennessee Knoxville, USA
Jakub Kurzak , Innovative Computing Lab University of Tennessee Knoxville, USA
Jack Dongarra , Innovative Computing Lab University of Tennessee Knoxville, USA
pp. 667-676

Modelling cascades over time in microblogs (Abstract)

Wei Xie , Living Analytics Research Centre, Singapore Management University
Feida Zhu , Living Analytics Research Centre, Singapore Management University
Siyuan Liu , Smeal College of Business, Penn State University
Ke Wang , Simon Fraser University, Singapore Management University
pp. 677-686

CSFinder: A cold-start friend finder in large-scale social networks (Abstract)

Yasser Salem , School of Electronics, Electrical Engineering and Computer Science Queen's University Belfast, Belfast BT7 INN, UK
Jun Hong , School of Electronics, Electrical Engineering and Computer Science Queen's University Belfast, Belfast BT7 INN, UK
Weira Liu , School of Electronics, Electrical Engineering and Computer Science Queen's University Belfast, Belfast BT7 INN, UK
pp. 687-696

Effectively crowdsourcing the acquisition and analysis of visual data for disaster response (Abstract)

Hien To , Integrated Media Systems Center, University of Southern California, Los Angeles, CA, U.S.A
Seon Ho Kim , Integrated Media Systems Center, University of Southern California, Los Angeles, CA, U.S.A
Cyrus Shahabi , Integrated Media Systems Center, University of Southern California, Los Angeles, CA, U.S.A
pp. 697-706

Full diffusion history reconstruction in networks (Abstract)

Zhen Chen , School of Electrical, Computer and Energy Engineering, Arizona State University Tempe, Arizona, 85281
Hanghang Tong , School of Computing, Informatics and Decision Systems Engineering, Arizona State University Tempe, Arizona, 85281
Lei Ying , School of Electrical, Computer and Energy Engineering, Arizona State University Tempe, Arizona, 85281
pp. 707-716

AdaM: An adaptive monitoring framework for sampling and filtering on IoT devices (Abstract)

Demetris Trihinas , Department of Computer Science, University of Cyprus
George Pallis , Department of Computer Science, University of Cyprus
Marios D. Dikaiakos , Department of Computer Science, University of Cyprus
pp. 717-726

Modeling graphs using a mixture of Kronecker models (Abstract)

Suchismit Mahapatra , Computer Science and Engineering, State University of New York at Buffalo
Varun Chandola , Computer Science and Engineering, State University of New York at Buffalo
pp. 727-736

Data quality assessment and anomaly detection via map/reduce and linked data: A case study in the medical domain (Abstract)

Stephen Bonner , School of Engineering and Computing Sciences, Durham University, Durham, UK
Andrew Stephen McGough , School of Engineering and Computing Sciences, Durham University, Durham, UK
Ibad Kureshi , School of Engineering and Computing Sciences, Durham University, Durham, UK
John Brennan , School of Engineering and Computing Sciences, Durham University, Durham, UK
Georgios Theodoropoulos , School of Engineering and Computing Sciences, Durham University, Durham, UK
Laura Moss , University of Glasgow, Glasgow, UK
David Corsar , University of Aberdeen, Aberdeen, UK
Grigoris Antoniou , University of Huddersfield, Huddersfield UK
pp. 737-746

SigCO: Mining significant correlations via a distributed real-time computation engine (Abstract)

Tian Guo , EPFL, Lausanne, Switzerland
Jean-Paul Calbimonte , EPFL, Lausanne, Switzerland
Hao Zhuang , EPFL, Lausanne, Switzerland
Karl Aberer , EPFL, Lausanne, Switzerland
pp. 747-756

Identifying smallest unique subgraphs in a heterogeneous social network (Abstract)

Yen-Kai Wang , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Wei-Ming Chen , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Cheng-Te Li , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Shou-De Lin , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
pp. 757-766

Toward precise user-topic alignment in online social media (Abstract)

Jiejun Xu , HRL Laboratories, LLC Malibu, USA
Tsai-Ching Lu , HRL Laboratories, LLC, Malibu, USA
pp. 767-775

Visual interface for exploring caution spots from vehicle recorder big data (Abstract)

Masahiko Itoh , The University of Tokyo, National Institute of Information and Communications Technology
Daisaku Yokoyama , The University of Tokyo
Masashi Toyoda , The University of Tokyo
Masaru Kitsuregawa , National Institute of Informatics, The University of Tokyo
pp. 776-784

Learning relevance from click data via neural network based similarity models (Abstract)

Xugang Ye , Microsoft Bellevue, WA, USA
Zijie Qi , Microsoft, Bellevue, WA, USA
Dan Massey , Microsoft Redmond, WA, USA
pp. 801-806

Matisse: A visual analytics system for exploring emotion trends in social media text streams (Abstract)

Chad A. Steed , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Margaret Drouhard , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Justin Beaver , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Joshua Pyle , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Paul L. Bogen , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
pp. 807-814

Robust crowd bias correction via dual knowledge transfer from multiple overlapping sources (Abstract)

Sihong Xie , Department of Computer Science University of Illinois at Chicago, Chicago, IL, USA
Qingbo Hu , Department of Computer Science University of Illinois at Chicago, Chicago, IL, USA
Jingyuan Zhang , Department of Computer Science University of Illinois at Chicago, Chicago, IL, USA
Jing Gao , Department of Computer Science, University at Buffalo, Buffalo, NY, USA
Wei Fan , Baidu Research Big Data Lab, Sunnyvale, CA, USA
Philip S. Yu , Department of Computer Science University of Illinois at Chicago, Chicago, IL, USA
pp. 815-820

A community driven social recommendation system (Abstract)

Deepika Lalwani , Dept. of Computer Science and Engineering, National Institute of Technology, Warangal, India
D. V. L. N. Somayajulu , Dept. of Computer Science and Engineering, National Institute of Technology, Warangal, India
P. Radha Krishna , Infosys Limited, Hyderabad, India
pp. 821-826

Task-based recommendation on a web-scale (Abstract)

Yongfeng Zhang , Department of Computer Science & Technology, Tsinghua University, Beijing, 100084, China
Min Zhang , Department of Computer Science & Technology, Tsinghua University, Beijing, 100084, China
Yiqun Liu , Department of Computer Science & Technology, Tsinghua University, Beijing, 100084, China
Chua Tat-Seng , School of Computing, National University of Singapore (NUS), 117417, Singapore
Yi Zhang , School of Engineering, University of California, Santa Cruz, CA 95060, USA
Shaoping Ma , Department of Computer Science & Technology, Tsinghua University, Beijing, 100084, China
pp. 827-836

Multi-modal learning for video recommendation based on mobile application usage (Abstract)

Xiaowei Jia , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
Aosen Wang , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
Xiaoyi Li , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
Guangxu Xun , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
Wenyao Xu , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
Aidong Zhang , School of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, USA, 14260-1660
pp. 837-842

Improving EEG feature learning via synchronized facial video (Abstract)

Xiaoyi Li , Dept. of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, U.S.A.
Xiaowei Jia , Dept. of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, U.S.A.
Guangxu Xun , Dept. of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, U.S.A.
Aidong Zhang , Dept. of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY, U.S.A.
pp. 843-848

MMC-margin: Identification of maximum frequent subgraphs by metropolis Monte Carlo sampling (Abstract)

Muyi Liu , Purdue University, Dept. of Biological Sciences, 240 S. Martin Jischke Drive, West Lafayette, IN, USA 47907-1971
Michael Gribskov , Purdue University, Depts. Of Biological Sciences and Computer Science (by courtesy) 240 S. Martin Jischke Drive, West Lafayette, IN, USA 47907-1971
pp. 849-856

KeyLabel algorithms for keyword search in large graphs (Abstract)

Yue Wang , School of Computing Science, Simon Fraser University
Ke Wang , School of Computing Science, Simon Fraser University
Ada Wai-Chee Fu , Department of Computer Science and Engineering, The Chinese University of Hong Kong
Raymond Chi-Wing Wong , Department of Computer Science and Engineering, The Hong Kong University of Science and Technology
pp. 857-864

Spatio-temporal asynchronous co-occurrence pattern for big climate data towards long-lead flood prediction (Abstract)

Chung-Hsien Yu , Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125
Dong Luo , Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125
Wei Ding , Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125
Joseph Cohen , Department of Computer Science, University of Massachusetts Boston, Boston, MA 02125
David Small , Department of Civil and Environmental Engineering, Tufts University, Medford, MA 02155
Shafiqul Islam , Department of Civil and Environmental Engineering, Tufts University, Medford, MA 02155
pp. 865-870

Using big data to study the link between human mobility and socio-economic development (Abstract)

Luca Pappalardo , Department of Computer Science, University of Pisa, Italy
Dino Pedreschi , Department of Computer Science, University of Pisa, Italy
Zbigniew Smoreda , SENSE Orange Lab, France
Fosca Giannotti , Institute of Information Science and Technologies, National Research Council (CNR), Italy
pp. 871-878

Cluster-based aggregate forecasting for residential electricity demand using smart meter data (Abstract)

Tri Kurniawan Wijaya , School of Computer and Communication Sciences, EPFL, Switzerland
Matteo Vasirani , School of Computer and Communication Sciences, EPFL, Switzerland
Samuel Humeau , School of Computer and Communication Sciences, EPFL, Switzerland
Karl Aberer , School of Computer and Communication Sciences, EPFL, Switzerland
pp. 879-887

A scalable approach for data-driven taxi ride-sharing simulation (Abstract)

Masayo Ota , Center for Urban Science and Progress, New York University
Huy Vo , Center for Urban Science and Progress, New York University
Claudio Silva , Center for Urban Science and Progress, New York University
Juliana Freire , Center for Urban Science and Progress, New York University
pp. 888-897

EveryoneCounts: Data-driven digital advertising with uncertain demand model in metro networks (Abstract)

Desheng Zhang , University of Minnesota, USA
Riiobing Jiang , Shanghai JiaoTong University, China
Shiiai Wang , University of Minnesota, USA
Yanmin Zhu , Shanghai JiaoTong University, China
Bo Yang , Shanghai JiaoTong University, China
Jian Cao , Shanghai JiaoTong University, China
Fan Zhang , SIAT, China
Tian He , Shanghai JiaoTong University, China
pp. 898-907

Fast decentralized gradient descent method and applications to in-situ seismic tomography (Abstract)

Liang Zhao , Department of Computer Science, Georgia State University, GA USA
Wen-Zhan Song , Department of Computer Science, Georgia State University, GA USA
Xiaojing Ye , Department of Mathematics & Statistics, Georgia State University, GA USA
pp. 908-917

Scientific computing meets big data technology: An astronomy use case (Abstract)

Zhao Zhang , AMPLab, University of California, Berkeley
Kyle Barbary , Berkeley Institute for Data Science, University of California, Berkeley
Frank Austin Nothaft , AMPLab, University of California, Berkeley
Evan Sparks , AMPLab, University of California, Berkeley
Oliver Zahn , Berkeley Center for Cosmological Physics, University of California, Berkeley
Michael J. Franklin , AMPLab, University of California, Berkeley
David A. Patterson , AMPLab, University of California, Berkeley
Saul Perlmutter , Berkeley Institute for Data Science, University of California, Berkeley
pp. 918-927

An interactive learning framework for scalable classification of pathology images (Abstract)

Michael Nalisnik , Department of Computer Science and Mathematics, Emory University, Emory University School of Medicine, Atlanta, GA 30322
David A Gutman , Department of Neurology, Emory University, Emory University School of Medicine, Atlanta, GA 30322
Jun Kong , Departments of Biomedical Informatics Emory University School of Medicine / Georgia Institute of Technology, Atlanta, GA 30322
Lee A D Cooper , Departments of Biomedical Informatics Emory University School of Medicine / Georgia Institute of Technology, Atlanta, GA 30322
pp. 928-935

America Tweets China: A fine-grained analysis of the state and individual characteristics regarding attitudes towards China (Abstract)

Yu Wang , Department of Political Science, University of Rochester, Rochester, NY, 14627, USA
Jianbo Yuan , Department of Computer Science University of Rochester, Rochester, NY, 14627, USA
Jiebo Luo , Department of Computer Science, University of Rochester, Rochester, NY, 14627, USA
pp. 936-943

A data-driven approach to extract connectivity structures from diffusion tensor imaging data (Abstract)

Yu Jin , Institute for Advanced Computer Studies and Department of Electrical and Computer Engineering, University of Maryland, College Park, USA
Joseph F. JaJa , Institute for Advanced Computer Studies and Department of Electrical and Computer Engineering, University of Maryland, College Park, USA
Rong Chen , Department of Radiology, University of Maryland, Baltimore, Baltimore, USA
Edward H. Herskovits , Department of Radiology, University of Maryland, Baltimore, Baltimore, USA
pp. 944-951

A MapReduce based k-NN joins probabilistic classifier (Abstract)

Georgios Chatzigeorgakidis , University of Péloponnèse, Department of Informatics and Telecommunications, Tripolis, Greece
Sophia Karagiorgou , R.C. ATHENA, Institute for the Management of Information Systems, Athens, Greece
Spiros Athanasiou , R.C. ATHENA, Institute for the Management of Information Systems, Athens, Greece
Spiros Skiadopoulos , University of Péloponnèse, Department of Informatics and Telecommunications, Tripolis, Greece
pp. 952-957

Scalable k-NN based text clustering (Abstract)

Alessandro Lulli , University of Pisa, Italy
Thibault Debatty , Royal Military Academy, Brussels, Belgium
Matteo Dell'Amico , Symantec Research Labs
Pietro Michiardi , EURECOM, Campus SophiaTech, France
Laura Ricci , University of Pisa, Italy
pp. 958-963

An ensemble learning based approach for building airfare forecast service (Abstract)

Yuwen Chen , Shanghai Jiao Tong University, Shanghai, China
Jian Cao , Shanghai Jiao Tong University, Shanghai, China
Shanshan Feng , Shanghai Jiao Tong University, Shanghai, China
Yudong Tan , Ctrip.com, Shanghai, China
pp. 964-969

Next-term student grade prediction (Abstract)

Mack Sweeney , George Mason University, Fairfax, VA, United States
Jaime Lester , George Mason University, Fairfax, VA, United States
Huzefa Rangwala , George Mason University, Fairfax, VA, United States
pp. 970-975

Predicting the location of users on Twitter from low density graphs (Abstract)

Sofia Apreleva , SeaChange International, Santa Monica, CA
Alejandro Cantarero , SeaChange International, Santa Monica, CA
pp. 976-983

How not to drown in a sea of information: An event recognition approach (Abstract)

Elias Alevizos , Institute of Informatics & Telecommunications, NCSR Demokritos, Athens, Greece
Alexander Artikis , Department of Maritime Studies, University of Piraeus, Greece
Kostas Patroumpas , Department of Informatics, University of Piraeus, Greece
Marios Vodas , Department of Informatics, University of Piraeus, Greece
Yannis Theodoridis , Department of Informatics, University of Piraeus, Greece
Nikos Pelekis , Department of Statistics & Insurance Science, University of Piraeus, Greece
pp. 984-990

Smog disaster forecasting using social web data and physical sensor data (Abstract)

Jiaoyan Chen , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Huajun Chen , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Daning Hu , Department of Informatics, University of Zurich, Zurich, Switzerland
Jeff Z. Pan , Department of Computing Science, The University of Aberdeen, Aberdeen, United Kingdom
Yalin Zhou , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
pp. 991-998

Large scale support vector regression for aviation safety (Abstract)

Kamalika Das , UARC, NASA Ames Research Center, Moffett Field, CA 94035
Kanishka Bhaduri , Intuitine, 2700 Coast Ave. Mountain View, CA 94043
Bryan L. Matthews , SGT Inc., NASA Ames Research Center Moffett Field, CA 94035
Nikunj C. Oza , NASA Ames Research Center Moffett Field, CA 94035
pp. 999-1006

City users' classification with mobile phone data (Abstract)

Lorenzo Gabrielli , Dep. of Information Engineering, University of Pisa - Italy
Barbara Furletti , ISTI - CNR, Pisa - Italy
Roberto Trasarti , ISTI - CNR, Pisa - Italy
Fosca Giannotti , ISTI - CNR, Pisa - Italy
Dino Pedreschi , Dep. of Computer Science, University of Pisa - Italy
pp. 1007-1012

Traffic forecasting in complex urban networks: Leveraging big data and machine learning (Abstract)

Florin Schimbinschi , Department of Computing and Information Systems, The University of Melbourne
Xuan Vinh Nguyen , Department of Computing and Information Systems, The University of Melbourne
James Bailey , Department of Computing and Information Systems, The University of Melbourne
Chris Leckie , Department of Computing and Information Systems, The University of Melbourne
Hai Vu , Department of Computing and Information Systems, The University of Melbourne
Rao Kotagiri , Department of Computing and Information Systems, The University of Melbourne
pp. 1019-1024

Prediction of physiological subsystem failure and its impact in the prediction of patient mortality (Abstract)

Karla Caballero Barajas , University of California Santa Cruz, Santa Cruz, USA
Ram Akella , University of California Berkeley, Santa Cruz, USA
pp. 1025-1030

Efficient distributed maximum matching for solving the container exchange problem in the maritime industry (Abstract)

Fei Shao , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Li-Yung Ho , Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Jan-Jan Wu , Institute of Information Science, Research Center for Information, Technology Innovation Academia Sinica, Taipei, Taiwan
Pangfeng Liu , Department of Computer Science and Information Engineering, Graduate Intitute of Networking and Multimedia, National Taiwan University Taipei, Taiwan
pp. 1031-1036

Cell analytics in compound hit selection of bacterial inhibitors (Abstract)

Robert P. Trevino , School of Computing, Informatics, and Decision Systems Engineering Arizona State University, Tempe, Arizona 85281, USA
Steve A. Kawamoto , UES, Inc., Dayton, Ohio 45432, USA
Thomas J. Lamkin , Air Force Research Laboratory, 711th HPW/RHXBC, Wright Patterson Air Force Base, Dayton, OH 45433, USA
Huan Liu , School of Computing, Informatics, and Decision Systems Engineering Arizona State University, Tempe, Arizona 85281, USA
pp. 1037-1042

Mining target users for online marketing based on App Store data (Abstract)

Xiuqiang He , Noah's Ark Lab, Huawei, Hong Kong
Wenyuan Dai , Noah's Ark Lab, Huawei, Hong Kong
Guoxiang Cao , Noah's Ark Lab, Huawei, Hong Kong
Ruiming Tang , Noah's Ark Lab, Huawei, Hong Kong
Mingxuan Yuan , Noah's Ark Lab, Huawei, Hong Kong
Qiang Yang , The Hong Kong University of Science and Technology
pp. 1043-1052

Scalable community discovery from multi-faceted graphs (Abstract)

Ahmed Metwally , Google Inc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043
Jia-Yu Pan , Google Inc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043
Minh Doan , Google Inc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043
Christos Faloutsos , Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213
pp. 1053-1062

Towards real-time customer experience prediction for telecommunication operators (Abstract)

Ernesto Diaz-Aviles , IBM Research - Ireland
Fabio Pinelli , IBM Research - Ireland
Karol Lynch , IBM Research - Ireland
Zubair Nabi , IBM Research - Ireland
Yiannis Gkoufas , IBM Research - Ireland
Eric Bouillet , IBM Research - Ireland
Francesco Calabrese , IBM Research - Ireland
Eoin Coughlan , IBM Now Factory - Ireland
Peter Holland , IBM Now Factory - Ireland
Jason Salzwedel , Vodacom - South Africa
pp. 1063-1072

Early experience with optimizing I/O performance using high-performance SSDs for in-memory cluster computing (Abstract)

I. Stephen Choi , Memory Solutions Lab., Samsung Semiconductor Inc., San Jose, CA
Weiqing Yang , School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Yang-Suk Kee , Memory Solutions Lab., Samsung Semiconductor Inc., San Jose, CA
pp. 1073-1083

An evaluation of alternative shared-nothing architecture for analytical processing systems (Abstract)

Hyunsik Choi , Gruter Inc.
Jongyoung Park , Gruter Inc.
Yong In Lee , SW R&D Center, Device Solutions, Samsung Electronics Co., Ltd.
Kangho Roh , SW R&D Center, Device Solutions, Samsung Electronics Co., Ltd.
Kwanghyun La , SW R&D Center, Device Solutions, Samsung Electronics Co., Ltd.
pp. 1084-1093

Controlled experiments for decision-making in e-Commerce search (Abstract)

Anjan Goswami , WalmartLabs, 860 W California Ave, Sunnyvale, CA 94085
Wei Han , WalmartLabs, 860 W California Ave, Sunnyvale, CA 94085
Zhenrui Wang , WalmartLabs, 860 W California Ave, Sunnyvale, CA 94085
Angela Jiang , WalmartLabs, 860 W California Ave, Sunnyvale, CA 94085
pp. 1094-1102

Semantics for Big Data access & integration: Improving industrial equipment design through increased data usability (Abstract)

Jenny Weisenberg Williams , Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
Paul Cuddihy , Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
Justin McHugh , Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
Kareem S. Aggour , Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
Arvind Menon , Combustion Control and Methods, GE Power & Water, Greenville, SC 29615 USA
Steven M. Gustafson , Knowledge Discovery Lab, GE Global Research, Niskayuna, NY 12309 USA
Timothy Healy , Combustion Control and Methods, GE Power & Water, Greenville, SC 29615 USA
pp. 1103-1112

Batch-mode active learning for technology-assisted review (Abstract)

Tanay Kumar Saha , Department of Computer Science, Indiana University Purdue University Indianapolis, IN
Mohammad Al Hasan , Department of Computer Science, Indiana University Purdue University Indianapolis, IN
Chandler Burgess , iControl ESI®, 16479 N. Dallas Parkway Addison, TX, 75001
Md Ahsan Habib , iControl ESI®, 16479 N. Dallas Parkway Addison, TX, 75001
Jeff Johnson , iControl ESI®, 16479 N. Dallas Parkway Addison, TX, 75001
pp. 1134-1143

A pipeline for extracting and deduplicating domain-specific knowledge bases (Abstract)

Mayank Kejriwal , The University of Texas at Austin
Qiaoling Liu , CareerBuilder LLC, 5550-A Peachtree Parkway, Norcross, GA 30092
Ferosh Jacob , CareerBuilder LLC, 5550-A Peachtree Parkway, Norcross, GA 30092
Faizan Javed , CareerBuilder LLC, 5550-A Peachtree Parkway, Norcross, GA 30092
pp. 1144-1153

EXOS: Expansion on session for enhancing effectiveness of query auto-completion (Abstract)

Fang-Hsiang Su , Columbia University, New York, NY USA
Manas Somaiya , eBay Inc., San Jose, CA USA
Shrish Mishra , eBay Inc., San Jose, CA USA
Rajyashree Mukherjee , eBay Inc., San Jose, CA USA
pp. 1154-1163

ADMM based scalable machine learning on Spark (Abstract)

Sauptik Dhar , Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304, USA
Congrui Yi , Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242, USA
Naveen Ramakrishnan , Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304, USA
Mohak Shah , Research and Technology Center, Robert Bosch LLC, Palo Alto, CA 94304, USA
pp. 1174-1182

Record-aware compression for big textual data analysis acceleration (Abstract)

Dapeng Dong , Mobile and Internet Systems Laboratory, University College Cork. Ireland
John Herbert , Mobile and Internet Systems Laboratory, University College Cork. Ireland
pp. 1183-1190

Automotive big data: Applications, workloads and infrastructures (Abstract)

Andre Luckow , Innovation Lab, BMW Group IT Research Center, Information Management Americas, Greenville, South Carolina, USA
Ken Kennedy , Innovation Lab, BMW Group IT Research Center, Information Management Americas, Greenville, South Carolina, USA
Fabian Manhardt , Innovation Lab, BMW Group IT Research Center, Information Management Americas, Greenville, South Carolina, USA
Emil Djerekarov , Innovation Lab, BMW Group IT Research Center, Information Management Americas, Greenville, South Carolina, USA
Bennie Vorster , Innovation Lab, BMW Group IT Research Center, Information Management Americas, Greenville, South Carolina, USA
Amy Apon , Qemson University, Clemson, South Carolina, USA
pp. 1201-1210

Cost-sensitive optimization of automated inspection (Abstract)

Goktug T. Cinar , Robert Bosch LLC, 4009 Miranda Avenue, Palo Alto, CA 94304 USA
Jeffrey Thompson , Robert Bosch LLC, 4009 Miranda Avenue, Palo Alto, CA 94304 USA
Soundar Srinivasan , Robert Bosch LLC, 4009 Miranda Avenue, Palo Alto, CA 94304 USA
pp. 1211-1219

Query sense disambiguation leveraging large scale user behavioral data (Abstract)

Mohammed Korayem , CareerBuilder, Norcross, GA, USA
Camilo Ortiz , Bloomberg, New York, NY, USA
Khalifeh AlJadda , CareerBuilder, Norcross, GA, USA
Trey Grainger , CareerBuilder, Norcross, GA, USA
pp. 1230-1237

Personalized expertise search at LinkedIn (Abstract)

Viet Ha-Thuc , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
Ganesh Venkataraman , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
Mario Rodriguez , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
Shakti Sinha , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
Senthil Sundaram , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
Lin Guo , Linkedln, 2029 Steirlin Ct, Mountain View, CA, USA
pp. 1238-1247

Mining lifestyle personas at scale in e-commerce (Abstract)

Kang Li , Search and Data Mining, Groupon, Palo Alto, CA 94306
Vinay Deolalikar , Search and Data Mining, Groupon, Palo Alto, CA 94306
Neeraj Pradhan , Search and Data Mining, Groupon, Palo Alto, CA 94306
pp. 1254-1261

SDFS: Secure distributed file system for data-at-rest security for Hadoop-as-a-service (Abstract)

Petros Zerfos , IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
Hangu Yeo , IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
Brent D. Paulovicks , IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
Vadim Sheinin , IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
pp. 1262-1271

Open research challenges with Big Data ? A data-scientist's perspective (Abstract)

Sreenivas R. Sukumar , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37831, USA
pp. 1272-1278

Maritime situation analysis framework: Vessel interaction classification and anomaly detection (Abstract)

Hamed Yaghoubi Shahir , Software Technology Lab, School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
Uwe Glasser , Software Technology Lab, School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
Amir Yaghoubi Shahir , Software Technology Lab, School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
Hans Wehn , MDA Systems Ltd. Research & Development, Richmond, BC, Canada
pp. 1279-1289

PAIRS: A scalable geo-spatial data analytics platform (Abstract)

Levente J Klein , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Fernando J Marianno , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Conrad M Albrecht , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Marcus Freitag , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Siyuan Lu , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Nigel Hinds , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Xiaoyan Shao , IBM TJ Watson Research Center Yorktown Heights, NY 10598
Sergio Bermudez Rodriguez , Osram Sylvania, Beverly, MA 01915
Hendrik F Hamann , IBM TJ Watson Research Center Yorktown Heights, NY 10598
pp. 1290-1298

Post-purchase recommendations in large-scale online marketplaces (Abstract)

Jayasimha Katukuri , Anonymous
Tolga Konik , Anonymous
Rajyashree Mukherjee , eBay Inc., San Jose, USA
Santanu Kolay , Turn Inc, San Jose, USA
pp. 1299-1305

Revenue maximization for telecommunications company with social viral marketing (Abstract)

Hong-Han Shuai , National Taiwan University
Chih-Ya Shen , Academia Sinica
Hsiang-Chun Hsu , Academia Sinica
De-Nian Yang , Academia Sinica
Chung-Kuang Chou , National Taiwan University
Jihg-Hong Lin , Chunghwa Telecom Laboratories
Ming-Syan Chen , National Taiwan University
pp. 1306-1310

Developer toolchains for large-scale analytics: Two case studies (Abstract)

Stephanie Rosenthal , Software Engineering Institute, Carnegie Mellon University, Pittsburgh PA USA
Scott McMillan , Software Engineering Institute, Carnegie Mellon University, Pittsburgh PA USA
Matthew E. Gaston , Software Engineering Institute, Carnegie Mellon University, Pittsburgh PA USA
pp. 1311-1316

Enterprise subscription churn prediction (Abstract)

Ramakrishna Vadakattu , eBay Inc., Bengaluru, India
Bibek Panda , eBay Inc., Bengaluru, India
Swarnim Narayan , eBay Inc., Bengaluru, India
Harshal Godhia , eBay Inc., Bengaluru, India
pp. 1317-1321

Data deidentification in medical transcriptions using regular expressions and machine learning (Abstract)

Joshua Seeger , NORC at the University of Chicago, 1 North State Street, 14th Floor, Chicago, IL 60602
Aron Culotta , Illinois Institute of Technology, Stuart Building 10 West 31st Street, Room 235, Chicago, IL 60616
Jason Keller , NORC at the University of Chicago, 1 North State Street, 14th Floor, Chicago, IL 60602
Patrick van Kessel , NORC at the University of Chicago, 1 North State Street, 14th Floor, Chicago, IL 60602
Michael Jugovich , NORC at the University of Chicago, 1 North State Street, 14th Floor, Chicago, IL 60602
pp. 1322-1323

Macau: Large-scale skill sense disambiguation in the online recruitment domain (Abstract)

Qinlong Luo , Data Science R & D, 5550-A Peachtree Parkway, Norcross, GA 30092, USA
Meng Zhao , Data Science R & D, 5550-A Peachtree Parkway, Norcross, GA 30092, USA
Faizan Javed , Data Science R & D, 5550-A Peachtree Parkway, Norcross, GA 30092, USA
Ferosh Jacob , Data Science R & D, 5550-A Peachtree Parkway, Norcross, GA 30092, USA
pp. 1324-1329

Genomic analysis with MapReduce (Abstract)

Wei Yi Liu , Data Analytics Technology & Applications Research Institute, Institute for Information Industry, Taipei, Taiwan
Hui-I Hsiao , Data Analytics Technology & Applications Research Institute, Institute for Information Industry, Taipei, Taiwan
Shih Yao Dai , Data Analytics Technology & Applications Research Institute, Institute for Information Industry, Taipei, Taiwan
pp. 1330-1335

Eagle: User profile-based anomaly detection for securing Hadoop clusters (Abstract)

Chaitali Gupta , eBay Inc. San Jose, CA, USA
Ranjan Sinha , eBay Inc. San Jose, CA, USA
Yong Zhang , eBay Inc. San Jose, CA, USA
pp. 1336-1343

Investigating insurance fraud using social media (Abstract)

Manuel Diaz-Granados , Rutgers Discovery Informatics Institute, Rutgers University, USA
Javier Diaz-Montes , Rutgers Discovery Informatics Institute, Rutgers University, USA
Manish Parashar , Rutgers Discovery Informatics Institute, Rutgers University, USA
pp. 1344-1349

A document-based data model for large scale computational maritime situational awareness (Abstract)

Luca Cazzanti , NATO STO Centre for Maritime Research and Experimentation (CMRE), La Spezia, Italy
Leonardo M. Millefiori , NATO STO Centre for Maritime Research and Experimentation (CMRE), La Spezia, Italy
Gianfranco Arcieri , NATO STO Centre for Maritime Research and Experimentation (CMRE), La Spezia, Italy
pp. 1350-1356

Modeling social influences from call records and mobile web browsing histories (Abstract)

Jhao-Yin Li , Department of Electrical Engineering, National Taiwan University
Mi-Yen Yeh , Institute of Information Science, Academia Sinica
Ming-Syan Chen , Department of Electrical Engineering, National Taiwan University
Jihg-Hong Lin , Big Data Laboratory, Chunghwa Telecom Laboratories
pp. 1357-1361

Next generation biobanks (Abstract)

Christian Seebode , ORTEC medical, Berlin, Germany
Matthias Ort , ORTEC medical, Berlin, Germany
Peter Hufnagl , Institut für Pathologie, Charité Universitätsmedizin Berlin, Berlin, Germany
Christian R. A. Regenbrecht , Institut für Pathologie Charité, Universitätsmedizin Berlin, Berlin, Germany
pp. 1362-1367

Data driven predictive analytics for a spindle's health (Abstract)

Divya Sardana , University of Cincinnati
Raj Bhatnagar , University of Cincinnati
Radu Pavel , University of Cincinnati
Jon Iverson , Techsolve, Inc., Cincinnati
pp. 1378-1387

A "smart component" data model in PLM (Abstract)

Yunpeng Li , Department of Mechanical and Aerospace Engineering, Syracuse University, Syracuse, New York 13244, USA
Utpal Roy , Department of Mechanical and Aerospace Engineering, Syracuse University, Syracuse, New York 13244, USA
Seung-Jun Shin , Systems Integration Division, National Institute of Standards and Technology (NIST) Gaithersburg, Maryland 20899, USA
Y. Tina Lee , Systems Integration Division, National Institute of Standards and Technology (NIST) Gaithersburg, Maryland 20899, USA
pp. 1388-1397

Big data process analytics for continuous process improvement in manufacturing (Abstract)

Nenad Stojanovic , NISSATECH INNOVATION, CENTRE DOO, Nis, Serbia
Marko Dinic , NISSATECH INNOVATION, CENTRE DOO, Nis, Serbia
Ljiljana Stojanovic , FZI Forschungszentrum Informatik am KIT, Karlsruhe, Germany
pp. 1398-1407

Automated uncertainty quantification analysis using a system model and data (Abstract)

Saideep Nannapaneni , Department of Civil & Environmental Engineering, Vanderbilt University, Nashville, TN 37235, USA
Sankaran Mahadevan , Department of Civil & Environmental Engineering, Vanderbilt University, Nashville, TN 37235, USA
David Lechevalier , Le2i, Université de Bourgogne, BP 47870, 21078 Dijon, France
Anantha Narayanan , Department of Mechanical Engineering, University of Maryland, College Park, MD 20742, USA
Sudarsan Rachuri , Systems Integration Division, Engineering Laboratory National Institute of Standards and Technology, Gaithersburg, MD 20899, USA
pp. 1408-1417

A neural network meta-model and its application for manufacturing (Abstract)

David Lechevalier , Le2i, Université de Bourgogne BP 47870, 21078 Dijon, France
Steven Hudak , University of Maryland, Baltimore County, 21250, MD, US
Ronay Ak , Systems Integration Division, Engineering Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899, USA
Y. Tina Lee , Systems Integration Division, Engineering Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899, USA
Sebti Foufou , CSE Department, College of Engineering, Qatar University, Qatar
pp. 1428-1435

Performance assessment and uncertainty quantification of predictive models for smart manufacturing systems (Abstract)

Luca Oneto , DITEN - University of Genoa Via Opera Pia 11A, I-16145, Genoa, Italy
Ilenia Orlandi , DIBRIS - University of Genoa Via Opera Pia 13, I-16145, Genoa, Italy
Davide Anguita , DIBRIS - University of Genoa Via Opera Pia 13, I-16145, Genoa, Italy
pp. 1436-1445

Time complexity and architecture of a cloud based prognostics system for a multi-client condition monitoring activity (Abstract)

Ashwin K. Thillai Natarajan , Dept. of Mechanical and Industrial Engineering, Northeastern University, Boston, MA 02115
Sagar Kamarthi , Dept. of Mechanical and Industrial Engineering, Northeastern University, Boston, MA 02115
pp. 1446-1450

Real-time energy prediction for a milling machine tool using sparse Gaussian process regression (Abstract)

Jinkyoo Park , Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
Kincho H. Law , Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
Raunak Bhinge , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
Mason Chen , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
David Dornfeld , Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
Sudarsan Rachuri , Systems Integration Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
pp. 1451-1460

Parallel Particle Swarm Optimization (PPSO) clustering for learning analytics (Abstract)

Kannan Govindarajan , Athabasca University, Edmonton, Canada
David Boulanger , Athabasca University, Edmonton, Canada
Vivekanandan Suresh Kumar , Athabasca University, Edmonton, Canada
Kinshuk , Athabasca University, Edmonton, Canada
pp. 1461-1465

High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm (Abstract)

Jeyhun Karimov , Computer Engineering Department, TOBB University of Economics and Technology, Ankara, Turkey
Murat Ozbayoglu , Computer Engineering Department, TOBB University of Economics and Technology, Ankara, Turkey
pp. 1473-1478

Agile text mining with Sherlok (Abstract)

pp. 1479-1484

Scalable adaptive label propagation in Grappa (Abstract)

Golnoosh Farnadi , Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
Zeinab Mahdavifar , Center for Data Science, University of Washington Tacoma, US
Ivan Keller , Department of Computer Science, KU Leuven, Belgium
Jacob Nelson , Department of Computer Science and Engineering, University of Washington, US
Ankur Teredesai , Center for Data Science, University of Washington Tacoma, US
Marie-Francine Moens , Department of Computer Science, KU Leuven, Belgium
Martine De Cock , Center for Data Science, University of Washington Tacoma, US
pp. 1485-1491

QueRIE reloaded: Using matrix factorization to improve database query recommendations (Abstract)

Magdalini Eirinaki , San Jose State University, San Jose, CA, USA
Sweta Patel , VISA Inc., San Francisco, CA, USA
pp. 1500-1508

Monitoring adolescent alcohol use via multimodal analysis in social multimedia (Abstract)

Ran Pang , University of Rochester, Rochester, NY 14627
Agustin Baretto , University of Rochester, Rochester, NY 14627
Henry Kautz , University of Rochester, Rochester, NY 14627
Jiebo Luo , University of Rochester, Rochester, NY 14627
pp. 1509-1518

An efficient map-reduce algorithm for computing formal concepts from binary data (Abstract)

Raj Bhatnagar , University of Cincinnati
Lalit Kumar , University of Cincinnati
pp. 1519-1528

Learning relaxed 3-clusters from pairs of related datasets (Abstract)

Jagadeesh Patchala , Department of Electrical Engineering and Computing, Systems University of Cincinnati, Cincinnati, OH, USA
Raj Bhatnagar , Department of Electrical Engineering and Computing, Systems University of Cincinnati, Cincinnati, OH, USA
pp. 1529-1538

Parallel information fusion method for microarray data analysis (Abstract)

Jun Meng , School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Rui Li , School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Jing Zhang , School of Computer Science and Technology, Dalian University of Technology, Dalian, China
pp. 1539-1544

A-Star algorithm based on-demand routing protocol for hierarchical LEO/MEO satellite networks (Abstract)

Xuezhi Ji , Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China
Lixiang Liu , Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China
Pei Zhao , Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China
Dapeng Wang , Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China
pp. 1545-1549

Granular modeling with fuzzy comparators (Abstract)

Lukasz Sosnowski , Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447 Warsaw, Poland and Dituel Sp. z o.o., Ostrobramska 101 lok. 206 04-041 Warsaw, Poland
Marcin Szczuka , Institute of Mathematics University of Warsaw, Banacha 2 02-097 Warsaw, Poland
Dominik Slezak , Institute of Mathematics University of Warsaw Banacha 2, 02-097 Warsaw, Poland and Infobright Inc., Krzywickiego 34/2, 19 02-078 Warsaw, Poland
pp. 1550-1555

Agglomerative algorithm to discover semantics from unstructured big data (Abstract)

I-Jen Chiang , College of Business Administration, Taipei Medical University, Taipei, Taiwan 110
pp. 1556-1563

A granular approach for identifying user knowledge (Abstract)

Alexander Denzler , University of Fribourg
Marcel Wehrle , University of Fribourg
Andreas Meier , University of Fribourg
pp. 1564-1569

Twitter opinion mining for adverse drug reactions (Abstract)

Liang Wu , Department of Computer Science, San Jose State University, San Jose, CA
Teng-Sheng Moh , Department of Computer Science, San Jose State University, San Jose, CA
Natalia Khuri , Department of Bioengineering, Stanford University, Stanford, CA
pp. 1570-1574

Data decomposition and dual clustering for clinical care management (Abstract)

Shusaku Tsumoto , Department of Medical Informatics, School of Medicine, Shimane University, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
Shoji Hirano , Department of Medical Informatics, School of Medicine, Shimane University, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
Haruko Iwata , Division of Nursing, Shimane University Hospital, Shimane University, 89-1 Enya-cho Izumo, Shimane 693-8501 Japan
pp. 1475-1584

Holistic entity matching across knowledge graphs (Abstract)

Maria Pershina , New York University
Mohamed Yakout , Microsoft Research
Kaushik Chakrabarti , Microsoft Research
pp. 1585-1590

GrC-based statistic optimization algorithm for big truth table (Abstract)

Chen Ze-hua , College of Information Engineering, Taiyuan University of Technology Taiyuan, Shanxi, P.R. China
Ma He , College of Information Engineering, Taiyuan University of Technology Taiyuan, Shanxi, P.R. China
Zhang Yu , College of Information Engineering, Taiyuan University of Technology Taiyuan, Shanxi, P.R. China
pp. 1591-1596

Mining incomplete data with many attribute-concept values and "do not care" conditions (Abstract)

Patrick G. Clark , Department of Electrical Eng. and Computer Sci., University of Kansas, Lawrence, KS 66045, USA
Jerzy W. Grzymala-Busse , Department of Electrical Eng. and Computer Sci., University of Kansas, Lawrence, KS, USA
pp. 1597-1602

Chinese wall security policies information flows in business cloud (Abstract)

Tsau-Young T. Y. Lin , Institute of Data Science and Computing, San Jose State University and GrC Society, San Jose, CA95192
pp. 1603-1607

Granular formalization of medical diagnostic process (Abstract)

Shusaku Tsumoto , Department of Medical Informatics, Faculty of Medicine, Shimane University, 89-1 Enya-cho Izumo 693-8501 Japan
Shoji Hirano , Department of Medical Informatics, Faculty of Medicine, Shimane University, 89-1 Enya-cho Izumo 693-8501 Japan
pp. 1608-1614

Mobile gesture-based iPhone user authentication (Abstract)

Karan Khare , Department of Computer Science, San Jose State University, San Jose, CA
Teng-Sheng Moh , Department of Computer Science, San Jose State University, San Jose, CA
pp. 1615-1621

Cost and data exploration considerations for big data prediction on the cloud (Abstract)

Chris Tseng , Computer Science Dept., San Jose State University
Tien Nguyen , Computer Science Dept., San Jose State University
Chetan Sharma , Computer Science Dept., San Jose State University
pp. 1622-1628

Mining local gazetteers of literary Chinese with CRF and pattern based methods for biographical information in Chinese history (Abstract)

Chao-Lin Liu , Department of Computer Science, National Chengchi University, Taiwan
Chih-Kai Huang , Department of Computer Science, National Chengchi University, Taiwan
Hongsu Wang , Institute for Quantitative Social Science, Harvard University, USA
Peter K. Bol , Institute for Quantitative Social Science, Harvard University, USA
pp. 1629-1638

Towards a mobile social data commons (Abstract)

Giles Greenway , Department of Digital Humanities, King's College London, United Kingdom
Leonard Mack , Open Data Institute, London, United Kingdom
Tobias Blanke , Department of Digital Humanities, King's College London, United Kingdom
Mark Cote , Department of Digital Humanities, King's College London, United Kingdom
Tom Heath , Open Data Institute, London, United Kingdom
pp. 1639-1642

Scaling out for extreme scale corpus data (Abstract)

Matthew Coole , School of Computing and Communications, Lancaster University, Lancaster, Lancashire, UK
Paul Rayson , School of Computing and Communications, Lancaster University, Lancaster, Lancashire, UK
John Mariani , School of Computing and Communications, Lancaster University, Lancaster, Lancashire, UK
pp. 1643-1649

Metaphor mining in historical german novels: An unsupervised learning approach (Abstract)

Stefan Pernes , University of Würzburg, Würzburg, Germany
pp. 1650-1652

Predicting social trends from non-photographic images on Twitter (Abstract)

Mehrdad Yazdani , Qualcomm Institute, California Institute for Telecommunication and Information Technology, University of California - San Diego, La Jolla, California 92037
Lev Manovich , Computer Science, The Graduate Center, City University of New York, New York, New York
pp. 1653-1660

The coding of literary form: Data mining and the information structure of historical texts (Abstract)

Dallas Liddle , Department of English, Augsburg College, Minneapolis, Minnesota
pp. 1661-1666

Plot arceology: A vector-space model of narrative structure (Abstract)

Benjamin M. Schmidt , Department of History; NuLab for Texts, Maps, and Networks, Northeastern University, Boston, Massachusetts, USA
pp. 1667-1672

A method for cross-document narrative alignment of a two-hundred-sixty-million word corpus (Abstract)

Ben Miller , Departments of English and Communication, Georgia State University
Jennifer Olive , Department of English, Georgia State University
Shakthidhar Gopavaram , Department of Computer Science, Indiana University
Yanjun Zhao , Department of Computer Science, Troy University
Ayush Shrestha , Department of Applied Linguistics, Georgia State University
Cynthia Berger , Department of Applied Linguistics, Georgia State University
pp. 1673-1677

Mixed-initiative social media analytics at the World Bank: Observations of citizen sentiment in Twitter data to explore "trust" of political actors and state institutions and its relationship to social protest (Abstract)

Nadya A. Calderon , Simon Fraser University
Brian Fisher , Simon Fraser University
Jeff Hemsley , Syracuse University
Billy Ceskavich , Syracuse University
Greg Jansen , University of Maryland
Richard Marciano , University of Maryland
Victoria L. Lemieux , The World Bank and The University of British Columbia
pp. 1678-1687

Workload-driven adaptive data partitioning and distribution ? The Cumulus approach (Abstract)

Ilir Fetai , Department of Mathematics and Computer Science, University of Basel, Switzerland
Damian Murezzan , Department of Mathematics and Computer Science, University of Basel, Switzerland
Heiko Schuldt , Department of Mathematics and Computer Science, University of Basel, Switzerland
pp. 1688-1697

Account clustering in multi-tenant storage management environments (Abstract)

Gabor Madl , Cloud Systems Analytics, IBM Research, Almaden, San Jose, CA 95120
Ramani Routray , Cloud Systems Analytics, IBM Research, Almaden, San Jose, CA 95120
Yang Song , Cloud Systems Analytics, IBM Research, Almaden, San Jose, CA 95120
Rakesh Jain , Cloud Systems Analytics, IBM Research, Almaden, San Jose, CA 95120
pp. 1698-1707

Fine-tuning the consistency-latency trade-off in quorum-replicated distributed storage systems (Abstract)

Marlon McKenzie , Electrical and Computer Engineering, University of Waterloo, Canada
Hua Fan , Electrical and Computer Engineering, University of Waterloo, Canada
Wojciech Golab , Electrical and Computer Engineering, University of Waterloo, Canada
pp. 1708-1717

Priority register: Application-defined replacement orderings for ad hoc reconciliation (Abstract)

Sathiya Prabhu Kumar , LISITE Laboratory, ISEP Paris, Paris, France
Sylvain Lefebvre , LISITE Laboratory, ISEP Paris, Paris, France
Minyoung Kim , Computer Science Laboratory, SRI International, Menlo Park, CA, USA
Mark Oliver Stehr , Computer Science Laboratory, SRI International, Menlo Park, CA, USA
pp. 1718-1727

A generalized flow for multi-class and binary classification tasks: An Azure ML approach (Abstract)

Matthew Bihis , Electrical Engineering, University of Washington, Bothell, USA
Sohini Roychowdhury , Electrical Engineering, University of Washington, Bothell, USA
pp. 1728-1737

Comparison of eager and quorum-based replication in a cloud environment (Abstract)

Alexander Stiemer , Department of Mathematics and Computer Science, University of Basel, Switzerland
Ilir Fetai , Department of Mathematics and Computer Science, University of Basel, Switzerland
Heiko Schuldt , Department of Mathematics and Computer Science, University of Basel, Switzerland
pp. 1738-1748

Towards a taxonomy of standards in smart data (Abstract)

Alexander Lenk , Accompanying Research of the BMWi Smart Data Technology Program, FZI Research Center for Information Technology, Berlin, Germany
Leif Bonorden , Accompanying Research of the BMWi Smart Data Technology Program, FZI Research Center for Information Technology, Berlin, Germany
Astrid Hellmanns , Accompanying Research of the BMWi Smart Data Technology Program, FZI Research Center for Information Technology, Berlin, Germany
Nico Roedder , Accompanying Research of the BMWi Smart Data Technology Program, FZI Research Center for Information Technology, Berlin, Germany
Stefan Jaehnichen , Accompanying Research of the BMWi Smart Data Technology Program, FZI Research Center for Information Technology, Berlin, Germany
pp. 1749-1754

Marlin: Taming the big streaming data in large scale video similarity search (Abstract)

Nan Zhu , McGill University, Montreal, Quebec, Canada
Wenbo He , McGill University, Montreal, Quebec, Canada
Yu Hua , Huazhong University of Science and Technology, Wuhan, Hubei, China
Yixin Chen , McGill University, Montreal, Quebec, Canada
pp. 1755-1764

Indexing historical spatio-temporal data in the cloud (Abstract)

Chong Zhang , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Xiaoying Chen , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Bin Ge , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Weidong Xiao , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
pp. 1765-1774

Push-based system for molecular simulation data analysis (Abstract)

Vladimir Grupcev , Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118 Tampa, FL 33620, U.S.A.
Yi-Cheng Tu , Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118 Tampa, FL 33620, U.S.A.
Joseph Fogarty , Department of Physics, University of South Florida, 4202 E. Fowler Ave., ISA 2019, Tampa, FL 33620, U.S.A.
Sagar Pandit , Department of Physics, University of South Florida, 4202 E. Fowler Ave., ISA 2019, Tampa, FL 33620, U.S.A.
pp. 1775-1784

Challenges and opportunities on network resource management in DCN with SDN (Abstract)

Guan Xu , School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
Jun Yang , School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
Bin Dai , School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
pp. 1785-1790

On the implementation of Zigzag codes for distributed storage system (Abstract)

Lijia Lu , Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, China
Hui Li , Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, China
Jun Chen , Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, China
Bing Zhu , Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, China
Weijuan Yin , Shenzhen Huadong Feitian Network Development Co., Ltd., Shenzhen, China
pp. 1791-1796

A comprehensive evaluation of NoSQL datastores in the context of historians and sensor data analysis (Abstract)

Arun Kumar Kalakanti , Data-centric Systems Research Group, Siemens Corporate Research and Technologies, Siemens Technology and Services Pvt. Ltd., Bangalore, India
Vinay Sudhakaran , Data-centric Systems Research Group, Siemens Corporate Research and Technologies, Siemens Technology and Services Pvt. Ltd., Bangalore, India
Varsha Raveendran , Data-centric Systems Research Group, Siemens Corporate Research and Technologies, Siemens Technology and Services Pvt. Ltd., Bangalore, India
Nisha Menon , Data-centric Systems Research Group, Siemens Corporate Research and Technologies, Siemens Technology and Services Pvt. Ltd., Bangalore, India
pp. 1797-1806

Learning classifiers from remote RDF data stores augmented with RDFS subclass hierarchies (Abstract)

Harris T. Lin , Department of Computer Science, Iowa State University, Ames, IA 50011, USA
Ngot Bui , Artificial Intelligence Research Laboratory, Center for Big Data Analytics and Discovery Informatics, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA 16802, USA
Vasant Honavar , Artificial Intelligence Research Laboratory, Center for Big Data Analytics and Discovery Informatics, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA 16802, USA
pp. 1807-1813

DISTINGER: A distributed graph data structure for massive dynamic graph processing (Abstract)

Guoyao Feng , David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
Xiao Meng , David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
Khaled Ammar , David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
pp. 1814-1822

LiteMat: A scalable, cost-efficient inference encoding scheme for large RDF graphs (Abstract)

Olivier Cure , LIP6 CNRS UMR 7606, Sorbonne Universites, UPMC Univ Paris 06, F-75005, Paris, France
Hubert Naacke , LIP6 CNRS UMR 7606, Sorbonne Universites, UPMC Univ Paris 06, F-75005, Paris, France
Tendry Randriamalala , LIP6 CNRS UMR 7606, Sorbonne Universites, UPMC Univ Paris 06, F-75005, Paris, France
Bernd Amann , LIP6 CNRS UMR 7606, Sorbonne Universites, UPMC Univ Paris 06, F-75005, Paris, France
pp. 1823-1830

MQuery: A query language for scientific meshes (Abstract)

Alireza Rezaei Mahdiraji , Jacobs University, 28759 Bremen, Germany
Peter Baumann , Jacobs University, 28759 Bremen, Germany
pp. 1831-1838

A fast parallel algorithm for counting triangles in graphs using dynamic load balancing (Abstract)

Shaikh Arifuzzaman , Network Dynamics & Simulation Science Laboratory, Virginia Bioinformatics Institute
Maleq Khan , Network Dynamics & Simulation Science Laboratory, Virginia Bioinformatics Institute
Madhav Marathe , Network Dynamics & Simulation Science Laboratory, Virginia Bioinformatics Institute
pp. 1839-1847

Scalable storage structure for pattern matching on big graph data (Abstract)

Janani Balaji , Department of Computer Science, Georgia State University, Atlanta, Georgia 30303
Rajshekhar Sunderraman , Department of Computer Science, Georgia State University, Atlanta, Georgia 30303
pp. 1848-1855

Employing in-memory data grids for distributed graph processing (Abstract)

Serafettin Tasci , Computer Science & Engineering Department, University at Buffalo, SUNY
Murat Demirbas , Computer Science & Engineering Department, University at Buffalo, SUNY
pp. 1856-1864

Current security threats and prevention measures relating to cloud services, Hadoop concurrent processing, and big data (Abstract)

Ather Sharif , Department of Computer Science, Saint Joseph's University, Philadelphia, PA 19131
Sarah Cooney , Department of Computer Science, Saint Joseph's University, Philadelphia, PA 19131
Shengqi Gong , Department of Computer Science, Saint Joseph's University, Philadelphia, PA 19131
Drew Vitek , Department of Computer Science, Saint Joseph's University, Philadelphia, PA 19131
pp. 1865-1870

Security for the scientific data services framework (Abstract)

Jinoh Kim , Department of Computer Science, Texas A&M University, Commerce, TX, 75429, USA
Bin Dong , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
Surendra Byna , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
Kesheng Wu , Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
pp. 1871-1875

A novel framework for mitigating insider attacks in big data systems (Abstract)

Santosh Aditham , Dept of Computer Science and Engineering, University of South Florida, Tampa, USA
Nagarajan Ranganathan , Dept of Computer Science and Engineering, University of South Florida, Tampa, USA
pp. 1876-1885

Heterogeneous k-anonymization with high utility (Abstract)

Katerina Doka , National Technical University of Athens, Greece
Mingqiang Xue , IR, Singapore
Dimitrios Tsoumakos , Ionian University, Greece
Panagiotis Karras , Skoltech, Russia
Alfredo Cuzzocrea , University of Trieste and ICAR-CNR, Italy
Nectarios Koziris , National Technical University of Athens, Greece
pp. 1886-1890

Multi-probe random projection clustering to secure very large distributed datasets (Abstract)

Lee A. Carraher , University of Cincinnati, Cincinnati, OH 45221-0030
Philip A. Wilsey , University of Cincinnati, Cincinnati, OH 45221-0030
Anindya Moitra , University of Cincinnati, Cincinnati, OH 45221-0030
Sayantan Dey , University of Cincinnati, Cincinnati, OH 45221-0030
pp. 1891-1900

Fast summarization and anonymization of multivariate big time series (Abstract)

Dymitr Ruta , Etisalat British Telecom Innovation Center, Khalifa University of Science, Technology and Research, Abu Dhabi, UAE
Ling Cen , Etisalat British Telecom Innovation Center, Khalifa University of Science, Technology and Research, Abu Dhabi, UAE
Ernesto Damiani , Etisalat British Telecom Innovation Center, Khalifa University of Science, Technology and Research, Abu Dhabi, UAE
pp. 1901-1904

Toward big data risk analysis (Abstract)

Ernesto Damiani , Etisalat British Telecom Innovation Center, Khalifa University of Science, Technology and Research, Abu Dhabi, UAE
pp. 1905-1909

A distributed framework for supporting adaptive ensemble-based intrusion detection (Abstract)

Alfredo Cuzzocrea , University of Trieste and ICAR-CNR, Trieste, Italy
Gianluigi Folino , ICAR-CNR, Rende, Italy
Pietro Sabatino , ICAR-CNR, Rende, Italy
pp. 1910-1916

Simplifying web analytics for digital marketing (Abstract)

Andy Bengel , Software Engineer, Marketing Analyst, InfoTrust LLC, Blue Ash, Ohio, USA
Amin Shawki , Software Engineer, Marketing Analyst, InfoTrust LLC, Blue Ash, Ohio, USA
Dippy Aggarwal , Computer Science and Engineering, University of Cincinnati, Cincinnati, Ohio, USA
pp. 1917-1918

PAUSE: A privacy architecture for heterogeneous big data environments (Abstract)

Dawn N. Jutla , Sobey School of Business, Saint Mary's University, Halifax, Nova Scotia, Canada
Peter Bodorik , Faculty of Computer Science, Dalhousie University
pp. 1919-1928

Spatio-temporal queries in HBase (Abstract)

Xiaoying Chen , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Chong Zhang , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Bin Ge , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
Weidong Xiao , Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, P.R. China
pp. 1929-1937

Component based dataflow processing framework (Abstract)

V. Gyurjyan , TJNAF, Newport News, VA
A. Bartle , Mechdyne, Co., Virginia Beach, VA
C. Lukashin , NASA Langley Research Center, Hampton, VA
S. Mancilla , Universidad Técnica Federico Santa María, Chile
R. Oyarzun , Universidad Técnica Federico Santa María, Chile
A. Vakhnin , Science Systems and Applications Inc., Hampton, VA
pp. 1938-1942

Earth science data fusion with event building approach (Abstract)

C. Lukashin , NASA Langley Research Center, Hampton, VA
A. Bartle , Mechdyne Corporation, Virginia Beach, VA
E. Callaway , Mechdyne Corporation, Virginia Beach, VA
V. Gyijrjyan , Thomas Jefferson National Accelerator Facility, Newport News, VA
S. Mancilla , University Tecnica Federico Santa Maria, Chile
R. Oyarzun , University Tecnica Federico Santa Maria, Chile
A. Vakhnin , Science Systemt and Applications Inc., Hampton, VA
pp. 1943-1947

Climate model diagnostic analyzer (Abstract)

Seungwon Lee , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA U.S.A.
Lei Pan , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA U.S.A.
Chengxing Zhai , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA U.S.A.
Benyang Tang , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA U.S.A.
Terry Kubar , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA U.S.A.
Jia Zhang , Department of Software Engineering, Carnegie Mellon University, Silicon Valley, CA U.S.A.
Wei Wang , Department of Software Engineering, Carnegie Mellon University, Silicon Valley, CA U.S.A.
pp. 1948-1952

High performance analysis of big spatial data (Abstract)

David Haynes , Minnesota Population Center, University of Minnesota
Suprio Ray , Faculty of Computer Science, University of New Brunswick, Fredericton
Steven M. Manson , Department of Geography, University of Minnesota
Ankit Soni , Minnesota Population Center, University of Minnesota
pp. 1953-1957

International standard ?OGC® moving features? to address ?4Vs? on locational bigdata (Abstract)

Akinori Asahara , Hitachi Ltd., Center for Technical Innovation - Systems. Resaerch and Development Group, Kokubunji-shi, Tokyo, Japan
Hideki Hayashi , Hitachi Ltd., Center for Technical Innovation - Systems. Resaerch and Development Group, Kokubunji-shi, Tokyo, Japan
Nobuhiro Ishimaru , Hitachi Ltd., Defense Systems Company, Chiyoda-ku, Tokyo, Japan
Ryosuke Shibasaki , Center for Spatial Information Science, University of Tokyo, Meguro-ku, Tokyo, Japan
Hiroshi Kanasugi , Center for Spatial Information Science, University of Tokyo, Meguro-ku, Tokyo, Japan
pp. 1958-1966

Optimizing apache nutch for domain specific crawling at large scale (Abstract)

Luis A. Lopez , NSIDC, Boulder, Colorado
Ruth Duerr , The Ronin Institute, Boulder, Colorado
Siri Jodha Singh Khalsa , University of Colorado Boulder, Boulder, Colorado
pp. 1967-1971

A Hadoop-based visualization and diagnosis framework for earth science data (Abstract)

Shujia Zhou , Northrop Grumman Information Technology, McLean, VA 22102
Xi Yang , Illinois Institute of Technology, Chicago, IL 60616
Xiaowen Li , Morgan State University, Baltimore, MD 21251
Toshihisa Matsui , University of Maryland, College Park, MD 20742
Si Liu , Illinois Institute of Technology, Chicago, IL 60616
Xian-He Sun , Illinois Institute of Technology, Chicago, IL 60616
Weikuo Tao , NASA Goddard Space Flight Center Greenbelt, MD 20771
pp. 1972-1977

Enabling scientific data storage and processing on big-data systems (Abstract)

Saman Biookaghazadeh , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University
Yiqi Xu , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University
Shujia Zhou , Northrop Grumman Information Technology
Ming Zhao , School of Computing, Informatics, and Decision Systems Engineering, Arizona State University
pp. 1978-1984

Light-weight parallel Python tools for earth system modeling workflows (Abstract)

Kevin Paul , National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder, Colorado 80305
Sheri Mickelson , National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder, Colorado 80305
John M. Dennis , National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder, Colorado 80305
Haiying Xu , National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder, Colorado 80305
David Brown , National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder, Colorado 80305
pp. 1985-1994

WDCloud: An end to end system for large-scale watershed delineation on cloud (Abstract)

In Kee Kim , University of Virginia, Charlottesville, VA, 22903
Jacob Steele , University of Virginia, Charlottesville, VA, 22903
Anthony M. Castronova , Utah State University, Logan, UT, 84322
Jonathan L. Goodall , University of Virginia, Charlottesville, VA, 22903
Marty Humphrey , University of Virginia, Charlottesville, VA, 22903
pp. 1995-2004

Integrating ?Big? geoscience data into the petascale national environmental research interoperability platform (NERDIP): Successes and unforeseen challenges (Abstract)

Lesley Wyborn , National Computational Infrastructure, Australian National University, Canberra, Australia
Benjamin J. K. Evans , National Computational Infrastructure, Australian National University, Canberra, Australia
pp. 2005-2009

An optimized interestingness hotspot discovery framework for large gridded spatio-temporal datasets (Abstract)

Fatih Akdag , Computer Science Department, University of Houston
Christoph F. Eick , Computer Science Department, University of Houston
pp. 2010-2019

Detecting environmental disasters in digital news archives (Abstract)

Amelia Yzaguirre , Dalhousie University Halifax, Canada
Robert Warren , Dalhousie University Halifax, Canada
Mike Smit , Dalhousie University Halifax, Canada
pp. 2027-2035

Is Apache Spark scalable to seismic data analytics and computations? (Abstract)

Yuzhong Yan , Department of Computer Science, Prairie View A&M University, Prairie View, TX
Lei Huang , Department of Computer Science, Prairie View A&M University, Prairie View, TX
Liqi Yi , Intel Corporation, 2111 NE 25th Ave., Hillsboro, OR
pp. 2036-2045

On the efficient evaluation of array joins (Abstract)

Peter Baumann , Jacobs University, Bremen, Germany, 28759 Bremen, Germany
Vlad Merticariu , rasdaman GmbH, 28759 Bremen, Germany
pp. 2046-2055

Business information modeling: A methodology for data-intensive projects, data science and big data governance (Abstract)

Torsten Priebe , Simplity s.r.o. Vienna, Austria
Stefan Markus , Simplity s.r.o. Vienna, Austria
pp. 2056-2065

Towards methods for systematic research on big data (Abstract)

Manirupa Das , Department of Computer Science and Engineering, The Ohio State University
Renhao Cui , Department of Computer Science and Engineering, The Ohio State University
David R. Campbell , Department of Computer Science and Engineering, The Ohio State University
Gagan Agrawal , Department of Computer Science and Engineering, The Ohio State University
Rajiv Ramnath , Department of Computer Science and Engineering, The Ohio State University
pp. 2072-2081

Towards a big data theory model (Abstract)

Marco Pospiech , Department of Management and Information, TU Bergakademie Freiberg, Freiberg, Germany
Carsten Felden , Department of Management and Information, TU Bergakademie Freiberg, Freiberg, Germany
pp. 2082-2090

Exploring the process of doing data science via an ethnographic study of a media advertising company (Abstract)

Jeffrey S. Saltz , School of Information Studies, Syracuse University, Syracuse, NY, USA
Ivan Shamshurin , School of Information Studies, Syracuse University, Syracuse, NY, USA
pp. 2098-2105

Forecast UPC-level FMCG demand, Part I: Exploratory analysis and visualization (Abstract)

Dazhi Yang , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Gary S. W. Goh , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Chi Xu , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Allan N. Zhang , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Orkan Akcan , Antuit Singapore, Singapore
pp. 2106-2112

Forecast UPC-level FMCG demand, Part II: Hierarchical reconciliation (Abstract)

Dazhi Yang , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Gary S. W. Goh , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Siwei Jiang , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Allan N. Zhang , Singapore Institute of Manufacturing Technology (SIMTech) Agency for Science, Technology and Research (A?STAR) Singapore, Singapore
Orkan Akcan , Antuit, Singapore, Singapore
pp. 2113-2121

Sparsity adjusted information gain for feature selection in sentiment analysis (Abstract)

B. Y. Ong , Planning and Operations Management Group, Singapore Institute of Manufacturing Technology (SIMTech), A?STAR, Singapore, Singapore
S. W. Goh , Planning and Operations Management Group, Singapore Institute of Manufacturing Technology (SIMTech), A?STAR, Singapore, Singapore
Chi Xu , Planning and Operations Management Group, Singapore Institute of Manufacturing Technology (SIMTech), A?STAR, Singapore, Singapore
pp. 2122-2128

Dynamic aggregation for time series forecasting (Abstract)

S. Iosevich , Prognos, an Antuit Company 1011 Lake Street, Suite 308 Oak Park, IL 60301
G. Arutyunyants , Prognos, an Antuit Company 1011 Lake Street, Suite 308 Oak Park, IL 60301
Z. Hou , Prognos, an Antuit Company 1011 Lake Street, Suite 308 Oak Park, IL 60301
pp. 2129-2131

Big data analytics for empowering milk yield prediction in dairy supply chains (Abstract)

W. J. Yan , Planning and Operations Management Group Singapore Institute of Manufacturing Technology 71 Nanyang Drive, Singapore 638075
X. Chen , Meme Analytics Pte. Ltd. 18 Boon Lay Way, #09-155, Tradehub 21, Singapore 609966
O. Akcan , Antuit Pte. Ltd., 10 Hoe Chiang Road, #06-03, Keppel Towers, Singapore 089315
J. Lim , Planning and Operations Management Group Singapore Institute of Manufacturing Technology 71 Nanyang Drive, Singapore 638075
D. Yang , Planning and Operations Management Group Singapore Institute of Manufacturing Technology 71 Nanyang Drive, Singapore 638075
pp. 2132-2137

Profit estimation error analysis in recommender systems based on association rules (Abstract)

Gurdal Ertek , Rochester Institute of Technology - Dubai, Dubai Silicon Oasis, Dubai, UAE
Xu Chi , Singapore Institute of Manufacturing Technology, Singapore
Gabriel Yee , Singapore Institute of Manufacturing Technology, Singapore
Ong Boon Yong , Singapore Institute of Manufacturing Technology, Singapore
Byung-Geun Choi , Singapore Institute of Manufacturing Technology, Singapore
pp. 2138-2142

Graph-based analysis of resource dependencies in project networks (Abstract)

Gurdal Ertek , Rochester Institute of Technology - Dubai, Dubai Silicon Oasis, Dubai, U.A.E.
Byung-Geun Choi , Singapore Institute of Manufacturing Technology, Singapore
Xu Chi , Singapore Institute of Manufacturing Technology, Singapore
DaZhi Yang , Singapore Institute of Manufacturing Technology, Singapore
Ong Boon Yong , Singapore Institute of Manufacturing Technology, Singapore
pp. 2143-2149

A data fusion framework for large-scale measurement platforms (Abstract)

Prapa Rattadilok , Smart Data Technologies Centre, Robert Gordon University, Aberdeen, UK
John McCall , Smart Data Technologies Centre, Robert Gordon University, Aberdeen, UK
Trevor Burbridge , British Telecom (BT), Ipswich, UK
Andrea Soppera , British Telecom (BT), Ipswich, UK
Philip Eardley , British Telecom (BT), Ipswich, UK
pp. 2150-2158

Sensor event mining with hybrid ensemble learning and evolutionary feature subset selection model (Abstract)

Nijat Mehdiyev , Institute for Information System (IWi) German Research Center for Artificial Intelligence (DFKI) Saarbrücken, Germany
Julian Krumeich , Institute for Information System (IWi) German Research Center for Artificial Intelligence (DFKI) Saarbrücken, Germany
Dirk Werth , Institute for Information System (IWi) German Research Center for Artificial Intelligence (DFKI) Saarbrücken, Germany
Peter Loos , Institute for Information System (IWi) German Research Center for Artificial Intelligence (DFKI) Saarbrücken, Germany
pp. 2159-2168

Optimization of system architecture for Big Data analysis in climate science (Abstract)

Huikyo Lee , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA
Luca Cinquini , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA
Daniel Crichton , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA
Amy Braverman , Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA
pp. 2169-2172

In-situ analytics for tomographic imaging in sensor network (Abstract)

Goutham Kamath , Department of Computer Science, Georgia State University
Wen-Zhan Song , Department of Computer Science, Georgia State University
pp. 2173-2176

Ontology-drive data access at the NASA earth exchange (Abstract)

Beth Huffer , Lingua Logica LLCl Denver, CO, United States
Marc Cotnoir , Computer Sciences Corporation, Moffett Field, CA, United States
Jonathan Gleason , Science Directorate, NASA Langley Research Center, Hampton, VA, United States
pp. 2177-2181

Constrained region selection method based on configuration space for visualization in scientific dataset search (Abstract)

Shin'ichi Takeuchi , National Institute of Information and Communications Technology, Kyoto 619-0289, Japan
Komei Sugiura , National Institute of Information and Communications Technology, Kyoto 619-0289, Japan
Yuhei Akahoshi , National Institute of Information and Communications Technology, Kyoto 619-0289, Japan
Koji Zettsu , National Institute of Information and Communications Technology, Kyoto 619-0289, Japan
pp. 2191-2200

Enhancing science support in SQL (Abstract)

Peter Baumann , rasdaman GmbH, Hans-Hermann-Sieling 17, 28759 Bremen, Germany
Dimitar Misev , rasdaman GmbH, Hans-Hermann-Sieling 17, 28759 Bremen, Germany
pp. 2201-2204

Modeling community detection using slow mixing random walks (Abstract)

Ramezan Paravi Torghabeh , Department of Electrical Engineering, University of Hawai'i at Manoa, Honolulu, HI
Narayana Prasad Santhanam , Department of Electrical Engineering, University of Hawai'i at Manoa, Honolulu, HI
pp. 2205-2211

Dimensional scalability of supervised and unsupervised concept drift detection: An empirical study (Abstract)

Jorge David Destephen Lavaire , University of Central Missouri
Anshuman Singh , University of Central Missouri
Mahmoud Yousef , University of Central Missouri
Sumi Singh , University of Central Missouri
Xiaodong Yue , University of Central Missouri
pp. 2212-2218

Efficient change detection for high dimensional data streams (Abstract)

Spiros V. Georgakopoulos , Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
Sotiris K. Tasoulis , Department of Applied Mathematics, Liverpool John, Moores University, Liverpool, United Kingdom
Vassilis P. Plagianakos , Department of Computer Science and Biomedical, Informatics, University of Thessaly, Lamia, Greece
pp. 2219-2222

Big data analytics for demand response: Clustering over space and time (Abstract)

Charalampos Chelmis , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
Jahanvi Kolte , Institute of Technology, Nirma University, Ahmedabad, Gujarat, India
Viktor K. Prasanna , Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA
pp. 2223-2232

Finding banded patterns in big data using sampling (Abstract)

Fatimah B Abdullahi , Department of Computer Science, University of Liverpool, Ashton Street Liverpool, L69 3BX United Kingdom
Frans Coenen , Department of Computer Science, University of Liverpool, Ashton Street Liverpool, L69 3BX United Kingdom
Russell Martin , Department of Computer Science, University of Liverpool, Ashton Street Liverpool, L69 3BX United Kingdom
pp. 2233-2242

Scalable preference queries for high-dimensional data using map-reduce (Abstract)

Gheorghi Guzun , Electrical and Computer Engineering, The University of Iowa, Iowa City, USA
Joel E. Tosado , Electrical and Computer Engineering, The University of Iowa, Iowa City, USA
Guadalupe Canahuate , Electrical and Computer Engineering, The University of Iowa, Iowa City, USA
pp. 2243-2252

Discovering time-evolving influence from dynamic heterogeneous graphs (Abstract)

Chuan Hu , Department of Computer Science, New Mexico State University, New Mexico, 88003
Huiping Cao , Department of Computer Science, New Mexico State University, New Mexico, 88003
pp. 2253-2262

Combining activity-evaluation information with NMF for trust-link prediction in social media (Abstract)

Kanji Matsutani , Department of Electronics and Informatics, Ryukoku University, Japan
Masahito Kumano , Department of Electronics and Informatics, Ryukoku University, Japan
Masahiro Kimura , Department of Electronics and Informatics, Ryukoku University, Japan
Kazumi Saito , School of Administration and Informatics, University of Shizuoka, Japan
Kouzou Ohara , Department of Integrated Information Technology, Aoyama Gakuin University, Japan
Hiroshi Motoda , Institute of Scientific and Industrial Research, Osaka University, Japan
pp. 2263-2272

Identifying actionable messages on social media (Abstract)

Nemanja Spasojevic , Lithium Technologies / Klout, San Francisco, CA
Adithya Rao , Lithium Technologies / Klout, San Francisco, CA
pp. 2273-2281

Klout score: Measuring influence across multiple social networks (Abstract)

Adithya Rao , Lithium Technologies / Klout, San Francisco, CA
Nemanja Spasojevic , Lithium Technologies / Klout, San Francisco, CA
Zhisheng Li , Lithium Technologies / Klout, San Francisco, CA
Trevor Dsouza , Lithium Technologies / Klout, San Francisco, CA
pp. 2282-2289

Top (k1, k2) Distance-based outliers detection in an uncertain dataset (Abstract)

Fei Liu , College of Computer, National University of Defense Technology, 410073, Changsha, P.R. China
Yan Jia , College of Computer, National University of Defense Technology, 410073, Changsha, P.R. China
pp. 2290-2299

Understanding the time characteristic of user behavior on online forums (Abstract)

Guirong Chen , School of Information and Navigation, Air Force Engineering University of PL, Xi an, China
Ning Wang , Department Nineteen, Aeronautical Computing Technique Research Institute, Xi an, China
Fengqin Zhang , School of Information and Navigation, Air Force Engineering University of PL, Xi an, China
Hua Jiang , School of Information and Navigation, Air Force Engineering University of PL, Xi an, China
pp. 2300-2306

Characterizing super spreading in microblog: An epidemic-based model (Abstract)

Yu Liu , Beijing Key Laboratory of Intelligence Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, 10 Xitucheng Road, Haidian District, Beijing, China
Bin Wu , Beijing Key Laboratory of Intelligence Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, 10 Xitucheng Road, Haidian District, Beijing, China
Bai Wang , Beijing Key Laboratory of Intelligence Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, 10 Xitucheng Road, Haidian District, Beijing, China
pp. 2307-2313

A community detection method based on K-shell (Abstract)

Yang Wang , Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China
Liutong Xu , Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China
Bin Wu , Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, China
pp. 2314-2319

How much is your information worth ? A method for revenue generation for your information (Abstract)

Divya Rao , School of Computer Engineering, Nanyang Technological University, Singapore
Wee Keong Ng , School of Computer Engineering, Nanyang Technological University, Singapore
pp. 2320-2326

Efficient large scale distributed matrix computation with spark (Abstract)

Rong Gu , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
Yun Tang , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
Zhaokang Wang , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
Shuai Wang , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
Xusen Yin , Intel Corporation, Beijing, China, 100190
Chunfeng Yuan , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
Yihua Huang , National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University, Nanjing, China 210093
pp. 2327-2336

A collaborative filtering algorithm fusing user-based, item-based and social networks (Abstract)

Bailing Wang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Junheng Huang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Libing Ou , Department of Mathematics, Harbin Institute of Technology at Weihai, Weihai, China
Rui Wang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
pp. 2337-2343

Mining the relation between dorm arrangement and student performance (Abstract)

Man Li , Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education, Beijing University of Posts & Telecommunications, Beijing 100876, China
Ruisheng Shi , Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education, Beijing University of Posts & Telecommunications, Beijing 100876, China
pp. 2344-2347

A proactive discovery and filtering solution on phishing websites (Abstract)

Lv Fang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Wang Bailing , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Huang Junheng , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Sun Yushan , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Wei Yuliang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
pp. 2348-2355

Finding community structure via rough K-means in social network (Abstract)

Yunlei Zhang , School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
Bin Wu , School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
pp. 2356-2361

A survey of semantic similarity and its application to social network analysis (Abstract)

Shuang Zhang , School of Computer and Communication Engineering, University of Science and Technology, Beijing Beijing, China
Xuefeng Zheng , School of Computer and Communication Engineering, University of Science and Technology, Beijing Beijing, China
Changjun Hu , School of Computer and Communication Engineering, University of Science and Technology, Beijing Beijing, China
pp. 2362-2367

Dynamic community detection based on game theory in social networks (Abstract)

Fei Jiang , School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Jin Xu , School of Electronics Engineering and Computer Science, Peking University, Beijing, China
pp. 2368-2373

The value of analytical queries on Social Networks (Abstract)

Michel de Rougemont , University of Paris II and LIAFA-CNRS
Guillaume Vimont , University Paris II
pp. 2374-2383

A collaborative filtering algorithm based on social network information (Abstract)

Rui Wang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Bailing Wang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
Junheng Huang , Department of Computer Science and Technology, Harbin Institute of Technology at Weihai, Weihai, China
pp. 2384-2389

Ties that matter (Abstract)

Garisha Chowdhary , Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India, 700108
Sanghamitra Bandyopadhyay , Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India, 700108
pp. 2398-2403

Sentiment expression via emoticons on social media (Abstract)

Hao Wang , Silicon Valley Laboratory, IBM San Jose, USA
Jorge A. Castanon , Silicon Valley Laboratory, IBM San Jose, USA
pp. 2404-2408

On compressing massive streaming graphs with Quadtrees (Abstract)

Michael Nelson , School of Computer Science, University of Oklahoma, Norman, OK, USA
Sridhar Radhakrishnan , School of Computer Science, University of Oklahoma, Norman, OK, USA
Amlan Chatterjee , Department of Computer Science, california State University Dominguez Hills, Carson, CA, USA
Chandra N. Sekharan , Department of Computer Science, Loyola University Chicago, Chicago, IL, USA
pp. 2409-2417

Social set visualizer: A set theoretical approach to big social data analytics of real-world events (Abstract)

Benjamin Flesch , Copenhagen Business School, Denmark
Ravi Vatrapu , Copenhagen Business School, Denmark
Raghava Rao Mukkamala , Copenhagen Business School, Denmark
Abid Hussain , Copenhagen Business School, Denmark
pp. 2418-2427

A novel symbolization technique for time-series outlier detection (Abstract)

Gavin Smith , Horizon Digital Economy Research, The University of Nottingham, UK
James Goulding , Horizon Digital Economy Research, The University of Nottingham, UK
pp. 2428-2436

Volatility matrix inference in high-frequency finance with regularization and efficient computations (Abstract)

Jian Zou , Department of Mathematical Sciences, Worcester Polytechnic Institute
Yunbo An , Department of Mathematical Sciences, Worcester Polytechnic Institute
Hong Yan , Department of Mathematical Sciences, Worcester Polytechnic Institute
pp. 2437-2444

Shaping data: Visualization under construction (Abstract)

Oliver Bieh-Zimmert , TU Bergakademie Freiberg
Carsten Felden , TU Bergakademie Freiberg
pp. 2445-2452

Immersive visualization for materials science data analysis using the Oculus Rift (Abstract)

Margaret Drouhard , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Chad A. Steed , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Steven Hahn , Neutron Data Analysis and Visualization Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Thomas Proffen , Neutron Data Analysis and Visualization Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Jamison Daniel , National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37831
Michael Matheson , National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37831
pp. 2453-2461

Spatio-temporal similarity search method for disaster estimation (Abstract)

Hideki Hayashi , Center for Technology Innovation - System Engineering, Research & Development Group, Hitachi, Ltd. 1-280, Higashi-koigakubo Kokubunji-shi, Tokyo, 185-8601 Japan
Akinori Asahara , Center for Technology Innovation - System Engineering, Research & Development Group, Hitachi, Ltd. 1-280, Higashi-koigakubo Kokubunji-shi, Tokyo, 185-8601 Japan
Natsuko Sugaya , IT Platform Division Group, Telecommunication Systems Company, Hitachi, Ltd., 292, Yoshida-cho, Totsuka-ku, Yokohama, Kanagawa, 244-0817 Japan
Yuichi Ogawa , IT Platform Division Group, Telecommunication Systems Company, Hitachi, Ltd., 292, Yoshida-cho, Totsuka-ku, Yokohama, Kanagawa, 244-0817 Japan
Hitoshi Tomita , Social Innovation Business Promotion Division, Hitachi, Ltd., Akihabara Daibiru Building, 18-13, Soto-Kanda 1-chome, Chiyoda-ku, Tokyo, 101-8608 Japan
pp. 2462-2469

Scalable dental computing on cyberinfrastructure (Abstract)

Hui Zhang , University of Louisville, KY, USA
Riqing Chen , Fujian Agriculture and Forestry University, P.R. China
Guangchen Ruan , Indiana University, IN, USA
Masatoshi Ando , Indiana University, IN, USA
pp. 2470-2478

Wrangler's user environment: A software framework for management of data-intensive computing system (Abstract)

Christopher Jordan , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
David Walling , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
Weijia Xu , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
Stephen A. Mock , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
Niall Gaffney , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
Dan Stanzione , Texas Advanced Computing Center/University of Texas at Austin, Austin Tx, USA
pp. 2479-2486

Visual analysis of large-scale LiDAR point clouds (Abstract)

Wanbo Luo , Fujian Surveying and Mapping Institute, P.R. China
Hui Zhang , University of Louisville, KY, USA
pp. 2487-2492

A database-based distributed computation architecture with Accumulo and D4M: An application of eigensolver for large sparse matrix (Abstract)

Yin Huang , Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, 21250
Yelena Yesha , Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, 21250
Shujia Zhou , Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD, 21250
pp. 2493-2500

Texture-based edge bundling: A web-based approach for interactively visualizing large graphs (Abstract)

Jieting Wu , Computer Science and Engineering, University of Nebraska-Lincoln
Lina Yu , Computer Science and Engineering, University of Nebraska-Lincoln
Hongfeng Yu , Computer Science and Engineering, University of Nebraska-Lincoln
pp. 2501-2508

Big data provenance: Challenges, state of the art and opportunities (Abstract)

Jianwu Wang , Department of Information Systems, University of Maryland, Baltimore County, Baltimore, MD, U.S.A.
Daniel Crawl , San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, U.S.A.
Shweta Purawat , San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, U.S.A.
Mai Nguyen , San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, U.S.A.
Ilkay Altintas , San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, U.S.A.
pp. 2509-2516

Performance evaluation of enabling logistic regression for big data with R (Abstract)

Ruizhu Huang , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas
Weijia Xu , Texas Advanced Computing Center, University of Texas at Austin, Austin, Texas
pp. 2517-2524

Skill grouping method: Mining and clustering skill differences from body movement BigData (Abstract)

Shinichi Yamagiwa , Faculty of Engineering, Information and Systems, University of Tsukuba, Tsukuba, Ibaraki, 305-8573 Japan
Yoshinobu Kawahara , Institute of Scientific and Industrial Research, Osaka University, Osaka, 567-0047 Japan
Noriyuki Tabuchi , MIZUNO Corporation, Suminoe, Osaka, 559-8510 Japan
Yoshinobu Watanabe , MIZUNO Corporation, Suminoe, Osaka, 559-8510 Japan
Takeshi Naruo , MIZUNO Corporation, Suminoe, Osaka, 559-8510 Japan
pp. 2525-2534

Regularized and sparse stochastic k-means for distributed large-scale clustering (Abstract)

Vilen Jumutc , KU Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Rocco Langone , KU Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
Johan A. K. Suykens , KU Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
pp. 2535-2540

Join algorithms on GPUs: A revisit after seven years (Abstract)

Ran Rui , Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118, Tampa, Florida, USA
Hao Li , Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118, Tampa, Florida, USA
Yi-Cheng Tu , Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118, Tampa, Florida, USA
pp. 2541-2550

A data-driven approach towards patient identification for telehealth programs (Abstract)

Martha Ganser , Robert Bosch Healthcare Systems, Inc., Palo Alto, CA
Sauptik Dhar , Robert Bosch Research and Technology Center, Palo Alto, CA
Unmesh Kurup , Robert Bosch Research and Technology Center, Palo Alto, CA
Carlos Cunha , Robert Bosch Data Mining Services, Palo Alto, CA
Aca Gacic , Robert Bosch Healthcare Systems, Inc., Palo Alto, CA
pp. 2551-2559

Ensemble prediction of vascular injury in Trauma care: Initial efforts towards data-driven, low-cost screening (Abstract)

Max Metzger , Decision Management Systems, Charles River Analytics, Inc., Cambridge, MA, USA
Michael Howard , Decision Management Systems, Charles River Analytics, Inc., Cambridge, MA, USA
Lee Kellogg , Decision Management Systems, Charles River Analytics, Inc., Cambridge, MA, USA
Rishi Kundi , Division of Vascular Surgery, University of Maryland School of Medicine, Baltimore, MD USA
pp. 2560-2568

M-SEQ: Early detection of anxiety and depression via temporal orders of diagnoses in electronic health data (Abstract)

Jinghe Zhang , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA
Haoyi Xiong , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA
Yu Huang , Department of Computer Science, University of Virginia, Charlottesville, VA
Hao Wu , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA
Kevin Leach , Department of Computer Science, University of Virginia, Charlottesville, VA
Laura E. Barnes , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA
pp. 2569-2577

Using clinical data, hypothesis generation tools and PubMed trends to discover the association between diabetic retinopathy and antihypertensive drugs (Abstract)

Katherine Senter , University of Pennsylvania College of Arts and Sciences
Sreenivas R. Sukumar , Oak Ridge National Laboratory Computational Sciences and Engineering Division
Robert M. Patton , Oak Ridge National Laboratory Computational Sciences and Engineering Division
Edward Chaum , The University of Tennessee Health Science Center, Hamilton Eye Institute
pp. 2578-2582

Enabling graph appliance for genome assembly (Abstract)

Rina Singh , Knowledge Discovery Lab, Tennessee Technological University, TN, USA
Jeffrey A. Graves , Knowledge Discovery Lab, Tennessee Technological University, TN, USA
Sangkeun Lee , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, TN, USA
Sreenivas R. Sukumar , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, TN, USA
Mallikarjun Shankar , Computational Sciences and Engineering Division, Oak Ridge National Laboratory, TN, USA
pp. 2583-2590

A memory capacity model for high performing data-filtering applications in Samza framework (Abstract)

Tao Feng , LinkedIn Corp 2029 Stierlin Court Mountain View, CA 94043, USA
Zhenyun Zhuang , LinkedIn Corp 2029 Stierlin Court Mountain View, CA 94043, USA
Yi Pan , LinkedIn Corp 2029 Stierlin Court Mountain View, CA 94043, USA
Haricharan Ramachandra , LinkedIn Corp 2029 Stierlin Court Mountain View, CA 94043, USA
pp. 2600-2605

Robust and distributed web-scale near-dup document conflation in microsoft academic service (Abstract)

Chieh-Han Wu , Microsoft Research, Redmond One Microsoft Way, Redmond, WA, USA
Yang Song , Microsoft Research, Redmond One Microsoft Way, Redmond, WA, USA
pp. 2606-2611

Evaluation of data quality of multisite electronic health record data for secondary analysis (Abstract)

Alicia L. Nobles , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, USA
Ketki Vilankar , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, USA
Hao Wu , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, USA
Laura E. Barnes , Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, USA
pp. 2612-2620

CrowdMD: Crowdsourcing-based approach for deduplication (Abstract)

Asma Abboura , RUR Laboratory, University of Oran, 1 Oran, Algeria
Soror Sahrl , Université Paris Descartes Sorbonnes Paris Cité, Paris, France
Mourad Ouziri , Université Paris Descartes Sorbonnes Paris Cité, Paris, France
Salima Benbernou , Université Paris Descartes Sorbonnes Paris Cité, Paris, France
pp. 2621-2627

Distributed life cycle scheduling for cascaded data processing (Abstract)

Lavanya Sainik , Centre Of Excellence Charging, Billing & Mediation, Ericsson India Global Services Pvt. Ltd. Gurgaon, Haryana, 122003, India
pp. 2637-2643

Big data, big data quality problem (Abstract)

David Becker , The MITRE Corporation
Trish Dunn King , The MITRE Corporation
Bill McMullen , The MITRE Corporation
pp. 2644-2653

Data quality issues in big data (Abstract)

Dhana Rao , Department of Biology, East Carolina University, Greenville, North Carolina, USA
Venkat N Gudivada , Department of Computer Science, East Carolina University, Greenville, North Carolina, USA
Vijay V. Raghavan , Center for Advanced Computer Studies, University of Louisiana at Lafayette, Lafayette, LA, USA
pp. 2654-2660

Machine learning for stress detection from ECG signals in automobile drivers (Abstract)

N. Keshan , Advanced Wireless Systems Research Center, State University of New York at Oswego, Oswego, NY 13126, USA
P. V. Parimi , Advanced Wireless Systems Research Center, State University of New York at Oswego, Oswego, NY 13126, USA
I. Bichindaritz , Computer Science Department, State University of New York at Oswego, Oswego, NY 13126, USA
pp. 2661-2669

Sequential pattern mining of electronic healthcare reimbursement claims: Experiences and challenges in uncovering how patients are treated by physicians (Abstract)

Kunal Malhotra , College of Computing, Georgia Institute of Technology, Atlanta, Georgia, USA
Tanner C. Hobson , Computational Science and Engineering Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Silvia Valkova , IMS Government Solutions, Plymouth Meeting, Pennsylvania, USA
Laura L. Pullum , Computational Science and Engineering Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Arvind Ramanathan , Computational Science and Engineering Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
pp. 2670-2679

SQL-like big data environments: Case study in clinical trial analytics (Abstract)

Akshay Grover , Computer Science and Electrical Engineering University of Maryland, Baltimore County
Jay Gholap , Information Systems University of Maryland, Baltimore County
Vandana P. Janeja , Information Systems University of Maryland, Baltimore County
Yelena Yesha , Computer Science and Electrical Engineering University of Maryland, Baltimore County
Raghu Chintalapati , Ekagra Software Technologies
Harsh Marwaha , Ekagra Software Technologies
Kunal Modi , Ekagra Software Technologies
pp. 2680-2689

Exploring spatio-temporal-theme correlation between physical and social streaming data for event detection and pattern interpretation from heterogeneous sensors (Abstract)

Minh-Son Dao , National Institute of Information and Communications Technology, 3-5 Hirakidai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan
Koji Zettsu , National Institute of Information and Communications Technology, 3-5 Hirakidai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan
Siripen Pongpaichet , University of California, Irvine, USA
Laleh Jalali , University of California, Irvine, USA
Ramesh Jain , University of California, Irvine, USA
pp. 2690-2699

Microdata analysis of the accommodation survey in Japanese tourism statistics (Abstract)

Aki-Hiro Sato , Graduate School of Informatics, Kyoto University, Kyoto, Japan
pp. 2700-2708

Detecting rumor patterns in streaming social media (Abstract)

Shihan Wang , Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama, Japan
Takao Terano , Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama, Japan
pp. 2709-2715

A collaborative framework for annotating energy datasets (Abstract)

Hong-An Cao , Department of Computer Science, ETH Zurich, Switzerland
Tri Kurniawan Wijaya , Department of Computer Science, EPFL, Switzerland
Karl Aberer , Department of Computer Science, EPFL, Switzerland
Nuno Nunes , Madeira Interactive Technologies Institute, Funchal, Portugal
pp. 2716-2725

The relation between firm age distributions and the decay rate of firm activities in the united states and Japan (Abstract)

Atushi Ishikawa , Department of Business Information, Kanazawa Gakuin University, Kanazawa. Japan
Shouji Fujimoto , Department of Business Information, Kanazawa Gakuin University, Kanazawa. Japan
Takayuki Mizuno , National Institute of Informatics, Department of Informatics, The Graduate University for Advanced Studies, PRESTO, Japan Science and Technology Agency, Tokyo. Japan
Tsutomu Watanabe , Graduate School of Economics, University of Tokyo, Tokyo. Japan
pp. 2726-2731

An epidemic simulation with a delayed stochastic SIR model based on international socioeconomic-technological databases (Abstract)

Aki-Hiro Sato , Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Honmachi, Yoshida, Sakyo-ku, Kyoto 606-8501 Japan
Isao Ito , Graduate School of Medicine, Kyoto University, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
Hidefumi Sawai , Universal Communication Research Institute, National Institute of Information and Communications Technology, 910 North Building, 3-1 Ofuka-cho, Kita-ku, Osaka 530-0011 Japan
Kentaro Iwata , Graduate School of Medicine, Kobe University, 7-5-2 Kusunoki-cho, Chuo-ku, Kobe 650-0017, Japan
pp. 2732-2741

A spatio-temporal multimedia big data framework for a large crowd (Abstract)

Bilal Sadiq , KACST GIS Technology Innovation Center, Umm Al Qura University, Saudi Arabia
Faizan Ur Reliman , KACST GIS Technology Innovation Center, Umm Al Qura University, Saudi Arabia
Akhlaq Ahmad , College of Engineering and Islamic Architecture, Umm Al Qura University, Saudi Arabia
Md. Abdur Rahman , College of Computer and Information Systems, Umm Al Qura University, Saudi Arabia
Sohaib Ghani , KACST GIS Technology Innovation Center, Umm Al Qura University, Saudi Arabia
Abdullah Murad , Innovation and Entrepreneurship Institute, Umm Al Qura University, Saudi Arabia
Saleh Basalamah , KACST GIS Technology Innovation Center, Umm Al Qura University, Saudi Arabia
Ahmad Lbath , Dept. of Computer Science, LIG, University of Grenoble Alpes, France
pp. 2742-2751

Directional decision lists (Abstract)

Marc Goessling , Department of Statistics, University of Chicago, Chicago, IL 60637
Shan Kang , Robert Bosch LLC, Research and Technology Center North America, Palo Alto, CA 94304
pp. 2762-2766

Analysis of key operation performance data in manufacturing systems (Abstract)

Ningxuan Kang , Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI 53706, USA
Cong Zhao , Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI 53706, USA
Jingshan Li , Department of Industrial and Systems Engineering, University of Wisconsin, Madison, WI 53706, USA
John A. Horst , National Institute of Standard and Technology, Gaithersburg, MD 20899-8230, USA
pp. 2767-2770

Outlier detection for large scale manufacturing processes (Abstract)

Abhinav Jauhri , Carnegie Mellon University, Pittsburgh, PA
Bradley McDanel , Harvard University, Cambridge, MA
Chris Connor , Intel Corporation, Hillsboro, OR
pp. 2771-2774

Fast detection of material deformation through structural dissimilarity (Abstract)

Daniela Ushizima , CRD, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Talita Perciano , CRD, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Dilworth Parkinson , Advanced Light Source, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
pp. 2775-2781

Data analytics and uncertainty quantification for energy prediction in manufacturing (Abstract)

Ronay Ak , Department of Energy, SUPELEC Gif-Sur-Yvette, 91192, France
Raunak Bhinge , Mechanical Engineering, University of California, Berkeley Berkeley, CA, USA
pp. 2782-2784

Lambda architecture for cost-effective batch and speed big data processing (Abstract)

Mariam Kiran , School of Computer Science, University of Bradford
Peter Murphy , Energy Sciences Network (ESnet)
Inder Monga , Energy Sciences Network (ESnet)
Jon Dugan , Energy Sciences Network (ESnet)
Sartaj Singh Baveja , Netaji Subhas Institute of Technology, New Delhi
pp. 2785-2792

Network-aware resource management for scalable data analytics frameworks (Abstract)

Thomas Renner , Technische Universität Berlin, Germany
Lauritz Thamsen , Technische Universität Berlin, Germany
Odej Kao , Technische Universität Berlin, Germany
pp. 2793-2800

Preparing, storing, and distributing multi-dimensional scientific data (Abstract)

Ranjeet Devarakonda , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Yaxing Wei , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Michele Thornton , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Ben Mayer , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Peter Thornton , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Bob Cook , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
pp. 2811-2813

Use of a metadata documentation and search tool for large data volumes: The NGEE arctic example (Abstract)

Ranjeet Devarakonda , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Les Hook , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Terri Killeffer , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Misha Krassovski , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Tom Boden , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
Stan Wullschleger , Climate Change Science Institute, Oak Ridge National Laboratory, 1 Bethel Valley Rd, Oak Ridge, TN USA
pp. 2814-2816

Data optimised computing for heterogeneous big data computing applications (Abstract)

Erica Yang , Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Derek Ross , Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Srikanth Nagella , Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Martin Turner , Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Winfried Kockelmann , ISIS Neutron Facility Rutherford Appleton Laboratory Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Genoveva Burca , ISIS Neutron Facility Rutherford Appleton Laboratory Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
Federico Montesino Pouzols , ISIS Neutron Facility Rutherford Appleton Laboratory Science and Technology Facilities Council Harwell Science and Innovation Campus, Oxfordshire UK
pp. 2817-2819

Top-k computations in MapReduce: A case study on recommendations (Abstract)

Vasilis Efthymiou , ICS-FORTH & Univ. of Crete, Greece
Kostas Stefanidis , ICS-FORTH, Greece
Eirini Ntoutsi , LMU, Germany
pp. 2820-2822

A LSTM-based method for stock returns prediction: A case study of China stock market (Abstract)

Kai Chen , Shanghai Jiaotong University, Shanghai, China
Yi Zhou , Shanghai Jiaotong University, Shanghai, China
Fangyan Dai , MD Anderson Cancer Center, Houston, USA
pp. 2823-2824

Predicting various types of user attributes in Twitter by using personalized pagerank (Abstract)

Kazuya Uesato , Waseda University, Tokyo, Japan
Hiroki Asai , Waseda University, Tokyo, Japan
Hayato Yamana , Waseda University, Tokyo, Japan
pp. 2825-2827

Large-scale learning with AdaGrad on Spark (Abstract)

Asmelash Teka Hadgu , L3S Research Center, Hannover, Germany
Aastha Nigam , University of Notre Dame, Indiana, USA
Ernesto Diaz-Aviles , IBM Research, Dublin, Ireland
pp. 2828-2830

Parallelizing natural language techniques for knowledge extraction from cloud service level agreements (Abstract)

Sudip Mittal , University of Maryland, Baltimore County, Baltimore, MD 21250, USA
Karuna P. Joshi , University of Maryland, Baltimore County, Baltimore, MD 21250, USA
Claudia Pearce , University of Maryland, Baltimore County, Baltimore, MD 21250, USA
Anupam Joshi , University of Maryland, Baltimore County, Baltimore, MD 21250, USA
pp. 2831-2833

Gradient-based signatures for big multimedia data (Abstract)

Christian Beecks , Data Management and Exploration Group, RWTH Aachen University, Germany
Merih Seran Uysal , Data Management and Exploration Group, RWTH Aachen University, Germany
Thomas Seidl , Data Management and Exploration Group, RWTH Aachen University, Germany
pp. 2834-2835

Indexing media storms on Flink (Abstract)

Dimitrios Rafailidis , Department of Informatics, Aristotle University of Thessaloniki
Stefanos Antaris , Department of Computer Science, University of Cyprus
pp. 2836-2838

Scaling NLP algorithms to meet high demand (Abstract)

Connor Stokes , Raytheon BBN Technologies Corp., 10 Moulton St. Cambridge, MA, USA
Anoop Kumar , Raytheon BBN Technologies Corp., 10 Moulton St. Cambridge, MA, USA
Frederick Choi , Raytheon BBN Technologies Corp., 10 Moulton St. Cambridge, MA, USA
Ralph Weischedel , Raytheon BBN Technologies Corp., 10 Moulton St. Cambridge, MA, USA
pp. 2839

The NIST data science evaluation series: Part of the NIST information access division data science initiative (Abstract)

Bonnie J. Dorr , National Institute of Standards and Technology
Craig S. Greenberg , National Institute of Standards and Technology
Peter Fontana , National Institute of Standards and Technology
Mark Przybocki , National Institute of Standards and Technology
Marion Le Bras , National Institute of Standards and Technology
Cathryn Ploehn , National Institute of Standards and Technology
Oleg Aulov , National Institute of Standards and Technology
Wo Chang , National Institute of Standards and Technology
pp. 2840-2842

Flexible ingest framework: A scalable architecture for dynamic routing through composable pipelines (Abstract)

Alexei Samoylov , Informatics Laboratory, Lockheed Martin Advanced Technology Laboratories, 1825 Barrett Lakes Blvd NW, Kennesaw, GA, USA
Jason Schlachter , Informatics Laboratory Lockheed Martin Advanced Technology Laboratories, 1825 Barrett Lakes Blvd NW, Kennesaw, GA, USA
pp. 2843-2845

A scalable solution for group feature selection (Abstract)

Priya Govindan , Rutgers University
Ruobing Chen , Robert Bosch LLC
Katya Scheinberg , Lehigh University
Soundararajan Srinivasan , Robert Bosch LLC
pp. 2846-2848

Performance of graph reconstruction method for large-scale web graph analysis (Abstract)

Ryota Takei , Future University Hakodate, Hokkaido, Japan
Ayahiko Niimi , Future University Hakodate, Hokkaido, Japan
pp. 2852-2854

Low latency analytics for streaming traffic data with Apache Spark (Abstract)

Altti Ilari Maarala , Department of Computer Science and Engineering, University of Oulu, FInland
Mika Rautiainen , Department of Computer Science and Engineering, University of Oulu, FInland
Miikka Salmi , Department of Computer Science and Engineering, University of Oulu, FInland
Susanna Pirttikangas , Department of Computer Science and Engineering, University of Oulu, FInland
Jukka Riekki , Department of Computer Science and Engineering, University of Oulu, FInland
pp. 2855-2858

How to make money from your information and keep your privacy (Abstract)

Divya Rao , School of Computer Engineering, Nanyang Technological University, Singapore
Wee Keong Ng , School of Computer Engineering, Nanyang Technological University, Singapore
pp. 2859-2861

Scheduling of Big Data application workflows in cloud and inter-cloud environments (Abstract)

B. Kezia Rani , Department of Computer Science, Jawaharlal Nehru Technological University, Hyderabad, India
A. Vinaya Babu , Department of Computer Science, Jawaharlal Nehru Technological University, Hyderabad, India
pp. 2862-2864

Patient-like-mine: A real time, visual analytics tool for clinical decision support (Abstract)

Peter Li , Department of Surgery, Mayo Clinic, Rochester, MN USA
Simon N. Yates , Department of Surgery, Mayo Clinic, Rochester, MN USA
Jenna K. Lovely , Department of Surgery, Mayo Clinic, Rochester, MN USA
David W. Larson , Department of Surgery, Mayo Clinic, Rochester, MN USA
pp. 2865-2867

A pricing mechanism using social media and web data to infer dynamic consumer valuations (Abstract)

Samuel D. Johnson , Computer Science Dept., University of California Davis, Davis, California, USA
Kang-Yu Ni , Information and Systems Sciences Lab., HRL Laboratories, Malibu, California, USA
pp. 2868-2870

Efficient keyword search on graphs using MapReduce (Abstract)

Yifan Hao , New Mexico State University, Las Cruces, NM
Huiping Cao , New Mexico State University, Las Cruces, NM
Yan Qi , Turn Inc., San Francisco, CA
Chuan Hu , New Mexico State University, Las Cruces, NM
Sukumar Brahma , New Mexico State University, Las Cruces, NM
Jingyu Han , Nanjing University of Posts and Telecommunications, Nanjing, China
pp. 2871-2873

Non-blocking one-phase commit made possible for distributed transactions over replicated data (Abstract)

Yuqing Zhu , Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
pp. 2874-2876

A large scale examination of vehicle recorder data to understand relationship between drivers' behaviors and their past driving histories (Abstract)

Daisaku Yokoyama , The University of Tokyo, Meguro-ku Komaba 4-6-1, Tokyo, Japan, 153-8505
Masashi Toyoda , The University of Tokyo, Meguro-ku Komaba 4-6-1, Tokyo, Japan, 153-8505
pp. 2877-2879

Online pattern mining for high-dimensional data streams (Abstract)

Yoshitaka Yamamoto , University of Yamanashi, 4-3-11 Takeda, Kofu-city, Japan
Koji Iwanuma , University of Yamanashi, 4-3-11 Takeda, Kofu-city, Japan
pp. 2880-2882

Modeling the learning behaviors of massive open online courses (Abstract)

Zhenhui Liu , Tsinghua National Laboratory for information Science and Technology, Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jingjing He , Tsinghua National Laboratory for information Science and Technology, Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Yufei Xue , Tsinghua National Laboratory for information Science and Technology, Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Zhenzhong Huang , Institute of Education, Tsinghua University, 100084, Beijing, China
Manli Li , Institute of Education, Tsinghua University, 100084, Beijing, China
Zhihui Du , Tsinghua National Laboratory for information Science and Technology, Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
pp. 2883-2885

Data confidentiality challenges in big data applications (Abstract)

Jian Yin , Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, Richland, WA
Dongfang Zhao , Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, Richland, WA
pp. 2886-2888

Factorization machines with follow-the-regularized-leader for CTR prediction in display advertising (Abstract)

Anh-Phuong Ta , Zebestof company - CCM Benchmark group, Paris, France
pp. 2889-2891

Taxi trip time prediction using similar trips and road network data (Abstract)

Aakash Deep Singh , Department of Mathematics and Computing, Indian Institute of Technology, Delhi
Wei Wu , Data Analytics Department, Institute for Infocomm Research
Shili Xiang , Data Analytics Department, Institute for Infocomm Research
Shonali Krishnaswamy , Data Analytics Department, Institute for Infocomm Research
pp. 2892-2894

Using Word2Vec to process big text data (Abstract)

Long Ma , Computer Science Department, Georgia State University, Atlanta, Georgia
Yanqing Zhang , Computer Science Department, Georgia State University, Atlanta, Georgia
pp. 2895-2897

MHT: A light-weight scalable zero-hop MPI enabled distributed key-value store (Abstract)

Xiaobing Zhou , Hortonworks
Tonglin Li , Illinois Institute of Technology
Ke Wang , Intel
Dongfang Zhao , Illinois Institute of Technology
Iman Sadooghi , Illinois Institute of Technology
Ioan Raicu , Illinois Institute of Technology
pp. 2901-2903

Big Data: Cloud computing in genomics applications (Abstract)

Hangu Yeo , Department of Next Generation Systems, IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
Catherine H. Crawford , Department of Next Generation Systems, IBM T. J. Watson Research Center, Yorktown Heights, NY U.S.A.
pp. 2904-2906

Integrating semantic knowledge into Tag-LDA model through cloud model (Abstract)

Maoyuan Zhang , Department of Computer Science and Technology, Central China Normal University, Wuhan, P.R. China
Fang Yuan , Department of Computer Science and Technology, Central China Normal University, Wuhan, P.R. China
Jianping Zhu , Department of Computer Science and Technology, Central China Normal University, Wuhan, P.R. China
pp. 2907-2909

A case study to apply mobile technology into individual's local community (Abstract)

Yunkai Liu , Computer and Information Science Department, Gannon University, Erie, PA, United States
Christopher Magno , Criminal Justice Department, Gannon University, Erie, PA, United States
pp. 2910-2912

Clairvoyant-push: A real-time news personalized push notifier using topic modeling and social scoring for enhanced reader engagement (Abstract)

Biying Tan , DataSpark Pte Ltd Singapore Telecommunications Limited, Singapore
Kajanan Sangaralingam , DataSpark Pte Ltd Singapore Telecommunications Limited, Singapore
Vivek Kumar Singh , Information Systems Decision Sciences, MUMA College of Business University of South Florida Florida, USA
Chandra Sekhar Saripaka , DataSpark Pte Ltd Singapore Telecommunications Limited, Singapore
Giuseppe Manai , DataSpark Pte Ltd Singapore Telecommunications Limited, Singapore
pp. 2913-2915

Using probabilistic approach to joint clustering and statistical inference: Analytics for big investment data (Abstract)

Hua Fang , University of Massachusetts Medical School, Worcester, MA, USA
Honggang Wang , University of Massachusetts Dartmouth, North Dartmouth, MA, USA
Chonggang Wang , InterDigital Communications, Princeton, NJ, USA
Mahmoud Daneshmand , Stevens Institute of Technology, Holmdel, NJ, USA
pp. 2916-2918

Towards a subgraph/supergraph cached query-graph index (Abstract)

Jing Wang , School of Computing Science, University of Glasgow, Glasgow, UK
Nikos Ntarmos , School of Computing Science, University of Glasgow, Glasgow, UK
Peter Triantafillou , School of Computing Science, University of Glasgow, Glasgow, UK
pp. 2919-2921

30 Day hospital readmission analysis (Abstract)

Ratna Madhuri Maddipatla , Data Science and Business Analytics, University of North Carolina, Charlotte, Charlotte, North Carolina
Mirsad Hadzikadic , Data Science and Business Analytics, University of North Carolina, Charlotte, Charlotte, North Carolina
Dipti Patel Misra , Health Informatics, University of North Carolina, Charlotte, Charlotte, North Carolina
Lixia Yao , Software and Information Systems, University of North Carolina, Charlotte Charlotte, North Carolina
pp. 2922-2924

Using pairwise difference features to measure temporal changes in the microbial ecology (Abstract)

M. Yazdani , California Institute for Telecommunications and Information Technology, University of California San Diego USA
L. Smarr , Department of Computer Science and Engineering, University of California San Diego USA
pp. 2925-2927

A timeline visualization system for road traffic big data (Abstract)

Ardi Imawan , Department of Big Data, Pusan National University, Busan, South Korea
Joonho Kwon , Department of Big Data, Pusan National University, Busan, South Korea
pp. 2928-2929

Text retrieval based on the feature conversion of vector space (Abstract)

Maoyuan Zhang , School of Computer, Central China Normal University, Wuhan, China
Jianping Zhu , School of Computer, Central China Normal University, Wuhan, China
Lijun Hua , School of Computer, Central China Normal University, Wuhan, China
Fang Yuan , School of Computer, Central China Normal University, Wuhan, China
pp. 2933-2935

Big data gathering and mining pipelines for CRM using open-source (Abstract)

Kang Li , Search and Data Mining, Groupon Palo Alto, CA 94306
Vinay Deolalikar , Search and Data Mining, Groupon Palo Alto, CA 94306
Neeraj Pradhan , Search and Data Mining, Groupon Palo Alto, CA 94306
pp. 2936-2938

Unified framework for clinical data analytics (U-CDA) (Abstract)

Jay Gholap , Information Systems, University of Maryland, Baltimore County, USA
Vandana P. Janeja , Information Systems, University of Maryland, Baltimore County, USA
Yelena Yesha , Computer Science and Electrical Engineering, University of Maryland, Baltimore County, USA
pp. 2939-2941

A novel initialization method for particle swarm optimization-based FCM in big biomedical data (Abstract)

Chanpaul J. Wang , Department of Quantitative Health Science, University of Massachusetts Medical School, Worcester, USA
Hua Fang , Department of Quantitative Health Science, University of Massachusetts Medical School, Worcester, USA
Chonggang Wang , InterDigital Communications, Princeton, New Jersey, USA
Mahmoud Daneshmand , Stevens Institute of Technology, Hoboken, New Jersey, USA
Honggang Wang , Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth, North Dartmouth, MA, USA
pp. 2942-2944

Algorithmic content generation for products (Abstract)

Chandra Khatri , eBay Inc., 2065 Hamilton Av., San Jose, CA
Suman Voleti , eBay Inc., 2065 Hamilton Av., San Jose, CA
Sathish Veeraraghavan , eBay Inc., 2065 Hamilton Av., San Jose, CA
Nish Parikh , eBay Inc., 2065 Hamilton Av., San Jose, CA
Atiq Islam , eBay Inc., 2065 Hamilton Av., San Jose, CA
Shifa Mahmood , eBay Inc., 2065 Hamilton Av., San Jose, CA
Neeraj Garg , eBay Inc., 2065 Hamilton Av., San Jose, CA
Vivek Singh , eBay Inc., 2065 Hamilton Av., San Jose, CA
pp. 2945-2947

Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news (Abstract)

Ismini Lourentzou , Department of Computer Science, University of Illinois at Urbana - Champaign
Graham Dyer , Department of Computer Science, University of Illinois at Urbana - Champaign
Abhishek Sharma , Department of Computer Science, University of Illinois at Urbana - Champaign
ChengXiang Zhai , Department of Computer Science, University of Illinois at Urbana - Champaign
pp. 2948-2950

Improving the quality of semantic relationships extracted from massive user behavioral data (Abstract)

Khalifeh AlJadda , CareerBuilder, Norcross, GA, USA
Mohammed Korayem , CareerBuilder, Norcross, GA, USA
Trey Grainger , CareerBuilder, Norcross, GA, USA
pp. 2951-2953

Analysis of star ratings in consumer reviews: A case study of Yelp (Abstract)

Maruthi Prithivirajan , School of Information Systems, Singapore Management University, Singapore
Vivian Lai , School of Information Systems, Singapore Management University, Singapore
Kyong Jin Shim , School of Information Systems, Singapore Management University, Singapore
Koo Ping Shung , School of Information Systems, Singapore Management University, Singapore
pp. 2954-2956

From stars to patients: Lessons from space science and astrophysics for health care informatics (Abstract)

S. G. Djorgovski , California Institute of Technology, Pasadena, CA 91125, USA
A. A. Mahabal , California Institute of Technology, Pasadena, CA 91125, USA
D. J. Crichton , Jet Propulsion Laboratory, Pasadena, CA 91109, USA
B. Chaudhry , TupleHealth Washington, DC 20008, USA
pp. 2957-2959

Author index (PDF)

pp. 1-22
77 ms
(Ver 3.3 (11022016))