Search For:

Displaying 1-49 out of 49 total
Delegation-Based I/O Mechanism for High Performance Computing Systems
Found in: IEEE Transactions on Parallel and Distributed Systems
By Arifa Nisar,Wei-Keng Liao,Alok Choudhary
Issue Date:February 2012
pp. 271-279
Massively parallel applications often require periodic data checkpointing for program restart and post-run data analysis. Although high performance computing systems provide massive parallelism and computing power to fulfill the crucial requirements of the...
 
High Performance Data Mining Using R on Heterogeneous Platforms
Found in: Parallel and Distributed Processing Workshops and PhD Forum, 2011 IEEE International Symposium on
By Prabhat Kumar,Berkin Ozisikyilmaz,Wei-Keng Liao,Gokhan Memik,Alok Choudhary
Issue Date:May 2011
pp. 1720-1729
The exponential increase in the generation and collection of data has led us in a new era of data analysis and information extraction. Conventional systems based on general-purpose processors are unable to keep pace with the heavy computational requirement...
 
Enabling active storage on parallel I/O software stacks
Found in: Mass Storage Systems and Technologies, IEEE / NASA Goddard Conference on
By Seung Woo Son, Samuel Lang, Philip Carns, Robert Ross, Rajeev Thakur, Berkin Ozisikyilmaz, Prabhat Kumar, Wei-Keng Liao, Alok Choudhary
Issue Date:May 2010
pp. 1-12
As data sizes continue to increase, the concept of active storage is well fitted for many data analysis kernels. Nevertheless, while this concept has been investigated and deployed in a number of forms, enabling it from the parallel I/O software stack has ...
 
Social media evolution of the Egyptian revolution
Found in: Communications of the ACM
By Alok Choudhary, Diana Palsetia, Kathy Lee, Wei-Keng Liao, William Hendrix
Issue Date:May 2012
pp. 74-80
Twitter sentiment was revealed, along with popularity of Egypt-related subjects and tweeter influence on the 2011 revolution.
     
Multicollective I/O: A technique for exploiting inter-file access patterns
Found in: ACM Transactions on Storage (TOS)
By Alok Choudhary, Gokhan Memik, Mahmut T. Kandemir, Wei-Keng Liao
Issue Date:August 2006
pp. 349-369
The increasing gap between processor cycle times and access times to storage devices makes it necessary to use powerful optimizations. This is especially true for applications in the parallel computing domain that frequently perform large amounts of file I...
     
Dynamic Alignment and Distribution of Irregularly Coupled Data Arrays for Scalable Parallelization of Particle-in-Cell Problems
Found in: Parallel Processing Symposium, International
By Wei-keng Liao, Chao-wei Ou, Sanjay Ranka
Issue Date:April 1996
pp. 57
Particle-in-cell (PIC) plasma simulation codes require two data arrays---particle array and field array---for storing the lists of particles and electromagnetic fields, respectively. In every iteration the two are updated based on the values of each other....
 
Scaling parallel I/O performance through I/O delegate and caching system
Found in: SC Conference
By Arifa Nisar, Wei-keng Liao, Alok Choudhary
Issue Date:November 2008
pp. 1-12
Increasingly complex scientific applications require massive parallelism to achieve the goals of fidelity and high computational performance. Such applications periodically offload checkpointing data to file system for post-processing and program resumptio...
 
AHPIOS: An MPI-Based Ad Hoc Parallel I/O System
Found in: Parallel and Distributed Systems, International Conference on
By Florin Isaila, Javier Garcia Blas, Jesus Carretero, Wei-keng Liao, Alok Choudhary
Issue Date:December 2008
pp. 253-260
This paper presents the design and implementation of a portable ad-hoc parallel I/O system (AHPIOS). AHPIOS virtualizes on-demand available distributed storage resources and allows the files to be striped over several storage devices. Additionally, the des...
 
Evaluating I/O characteristics and methods for storing structured scientific data
Found in: Parallel and Distributed Processing Symposium, International
By A. Ching,A. Choudhary, Wei-keng Liao,L. Ward,N. Pundit
Issue Date:April 2006
pp. 49
Many large-scale scientific simulations generate large, structured multi-dimensional datasets. Data is stored at various intervals on high performance I/O storage systems for checkpointing, post-processing, and visualization. Data storage is very I/O inten...
 
IOPin: Runtime Profiling of Parallel I/O in HPC Systems
Found in: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC)
By Seong Jo Kim,Seung Woo Son,Wei-keng Liao,Mahmut Kandemir,Rajeev Thakur,Alok Choudhary
Issue Date:November 2012
pp. 18-23
Abstract -- Many I/O- and data-intensive scientific applications use parallel I/O software to access files in high performance. On modern parallel machines, the I/O software consists of several layers, including high-level libraries such as Parallel netCDF...
 
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure
Found in: 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
By Md. Mostofa Ali Patwary,Diana Palsetia,Ankit Agrawal,Wei-keng Liao,Fredrik Manne,Alok Choudhary
Issue Date:November 2012
pp. 1-11
DBSCAN is a well-known density based clustering algorithm capable of discovering arbitrary shaped clusters and eliminating noise data. However, parallelization of Dbscan is challenging as it exhibits an inherent sequential data access order. Moreover, exis...
 
On the path to sustainable, scalable, and energy-efficient data analytics: Challenges, promises, and future directions
Found in: 2012 International Green Computing Conference (IGCC)
By Sriram Lakshminarasimhan,Prabhat Kumar, Wei-keng Liao,Alok Choudhary,Vipin Kumar,Nagiza F. Samatova
Issue Date:June 2012
pp. 1-6
As scientific data is reaching exascale, scalable and energy efficient data analytics is quickly becoming a top notch priority. Yet, a sustainable solution to this problem is hampered by a number of technical challenges that get exacerbated with the emergi...
 
Supporting computational data model representation with high-performance I/O in parallel netCDF
Found in: High-Performance Computing, International Conference on
By Kui Gao,Chen Jin,Alok Choudhary,Wei-keng Liao
Issue Date:December 2011
pp. 1-10
Parallel computational scientific applications have been described by their computation and communication patterns. From a storage and I/O perspective, these applications can also be grouped into separate data models based on the way data is organized and ...
 
Community Dynamics and Analysis of Decadal Trends in Climate Data
Found in: Data Mining Workshops, International Conference on
By William Hendrix,Isaac K. Tetteh,Ankit Agrawal,Fredrick Semazzi,Wei-keng Liao,Alok Choudhary
Issue Date:December 2011
pp. 9-14
The application of complex networks to study complex phenomena, including the Internet, social networks, food networks, and others, has seen a growing interest in recent years. In particular, the use of complex networks and network theory to analyze the be...
 
SES: Sentiment Elicitation System for Social Media Data
Found in: Data Mining Workshops, International Conference on
By Kunpeng Zhang,Yu Cheng,Yusheng Xie,Daniel Honbo,Ankit Agrawal,Diana Palsetia,Kathy Lee,Wei-keng Liao,Alok Choudhary
Issue Date:December 2011
pp. 129-136
Social Media is becoming major and popular technological platform that allows users discussing and sharing information. Information is generated and managed through either computer or mobile devices by one person and consumed by many other persons. Most of...
 
Learning to Group Web Text Incorporating Prior Information
Found in: Data Mining Workshops, International Conference on
By Yu Cheng,Kunpeng Zhang,Yusheng Xie,Ankit Agrawal,Wei-keng Liao,Alok Choudhary
Issue Date:December 2011
pp. 212-219
Clustering similar items for web text has become increasingly important in many Web and Information Retrieval applications. For several kinds of web text data, it is much easier to obtain some external information other than textual features which can be u...
 
Design and Evaluation of MPI File Domain Partitioning Methods under Extent-Based File Locking Protocol
Found in: IEEE Transactions on Parallel and Distributed Systems
By Wei-keng Liao
Issue Date:February 2011
pp. 260-272
MPI collective I/O has been an effective method for parallel shared-file access and maintaining the canonical orders of structured data in files. Its implementation commonly uses a two-phase I/O strategy that partitions a file into disjoint file domains, a...
 
pFANGS: Parallel high speed sequence mapping for Next Generation 454-roche Sequencing reads
Found in: Parallel and Distributed Processing Workshops and PhD Forum, 2011 IEEE International Symposium on
By Sanchit Misra,Ramanathan Narayanan, Wei-keng Liao,Alok Choudhary,Simon Lin
Issue Date:April 2010
pp. 1-8
Millions of DNA sequences (reads) are generated by Next Generation Sequencing machines everyday. There is a need for high performance algorithms to map these sequences to the reference genome to identify single nucleotide polymorphisms or rare transcripts ...
 
Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols
Found in: SC Conference
By Wei-keng Liao, Alok Choudhary
Issue Date:November 2008
pp. 1-12
Collective I/O, such as that provided in MPI-IO, enables process collaboration among a group of processes for greater I/O parallelism. Its implementation involves file domain partitioning, and having the right partitioning is a key to achieving high-perfor...
 
Using MPI file caching to improve parallel write performance for large-scale scientific applications
Found in: 2007 SC - International conference for High Performance Computing, Networking, Storage and Analysis
By Wei-keng Liao,Avery Ching,Kenin Coloma,Arifa Nisar,Alok Choudhary,Jacqueline Chen,Ramanan Sankaran,Scott Klasky
Issue Date:November 2007
pp. 1-11
Typical large-scale scientific applications periodically write checkpoint files to save the computational state throughout execution. Existing parallel file systems improve such write-only I/O patterns through the use of client-side file caching and write-...
 
Noncontiguous locking techniques for parallel file systems
Found in: SC Conference
By Avery Ching, Wei-keng Liao, Alok Choudhary, Robert Ross, Lee Ward
Issue Date:November 2007
pp. 1-12
Many parallel scientific applications use high-level I/O APIs that offer atomic I/O capabilities. Atomic I/O in current parallel file systems is often slow when multiple processes simultaneously access interleaved, shared files. Current atomic I/O solution...
 
Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method
Found in: Parallel and Distributed Processing Symposium, International
By Wei-keng Liao, Avery Ching, Kenin Coloma, Alok Choudhary, Mahmut Kandemir
Issue Date:March 2007
pp. 295
Many large-scale production applications often have very long executions times and require periodic data check-points in order to save the state of the computation for program restart and/or tracing application progress. These write-only operations often d...
 
An Implementation and Evaluation of Client-Side File Caching for MPI-IO
Found in: Parallel and Distributed Processing Symposium, International
By Wei-keng Liao, Avery Ching, Kenin Coloma, Alok Choudhary, Lee Ward
Issue Date:March 2007
pp. 49
Client-side file caching has long been recognized as a file system enhancement to reduce the amount of data transfer between application processes and I/O servers. However, caching also introduces cache coherence problems when a file is simultaneously acce...
 
Scalable Design and Implementations for MPI Parallel Overlapping I/O
Found in: IEEE Transactions on Parallel and Distributed Systems
By Wei-keng Liao, Kenin Coloma, Alok Choudhary, Lee Ward, Eric Russell, Neil Pundit
Issue Date:November 2006
pp. 1264-1276
<p><b>Abstract</b>—We investigate the Message Passing Interface Input/Output (MPI I/O) implementation issues for two overlapping access patterns: the overlaps among processes within a single I/O operation and the overlaps across a sequenc...
 
Design and Evaluation of Database Layouts for MEMS-Based Storage Systems
Found in: Database Engineering and Applications Symposium, International
By Jayaprakash Pisharath, Wei-keng Liao, Alok Choudhary
Issue Date:July 2005
pp. 263-272
MEMS-based storage systems have recently generated significant interest due to their potential to be faster and more efficient than disks, while providing the non-volatility property. Designing data layouts for these devices is a challenging, important and...
 
Collective caching: application-aware client-side file caching
Found in: High-Performance Distributed Computing, International Symposium on
By Wei-keng Liao, K. Coloma, A. Choudhary, L. Ward, E. Russell, S. Tideman
Issue Date:July 2005
pp. 81-90
Parallel file subsystems in today's high-performance computers adopt many I/O optimization strategies that were designed for distributed systems. These strategies, for instance client-side file caching, treat each I/O request process independently, due to ...
 
Scalable High-level Caching for Parallel I/O
Found in: Parallel and Distributed Processing Symposium, International
By Kenin Coloma, Alok Choudhary, Wei-keng Liao, Lee Ward, Eric Russell, Neil Pundit
Issue Date:April 2004
pp. 96b
<p>In order for I/O systems to achieve high performance in a parallel environment, they must either sacrifice client-side file caching, or keep caching and deal with complex coherency issues. The most common technique for dealing with cache coherency...
 
Processor-Embedded Distributed MEMS-Based Storage Systems for High-Performance I/O
Found in: Parallel and Distributed Processing Symposium, International
By Steve C. Chiu, Wei-keng Liao, Alok N. Choudhary
Issue Date:April 2004
pp. 91b
Built upon new data organization and access characteristics, MEMS-based storage devices have come under consideration as an alternative to disks for large data-intensive applications. While not already in commercial production, MEMS-based storage devices h...
 
A High-Performance Application Data Environment for Large-Scale Scientific Computations
Found in: IEEE Transactions on Parallel and Distributed Systems
By Xiaohui Shen, Wei-keng Liao, Alok Choudhary, Gokhan Memik, Mahmut Kandemir
Issue Date:December 2003
pp. 1262-1274
<p><b>Abstract</b>—Effective high-level data management is becoming an important issue with more and more scientific applications manipulating huge amounts of secondary-storage and tertiary-storage data using parallel processors. A major ...
 
Efficient Structured Data Access in Parallel File Systems
Found in: Cluster Computing, IEEE International Conference on
By Avery Ching, Alok Choudhary, Wei-keng Liao, Robert Ross, William Gropp
Issue Date:December 2003
pp. 326
<p>Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in providing high-performance I/O solutions for these applications. High-level interfaces such ...
 
Parallel netCDF: A High-Performance Scientific I/O Interface
Found in: SC Conference
By Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, Michael Zingale
Issue Date:November 2003
pp. 39
Dataset storage, exchange, and access play a critical role in scientific applications. For such purposes netCDF serves as a portable, efficient file format and programming interface, which is popular in numerous scientific application domains. However, the...
 
Scalable Implementations of MPI Atomicity for Concurrent Overlapping I/O
Found in: Parallel Processing, International Conference on
By Wei-keng Liao, Alok Choudhary, Kenin Coloma, George K. Thiruvathukal, Lee Ward, Eric Russell, Neil Pundit
Issue Date:October 2003
pp. 239
For concurrent I/O operations, atomicity defines the results in the overlapping file regions simultaneously read/written by requesting processes. Atomicity has been well studied at the file system level, such as POSIX standard. In this paper, we investigat...
 
Noncontiguous I/O Accesses Through MPI-IO
Found in: Cluster Computing and the Grid, IEEE International Symposium on
By Avery Ching, Alok Choudhary, Kenin Coloma, Wei-keng Liao, Robert Ross, William Gropp
Issue Date:May 2003
pp. 104
I/O performance remains a weakness of parallel computing systems today. While this weakness is partly attributed to rapid advances in other system components, I/O interfaces available to programmers and the I/O methods supported by file systems have tradit...
 
Design and Evaluation of a Parallel HOP Clustering Algorithm for Cosmological Simulation
Found in: Parallel and Distributed Processing Symposium, International
By Ying Liu, Wei-keng Liao, Alok Choudhary
Issue Date:April 2003
pp. 82a
Clustering, or unsupervised classification, has many uses in fields that depend on grouping results from large amount of data, an example being the N-body cosmological simulation in astrophysics. In this paper, we study a particular clustering algorithm us...
 
Noncontiguous I/O through PVFS
Found in: Cluster Computing, IEEE International Conference on
By Avery Ching, Alok Choudhary, Wei-keng Liao, Rob Ross, William Gropp
Issue Date:September 2002
pp. 405
With the tremendous advances in processor and memory technology, I/O has risen to become the bottleneck in high-performance computing for many applications. The development of parallel file systems has helped to ease the performance gap, but I/O still rema...
 
I/O Analysis and Optimization for an AMR Cosmology Application
Found in: Cluster Computing, IEEE International Conference on
By Jianwei Li, Wei-keng Liao, Alok Choudhary, Valerie Taylor
Issue Date:September 2002
pp. 119
In this paper, we investigate the data access patterns and file I/O behaviors of a production cosmology application that uses the adaptive mesh refinement (AMR) technique for its domain decomposition. This application was originally developed using Hierarc...
 
An Integrated Graphical User Interface for High Performance Distributed Computing
Found in: Database Engineering and Applications Symposium, International
By Xiaohui Shen, Wei-keng Liao, Alok Choudhary
Issue Date:July 2001
pp. 0237
Abstract: It is very common that modern large-scale scientific applications employ multiple compute and storage resources in a heterogeneously distributed environment. Working effectively and efficiently in such an environment is one of major concerns for ...
 
Design and Evaluation of I/O Strategies for Parallel Pipelined STAP Applications
Found in: Parallel and Distributed Processing Symposium, International
By Wei-keng Liao, Alok Choudhary, Donald Weiner, Pramod Varshney
Issue Date:May 2000
pp. 655
This paper presents experimental results for a parallel pipeline STAP system with I/O task implementation using the parallel file systems on the Intel Paragon and the IBM SP. In our previous work, a parallel pipeline model was designed for radar signal pro...
 
Multi-Threaded Design and Implementation of Parallel Pipelined STAP on Parallel Computers with SMP Nodes
Found in: Parallel Processing Symposium, International
By Wei-keng Liao, Alok Choudhary, Donald Weiner, Pramod Varshney
Issue Date:April 1999
pp. 448
This paper presents performance results for the multi-threaded design and implementation of a parallel pipelined Space-Time Adaptive Processing (STAP) algorithm on parallel computers with Symmetrical Multiple Processor (SMP) nodes. In particular, the paper...
 
Efficient pairwise statistical significance estimation for local sequence alignment using GPU
Found in: 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)
By Yuhong Zhang,Sanchit Misra,Daniel Honbo,Ankit Agrawal, Wei-keng Liao,Alok Choudhary
Issue Date:February 2011
pp. 226-231
Pairwise statistical significance has been found to be quite accurate in identifying related sequences (homologs), which is a key step in numerous bioinformatics applications. However, it is computational and data intensive, particularly for a large amount...
 
Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload
Found in: 2013 IEEE International Conference on Cluster Computing (CLUSTER)
By Seung Woo Son,Saba Sehrish,Wei-keng Liao,Ron Oldfield,Alok Choudhary
Issue Date:September 2013
pp. 1-8
As the number of compute cores on modern parallel machines increases to more than hundreds of thousands, scalable and consistent I/O performance is becoming hard to obtain due to fluctuating file system performance. This fluctuation is often caused by rebu...
   
Parallel hierarchical clustering on shared memory platforms
Found in: 2012 19th International Conference on High Performance Computing (HiPC)
By William Hendrix,Md. Mostofa Ali Patwary,Ankit Agrawal,Wei-keng Liao,Alok Choudhary
Issue Date:December 2012
pp. 1-9
Hierarchical clustering has many advantages over traditional clustering algorithms like k-means, but it suffers from higher computational costs and a less obvious parallel structure. Thus, in order to scale this technique up to larger datasets, we present ...
   
Scalable parallel OPTICS data clustering using graph algorithmic techniques
Found in: Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis (SC '13)
By Alok Choudhary, Fredrik Manne, Wei-keng Liao, Ankit Agrawal, Diana Palsetia, Mostofa Ali Patwary
Issue Date:November 2013
pp. 1-12
OPTICS is a hierarchical density-based data clustering algorithm that discovers arbitrary-shaped clusters and eliminates noise using adjustable reachability distance thresholds. Parallelizing OPTICS is considered challenging as the algorithm exhibits a str...
     
Mining diabetes complication and treatment patterns for clinical decision support
Found in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (CIKM '13)
By Ankit Agrawal, Jie Tang, Lu Liu, Yu Cheng, Alok Choudhary, Wei-keng Liao
Issue Date:October 2013
pp. 279-288
The fast development of hospital information systems (HIS) produces a large volume of electronic medical records, which provides a comprehensive source for exploratory analysis and statistics to support clinical decision-making. In this paper, we investiga...
     
Improving collective I/O performance by pipelining request aggregation and file access
Found in: Proceedings of the 20th European MPI Users' Group Meeting (EuroMPI '13)
By Alok Choudhary, Karen Schuchardt, Saba Sehrish, Seung Woo Son, Wei-keng Liao
Issue Date:September 2013
pp. 37-42
In this paper, we propose a multi-buffer pipelining approach to improve collective I/O performance by overlapping the dominant request aggregation phases with the I/O phase in the two-phase I/O implementation. Our pipelining method first divides the collec...
     
Sentiment identification by incorporating syntax, semantics and context information
Found in: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12)
By Alok Choudhary, Ankit Agrawal, Daniel Honbo, Doug Downey, Kunpeng Zhang, Wei-keng Liao, Yu Cheng, Yusheng Xie
Issue Date:August 2012
pp. 1143-1144
This paper proposes a method based on conditional random fields to incorporate sentence structure (syntax and semantics) and context information to identify sentiments of sentences within a document. It also proposes and evaluates two different active lear...
     
Mining millions of reviews: a technique to rank products based on importance of reviews
Found in: Proceedings of the 13th International Conference on Electronic Commerce (ICEC '11)
By Alok Choudhary, Kunpeng Zhang, Wei-keng Liao, Yu Cheng
Issue Date:August 2011
pp. 1-8
As online shopping becomes increasingly more popular, many shopping web sites encourage existing customers to add reviews of products purchased. These reviews make an impact on the purchasing decisions of potential customers. At Amazon.com for instance, so...
     
Noncontiguous locking techniques for parallel file systems
Found in: Proceedings of the 2007 ACM/IEEE conference on Supercomputing (SC '07)
By Alok Choudhary, Avery Ching, Lee Ward, Robert Ross, Wei-keng Liao
Issue Date:November 2007
pp. 24-31
Many parallel scientific applications use high-level I/O APIs that offer atomic I/O capabilities. Atomic I/O in current parallel file systems is often slow when multiple processes simultaneously access interleaved, shared files. Current atomic I/O solution...
     
Using MPI file caching to improve parallel write performance for large-scale scientific applications
Found in: Proceedings of the 2007 ACM/IEEE conference on Supercomputing (SC '07)
By Alok Choudhary, Arifa Nisar, Avery Ching, Jacqueline Chen, Kenin Coloma, Ramanan Sankaran, Scott Klasky, Wei-keng Liao
Issue Date:November 2007
pp. 24-31
Typical large-scale scientific applications periodically write checkpoint files to save the computational state throughout execution. Existing parallel file systems improve such write-only I/O patterns through the use of client-side file caching and write-...
     
 1