Search For:

Displaying 1-31 out of 31 total
Composing Data Parallel Code for a SPARQL Graph Engine
Found in: 2013 International Conference on Social Computing (SocialCom)
By Vito Giovanni Castellana,Antonino Tumeo,Oreste Villa,David Haglin,John Feo
Issue Date:September 2013
pp. 691-699
The emergence of petascale triple stores have motivated the investigation of alternates to traditional table-based relational methods. Since triple stores represent data as structured tuples, graphs are a natural data structure for encoding their informati...
 
Exploring hardware support for scaling irregular applications on multi-node multi-core architectures
Found in: 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
By Simone Secchi,Marco Ceriani,Antonino Tumeo,Oreste Villa,Gianluca Palermo,Luigi Raffo
Issue Date:June 2013
pp. 309-313
The recent emergence of large-scale knowledge discovery, data mining and social network analysis, irregular applications have gained renewed interest. Cache-based architectures do not provide optimal performances with such workloads, mainly due to the low ...
 
Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping
Found in: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
By Marco Ceriani,Gianluca Palermo,Simone Secchi,Antonino Tumeo,Oreste Villa
Issue Date:April 2013
pp. 238
Knowledge discovery applications are an emerging class of irregular applications that exploit graph-based data structures, present poor locality and analyze very big data sets that require multi-node systems for processing. Current clusters, which exploit ...
 
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer
Found in: IEEE Transactions on Parallel and Distributed Systems
By Oreste Villa,Antonino Tumeo,Simone Secchi,Joseph B. Manzano
Issue Date:December 2012
pp. 2266-2279
Irregular applications, such as data mining or graph-based computations, show unpredictable memory/network access patterns and control structures. Massively multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2, and XMT, appea...
 
A High Performance Computing Network and System Simulator for the Power Grid: NGNS^2
Found in: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC)
By Oreste Villa,Antonino Tumeo,Selim Ciraci,Jeff A. Daily,Jason C. Fuller
Issue Date:November 2012
pp. 313-322
Designing and planing next generation power grid systems composed of large power distribution networks, monitoring and control networks, autonomous generators and consumers of power requires advanced simulation infrastructures. The objective is to predict ...
 
Efficient Sorting on the Tilera Manycore Architecture
Found in: 2012 24th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
By Alessandro Morari,Antonino Tumeo,Oreste Villa,Simone Secchi,Mateo Valero
Issue Date:October 2012
pp. 171-178
We present an efficient implementation of the radix sort algorithm for the Tilera TILEPro64 processor. The TILEPro64 is one of the first successful commercial manycore processors. It is composed of 64 tiles interconnected through multiple fast Networks-on-...
 
Designing Next-Generation Massively Multithreaded Architectures for Irregular Applications
Found in: Computer
By Antonino Tumeo,Simone Secchi,Oreste Villa
Issue Date:August 2012
pp. 53-61
Massively multithreaded architectures like the Cray XMT address the needs of irregular data-intensive applications better than commodity clusters. A proposed evolution of the XMT integrates multicore processors and next-generation interconnects, along with...
 
A Bandwidth-Optimized Multi-core Architecture for Irregular Applications
Found in: Cluster Computing and the Grid, IEEE International Symposium on
By Simone Secchi,Antonino Tumeo,Oreste Villa
Issue Date:May 2012
pp. 580-587
This paper presents an architecture for high performance computing systems specifically targeted to irregular applications. We show how a multi-core paradigm can benefit from next-generation memories and networks, while still resorting to fine-grained mult...
 
Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems
Found in: Cluster Computing, IEEE International Conference on
By Long Chen,Oreste Villa,Guang R. Gao
Issue Date:September 2011
pp. 386-394
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU programming paradigms, e.g., CUDA, cannot satisfactorily address certain issu...
 
Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures
Found in: IEEE Transactions on Parallel and Distributed Systems
By Antonino Tumeo,Oreste Villa,Daniel G. Chavarría-Miranda
Issue Date:March 2012
pp. 436-443
String matching requires a combination of (sometimes all) the following characteristics: high and/or predictable performance, support for large data sets and flexibility of integration and customization. This paper compares several software-based implement...
 
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
Found in: Cluster Computing and the Grid, IEEE International Symposium on
By Simone Secchi, Antonino Tumeo, Oreste Villa
Issue Date:May 2011
pp. 275-284
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the...
 
Accelerating DNA analysis applications on GPU clusters
Found in: Application Specific Processors, Symposium on
By Antonino Tumeo, Oreste Villa
Issue Date:June 2010
pp. 71-76
DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The...
 
Input-independent, scalable and fast string matching on the Cray XMT
Found in: Parallel and Distributed Processing Symposium, International
By Oreste Villa,Daniel Chavarria-Miranda,Kristyn Maschhoff
Issue Date:May 2009
pp. 1-12
String searching is at the core of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast, and ofte...
 
High-speed string searching against large dictionaries on the Cell/B.E. Processor
Found in: Parallel and Distributed Processing Symposium, International
By Daniele Paolo Scarpazza, Oreste Villa, Fabrizio Petrini
Issue Date:April 2008
pp. 1-12
Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, ...
 
Accelerating Real-Time String Searching with Multicore Processors
Found in: Computer
By Oreste Villa, Daniele Paolo Scarpazza, Fabrizio Petrini
Issue Date:April 2008
pp. 42-50
String searching is at the core of tools used to search, filter, and protect data, but this has become increasingly difficult to do in real time as communication speed grows. The authors present an optimization strategy for a popular algorithm that fully e...
 
Efficient Breadth-First Search on the Cell/BE Processor
Found in: IEEE Transactions on Parallel and Distributed Systems
By Daniele Paolo Scarpazza, Oreste Villa, Fabrizio Petrini
Issue Date:October 2008
pp. 1381-1395
Multi-core processors are a shift of paradigm in computer architecture that promises a dramatic increase in performance. But they also bring an unprecedented level of complexity in algorithmic design and software development. In this paper we describe the ...
 
Challenges in Mapping Graph Exploration Algorithms on Advanced Multi-core Processors
Found in: Parallel and Distributed Processing Symposium, International
By Oreste Villa, Daniele Paolo Scarpazza, Fabrizio Petrini, Juan Fernandez Peinador
Issue Date:March 2007
pp. 63
Multi-core processors are a shift of paradigm in computer architecture that promises a dramatic increase in performance. But multi-core processors also bring an unprecedented level of complexity in algorithmic design and software development. In this paper...
 
Peak-Performance DFA-based String Matching on the Cell Processor
Found in: Parallel and Distributed Processing Symposium, International
By Daniele Paolo Scarpazza, Oreste Villa, Fabrizio Petrini
Issue Date:March 2007
pp. 444
The security of your data and of your network is in the hands of intrusion detection systems, virus scanners and spam filters, which are all critically based on string matching. But network links are getting faster and faster, and string matching is gettin...
 
Exploring Efficient Hardware Support for Applications with Irregular Memory Patterns on Multinode Manycore Architectures
Found in: IEEE Transactions on Parallel and Distributed Systems
By Marco Ceriani,Simone Secchi,Oreste Villa,Antonino Tumeo,Gianluca Palermo
Issue Date:August 2014
pp. 1
With computing systems becoming ubiquitous, numerous data sets of extremely large size are becoming available for analysis. Often the data collected have complex, graph based structures, which makes them difficult to process with traditional tools. Moreove...
 
Scaling Semantic Graph Databases in Size and Performance
Found in: IEEE Micro
By Alessandro Morari,Vito Giovanni Castellana,Oreste Villa,Antonino Tumeo,Jesse Weaver,David Haglin,Sutanay Choudhury,John Feo
Issue Date:July 2014
pp. 16-26
This article presents SGEM, a full software system for accelerating large-scale graph databases on commodity clusters. Unlike current approaches, GEMS addresses graph databases by primarily employing graph-based methods, which is reflected at all levels of...
 
Accelerating subsurface transport simulation on heterogeneous clusters
Found in: 2013 IEEE International Conference on Cluster Computing (CLUSTER)
By Oreste Villa,Nitin Gawande,Antonino Tumeo
Issue Date:September 2013
pp. 1-8
Reactive transport numerical models simulate chemical and microbiological reactions that occur along a flow-path. These models have to compute reactions for a large number of locations. They solve the set of ordinary differential equations (ODEs) that desc...
   
Second Workshop on Irregular Applications: Architectures & Algorithms - IA 3 2012
Found in: 2012 SC Companion: High-Performance Computing, Networking, Storage and Analysis (SCC)
By John Feo,Antonino Tumeo,Oreste Villa,Simone Secchi,Mahantesh Halappanavar
Issue Date:November 2012
pp. lxiv-lxv
This workshop, this year in its second edition, aims at bringing together scientists with all these different backgrounds to discuss, define and design methods and technologies for efficiently supporting irregular applications on current and future machine...
   
A Modular Approach to Model Heterogeneous MPSoC at Cycle Level
Found in: Digital Systems Design, Euromicro Symposium on
By Matteo Monchiero, Gianluca Palermo, Cristina Silvano, Oreste Villa
Issue Date:September 2008
pp. 158-164
This paper proposes a system-level cycle-based framework to model and design heterogeneous Multiprocessor Systemson-Chip (MPSoC), called GRAPES. The approach features flexibility and modularity maintaining high simulation speed despite modeling at cycle le...
 
Toward a data scalable solution for facilitating discovery of scientific data resources
Found in: Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2013)
By Antonino Tumeo, David Haglin, John Feo, Oreste Villa, Sumit Purohit, Alan Chappell, Alessandro Morari, Jesse Weaver, Karen Schuchardt, Sutanay Choudhury
Issue Date:November 2013
pp. 55-60
Science is increasingly motivated by the need to process larger quantities of data. It is facing severe challenges in data collection, management, and processing, so much so that the computational demands of "data scaling" are competing with, and in many f...
     
Exploiting points-to maps for de-/serialization code generation
Found in: Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC '13)
By Oreste Villa, Selim Ciraci
Issue Date:March 2013
pp. 1712-1719
Serialization code generators for C++ have restrictions on the implementation of dynamic arrays and void/function pointers. If the target program is not implemented with these restrictions, developers have to manually change the source code to facilitate s...
     
Prototyping hardware support for irregular applications
Found in: Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO '13)
By Antonino Tumeo, Marco Ceriani, Oreste Villa, Simone Secchi
Issue Date:January 2013
pp. 1-8
The use of FPGA platforms developed with off-the-shelf soft cores has recently emerged as one of the most promising fast prototyping approaches to design, evaluate and validate new architectural components for multi- and many-core processors. The approach ...
     
Towards efficient execution of irregular applications: panel outline
Found in: Proceedings of the first workshop on Irregular applications: architectures and algorithm (IAAA '11)
By Antonino Tumeo, John Feo, Oreste Villa, Simone Secchi
Issue Date:November 2011
pp. 43-44
This panel seeks to discuss the current challenges for the efficient execution of irregular applications and to propose directions for the development of next generation systems.
     
Irregular applications: architectures & algorithms
Found in: Proceedings of the first workshop on Irregular applications: architectures and algorithm (IAAA '11)
By Antonino Tumeo, John Feo, Oreste Villa, Simone Secchi
Issue Date:November 2011
pp. 1-2
Irregular applications are characterized by irregular data structures, control and communication patterns. Novel irregular high performance applications which deal with large data sets and require have recently appeared. Unfortunately, current high perform...
     
Scalable transparent checkpoint-restart of global address space applications on virtual machines over infiniband
Found in: Proceedings of the 6th ACM conference on Computing frontiers (CF '09)
By David M. Brown, Jarek Nieplocha, Oreste Villa, Sriram Krishnamoorthy
Issue Date:May 2009
pp. 227-227
Checkpoint-Restart is one of the most used software approaches to achieve fault-tolerance in high-end clusters. While standard techniques typically focus on user-level solutions, the advent of virtualization software has enabled efficient and transparent s...
     
Efficiency and scalability of barrier synchronization on NoC based many-core architectures
Found in: Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems (CASES '08)
By Cristina Silvano, Gianluca Palermo, Oreste Villa
Issue Date:October 2008
pp. 79-79
Interconnects based on Networks-on-Chip are an appealing solution to address future microprocessor designs where, very likely, hundreds of cores will be connected on a single chip. A fundamental role in highly parallelized applications running on many-core...
     
Exact multi-pattern string matching on the cell/b.e. processor
Found in: Proceedings of the 2008 conference on Computing frontiers (CF '08)
By Daniele Paolo Scarpazza, Fabrizio Petrini, Oreste Villa
Issue Date:May 2008
pp. 353-358
String searching is the computationally intensive kernel of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push th...
     
 1