Search For:

Displaying 1-31 out of 31 total
Composing Data Parallel Code for a SPARQL Graph Engine
Found in: 2013 International Conference on Social Computing (SocialCom)
By Vito Giovanni Castellana,Antonino Tumeo,Oreste Villa,David Haglin,John Feo
Issue Date:September 2013
pp. 691-699
The emergence of petascale triple stores have motivated the investigation of alternates to traditional table-based relational methods. Since triple stores represent data as structured tuples, graphs are a natural data structure for encoding their informati...
 
Exploring hardware support for scaling irregular applications on multi-node multi-core architectures
Found in: 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
By Simone Secchi,Marco Ceriani,Antonino Tumeo,Oreste Villa,Gianluca Palermo,Luigi Raffo
Issue Date:June 2013
pp. 309-313
The recent emergence of large-scale knowledge discovery, data mining and social network analysis, irregular applications have gained renewed interest. Cache-based architectures do not provide optimal performances with such workloads, mainly due to the low ...
 
Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping
Found in: 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
By Marco Ceriani,Gianluca Palermo,Simone Secchi,Antonino Tumeo,Oreste Villa
Issue Date:April 2013
pp. 238
Knowledge discovery applications are an emerging class of irregular applications that exploit graph-based data structures, present poor locality and analyze very big data sets that require multi-node systems for processing. Current clusters, which exploit ...
 
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer
Found in: IEEE Transactions on Parallel and Distributed Systems
By Oreste Villa,Antonino Tumeo,Simone Secchi,Joseph B. Manzano
Issue Date:December 2012
pp. 2266-2279
Irregular applications, such as data mining or graph-based computations, show unpredictable memory/network access patterns and control structures. Massively multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2, and XMT, appea...
 
A High Performance Computing Network and System Simulator for the Power Grid: NGNS^2
Found in: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC)
By Oreste Villa,Antonino Tumeo,Selim Ciraci,Jeff A. Daily,Jason C. Fuller
Issue Date:November 2012
pp. 313-322
Designing and planing next generation power grid systems composed of large power distribution networks, monitoring and control networks, autonomous generators and consumers of power requires advanced simulation infrastructures. The objective is to predict ...
 
Efficient Sorting on the Tilera Manycore Architecture
Found in: 2012 24th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
By Alessandro Morari,Antonino Tumeo,Oreste Villa,Simone Secchi,Mateo Valero
Issue Date:October 2012
pp. 171-178
We present an efficient implementation of the radix sort algorithm for the Tilera TILEPro64 processor. The TILEPro64 is one of the first successful commercial manycore processors. It is composed of 64 tiles interconnected through multiple fast Networks-on-...
 
Designing Next-Generation Massively Multithreaded Architectures for Irregular Applications
Found in: Computer
By Antonino Tumeo,Simone Secchi,Oreste Villa
Issue Date:August 2012
pp. 53-61
Massively multithreaded architectures like the Cray XMT address the needs of irregular data-intensive applications better than commodity clusters. A proposed evolution of the XMT integrates multicore processors and next-generation interconnects, along with...
 
A Bandwidth-Optimized Multi-core Architecture for Irregular Applications
Found in: Cluster Computing and the Grid, IEEE International Symposium on
By Simone Secchi,Antonino Tumeo,Oreste Villa
Issue Date:May 2012
pp. 580-587
This paper presents an architecture for high performance computing systems specifically targeted to irregular applications. We show how a multi-core paradigm can benefit from next-generation memories and networks, while still resorting to fine-grained mult...
 
Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures
Found in: IEEE Transactions on Parallel and Distributed Systems
By Antonino Tumeo,Oreste Villa,Daniel G. Chavarría-Miranda
Issue Date:March 2012
pp. 436-443
String matching requires a combination of (sometimes all) the following characteristics: high and/or predictable performance, support for large data sets and flexibility of integration and customization. This paper compares several software-based implement...
 
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
Found in: Cluster Computing and the Grid, IEEE International Symposium on
By Simone Secchi, Antonino Tumeo, Oreste Villa
Issue Date:May 2011
pp. 275-284
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the...
 
A Compact Transactional Memory Multiprocessor System on FPGA
Found in: International Conference on Field Programmable Logic and Applications
By Matteo Pusceddu, Simone Ceccolini, Gianluca Palermo, Donatella Sciuto, Antonino Tumeo
Issue Date:September 2010
pp. 578-581
In this paper we present a rapid prototyping platform on a single Field Programmable Gate Array (FPGA) with support for software transactional memory. The system is composed only by off-the-shelf cores and is useful for porting and early validation of prog...
 
Accelerating DNA analysis applications on GPU clusters
Found in: Application Specific Processors, Symposium on
By Antonino Tumeo, Oreste Villa
Issue Date:June 2010
pp. 71-76
DNA analysis is an emerging application of high performance bioinformatics. Modern sequencing machinery are able to provide, in few hours, large input streams of data which needs to be matched against exponentially growing databases of known fragments. The...
 
A multiprocessor self-reconfigurable JPEG2000 encoder
Found in: Parallel and Distributed Processing Symposium, International
By Antonino Tumeo,Simone Borgio,Davide Bosisio,Matteo Monchiero,Gianluca Palermo,Fabrizio Ferrandi,Donatella Sciuto
Issue Date:May 2009
pp. 1-8
This paper presents a multiprocessor architecture prototype on a Field Programmable Gate Arrays (FPGA) with support for hardware and software multithreading. Thanks to partial dynamic reconfiguration, this system can, at run time, spawn both software and h...
 
Lightweight DMA management mechanisms for multiprocessors on FPGA
Found in: Application-Specific Systems, Architectures and Processors, IEEE International Conference on
By Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto
Issue Date:July 2008
pp. 275-280
This paper presents a multiprocessor system on FPGA that adopts Direct Memory Access (DMA) mechanisms to move data between the external memory and the local memory of each processor. The system integrates all standard DMA primitives via a fast Application ...
 
A Dual-Priority Real-Time Multiprocessor System on FPGA for Automotive Applications
Found in: Design, Automation and Test in Europe Conference and Exhibition
By Antonino Tumeo, Marco Branca, Lorenzo Camerini, Marco Ceriani, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto
Issue Date:March 2008
pp. 1039-1044
This paper presents the implementation of a dual-priority scheduling algorithm for real-time embedded systems on a shared memory multiprocessor on FPGA. The dual-priority microkernel is supported by a multiprocessor interrupt controller to trigger periodic...
 
A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs
Found in: VLSI, IEEE Computer Society Annual Symposium on
By Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto
Issue Date:March 2007
pp. 331-336
<p>Multimedia applications, and in particular the encoding and decoding of standard image and video formats, are usually a typical target for Systemson- Chip (SoC). The bi-dimensional Discrete Cosine Transformation (2D-DCT) is a commonly used frequen...
 
An Internal Partial Dynamic Reconfiguration Implementation of the JPEG Encoder for Low-Cost FPGAsb
Found in: VLSI, IEEE Computer Society Annual Symposium on
By Antonino Tumeo, Matteo Monchiero, Gianluca Palermo, Fabrizio Ferrandi, Donatella Sciuto
Issue Date:March 2007
pp. 449-450
This paper presents the design of a JPEG Encoder which exploits this feature. We propose a mixed HW/SW architecture, where most compute-intensive components of the application are mapped to application-specific HW cores. These cores can be alternated on th...
 
Exploring Efficient Hardware Support for Applications with Irregular Memory Patterns on Multinode Manycore Architectures
Found in: IEEE Transactions on Parallel and Distributed Systems
By Marco Ceriani,Simone Secchi,Oreste Villa,Antonino Tumeo,Gianluca Palermo
Issue Date:August 2014
pp. 1
With computing systems becoming ubiquitous, numerous data sets of extremely large size are becoming available for analysis. Often the data collected have complex, graph based structures, which makes them difficult to process with traditional tools. Moreove...
 
Scaling Semantic Graph Databases in Size and Performance
Found in: IEEE Micro
By Alessandro Morari,Vito Giovanni Castellana,Oreste Villa,Antonino Tumeo,Jesse Weaver,David Haglin,Sutanay Choudhury,John Feo
Issue Date:July 2014
pp. 16-26
This article presents SGEM, a full software system for accelerating large-scale graph databases on commodity clusters. Unlike current approaches, GEMS addresses graph databases by primarily employing graph-based methods, which is reflected at all levels of...
 
Accelerating subsurface transport simulation on heterogeneous clusters
Found in: 2013 IEEE International Conference on Cluster Computing (CLUSTER)
By Oreste Villa,Nitin Gawande,Antonino Tumeo
Issue Date:September 2013
pp. 1-8
Reactive transport numerical models simulate chemical and microbiological reactions that occur along a flow-path. These models have to compute reactions for a large number of locations. They solve the set of ordinary differential equations (ODEs) that desc...
   
Second Workshop on Irregular Applications: Architectures & Algorithms - IA 3 2012
Found in: 2012 SC Companion: High-Performance Computing, Networking, Storage and Analysis (SCC)
By John Feo,Antonino Tumeo,Oreste Villa,Simone Secchi,Mahantesh Halappanavar
Issue Date:November 2012
pp. lxiv-lxv
This workshop, this year in its second edition, aims at bringing together scientists with all these different backgrounds to discuss, define and design methods and technologies for efficiently supporting irregular applications on current and future machine...
   
Toward a data scalable solution for facilitating discovery of scientific data resources
Found in: Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2013)
By Antonino Tumeo, David Haglin, John Feo, Oreste Villa, Sumit Purohit, Alan Chappell, Alessandro Morari, Jesse Weaver, Karen Schuchardt, Sutanay Choudhury
Issue Date:November 2013
pp. 55-60
Science is increasingly motivated by the need to process larger quantities of data. It is facing severe challenges in data collection, management, and processing, so much so that the computational demands of "data scaling" are competing with, and in many f...
     
Prototyping hardware support for irregular applications
Found in: Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO '13)
By Antonino Tumeo, Marco Ceriani, Oreste Villa, Simone Secchi
Issue Date:January 2013
pp. 1-8
The use of FPGA platforms developed with off-the-shelf soft cores has recently emerged as one of the most promising fast prototyping approaches to design, evaluate and validate new architectural components for multi- and many-core processors. The approach ...
     
Towards efficient execution of irregular applications: panel outline
Found in: Proceedings of the first workshop on Irregular applications: architectures and algorithm (IAAA '11)
By Antonino Tumeo, John Feo, Oreste Villa, Simone Secchi
Issue Date:November 2011
pp. 43-44
This panel seeks to discuss the current challenges for the efficient execution of irregular applications and to propose directions for the development of next generation systems.
     
Irregular applications: architectures & algorithms
Found in: Proceedings of the first workshop on Irregular applications: architectures and algorithm (IAAA '11)
By Antonino Tumeo, John Feo, Oreste Villa, Simone Secchi
Issue Date:November 2011
pp. 1-2
Irregular applications are characterized by irregular data structures, control and communication patterns. Novel irregular high performance applications which deal with large data sets and require have recently appeared. Unfortunately, current high perform...
     
Multiprocessor systems-on-chip synthesis using multi-objective evolutionary computation
Found in: Proceedings of the 12th annual conference on Genetic and evolutionary computation (GECCO '10)
By Antonino Tumeo, Donatella Sciuto, Fabrizio Ferrandi, Marco Ceriani, Pier Luca Lanzi
Issue Date:July 2010
pp. 1267-1274
In this paper, we apply multi-objective evolutionary computation to the synthesis of real-time, embedded, heterogeneous, multiprocessor systems (briefly, Multiprocessor Systems-on-Chip or MP-SoCs). Our approach simultaneously explores the architecture, the...
     
Mapping pipelined applications onto heterogeneous embedded systems: a bayesian optimization algorithm based approach
Found in: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis (CODES+ISSS '09)
By Antonino Tumeo, Christian Pilato, Donatella Sciuto, Fabrizio Ferrandi, Lorenzo Camerini, Marco Branca, Pier Luca Lanzi
Issue Date:October 2009
pp. 443-452
In this paper we propose a flow based on the Bayesian Optimization Algorithm (BOA) for mapping pipelined applications on a heterogeneous multiprocessor platform on Field Programmable Gate Array (FPGA) with customizable processors. BOA is a Probabilistic Mo...
     
Evolutionary algorithms for the mapping of pipelined applications onto heterogeneous embedded systems
Found in: Proceedings of the 11th Annual conference on Genetic and evolutionary computation (GECCO '09)
By Antonino Tumeo, Christian Pilato, Donatella Sciuto, Fabrizio Ferrandi, Lorenzo Camerini, Marco Branca, Pier Luca Lanzi
Issue Date:July 2009
pp. 46-52
In this paper, we compare four algorithms for the mapping of pipelined applications on a heterogeneous multiprocessor platform implemented using Field Programmable Gate Arrays (FPGAs) with customizable processors. Initially, we describe the framework and t...
     
HW/SW methodologies for synchronization in FPGA multiprocessors
Found in: Proceeding of the ACM/SIGDA international symposium on Field programmable gate arrays (FPGA '09)
By Antonino Tumeo, Christian Pilato, Donatella Sciuto, Fabrizio Ferrandi, Gianluca Palermo
Issue Date:February 2009
pp. 1-2
odern Field Programmable Gate Arrays (FPGA) can be programmed with multiple soft-core processors. These solutions can be used for MultiProcessor Systems-on-Chip (MPSoCs) prototyping or even for final implementation. Nevertheless, efficient synchronization ...
     
A dual-priority real-time multiprocessor system on FPGA for automotive applications
Found in: Proceedings of the conference on Design, automation and test in Europe (DATE '08)
By Antonino Tumeo, Donatella Sciuto, Fabrizio Ferrandi, Gianluca Palermo, Lorenzo Camerini, Marco Branca, Marco Ceriani, Matteo Monchiero
Issue Date:March 2008
pp. 1-30
This paper presents the implementation of a dual-priority scheduling algorithm for real-time embedded systems on a shared memory multiprocessor on FPGA. The dual-priority microkernel is supported by a multiprocessor interrupt controller to trigger periodic...
     
A design kit for a fully working shared memory multiprocessor on FPGA
Found in: Proceedings of the 17th great lakes symposium on Great lakes symposium on VLSI (GLSVLSI '07)
By Antonino Tumeo, Donatella Sciuto, Fabrizio Ferrandi, Gianluca Palermo, Matteo Monchiero
Issue Date:March 2007
pp. 219-222
This paper presents a framework to design a shared memory multiprocessor on a programmable platform. We propose a complete flow, composed by a programming model and a template architecture. Our framework permits to write a parallel application by using a s...
     
 1