IEEE Transactions on Computers

Transactions on Computers Media Center

Our volunteers share with the wider community their views and experiences on a variety of topics. The volunteers can range from associate editors to authors, reviewers or members from the research community at large. The interviews are intended to cover a wide spectrum of topics that are relevant to our community. These topics can be in the form of “shared experiences” and “lessons learned” or highlighting a new technological or theoretical breakthrough. We hope that members of the community will actively participate in making this new feature a great success. For information on submitting multimedia content, please click here.

Albert Zomaya

TC EIC

A Word from the Editor-in-Chief,
Albert Y. Zomaya

In Their Own Words In Their Own Words
Minimizing Energy Consumption of Embedded Systems via Optimal Code Layout

by Chen-Wei Huang and Shiao-Li Tsao

 

As the gap between CPU and memory is widening every year, it is getting harder for CPU to obtain a timely response from main memory. At the same time, accessing energy for registers in CPU to that of SDRAM differs by orders of magnitudes. So a badly designed memory system will drag down the system performance, and also restrain the battery life for portable devices. This situation will be more challenging in the multi-core era, where memory systems have to supply more data at a faster pace and in more energy-efficient manner.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.122

Progressive Congestion Management Based on Packet Marking and Validation Techniques

by Joan-LLuis Ferrer, Elvira Baydal, Antonio Robles, Pedro Lopez, and Jose Duato

 

Congestion management in multistage interconnection networks is a serious problem not completely solved. In order to avoid the degradation of network performance when congestion appears, several congestion management mechanisms have been proposed. Most of these mechanisms are based on explicit congestion notification. For this purpose, switches detect congestion and depending on the applied strategy, packets are marked to warn the source hosts. In response, source hosts apply some corrective actions to adjust their packet injection rate. Although these proposals seem quite effective, they either exhibit some drawbacks or are partial solutions. Some of them introduce some penalties over the flows not responsible for congestion, whereas others can cope only with congestion situations that last for a short time. In this paper, we present an overview of the different strategies to detect and correct congestion in multistage interconnection networks, and propose a new mechanism referred to as Marking and Validation Congestion Management (MVCM), targeted to this kind of lossless networks, and based on a more refined packet marking strategy combined with a fair set of corrective actions, that makes the mechanism able to effectively manage congestion regardless of the congestion degree. Evaluation results show the effectiveness and robustness of the proposed mechanism.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.146

Scalable Tree-Based Architectures for IPv4/v6 Lookup Using Prefix Partitioning

by Hoang Le and Viktor K. Prasanna

 

Memory efficiency and dynamically updateable data structures for Internet Protocol (IP) lookup have regained much interest in the research community. In this paper, we revisit the classic tree-based approach for solving the longest prefix matching (LPM) problem used in IP lookup. In particular, we target our solutions for a class of large and sparsely-distributed routing tables, such as those potentially arising in the next-generation IPv6 routing protocol. Due to longer prefix lengths and much larger address space, preprocessing such routing tables for tree-based LPM can significantly increase the number of prefixes and/or memory stages required for IP lookup. We propose a prefix partitioning algorithm (DPP) to divide a given routing table into k groups of disjoint prefixes (k is given). The algorithm employs dynamic programming to determine the optimal split lengths between the groups to minimize the total memory requirement. Our algorithm demonstrates a substantial reduction in the memory footprint compared with those of the state-of-the-art in both IPv4 and IPv6 cases. Two proposed linear pipelined architectures, which achieve high throughput and support incremental updates, are also presented. The proposed algorithm and architectures achieve a memory efficiency of 1 byte of memory for each byte of prefix for both IPv4 and IPv6. As a result, our design scales well to support either larger routing tables, longer prefix lengths, or both. The total memory requirement depends solely on the number of prefixes. Implementations on 45 nm ASIC and a state-of-the-art FPGA device (for a routing table consisting of 330K prefixes) show that our algorithm achieves 980 and 410 million lookups per second, respectively. These results are well suited for 100Gbps lookup. The implementations also scale to support larger routing tables and longer prefix length when we go from IPv4 to IPv6. Additionally, the proposed architectures can easily interface with external SRAMs to ease the limitation of on-chip memory of the target devices.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.130

CPU Accounting for Multicore Processors

by Carlos Luque, Miquel Moreto, Francisco J. Cazorla, Roberto Gioiosa, Alper Buyuktosunoglu, and Mateo Valero

 

In single-threaded processors and Symmetric Multiprocessors the execution time of a task depends on the other tasks it runs with (the workload), since the Operating System (OS) time shares the CPU(s) between tasks in the workload. However, the time accounted to a task is roughly the same regardless of the workload in which the task runs in, since the OS takes into account those periods in which the task is not scheduled onto a CPU. Chip Multiprocessors (CMPs) introduce complexities when accounting CPU utilization, since the CPU time to account to a task not only depends on the time that the task is scheduled onto a CPU, but also on the amount of hardware resources it receives during that period.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.152

Importance of Coherence Protocols with Network Applications on Multi-Core Processors

by Kyueun Yi, Won W. Ro, and Jean-Luc Gaudiot

 

As Internet and information technology have continued developing, the necessity for fast packet processing in computer networks has also grown in importance. All emerging network applications require deep packet classification as well as security-related processing and they should be run at line rates. Hence, network speed and the complexity of network applications will continue increasing and future network processors should simultaneously meet two requirements: high performance and high programmability. We will show that the performance of single processors will not be sufficient to support future demands. Instead, we will have to turn to multi-core processors which can exploit the parallelism in network workloads.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.199

Statistical Reliability Estimation of Microprocessor-based Systems

by A. Savino, A. Benso, A. Bosio, S. Di Carlo, G. Politano, and G. Di Natale

 

What is the probability that the execution state of a given microprocessor running a given application is correct, in a certain working environment with a given soft-error rate? Trying to answer this question using fault injection can be very expensive and time consuming. This paper proposes the baseline for a new methodology, based on microprocessor error probability profiling, that aims at estimating fault injection results without the need of a typical fault injection setup. The proposed methodology is based on two main ideas: a one-time fault-injection analysis of the microprocessor architecture to characterize the probability of successful execution of each of its instructions in presence of a soft-error, and a static and very fast analysis of the control and data flow of the target software application to compute its probability of success. The presented work goes beyond the dependability evaluation problem; it also has the potential to become the backbone for new tools able to help engineers to choose the best hardware and software architecture to structurally maximize the probability of a correct execution of the target software.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.188

Low Overhead Soft Error Mitigation Techniques for High-Performance and Aggressive Designs

by Naga Durga Prasad Avirneni and Arun Somani

 

The threat of soft error induced system failure in computing systems has become more prominent, as we adopt ultra-deep submicron process technologies. In this paper, we propose two efficient soft error mitigation schemes, namely Soft Error Mitigation (SEM) and Soft and Timing Error Mitigation (STEM), using the approach of multiple clocking of data for protecting combinational logic blocks from soft errors. Our first technique, SEM, based on distributed and temporal voting of three registers, unloads the soft error detection overhead from the critical path of the systems. SEM is also capable of ignoring false errors and recovers from soft errors using in-situ fast recovery avoiding recomputation. Our second technique, STEM, while tolerating soft errors, adds timing error detection capability to guarantee reliable execution in aggressively clocked designs that enhance system performance by operating beyond worst-case clock frequency. We also present a specialized low overhead clock phase management scheme that ably supports our proposed techniques. Timing annotated gate level simulations, using 45nm libraries, of a pipelined adder-multiplier and DLX processor show that both our techniques achieve near 100% fault coverage. For DLX processor, even under severe fault injection campaigns, SEM achieves an average performance improvement of 26.58% over a conventional triple modular redundancy voter based soft error mitigation scheme, while STEM outperforms SEM by 27.42%.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.31

Automated Generation of Performance and Dependability Models for the Assessment of Wireless Sensor Networks

by Catello Di Martino, Marcello Cinque, and Domenico Cotroneo

 

Wireless Sensor Networks (WSNs) are widely recognized as a promising solution to build next-generation monitoring systems. Their industrial uptake is however still compromised by the low level of trust on their performance and dependability. Whereas analytical models represent a valid means to assess non-functional properties via simulation, their wide use is still limited by the complexity and dynamicity of WSNs, which lead to unaffordable modeling costs. To reduce this gap between research achievements and industrial development, we present a framework for the assessment of WSNs based on the automated generation of analytical models. The framework hides modeling details, and it allows designers to focus on simulation results to drive their design choices. Models are generated starting from a high-level specification of the system and by a preliminary characterization of its fault-free behavior, by exploiting behavioral simulators. The benefits of the framework are shown in the context of two case studies, based on the wireless monitoring of civil structures.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.96

A Parallel Hardware Architecture for Real-Time Object Detection with Support Vector Machines

by Christos Kyrkou and Nicosia Theocharis Theocharides

 

Object detection applications are often associated with real-time performance constraints that stem from the embedded environment that they are often deployed in. Consequently, researchers have proposed dedicated hardware architectures, utilizing a variety of classification algorithms targeting object detection. Support Vector Machines (SVMs) is amongst the most popular classification algorithms used in object detection yielding high accuracy rates. However, existing SVM hardware implementations attempting to speedup SVM classification, have either targeted only simple applications, or SVM training. As such, there are limited proposed hardware architectures that are generic enough to be used in a variety of object detection applications. Hence, this work presents a parallel array architecture for SVM-based object detection, in an attempt to show the advantages, and performance benefits that stem from a dedicated hardware solution. The proposed hardware architecture provides parallel processing, resource sharing amongst the processing units, and efficient memory management. Furthermore, the size of the array is scalable to the hardware demands, and can also handle a variety of applications such as multi-class classification problems. A prototype of the proposed architecture was implemented on an FPGA platform and evaluated using three popular detection applications, demonstrating real-time performance (40-122 fps for a variety of applications). 

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.113

PETCAM: A Power Efficient TCAM Architecture for Forwarding Tables

by Tania Banerjee Mishra and Sartaj Sahni

 

Ternary Content Addressable Memory (TCAM) is a hardware device which can support high-speed table lookups and is an attractive solution for applications such as packet forwarding and classification. We investigate various TCAM architectures recently proposed for TCAM power and memory reduction in packet forwarding and show that far better power and memory performance is possible when we use an optimal prefix set for the given routing table. Compared to existing approaches, our experimental results demonstrate that our approach can significantly reduce both power (8% - 98%) and TCAM memory (45% - 78%) requirements.

The full article can be found here: http://doi.ieeecomputersociety.org/10.1109/TC.2011.84

IEEE Transactions on Computers: Paolo Montuschi

 

After short self introduction, the speaker quickly inform readers and subscribers of the Computer Society Digital Library (CSDL), about the availability in the CSDL of IEEE Transactions on Computer issues for download in an electronic format very suitable for reading on mobile devices. The speaker both presents a couple of visual examples, and provides a web address where additional information on these new features can be found. Finally, the speaker gives a preview on the future collaboration between the on-line Computer Society web-based-only journal Computing Now and the IEEE Transactions on Computers.

For more information on IEEE Transactions on Computers, visit http://www.computer.org/tc

 

IEEE Transactions on Computers: Q&A with Dr. Elisardo Antelo

 

Professor Elisardo Antelo shares his experience as an Associate Editor for the IEEE TC. He also provides us some insights from his research journey.

For more information on IEEE Transactions on Computers, visit http://www.computer.org/tc

Showing 12 results.

What is the OnlinePlus publication model?

It is our new publication model that is a hybrid of online only and print, giving subscribers the best of both worlds—online access plus a printed book of article abstracts and a searchable interactive disk that allows readers to access content anywhere without an internet connection for less than a traditional print subscription.


Essential Sets: Industry's Interest in Computer Arithmetic Research: Part I, Dr. Schwarz's view

 Dr. Eric Schwarz describes the important aspects of computer arithmetic research. He provides a list of current questions that need to be solved by research and also what topics are the most interesting to industry.

Purchase the Essential Sets here:

Volume 1:

www.computer.org/portal/web/store?product_id=ES0000033&category_id=TechSets

 Volume 2:

www.computer.org/portal/web/store?product_id=ES0000034&category_id=TechSets

 


Essential Sets: Industry's Interest in Computer Arithmetic Research: Part II, Dr. Hu's view

Dr. Hu describes the important aspects of computer arithmetic research. He provides a list of current questions that need to be solved by research and also what topics are the most interesting to industry.

Purchase the Essential Sets here:

Volume 1:

www.computer.org/portal/web/store?product_id=ES0000033&category_id=TechSets

 Volume 2: 

www.computer.org/portal/web/store?product_id=ES0000034&category_id=TechSets

 


Concurrent On-Line Testing and Error/Fault Resilience of Digital Systems

Guest editor Cecilia Metra discusses the "Concurrent On-Line Testing and Error/Fault Resilience of Digital Systems" theme issue for IEEE Transactions on Computers. View the issue here: 

 http://www.computer.org/portal/web/csdl/transactions/tc#3