VLSI Circuits and Systems Letter

Volume 3, Issue 2, June 2017

Editorial

Feature Member

- Nagarajan Ranganathan, Distinguished University Professor Emeritus, University of South Florida

Feature Articles

- Abir J Mondal, Alak Majumder, Bidyut K bhattacharyya and Pinaki Chakraborty, A Process Aware Delay Circuit with Reduce Impact of Input Switching at GHz Frequencies
- Bipasha Nath, Alak Majumder, Monalisa Das, Abir J Mondal, Pinaki Chakraborty, Bidyut K Bhattacharyya, Voltage Keeper Based 28.27µW New Frequency Divider Circuit in 90nm Technology for Gigascale SerDes Application
- Sandeep Kakde, Yashika Gaidhani, Tejas Thubrikar, Shailesh Kamble, Nikit Shah, Power Efficient Test Pattern Generator Using Bit-Swapping LFSR Technique
- Sunil Kumar and Balwinder Raj, Estimation of Stability and Performance metric for Inward Access Transistor based 6T SRAM Cell Design using n-type/p-type DMDG-GDOV TFET
- Neha Gupta and Vaibhav Neema, Design and Analysis of DECODER circuit with Source biasing Technique for Memory array application

Updates

- Upcoming conferences and workshops
- Call for papers and proposals
- TCVLSI Awards
- Funding Opportunities
- Job Openings
- Ph.D. Fellowships Available

Outreach and Community

Call for Contributions
Editorial

The VLSI Circuits and Systems Letter is affiliated with the Technical Committee on VLSI (TCVLSI) under the IEEE Computer Society. It aims to report recent advances in VLSI technology, education and opportunities and, consequently, grow the research and education activities in the area. The letter, published twice a year, covers the design methodologies for advanced VLSI circuit and systems, including digital circuits and systems, analog and radio-frequency circuits, as well as mixed-signal circuits and systems. The emphasis of TCVLSI falls on integrating the design, computer-aided design, fabrication, application, and business aspects of VLSI while encompassing both hardware and software.

TCVLSI sponsors a number of premium conferences and workshops, including, but not limited to, ASAP, ASYnc, ISVLSI, IWLS, SLIP, and ARITH. Emerging research topics and state-of-the-art advances on VLSI circuits and systems are reported at these events on a regular basis. Best paper awards are selected at these conferences to promote the high-quality research work each year. In addition to these research activities, TCVLSI also supports a variety of educational activities related to TCVLSI. Several student travel grants are sponsored by TCVLSI in the following meetings: ASAP 2017, ISVLSI 2017, IWLS 2017, and SLIP 2017. Funds are provided to compensate student travels to these meetings as well as attract more student participation. The organizing committees of these meetings undertake the task of selecting right candidates for these awards.

This issue of the VLSI Circuits and Systems Letter highlights Prof. Nagarajan Ranganathan, Distinguished University Professor Emeritus at University of South Florida, as our “Feature Member”. It also showcases the state-of-the-art developments covering several emerging areas: low-power and robust circuit design and test, emerging devices and circuits, etc. Professional articles are solicited from technical experts to provide an in-depth review of these areas. The articles can be found in the section of “Features Articles”. In the section of “Updates”, upcoming conferences/workshops, call for papers and proposals, funding opportunities, job openings and Ph.D. fellowships are summarized. Finally, a dedicated section of “Outreach and Community” summarizes our outreach activities.

We would like to express our great appreciation to all Associate Editors (Hideharu Amano, Mike Borowczak, Prasun Ghosal, Shiyan Hu, Michael Hübner, Helen Li, Anirban Sengupta, Jawar Singh, Saket Srivastava, Yiyu Shi, Yasuhiro Takahashi, Jun Tao, Himanshu Thapliyal and Qi Zhu) for their dedicated effort and strong support in organizing this letter. The complete editorial board information is available at: https://www.computer.org/web/tcvlsi/editorial-board. We are thankful to our web chair Mike Borowczak, for his professional service to make the letter publicly available on the Internet. We wish to thank all authors who have contributed their professional articles to this issue. We hope that you will have an enjoyable moment when reading the letter! The call for contributions for the next issue is available at the end of this issue and we encourage you to submit articles, news, etc. to an associate editor covering that scope.

Saraju Mohanty  
Chair TCVLSI and Editor  
University of North Texas

Xin Li  
TCVLSI Editor  
Duke University and Duke Kunshan University
Nagarajan "Ranga" Ranganathan (S'81-M'88-SM'92-F'02) received the B.E. (Honors) degree in Electrical and Electronics Engineering from National Institute of Technology, Trichy, India, 1983, and the Ph.D. degree in Computer Science from the University of Central Florida, Orlando in 1988. He is a Distinguished University Professor Emeritus of Computer Science and Engineering at the University of South Florida, Tampa. During 1998-99, he was a Professor of Electrical and Computer Engineering at the University of Texas at El Paso. His research interests include VLSI System Design, VLSI Design Automation, Multimetric optimization in Hardware and Software Systems, Computer Architecture, Reversible Logic and Quantum Computing. He has developed many special purpose VLSI circuits and systems for computer vision, image and video processing, pattern recognition, data compression and signal processing applications. He has developed several VLSI CAD algorithms based on decision theory, game theory, auction theory and Fuzzy modeling. He has co-authored over 310 publications in refereed journals and conferences, five book chapters and co-owns twelve U.S. patents and three pending. He mentored and served as major professor for 27 PhD and 69 MS theses students to graduation. He has edited three books titled VLSI Algorithms and Architectures: Fundamentals, VLSI Algorithms and Architectures: Advanced Concepts, IEEE CS Press, 1993, VLSI for Pattern Recognition and Artificial Intelligence, World Scientific Publishers, 1995 and co-authored a book titled, Low Power High Level Synthesis for Nanoscale CMOS Circuits, Springer, June 2008.

Dr. Ranganathan was elected as a Fellow of IEEE in 2002 for his contributions to algorithms and architectures for VLSI systems. He was elected Fellow of AAAS in 2012. He is a member of the IEEE, IEEE Computer Society, IEEE Circuits and Systems Society and the VLSI Society of India. He has served on the editorial boards for the journals: Pattern Recognition (1993-97), VLSI Design (1994-1998), IEEE Transactions on VLSI Systems (1995-97), IEEE Transactions on Circuits and Systems (1997-99), IEEE Transactions on Circuits and Systems for Video Technology (1997-00), IEEE Transactions on Computers (2008-12), IEEE Transactions on CAD (2008-10) and ACM Transactions on Design Automation of Electronic Systems (2007-09). He was the chair of the IEEE Computer Society Technical Committee on VLSI during 1997-01, served on the steering committee of the IEEE Transactions on VLSI Systems during 1999-01 and 2007-2010, the steering committee chair during 2002-03 and the Editor-in-Chief (EiC) for two consecutive terms during 2003-07. He served as the program co-chair for ICVLSID'94, ISVLSI'96, ISVLSI'05, and ICVLSID'08 and as general co-chair for ICVLSID'95, IWVLSI'98, ICVLSID'98, ISVLSI'05, ISVLSI'09 and ISVLSI'12. He served on technical program committees of international conferences including ICCD, ICPP, IPPS, SPDP, ICHPC, HPCA, GLSVLSI, ASYNC, ISQED, ISLPED, CAMP, ISCAS, VLSID, MSE and ICCAD.

Dr. Ranganathan received the USF Outstanding Research Achievement Award (2002), USF President's Faculty Excellence Award (2003), USF Theodore-Venette Askounes Ashford Distinguished Scholar Award (2003), SIGMA XI Scientific Honor Society Tampa Bay Chapter Outstanding Faculty Researcher Award (2004), Distinguished University Professor honorific title and the university gold medallion honor (2007), USF Outstanding Undergraduate Teaching Award (2009), Recipient of three Best Paper Awards at the Intl. Conf. on VLSI Design (1995, 2004 and 2006) and the IEEE Circuits and Systems Society VLSI Transactions Best Paper Award (2009). He was elected as the Fellow of Academy of Advances in Science (AAAS) in 2013. Dr. Ranganathan Served as the Faculty Liaison on the Academics and Campus Environment (ACE) Group of the University of South Florida Board of Trustees (2011-2013) and retired with Emeritus status in 2016.
Q1. Tell us a little about your research area and what motivated you to get into it?

With a bachelor degree in electrical and electronics engineering, I had developed strong interest in logic design and circuit theory. Although I did my Ph.D. in computer science in a program that was basically theoretical computer science located in the College of Arts and Sciences while the computer engineering department was housed in College of Engineering, I was fascinated by VLSI circuits and systems. This was in early 1980’s as VLSI was becoming mainstream technology. On the other hand, theory of algorithms and automata theory were favorites while as student. The introduction to Professor Amar Mukherjee just made the clear decision to work in VLSI algorithms and architectures for specific applications that were purely implemented as software algorithms for applications needing intensive and on-the-fly computations in real-time. The field of ASICS was just then emerging with very little work published. At the time CMOS was just emerging and nMOS was still the common technology. The idea was from implementing algorithms for processing in programmable machines coded in high level languages and as assembler routines on microcontroller and microprocessors based on CISC towards implementing compute intensive software algorithms as hardware algorithms mapped onto application specific integrated circuits and systems.

My initial goal was to implement an operating system as hardwired ASIC in 1984, at a time when researchers were still working on hardware memory management controllers. When approached Professor Manichandy at UTA at that time for his opinion he said, “it is a great idea but not for a single person attempting to accomplish such a massive task within Ph.D research”, warning that one could get stuck without completing the degree for many years. Thus, the dissertation research was narrowed down to designing hardware algorithms for data compression. After starting my faculty career, I worked on VLSI algorithms, architectures, circuits and systems for image processing, computer vision, signal processing and communications, neural networks and expert systems etc. The strategy was to convert software algorithms into circuits and where possible develop new algorithms that more naturally allowed hardware design. The mapped algorithms had to use extensive pipelining and parallelism for speed and maximum possible logic minimization at minimal power and energy consumption. The objective was to get rid of the overheads involved in the traditional software having to be compiled and run on programmable processors and realize those as special purpose VLSI circuits. Such a circuit for a single application is possible to obtain the maximum possible speed and avoids the complex control in needed in programmable processors and microcontrollers. This field of ASICs was redefined as embedded circuits and systems since ASICs were primarily embedded within application systems in the 90’s.

In 1993, Professor Don Bouldin started the IEEE Transactions on VLSI systems and a few of my papers appeared in the early issues in that year. TVLSI became the primary platform for reporting my works in that appropriate journal for VLSI design of circuits and systems. The VLSI mainstream community was focused on CAD with TCAD, and grew into a massive field. Before TVLSI, I was trying hard to find suitable conferences and journals to publish our work. Having worked with students on a huge number of problems in a wide variety of applications and with computer science background I started working in CAD area, being late in the game, found it hard to obtain grants in CAD, but continued to work on problems like gate as well as wire sizing and optimization, synthesis, power estimation and optimization, leakage reduction etc., applying modeling techniques from decision theory, game theory, auction theory, Bayesian networks etc. As speculations indicated towards quantum computing, reversible logic was my next interest and at the time only reversible logic designs for some logic functions and adders were found in the literature. Thus, our group focused and developed novel reversible logic gates and circuits for a wide variety of arithmetic functions like multipliers, dividers, shifters and circuits for smartcards etc. Most recent work has focused on techniques for data security in clouds and big data systems and the pending patents are currently getting attention from the industry.

Q2. What are some of your proudest accomplishments?

I think the biggest accomplishment is the mentoring of research students and seeing them establish significantly in their chosen fields. Examples in academia are Professors Vijaykrishnan Narayanan at Penn State, Saraju Mohanty at University of North Texas, Rajarathnam Chandramouli at Stevens Institute of Technology, Matt Morrison at Ole Mississippi, Sanjukta Bhanja now an Associate Dean at USF etc. In the industry, many including Ashok Murugavel at Intel, Shankar Arumugavelu now CIO and Chief Executive Vice President at Verizon and several CTO’s of small companies. Several
students were employed in various industries including Intel, Samsung, TI, LSI logic, Fujitsu, Cadence, Synopsys, Juniper and the like. Thus, training students towards successful careers is the most rewarding accomplishment as part of the academic profession. The Best Paper awards won by my students come to my mind. In terms of my research all the credit can be attributed to my 96 research students. The application of modeling techniques from vastly different other fields to VLSI CAD and developing novel VLSI circuits and systems for pattern matching and recognition, development of a lossless data compression method that outperformed all the versions of the original JPEG standard within months after it was released and further the VLSI circuit for JPEG standard named JAGUAR can be considered as significant accomplishments. The idea of applying game theory to VLSI CAD won several best paper awards.

Q3. How do you see your research field shaping up and what are the major directions?

The future is promising due to the proliferation of applications needing real time responses and the progress in nanotechnology from materials to devices is key to designing new hardware circuits and systems based on those devices. New architectures and circuits as well as CAD will need to be realized to suit and exploit the advantages of nano devices towards nano computing systems.

Q4. Can you talk a little about the history of TCVLSI? What motivate you to initialize TCVLSI?

IEEE-CS Annual workshop on VLSI was initiated due to researchers like Don Bouldin, Amar Mukherjee, Richard Newton, Jan Rabaey, P. A. Subramanyam of Bell Labs etc. Since the mid-eighties, while as a student, I began volunteering in this workshop every year. A few years this workshop was dormant. In 1988, I approached Professor Amar Mukherjee and we jointly revived this annual meeting and later renamed it as ISVLSI. Around 2012, my former student Prof. Vijay Narayanan at Penn State took over as the steering committee chair. TCVLSI was formed around late eighties with its major activities being ISVLSI, IEEE Transactions on VLSI systems, etc. Thus. I have been involved with and played leadership role in TCVLSI and ISVLSI for over 25 years.

Q5. What advice would you give to junior researchers and graduate students?

In general, it is important to keep abreast of the latest advances in the research field through regularly attending internationally reputed major conferences and reading journals. It is important to watch out for emerging trends. The right approach is to choose problems that have impact in the future, at least in the next ten years. This allows for the research to be relevant also for seeking sponsorship funding. In academia, it is important to involve graduate students with fresh minds and mentor them to become future researchers. Certainly, they can be encouraged and inspired through making sure their papers are well written and they are the primary authors. This is a fundamental responsibility of an academic researcher. Based on the students’ goals about academia or industry, the research area and problems should be defined differently and the nature of entire research should be tuned accordingly. Strict discipline and following milestones are key to success. The graduate students should be trained to work well in a team as well as being a self-starter of research initiatives. Any researcher needs passion for chosen field, strong focus, willingness to learn, acceptance of intermediary failures such as rejection of a submission and concentrated hard work to achieve success. Also, quality assurance is to be verified through publishing only in topmost journals such IEEE and ACM transactions. And avoiding secondary ones. Also, a graduate student must learn the entire current state-of-the-art in the entire chosen field.

Q6. What profession would you be in if you weren't in this field?

I entered electrical engineering and computer science being attracted to mathematics and technology at a time when NASA was accomplishing landing on the moon and the space shuttle programs. If that was not my path, I would have pursued medical research which is a major need to be constantly advancing the field of medicine to solve health problems vital to human life.
Q7. Any final thoughts?

I loved my academic career of research and teaching that I have pursued for about 30 years but it came to an abrupt ending due to accidents resulting in major health issues. Else I have no regrets about my career. My former students that I mentored and trained to be researchers have been turning out to be outstanding successful researchers and hold important positions and status in their careers. They keep in touch with me and keep me informed of their progress in life and career. It has been a very rewarding and satisfying experience to have mentored 96 graduate students in research. The University of South Florida has grown from not being ranked to get within the top 30 research institutions as ranked by the National Research Council during my tenure and thus I enjoyed being a part of it. I became one of the youngest Distinguished University Professor at USF. Thus, I have always strived to do my best at USF.
Feature Articles

A Process Aware Delay Circuit with Reduce Impact of Input Switching at GHz Frequencies
Abir J Mondal¹, Alak Majumder¹, Bidyut K bhattacharyya² and Pinaki Chakraborty³
1 Department of Electronics and Communication Engineering, National Institute of Technology, Arunachal Pradesh, India
2 Department of Electronics and Communication Engineering, National Institute of Technology, Agartala, India
3 Department of Basic and Applied Science, National Institute of Technology, Arunachal Pradesh, India

Abstract – Conventional delay circuits with one input cannot perform the task of generating delay between the input and output while operated at GHz frequencies. In addition, by switching 1 GHz signal at the input of conventional delay circuits, the signal suffers from distortion inside the chip. Whereas, the use of a resistance can control the delay and so becomes the cause of voltage loss inside the chip. The resistance also acts as a filter, which eliminates higher frequency signals and suppresses the operating frequency of a signal. Considering all these pros and cons, this work suggests a method of designing a differential scheme made of two identical drivers and a comparator for generation of delay signal from the original signal, thus enhancing the operating frequency from 1 Gbps to 4 Gbps at 180 nm process technologies.

1. Introduction
The sensitivity of the signal delay due to varying voltage at the input pin of I/C chip is significant while designing delay circuit for application. Later, the advancement of VLSI technology does not allow the data rate for server to increase dramatically in terms of Gbit/sec while communicating between two chips using copper interconnect (channel) [1]. The use of such interconnects results in insertion losses in the channel [2] thereby causing signal distortion. In order to reconstruct signal, one requires generation of various delay schemes inside the chip [3] at higher speed. It has been observed that even inside the chip the signal strength gets destroyed because of the metal line resistance and the input capacitance of the device the signal is driving. Use of additional resistance for the generation of signal delay destroys the signal integrity and reduces the amplitude of a signal propagating inside a chip. On the contrary, the conventional delay circuits which are designed inside the chip suffer from distortions and significant signal loss when operated at gigahertz ranges. Such deterioration occurs due to fluctuating voltage at the input of a chip, resulting from an external source using another chip.

Inverter chains and RC circuits [4] are often used in peripheral circuit design in order to generate a desired time delay though suffering from large die area and severe switching power loss [5]. On the other hand, current starved delay [6] controls the delay time by manipulating charge/discharge current, thus providing an advantageous longer delay time. Bazes [7] uses voltage source to control the delay time. A thyristor based delay element [8] is similar to current starved one and uses current mirror to adjust delay time. However, by changing the position of current controlling transistor in the current starved circuit, it is possible to have an alternative delay gate, which is called output split [9]. The tunable delay element [10] turns to be low power in nature. In all the above stages, the problem is to control the delay, since the signal strength reduces and makes such delay control difficult for input to switch at gigahertz ranges. As a remedial measure, the authors have described and analyzed an alternate structure of delay circuit based on differential signaling approach. The simulation set up described ultimately provides a precise delay time following elimination of signal distortions and loss associated with the conventional delay circuits.

2. Conventional Delay Circuits
The increase in data rate to gigahertz ranges makes it difficult to design circuits to operate at higher frequencies inside as well as outside the chip. Externally it is controlled by the channel characteristics and inside the chip it is controlled by RC delays, both generating a loss in the range of 60-120 db while operating at 1-10 Gbit/sec data rate. When the slew rate at input is fast or switched at 1 GHz, the delay circuits shown in Figure 1 [6, 9] suffer from significant signal losses. Depending on the channel characteristics, the edge rate of the input, $V_{IN}$ in Fig. 1 emerging from another chip may vary. The rise time for the signal entering $V_{IN}$ has to be smaller than 100 ps in order to operate at 1 GHz and that causes
problem for a conventional delay circuit. When the pulse at the input switches from low to high, \( M_2 \) turns on and starts discharging the load capacitance, \( C_2 \). The on resistance, \( R_1 \) of \( M_1 \) controls the discharging capacity by controlling the discharge current. But the charging time of the capacitor is not controlled by any current control transistor like \( M_1 \). \( R_1 \) is controlled by varying the input voltage \( V_D \) at the transistor \( M_1 \) in Figure 1. Therefore, designing of such chip requires two sources of power supply, one Vdd and the other \( V_D \) (variable) for controlling the effective “\( R_1 \)”. 

To estimate the effective \( R_1 \), the discharging current is first obtained using simulation. Thereafter, Eq. (1) is used to determine the switching time.

\[
\frac{dI}{dt} + \frac{I}{R_1 \times C_2} = 0
\]

where \( C_2 \) denotes the load capacitance and \( I(t) \) depicts the discharging current at a given time \( t \) through \( R_1 \). Once the current vs. time is plotted, \( R_1 \) is determined from the time constant (\( \tau \)) of the said plot using Eq. (2)

\[
R_1 \times C_2 \approx \tau
\]

For \( C_2 = 1 \) fF and Vdd=1.8 V, the value of \( R_1 \) using Eq. (2) and Eq. (1) from data fit, appears to be 360 kΩ. Now, the effect of high value of \( R_1 \) is noted when the signal arrived at the input pad with some slew rate which may be expressed as \( \frac{dV_{IN}}{dt} \). Since the pulse arises at the input, current flows towards supply voltage followed by a gradual charging and discharging. On the other hand, when \( \frac{dV_{IN}}{dt} \) is high and \( R_1 \) approaches infinity, \( C_2 \) instead of being discharged gets charged by a current flowing through input capacitance. Thus, the generation of the delay by controlling the current through the ground path using the NMOS gate and using a separate power supply happens to be a problem.

3. Proposed Delay Circuit

To counter the problematic issues associated with [6, 9] at higher frequencies, the authors propose to design a circuit as described in Figure 2. Here, the extra power supply is discarded and the NMOS gate is connected directly to the ground, which translate to \( R_1 = 0 \) or no transistor, to control the ground current. Introduction of an additional resistance \( R_2 \) shows that the delay is possible with larger values of \( R_2 \). But the signal amplitude reduces drastically as \( R_2 \rightarrow \infty \). Therefore, it is irrelevant whether a transistor or a pure resistor is included in the conventional delay circuit in Figure 1. In Fig. 2 when the input switches from high to low \( C_2 \) gets charged though PMOS \( M_3 \). During low to high switching, \( M_2 \) turns on creating a conducting path from output to ground, but \( C_2 \) remains charged for a moment. The resistance \( R_2 \) controls the discharging rate through \( M_2 \) and provides protection during switching without overcharging the load capacitance. The resistor actually prevents the flow of current towards supply when switching occurs rapidly.

Further to address an additional issue, an initial pulse having a switching frequency of 20 ns (50 MHz) is applied at the input of Figure 2 and the value of \( R_2 \) is taken same as \( R_1 \). Keeping all the transistor dimensions identical, Figure 3a illustrates that the input/output waveform corresponding to 20 ns (0.1 Gbit/sec) switching frequency changes with the
change in $R_2$, thereby causing a precise delay. Whereas, by reducing the switching frequency to 1 ns (equivalent to approximately 1 Gbit/sec data rate), the corresponding output waveforms, as shown in Figure 3b, suffer from significant signal distortion. The distortion is much more significant for higher $R_2$ because $C_2$ does not charge and discharge completely due to large RC time constant ($\tau$).

![Figure 2 Schematic of the proposed delay circuit.](image)

![Figure 3 a) Plot of input and output voltages corresponding to 20 ns switching frequency and different $R_2$ and b) Plot of input and output voltages corresponding to 1 ns switching frequency and different $R_2$.](image)

To avoid signal loss inside the chip due to fast switching and to have full voltage swing at output, Figure 4a suggests a differential signaling technique consisting of two identical driver blocks and a comparator and aiming to generate delay between two signals. The driver blocks correspond to the design block shown in Figure 2 and the comparator is realized using the circuit given in Figure 4b. At first, input pulses switching at gigahertz frequencies are applied to the two driver blocks in low to high and high to low form. The corresponding outputs though suffering from significant losses acts as an input to the comparator. An ideal comparator outputs logic 1 if $V_{out,1} - V_{out,2} > 0$ or else logic 0. On comparing the two inputs, the resulting outputs are displayed at $V_{out1}$ and $V_{out2}$. Further, the resulting outputs at $V_{out,1}$ and $V_{out,2}$ generate an eye, the amplitude of which decides the correctness of the comparator output. The comparator circuit compares the inputs in the form of eye and gives full voltage swing. The proposed approach also eliminates the common mode noise appearing in the input, thus even operating at gigahertz ranges removes the aberrations at the output. This allows input of small amplitude to get sampled accurately as long as the eye generated at the output of two drivers has a voltage swing of 0.5 V. Finally, irrespective of the previous pulse whether appearing as 1 or 0, evaluating signal integrity of thousand of pulses at a glance is no problem.

4. Results and Analysis

The proposed delay circuit and the circuit developed using differential signalling approach are simulated using TSMC 180 nm technology in LT spice. Although, the use of PTM [11] transistor models results in better performances compared to the 180 nm technology, the authors aim at acquiring maximum operating frequency as well as performance achievable with 180 nm technology. The delay circuit in Fig. 2 is simulated to obtain the discharging current $5 \times 10^{-6} \mu A$. 

VLSI Circuits and Systems Letter Volume 3 – Issue 2 June 2017
Thereafter, Eq. (1) and Eq. (2) are used along with the estimated current to determine the effective resistance \( R_1 \) i.e. 360 kΩ. To get rid of the problems associated with the conventional delay circuit, \( R_1 \) is fixed at zero and a resistance \( R_2 \) of 360 kΩ is inserted between output and load capacitance. Using Eq. (3), the total resistance offered during discharging becomes 400 kΩ. Since the initially chosen \( R_2 \) is 360 kΩ, the resistance offered by \( M_2 (R_T) \) comes out to be 40 kΩ.

\[
V = V_0 \left\{ 1 - e^{-\frac{t}{\tau}} \right\}
\]

(3)

To remove the signal loss associated with 1 GHz switching frequency and to attain full voltage swing at the output, differential inputs are applied to the driver blocks of Figure 4a in a manner shown in Figure 5a. Thus, Figure 5b depicts that with \( R_2 = 360 \) kΩ the driver outputs do not overlap. Further, there is no eye and the comparator outputs in Figure 5c get corrupted. Under this circumstance, the amplitude of the eye needs to be adjusted to develop full and correct voltage swing at the comparator outputs. So, the value of \( R_2 \) is carefully adjusted and it is observed from Figure 6a that with 1 gigahertz switching frequency and \( R_2 = 100 \) kΩ an eye with certain amplitude is appearing at the driver outputs.

Figure 4 Schematic of the a) proposed delay circuit using differential signalling approach and b) conventional comparator circuit

Figure 5 a) Differential inputs b) driver outputs (eye) and c) comparator outputs using differential signalling approach with \( R_2 = 360 \) kΩ.
Figure 6 Plot of a) driver outputs (eye), b) comparator outputs using differential signaling approach with $R_2=100 \, \text{k}\Omega$, c) driver outputs (eye) and d) comparator outputs using differential signaling approach with $R_2=30 \, \text{k}\Omega$

As illustrated in Figure 6b, the eye amplitude is not enough for the comparator to generate correct and full voltage swing. The value of $R_2$ is again adjusted and Figure 6c depicts that with $R_2=30 \, \text{k}\Omega$ an eye with appropriate amplitude is formed at the driver outputs. Figure 6d illustrates the corresponding full voltage swing at the comparator outputs. It is also apparent from the above that the generated eye must have a voltage drop of 0.5 V so that the comparator gives full voltage swing. The signals that are studied and sampled are $S_{1010101010}$ and $S_{01010101010}$.

The power drawn by the proposed delay circuit in Fig. 4a as a function of frequency is shown in Figure 7. It is interesting to note that the dynamic current for the proposed device, designed at 180 nm process technology and operating at 1.8 V, comes out to be around 74.4 $\mu$A for 20 transistors accommodated in about 39.2 $\mu$m$^2$ area. The intercept of about 17.7 $\mu$A, corresponds to the leakage current for this set-up.

The input signal at the receiver due to process skew is shown in Figure 8a. It has been assumed that the power supply is fluctuating between 5% of the allowed Vdd specification. Further, there are jitter also appearing within 5% due to crosstalk and other issues involving the rise and fall time of the signal. Using Monte Carlo experiment, the eye diagram is plotted in Figure 8b to indicate that the eye is still open with some marginal loss. Following noise and jitter, eye height and eye width varies in a manner given in Table 1. In spite of marginal loss, a clear comparator output is evident in Figure 8c.
The plot of resistance ($R_2$) versus channel length in Figure 9 illustrates how the resistance value changes as a function of various process files, while keeping the operating frequency fixed at 1 GHz. It is important to note that as the channel length is reducing, the value of $R_2$ is increasing and this attributes to larger current being drawn here. Therefore, an increase in $R_2$ is essentially required to keep the delay time fixed. On the other hand, to reduce the delay time $R_2$ has to be fixed. Thus, with various channel lengths one requires different values of resistances to obtain a constant delay of 310 ps as reported in the Table 2. The curve in Figure 9 portrays a constant delay of 310 ps, which is independent of the channel length or the scaling of process file.

<table>
<thead>
<tr>
<th>technology (nm)</th>
<th>process &amp; temperature (°C)</th>
<th>delay (ps)</th>
<th>$R_2$ (kΩ)</th>
<th>eye width (ps)</th>
<th>eye amplitude (V)</th>
</tr>
</thead>
<tbody>
<tr>
<td>180</td>
<td>TT at 27</td>
<td>310</td>
<td>30</td>
<td>498.4</td>
<td>1.1</td>
</tr>
<tr>
<td>130</td>
<td>TT at 27</td>
<td>310</td>
<td>98</td>
<td>457.6</td>
<td>0.75</td>
</tr>
<tr>
<td>90</td>
<td>TT at 27</td>
<td>310</td>
<td>130</td>
<td>421</td>
<td>0.68</td>
</tr>
<tr>
<td>65</td>
<td>TT at 27</td>
<td>310</td>
<td>180</td>
<td>418.5</td>
<td>0.52</td>
</tr>
<tr>
<td>45</td>
<td>TT at 27</td>
<td>310</td>
<td>300</td>
<td>419.5</td>
<td>0.54</td>
</tr>
<tr>
<td>32</td>
<td>TT at 27</td>
<td>310</td>
<td>475</td>
<td>368</td>
<td>0.40</td>
</tr>
</tbody>
</table>

5. Comparison with Conventional Circuits

In order to compare the proposed delay circuit with CSI and OSI circuits discussed in Figure 1, the three delay circuits are simulated using TSMC 180 nm technology in Lt spice. The W/L ratios of the transistors, $R_1$ and $R_2$ of all three circuits are chosen with an aim to develop approximately equal delay. Figure 10 shows the output of the three delay circuits. Though the performance of a delay circuit is temperature dependent, in most applications a very precise and stable delay is always desirable. Accordingly, the three delay circuits are simulated at two different temperatures and the corresponding results are given in Table 3. Data reported in this table clearly indicate that the proposed circuit has least sensitivity to temperature variation.
Further to compare with CSI and OSI, the proposed delay circuit consumes higher power which appears to be around 163.6 μW at 250 MHz. The dynamic and static power components constitute 133 μW and 30.6 μW, respectively. With regard to estimation of delay time, reference may be drawn to Figure 10 again, which shows that the time to reach 1 V is 0.95 ns for the proposed circuit in comparison to nearly 1.1 ns for CSI and OSI. Thus the proposed method achieves a gain of about 0.25 ns, which may cause the dynamic power to increase marginally. The rise time of the proposed design is much less and corresponds to operating frequency increase by 14%, i.e. from 1.05 GHz to 0.91 GHz.

6. Conclusions

At high speed, the non-differential signal lines due to channel characteristics when arrives at the receiver, the signal amplitude gets distorted. Further, considering arrival of a signal of reduced magnitude at the input and its spread occurring due to dispersion, introducing differential signal lines along with designing a delay circuit to match the input signals is proposed. The results of Monte Carlo analysis with 5% variation in process file for both active and passive components of design indicate a negligible impact on the outputs of the proposed circuit. The empirical relation between the channel length and resistance gives rise to a fixed delay of 310 ps, which happens to be independent of the channel length or the scaling of process file.

Reference

Voltage Keeper Based 28.27µW New Frequency Divider Circuit in 90nm Technology for Gigascale SerDes Application

Bipasha Nath1, Alak Majumder1, Monalisa Das1, Abir J Mondal1, Pinaki Chakraborty1, Bidyut K Bhattacharyya2
1Department of ECE, NIT Arunachal Pradesh, Yupia-791112, India
2Department of ECE, NIT Agartala, Tripura – 799055, India

Abstract - Frequency divider circuit, used in different applications such as RF and microwave, is very much essential when individual components of a system are to be driven at different operating frequencies. Though Current Mode Logic (CML) based frequency divider runs at higher switching frequency, it is still a point of concern due to large power, larger area and low output swing. This paper presents an ultra-low power frequency divider circuit using hybrid logic. It uses a voltage keeper to refresh the dynamic node so as to hold the output logic level undistorted and to obtain a full swing at the output. The simulation is carried out at 6.66 GHz switching frequency with a supply voltage of 1.1 Volt using 90nm Predictive Technology Model (PTM) in LTspice IV. The proposed circuit is also tested on a 4:1 Serializer, an important segment of transmitter module in high speed serial link systems.

1. Introduction

Frequency divider plays a vital role in RF and microwave systems, such as in phase-locked loop frequency synthesizers to generate multiple ranges of frequencies from a reference frequency. Microwave frequency division is a novel approach that employs sub harmonic generation without the use of oscillators. Further, as each successive stage divides the frequency into half, it can be termed as ‘mapping of wideband input into narrowband output’, which makes them useful in signal processing applications. They are also used at the Defence Research Organizations for ultra-broadband counter measures system, digital signal manipulation and frequency multiplication [1].

Basically, it is classified as analogue and digital dividers. Analogue divider includes the regenerative or Miller frequency divider and the injection-locked frequency divider. Regenerative frequency divider uses a low pass filter followed by an amplifier and the output feedback using a mixer. Thus it removes the higher frequency noise, making it useful as low-phase-noise frequency synthesizers. Injection locked frequency divider [2] uses the concept of forced oscillation in non-linear oscillators having self-oscillation (free-running) frequency and works at high frequency and speed. In contrast to these, digital dividers has simple structures, large bandwidth and good robustness over process variations, but undergoes higher power consumption with frequencies beyond GHz.

It is found that CML based divider is a better choice for very high frequency operation. But due to its reduced output swing, large size and the presence of high static current, the power consumption is large. Thus, for low power applications CML cannot be considered rather it is replaced with static CMOS / hybrid logic. In 2001, Wu et. al. [3] presented a 19 GHz / 0.5mW frequency divider using shunt peaking locking range enhancement technique to increase the frequency range. Another wide band frequency divider was proposed in [4], which uses a LC oscillator for wide band locking range but offers area penalty. Frequency divider circuit implemented for different biomedical applications is presented in [5], which covers a frequency range required for medical implant communication service (MICS) and the industrial, scientific and medical (ISM) frequency bands. Article [6] proposed architecture for Neutrino experiments at 4GHz. Since individual component in a system may have different operating frequency, the generation of different frequencies with lower cost of power is a point of concern. Hence, an attempt has been made in this paper to come up with an ultra-low power frequency divider circuit with lesser number of transistor and higher operating frequency.

The paper is organized as follows: Section 2 presents the proposed architecture and describes its operation in detail. This is followed by the simulation results and different performance analyses in section 3. Section 4 depicts the application of proposed divider in SerDes system. Finally, Section 5 concludes the work.

2. Proposed Frequency Divider

The proposed architecture of frequency divider comprising of 12 transistors, along with its transient response (up to frequency divided by 16) is shown in figure 1.
A. Working Principle

The gate terminal of NMOS transistor M1 is driven by a complementary clock which regulates the flow of current through the device. Input is fed to M1 and its output is directly connected to the gate terminals of PMOS M3 and NMOS M4, combination of which constitutes a CMOS inverter. Therefore, when the potential at node A (V_A) is greater than the threshold voltage (V_{TH}) of M4, the node capacitance at node B is discharged to ground. However, V_A being higher than V_{TH} of M3, charges up the node capacitance at B to V_{DD}. This stable charging to V_{DD} and discharging to ground is denoted as logic ‘1’ and logic ‘0’ respectively. So, if V_A has logic ‘0’, the potential at node B (V_B) will be logic ‘1’, which makes the PMOS M2 to be in cut-off, holding back the logic ‘0’ at node A. Similarly, when V_A is at logic ‘1’, V_B will have logic ‘0’, letting the PMOS M2 to be ON, which holds back logic ‘1’. The beauty of this circuit is that, even if the logic level at node A degrades due to M1 (bad conductor of logic ‘1’), it will get uplifted by the power supply V_{DD}. Here, the PMOS M2 acts as a voltage keeper, keeping a strict eye on the logic levels of node A. The architecture consisting of transistors M1, M2, M3 and M4 is working as a latch and may be considered as MASTER latch. A similar architecture is repeated as SLAVE latch comprising of M5, M6, M7 and M8, where the gate terminal of M5 is driven by clock. The complement of output of SLAVE latch (obtained by M9 and M10) is used to drive the M1 of MASTER Latch, which makes it to work as a T Flip-flop or as a frequency divider. The reason of having the two counterparts as MASTER and SLAVE, cascaded with each other, is to make the architecture work at the edge triggering of clock. Cascading of such proposed divider circuit for N times may help to generate multiple frequencies up to f_{in}/2^N (i.e., f_{in}/2, f_{in}/4, f_{in}/8 and so on).

B. Significance of Voltage Keeper circuits

The concept of voltage keeper came to rise, when there was a requirement of refreshing technique needed for every dynamic node within the integrated circuits. The job is to continuously hold the logic voltage level at a particular node with the help of power supply V_{DD}. However, if the transistor, acting as keeper, is left ON all the time, it will leak current from V_{DD} leading to huge power dissipation. Therefore, a control is must to make the keeper circuit ON, only when it is needed. On the other hand, ON resistance of the keeper transistor should be at an optimal value, so that the current flow through it is quick. Otherwise, there is a scope that the dynamic node may get contaminated with cross-talk during the refreshing of logic level.

In our proposed architecture, the voltage keepers M2 and M6 are appointed to tighten the logic voltage level at node X and Y and are controlled by the voltage at node B and C respectively.

3. Results and Discussions

This proposed frequency divider circuit is designed and simulated using 90nm PTM technology [7] at 1.1 Volt supply voltage. The clock frequency used is 6.66 GHz with a rise time and fall time of 10ps each. The performance parameters of ÷2 circuit is measured and shown in table I.

Table I: Performance Parameters of the proposed divider circuit
To check the reliability and robustness of the proposed circuit, we have simulated it in 1000 runs of Monte-Carlo at three different process corners (Fast–Fast, Typical–Typical and Slow–Slow). The results are shown in table II. It is seen that even with the 5% process skew the circuit produces almost similar value of parameters just like when it was run under no skew. It clearly states that our proposed frequency divider is functioning correctly to process variation with less deviation in average power and a nominal for delay. This means that the circuit is a better choice at different temperatures and works better even if some skews are present in intermediate nodes of it.

Table II: Process Variation of Performance Parameters through Monte-Carlo

<table>
<thead>
<tr>
<th>Process Corners</th>
<th>No Skew</th>
<th>5% Process Skew</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Mean</td>
<td>Std. Deviation</td>
</tr>
<tr>
<td>FF (0ºC)</td>
<td>23.826</td>
<td>22.462</td>
</tr>
<tr>
<td>TT (27ºC)</td>
<td>29.524</td>
<td>23.714</td>
</tr>
<tr>
<td>SS (90ºC)</td>
<td>29.040</td>
<td>29.350</td>
</tr>
</tbody>
</table>

We have analyzed the delay and Average Power of the proposed circuit with a variation in $V_{DD}$ at different process corners as shown in figure 2a and 2b. It is seen that the delay increases with the reduction of supply voltage and increment in temperature. However, we have observed the power reduction with scaling down of $V_{DD}$ and temperature. So, at 0ºC, the circuit is much reliable in terms of both power and delay. As the temperature increases from 27 ºC to 90 ºC, the performance of the circuit degrades, but with a very tiny margin. The power dissipation of the proposed circuit is compared with existing frequency divider architectures available in literature and is shown in figure 3. The Bar chart reveals that voltage keeper based proposed circuit consumes least power.
4. Application of Proposed Divider in Serializer

Though parallel communications are intrinsically faster than serial communications, serial data transfer is preferred in high-speed data links such as chip-to-chip communications in backplane, computer networks and computer peripheral buses etc. SerDes (Serializer - Deserializer) is considered to be the heart of communication in high speed links. Such device is capable of converting data from parallel to serial and vice versa. Serializer is a circuit made of high-speed 2:1 multiplexer, which serves parallel to serial data conversion. Each level of serializer works at a frequency which is double than the operating frequency at previous level. So, there lies the importance of frequency divider. The basic block diagram of a 4:1 serializer circuit and truth table of operation is given in figure 4.

Figure 4a contains three 2:1 MUX at two levels, where if the second level works at a frequency of \( f_{\text{in}} \), the previous level will work at half the frequency of \( f_{\text{in}} \). The input to the select line of the second level of the serializer is driven from the output of the frequency divider. The multiplexer blocks of first level in figure 4a are given pulse inputs A, B, C and D. These inputs are serialized at the output of 2\(^{nd}\) level following figure 4b as per the logic level of select lines available in both multiplexer levels.
The 4:1 serializer based on proposed frequency divider is simulated using 90nm PTM. The divider circuit, driven by 6.66 GHz clock frequency (fin), outputs a frequency of 3.33 GHz to be used as fin/2. The transient of the serializer is plotted in figure 5. It is observed that the output follows the truth table given in figure 4b, depending on the logic at select lines. For the logic level 00 at fin/2 and fin respectively, A is passed at the output. Similarly for 01, 10 and 11 logics C, B and D are passed respectively.

5. Conclusion

In this paper, a novel frequency divider circuit is proposed in hybrid CMOS logic incorporating the voltage keeper principle, which holds the output logic level so as to obtain full swing at the output. The power dissipation reads 28.27 µW, which is much lesser than the existing designs available in literature. The circuit is proved to be a resilient one as it offers definitive output in three different process corners simulated through monte-carlo. As the circuit runs at higher switching frequency, it is tested in a 4:1 serializer architecture to make it a better choice for high speed data link applications. The circuit may also be used in designing the programmable frequency divider of RF synthesizer circuit for wireless access.

References

Power Efficient Test Pattern Generator Using Bit-Swapping LFSR Technique

Sandeep Kakde¹,Yashika Gaidhani², Tejas Thubrikar³,Shailesh Kamble³, Nikit Shah⁵

¹²³Department of Electronics Engineering, Yeshwantrao Chavan College of Engineering, Nagpur, India.
⁴Department of Computer Technology, Yeshwantrao Chavan College of Engineering, Nagpur, India.
⁵Department of Electrical Engineering, San Jose State University, California.95192.

Abstract – The testing of VLSI circuits enables many challenges in term of area, power, and latency. The Bit-swapping test pattern generation is a key technique for testing of a complex architecture of VLSI design. In this paper, 32-bit Bit-swapping test pattern generator has been proposed for testing the VLSI design. This 32-bit test pattern generator is implemented with efficient LFSR and with Multiplexer for swapping bit which achieved Low power consumption. The switching activities between the two consecutive test vectors are reduced which results in low power consumption. The design of test pattern generation which yield a power of 32 mw with a latency of 4.713ns. The switching activity required for 32-bit test pattern generation has been improved and presented in this paper. It is observed that total power consumed in Bit-Swapping linear feedback shift register is 34.69% less than the conventional LFSR. The design is implemented using Xilinx 13.1 ISE design suite in Verilog HDL present in this paper.

Introduction

Testing of the digital circuit is a major challenge today with low power consumption. The development of microelectronics industry allows us to design a complex digital system on a single chip. To test such a complex VLSI design, Built-in-self-test (BIST) technique has been extensively studied and widely used nowadays. In BIST, the pattern is generated and applied to circuit-under test (CUT) by On-chip system, minimizing hardware overhead which is a major concern of BIST implementation. BIST technique is used with the LFSR to generate the test pattern which results in high power consumption. The switching activity between two consecutive patterns in BIST technique is on a higher side. As Power dissipation is a major problem in system-on-chip (SOCs) that contains a very large number of a transistor. In CMOS technology, there are three types of power dissipation occurs as are follows:

- **Static Power Dissipation**: It occurs when the system is in steady state.
  \[ P_{\text{static}} = V_{DD} \cdot I_{\text{leakage}} \]

- **Short Circuit Power Dissipation**: It occurs when the current flowing from power supply to ground. Also called short circuit path flowing from power supply to ground.
  \[ P_{\text{short}} = V_{DD} \cdot I_{\text{short}} \]

- **Dynamic Power Dissipation**: It is superior Power dissipation in CMOS Devices. The 90% of Dynamic power dissipation occurs in overall power consumption in CMOS Devices. it occurs due to different input combination applied to the device for testing.
  \[ P_{\text{dynamic}} = \beta C V_{DD}^2 f \]

  Where \( \beta \) is a Switching Activity, \( C \) is a Load Capacitance, \( f \) is a Switching frequency and \( V_{DD} \) is a Supply voltage.

In terms of power consumption, the Short circuit and static power dissipation are less than the dynamic power dissipation. Therefore, dynamic power dissipation is a crucial source of power dissipation in CMOS. In test mode, power dissipation is more than the normal mode. In general, the power consumption of a system in testing mode is more than in normal modes. The four reasons for the increase of power dissipation during testing mode are: high switching activity caused by nature of test patterns, parallel activation of internal cores during testing, power consumed by extra design-for-test circuitry and low correlation among test vectors.

Some of the previous approaches have been discussed in this section which is used to reduced power consumption during testing of VLSI circuit. In [1], they introduced the test pattern generator model using Linear Feedback Shift Register which efficiently used for wireless communication application and also with this method of TPG can be used efficiently in secure transmission of codes and with low power consumption and also can be used for applications like.
Data compression, PN sequence generation. Another technique that reduces the power consumption during testing is low transition linear feedback shift register which gives the maximum correlation between the two test pattern as compared to that of linear feedback shift register which in turn reduces the switching activities i.e. number of transition in between the two test vectors as a result of this reduction, the power consumption is reduced during testing [2]. In [3-4], generation of the test pattern with high fault coverage Built-In-Self-test applications reduces the number of transitions that occurs at scan inputs during scan shift operations and therefore reduces switching activity in the circuit under test (CUT) which ultimately reduces test power with high target fault coverage without increasing test length sequences. In [5], the reduction in static power has been achieved using Dual threshold Bit-swapping LFSR and automatic test pattern generator is implemented with less cost. Two strategies are used LOC (Launch of Capture) and Bit-swapping with this testing of ISCAS’ 89-S27 benchmark circuit is done with efficient area overhead and less power consumption [6]. In [7], a design of low power test pattern generator using low power linear feedback shift register presents a low power test pattern generator has been proposed which reduces the power consumption during testing mode with a minimum number if switching activities using LP-LFSR in place of conventional LFSR in the circuit used for test pattern generator. In [7] [6], they award the low correlation between the test vector. In [9], they proposed low Power LFSR architecture with high fault coverage design for BIST architecture which uses for security application.

Design Approach
This paper presents a test pattern generator using 32-bit linear feedback shift register and with 32-bit Bit-swapping linear feedback shift register. The ideology of this paper is to increase the correlation between the two test vectors and reduce the number of transition i.e., switching activities. The conventional linear feedback shift register (LFSR) and Bit-Swapping LFSR are discussed in 1 and 2 respectively.

A. LINEAR FEEDBACK SHIFT REGISTER

The LFSR is a Linear Feedback Shift Register which generates the test pattern which is known as a pseudo random pattern. LFSR consist of a number of flip-flops connected in series with each other and exclusive-OR gate is used as a feedback. The 32-LFSR Structure as shown in fig.1 which generate the random test pattern. The fig.1 consists of an exclusive-OR gate as a feedback which having two input from last and first flip-flop output and output of exclusive-OR gate is provided to an input of the first flip-flop. The shift register is made from number of D-flip-flop arranged in series connection which provided with the same clock.

![32-bit LFSR Structure](image)

Fig. 1: 32-bit LFSR Structure

Limitation of LFSR:
Linear Feedback Shift Register is used to generate Pseudo Random Test Pattern. This Random pattern results in high switching activities i.e., the number of transition between two consecutive test pattern which in turn increases the power consumption during testing of any digital circuit. To reduce this limitation we introduced the technique name as Bit-swapping Linear Feedback Shift Register discussed in section B.

B. BIT-SWAPPING LINEAR FEEDBACK SHIFT REGISTER

By adding some extra circuitry to Linear Feedback Shift Register we can create Bit-Swapping LFSR. Bit-swapping LFSR consists of LFSR and a 2x1 multiplexer. The Bit-swapping LFSR is a modified version of convection LFSR which
generates a pseudo random pattern. In BS-LFSR, the function of 2x1 Mux is to swap the bit of LFSR. With this swapping we can reduce the number of transition i.e., switching activity between consecutive test pattern. This reduction of transition or switching activity reduces the average power consumption. Fig.2 shows the 32-bit Bit-swapping LFSR structure.

Fig. 2: 32-bit Bit-swapping LFSR Structure

Design Steps:
1. Bit-swapping LFSR will generate the same number of bit pattern as conventional LFSR but the test pattern will be in a different manner.
2. Bit-swapping LFSR will create the same number of 0s and 1s through 2x1 multiplexer used to swap the bit of consecutive flip-flop.
3. The above fig. 2 is to generate 32-bit test pattern with the used of conventional LFSR and 2x1 Mux (32 numbers).
4. The function of Mux is to swap the output bit of flip-flop 1 with 2, 3 with 4 and so on till 31 with 32.
5. With this swapping, we reduce the transition between consecutive patterns.
6. For 32-bit LFSR which begin with initial seed and runs to generate test patterns for 2x32 clock cycles until it comes back to its initial seed.

In above table without applying bit swapping the number of transitions in C1 and C2 are 8 and 8. But after applying bit swapping technique the number of transitions in the same C1 and C2 are 8 and 4 respectively. So the total number of transitions is 16, without applying bit swapping and 12, after applying bit swapping technique. Hence the number of transitions is decreased results in the reduction of power consumption. Finally, the peak power is reduced by using this bit swapping technique. It is important to note that the overall savings of the number of transition is 25% between the outputs of the multiplexers. This is because the value of C1 in the present state will affect the value of C2 and its own value in the next state (C2 (Next) = C1 and C1 (Next) = “C1 Xor C32”). To see the effect of each register in transition savings.

Table 1 tells that O1 will save one transition when moving from state [0,0,1] to next state [1,0,0], from [0,1,1] to [1,0,0], from [1,0,1] to [0,1,0], or from [1,1,1] to [0,1,0] whereas at the same time, O1 will add one transition when moving from state [0,1,0] to next state[0,0,0], from [0,1,0] to [0,0,1], from [1,0,0] to [1,1,0], or from [1,0,0] to [1,1,1]. Therefore, O1 increases the four transitions and at the same time, O1 will save four transitions in other scene. Hence, within this sense the overall transition occurs in O1 will be neutral.. On the other hand, O2 will save one transition while moving from state [0,1,0] to next state[0,0,0], from [0,1,0] to [0,0,1], from [0,1,1] to [1,0,0], from [1,0,0] to [1,1,0], from [1,0,0] to [1,1,1], and from [1,0,1] to [0,1,0] whereas, in O2 two additional transition is experienced when moving from state [0,0,1] to [1,0,0] and from [1,1,1] to [0,1,0]. This result that O2 will save overall four possible transitions where the initial state has a probability of 1/8 and the final states of probability 1/2.
### Table 1

**NUMBER OF TRANSITION IN EACH REGISTER IN LFSR WITHOUT APPLYING BIT SWAPPING LFSR TECHNIQUE AND WITH BIT SWAPPING**

<table>
<thead>
<tr>
<th>States</th>
<th>Next state</th>
<th>Transition</th>
<th>States</th>
<th>Next state</th>
<th>Transition</th>
</tr>
</thead>
<tbody>
<tr>
<td>C1</td>
<td>C2</td>
<td>C32</td>
<td>C1</td>
<td>C2</td>
<td>C32</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>Σ</td>
<td>Transition</td>
<td>8</td>
<td>8</td>
<td>16</td>
<td>8</td>
</tr>
</tbody>
</table>

**Fig. 3: C17 Benchmark circuit**

**Implementation Model**

The implementation model is developed on Xilinx 13.1 ISE design suite with Verilog coding. The fig. 3 shows the one of the ISCAS'85 benchmark circuit named as C17. This circuit is designed with good and bad signature and test with both LFSR and Bit-swapping LFSR.

The testing of above circuit with LFSR and Bit-swapping is as shown in fig. 4. The test pattern for testing is generated through LFSR and Bit-swapping respectively. The generated test pattern is applied to the good circuit and bad circuit for fault detection. The comparator is used to compare the signature of the good and bad circuit and tells us for which test pattern the circuit will faulty.
Simulation Result:

Fig. 4: Flow Chart for Testing

Fig. 5: 32-bit Pattern generation using LFSR

Fig. 6: 32-bit Pattern generation using Bit-swapping LFSR
The simulation result obtained from the Xilinx 13.1 ISE design suite with the target device xc6slxl6-3csg324 in which, we have generated VCD file after the post-simulation. Xpower analyzer is used to determine dynamic as well as quiescent power. The result is obtained for each case and comparison is done on the basis of power, latency, and area overhead as shown in the following table.

<table>
<thead>
<tr>
<th>Table 2</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Comparison of Power Dissipation, Latency, No. of LUT’s for Conventional LFSR and Bit-Swapping LFSR</strong></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type of Random Sequence generator</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Conventional LFSR</td>
</tr>
<tr>
<td>No. of Bits</td>
<td>32-bit</td>
</tr>
<tr>
<td>No. of Slices Register</td>
<td>32</td>
</tr>
<tr>
<td>No. of bounded IOBs</td>
<td>34</td>
</tr>
<tr>
<td>Total Power (mW)</td>
<td>49.00</td>
</tr>
<tr>
<td>Latency (ns)</td>
<td>3.668</td>
</tr>
<tr>
<td>No. of LUT’s</td>
<td>12</td>
</tr>
</tbody>
</table>
Conclusion

This paper shows an effective HDL Verilog implementation of Low Power Test Pattern Generator using Bit-Swapping LFSR technique. It also addresses a theory to express a test pattern generation greatly reduced as compared to the conventional LFSR technique. It shows that the total power consumed in low transition linear feedback shift register is 34.69% less than the conventional LFSR. It is concluded that low power Bit-Swapping LFSR is very much useful for power optimization techniques during testing mode.

Reference


Estimation of Stability and Performance metric for Inward Access Transistor based 6T SRAM Cell Design using n-type/p-type DMDG-GDOV TFET

Sunil Kumar and Balwinder Raj
Department of Electronics and Communication Engineering, National Institute of Technology Jalandhar, India.

Abstract— In SRAM cell design using novel devices, stability is the major concern to variability and scaling the supply voltage in nanotechnology regime. This work presents, the stability and performance criteria using IA-nTFET and IA-pTFET for standard 6T SRAM Cell. DMDG-GDOV TFET is used in building a 6T SRAM Cell design. Exploiting merits of steep sub-threshold swing and very low off current, TFET can be used in very low power circuit application in static as well as dynamic condition. TFET, due to inherent property of unidirectional current conduction, poses a significant issue when access transistors are used in SRAM Cell which is viable for read/write ability. Hence, we analyze the 6T TFET SRAM cell design using both n-type and p-type Inward Access transistor techniques and find that inward access p-TFET gives better stability and performance than inward access n-TFET at VDD=0.9V. However, at low VDD=0.6 V inward p-TFET access transistors are also not much reliable for 6T SRAM cell. Therefore, to enhance the read/write ability of 6T SRAM cell by using wordline line lower is used as Write Assist (WA) technique for write ability and wordline raising is used as Read Assist (RA) technique for read ability.

1. INTRODUCTION

Nowadays, multiple processing cores are being integrated into a single chip. In which SRAM cell is highly used in memory hierarchy as on-board cache in processors [1, 2]. Hence, SRAM cell design shifted towards technology scaling. However, SRAM cell is more vulnerable to process variations and threshold voltage mismatch than logic circuits in nanoscale regime [3, 4]. The cell becomes less stable with low supply voltage, increasing leakage currents and increasing variability due to technology scaling. Since, the thermal distribution operation in MOSFETs limits the subthreshold swing (60mV/decade) significantly and restricts low voltage operation, this results in a low ON to OFF ratio and increases the leakage current [5]. At low voltages, drive current of MOSFET drops inevitably due to reduced gate voltage overdrive obtained in large signal transition delays [6, 7].

Recently, the tunnel FET (TFET) has shown ability to operate at very low voltages which uses Inter-Band Tunneling (IBT) current modulation technique, unlike MOSFET which uses thermal distribution process (kT/q) [8, 9]. Due to quantum mechanical based electron transport mechanism TFET has steep slope below (60mV/decade), very low OFF current and high Ion/Ioff ratio [10]. On the other side, TFET has some limitations such as low ON current which depends upon bandgap of semiconductor material, parasitic ambipolar behavior and inherent property of uni-directional current conduction due to asymmetrical p-i-n structured device [11-12], which is critically used as access transistors based circuits and SRAMs [11-15]. Read and write stability of 6T SRAM cell is characterized by noise margin which decreases with scaling the device dimensions and reducing the supply voltage. Therefore, SRAM design using assist techniques requires maintainance of bias voltages so that the improvement in can be balanced out with the degradation in the other [16-17]. The strong sensitivity of read and write stability depends onto wordline and cell voltage biasing. Therefore, these two voltage tuning knobs are suggested so that the overall reliability of the SRAM cell can be increased.

In this work, we analyze the stability and performance criteria for 6T SRAM cell using inward-access transistor of previously proposed Dual Material Double Gate (DMDG)-TFET with emphasis on Gate Drain Overlap (GDOV) device design [18]. Here, we study and investigate the 6T SRAM cell design based on both Inward facing n-type and p-type access transistors.

The rest of this paper is organized as follows, Section II briefly discusses the device design operation and characteristics of DMDG-GDOV TFET device. Section III describes the 6T SRAM cell design operation using access techniques. Section IV discusses its static and dynamic stability analysis. Section V introduces the write assist (WA) or read assist (RA) techniques to improve the write or read ability and performance evaluation. And the last section VI concludes this paper.
2. DEVICE OPERATION OF DMDG-GDOV TUNNEL FETS

A. The device structure and simulation Methodology

In this work, we consider the DMDG-GDOV TFET for its capability to achieve steep slope below 60mV/decade, low off current, high $I_{ON}$ as well as reduced ambipolar effect which degrades the switching performance of the transistor. All the parameters are comparable to the values in previous TFET studies [19]. The DMDG-GDOV TFETs are analyzed using Silvaco Atlas, version 5.19.20.R [20]. Here a nonlocal band-to-band tunneling (BTBT) model has been used to take into consideration the tunneling current along the lateral direction for actual spatial charge transfer across the tunnel barrier. To include the mobility effect concentration dependent model is used in our simulation. Band Gap Narrowing (BGN) model is used for high doping concentration in the source and drain regions. In this paper the Fermi Dirac statistics, the Shockley-Read-Hall (SRH) recombination model and Auger model under the high electric field have been used.

![Fig. 1. Device Structures of (a) n-type DMDG-GDOV TFET and (b) p-type DMDG-GDOV TFET.](image)

Fig. 1 presents the device structure of Si based n-type and p-type DMDG-GDOV TFETs. Both TFETs have a channel length of 32nm; gate overlap on drain side is 19nm for both TFET devices. The silicon body thickness $t_{Si}=10\text{nm}$, the source and drain doping concentration is $10^{20}/\text{cm}^3$ and $10^{19}/\text{cm}^3$ respectively. A 3nm high-k dielectric (HfO$_2$, permittivity 25) is used for gate insulator and SiO$_2$, low-k dielectric is used for both the sides source and drain region to reduce the parasitic capacitances. The intrinsic channel is lightly p-doped i.e. $10^{17}/\text{cm}^3$.

Fig. 1(a) and Fig. 1(b), show the DMDG-GDOV TFET with metal gate, $M_1$, aligned to source junction and metal gate, $M_2$, overlapped with drain region (i.e. GDOV=19 nm from the drain-channel junction) for both types of TFETs. In all these structures, high-k (HfO$_2$) and low-k (SiO$_2$) dielectrics have been considered where high-k dielectric is layered on the top and bottom of the channel region. The TFET (structure 1 and structure 2) gate electrode work functions of both materials, $M_1$ and $M_2$ are chosen which effectively control tunnel barrier width either at source–channel interface or at the drain–channel interfaces as per applied positive and negative gate voltage for the conduction of the current. Fig. 2 (a) and Fig. 2(b) show the energy band diagrams of both n- and p-type DMDG-GDOV TFET during OFF and ON states.

In the OFF state ($V_{GS} =0.0V$, $V_{DS} =0.9V$ for n-type TFET and $V_{GS} =0.0V$, $V_{DS} =-0.9V$ for p-type TFET) the energy difference between conduction band in the channel region and valence band in source region at tunnel junction is larger than in the ON state condition which is shown by thick line. Therefore, the tunneling probability of electron flow towards source to channel is very low, resulting in low off state current. In the ON state ($V_{GS} =0.0V$, $V_{DS} =0.9V$ for n-type TFET and $V_{GS} =0.0V$, $V_{DS} =-0.9V$ for p-type TFET) the direction of electrons tunneling from the Valance Band (VB) of the source to the Conduction Band (CB) of the channel region in n-TFET and from the VB of channel region to the CB of source region in p-TFET has been highlighted with the help of arrowhead. At an applied gate bias voltage the conduction band in the channel region narrows towards valence band in the source region shown by dotted line which reduces the tunneling barrier width. Hence, the window created for transmission of electron from source to channel region, results in Inter-band tunneling and is able to give sufficient high on current. Fig. 3 shows the $I_{D}-V_{GS}$ characteristics for both n- and p-type DMDG-GDOV devices. In the TFET device for OFF and ON state, band-to-band tunneling probability depends on the $V_{GS}$ at fixed drain to source bias. Hence when a positive gate voltage is applied, it gets coupled to the channel region and narrows tunnel barrier width at the source channel interface.
Fig. 2. Energy-band diagram (a) for n-type and (b) p-type DMDG-GDOV-TFET structured device biased OFF-state and ON-state.

Fig. 3. $I_D$-$V_{GS}$ characteristics of both n-type and p-type DMDG-GDOV TFET device by numerical simulation and ambipolar current considered as $V_{GS}$=-0.9V, $V_{DS}$=0.9V.

Thus drain current is obtained approx 0.4 mA for both types of devices and very low off current about femto ampere is obtained at $V_{GS}$=0 V. The OFF current for both types of devices chosen by adjusting the metal workfunction difference between dual material of the gate electrode gets very low which is further required to design a low power SRAM cell. The utility of overlapping metal gate on the drain side region is to suppress the parasitic ambipolar behavior when a negative gate voltage is applied for DMDG-GDOV TFET device as shown in Fig. 3.

It is important to consider the $I_D$-$V_D$ characteristics of DMDG-GDOV TFET, since; the saturation voltage plays a vital role in the noise-margin characteristics of digital circuits. DMDG-GDOV TFET has $V_{TH}$ and hence shows delayed output saturation characteristics. In addition, the TFET shows uni-directional current conduction due to the asymmetric p-i-n structure. Since, TFET can operate as both n-type and p-type (depending on the biasing), the OFF state is not completely controlled by the gate to source voltage when it is used as access transistor.

From Fig. 4 it is obvious that, the output characteristics of TFET current is unidirectional due to the asymmetric source/drain doping in p-i-n base structured device. When an n-type TFET is biased with negative $V_{DS}$ this leads to the formation of a junction between P+ channel and N+ doped region. And this junction further acts like a forward biasing of a p-n junction diode. Therefore in such conditions, $V_{GS}$ loses control over the channel region resulting in drain current flowing through the opposite direction. However, the drain current level is lower than in the normal condition of nTFET.
with +ve \( V_{GS} \). In the output characteristics, as drain voltage is increased both surface potential and inversion charge carrier increase at fixed applied \( V_{GS} \) which results in low value of the channel resistance.

![Output characteristics of DMDG-GDOV nTFET for reverse bias condition at various \( V_{GS} \) level](image)

**Fig. 4.** Output characteristics of DMDG-GDOV nTFET for reverse bias condition at various \( V_{GS} \) level

![Output characteristics of both n- and p-type DMDG-GDOV TFET device by numerical simulation for forwards biased condition at different \( V_{GS} \)=0.9V, 0.7V, 0.5V, 0.3V.](image)

**Fig. 5.** Output characteristics of both n- and p-type DMDG-GDOV TFET device by numerical simulation for forwards biased condition at different \( V_{GS} \)=0.9V, 0.7V, 0.5V, 0.3V.

However, unlike MOSFETs, in TFET inversion charge is created from drain to source side [19]. Due to this, at low \( V_{DS} \), the channel potential is pinned through the strong inversion charge carrier. Thus, with small increase in \( V_{DS} \) the tunnelling width reduces effectively causing a linear increase in current. But with further increase in \( V_{DS} \) there is no linear increase in drain current and thereby the source region enters in weak inversion condition pinching off the inversion charge which results in current saturation, shown in Fig. 5.

3. **6T SRAM CELL DESIGN WITH ACCESS TECHNIQUES AND ITS CHARACTERIZATION**

A. **SRAM Cell Operation**

In standard 6T SRAM cell design, the access transistors require bi-directional current conduction. One of the main intrinsic limitations of TFET is that it conducts current in one direction and it may affect the ability to read and write operation in terms of noise margins to evaluate the functional implications. To overcome this limitation, various techniques [21-25] have been proposed. In this work, DMDG-GDOV TFET based n-type and p-type access transistor techniques are used to design an SRAM cell at 32nm technology length to improve the stability, power dissipation and delay at low \( V_{DD} \). In order to compare the stability characteristics between two different access transistor techniques in the design of standard 6T SRAM cell based on proposed TFET device structure the obtained transfer
characteristics of the TFET device are captured through device simulation in Atlas tool across a range of voltages in a Verilog-A model to perform SRAM cell operation. The $I_{DS}$ ($V_{GS}$, $V_{DS}$), $C_{GD}$ ($V_{GS}$, $V_{DS}$) and the $C_{GS}$ ($V_{GS}$, $V_{DS}$) characteristics are extracted in two-dimensional look-up tables for designing TFET based circuit applications. Further, HSPICE simulator has been used for SRAM cell design and simulation. Fig. 6(a) and Fig. 6(b) show the schematic of 6T SRAM Cell circuits based on DMDG-GDOV TFET which operate as a bi-stable circuit with two stable states during read and holding periods. The cell content is stored at nodes Q and QB which are connected by two cross-coupled TFET inverters. Two access transistors IBTN1/IBTP3 and IBTN2/IBTP4 connected between BL/Q and BLB/QB nodes respectively are used for read/write operation. To activate the access transistors for participating read/write function, wordline (WL) is connected to both the gates of the access transistor. And the supply voltage is given to both the source sides of pTFET in the inverter circuit. The transistors are connected in cross-coupled inverters which conduct always in one direction. Therefore using TFET instead of CMOS for inverter circuit consume very less static power than CMOS due to very low off state current.

![Schematic of 6T SRAM cell design using (a) inward- access nTFET and (b) inward- access pTFET.](image)

The operation principle of TFET based 6T SRAM cell mainly depends on using access transistor techniques due to asymmetrical property of source/drain doping. Fig. 7 (a) to 7(d) show the read/write operation of both inward-facing access transistor using n-type and p-type TFET for 6T SRAM cell. During read operation of both access transistors no implication of uni-directionality occurs because at a time in one direction only one current path exists (either IBTN1/IBTP3 to IBTN3/IBTN1 or IBTN2/IBTP4 to IBTN4/IBTN2) which is in the inward direction. Hence, there is no ambiguity for both types of access transistors. The read-stability is determined by the sizing ratio of Pull down and access transistor which is commonly referred as cell ratio ($\beta$) which balances performance, stability and area of the SRAM cell. During write operation, turning over the bitcell content from previously stored value (i.e. node Q at $V_{DD}$ and node QB at GND) requires following actions. At the starting of write operation both the bit-lines are set i.e. BL=0 and BLB=1 and followed by the high activation of word-line (WL) for using IA-nTFET and WL low activation for using IA-pTFET. In inward access nTFET the durations of on-state of WL and “0” state of written BL are restricted and as a result, the write margin appears pessimistic compared to the inward access pTFET as shown in Fig. 15. Hence, it concludes that inward nTFETs cannot be used as the access transistor, and inward pTFETs are the only appropriate choice as they can provide both low static power and successful read/write operation.
Fig. 7. SRAM Cell structure operation during read/write for both IA-n/p TFET.

Fig. 8. Butterfly curve for inward access n-type DMDG-GDOV TFET based 6T SRAM cell for Read static Noise Margin (RSNM) at different $V_{DD}$.
Fig. 9. Butterflies curve for inward access p-type DMDG-GDOV TFET based 6T SRAM cell for Read static Noise Margin (RSNM) at different $V_{DD}$.

Fig. 10. Butterflies curve for inward access p-type DMDG-GDOV TFET based 6T SRAM cell for Write static Noise Margin (WSNM) at $V_{DD}$=0.9V.

**B. Static Stability Analysis**

For read and write operation, static noise margin is more attractive to represent the worst case SNM. Reading failure happens when the low voltage level of output of one inverter is above the trip point of the opposite inverter. This case is exacerbated when the supply voltage is scaled owing to decrease in the trip point of opposite inverter. The Fig. 8 and Fig. 9 show the read static noise margin of both inward access nTFET and IA-pTFET SRAM cell at different $V_{DD}$. These figures show RSNM =150 mV at $V_{DD}$=0.9V for both types of IA-TFET. RSNM at $V_{DD}$=0.6V are 106 mV and 113 mV and at $V_{DD}$=0.5V RSNM are 45 mV and 50 mV for IA-nTFET and IA-pTFET respectively. Since scaling the $V_{DD}$ directly affects the SNM of SRAM cell, hence, if supply voltage of both types of IA-TFET based SRAM cell is reduced RSNM gets reduced as shown in Fig. 8 and Fig. 9. These figures show at $V_{DD}$=0.5V read static noise margin is more disturbed than RSNM at $V_{DD}$=0.6V. This is because ON-state current of DMDG-GDOV TFET device gets reduced when supply voltage below 0.6V is applied, shown in Fig. 4. From Fig. 9, it is clear that using IA-pTFET SRAM cell there is significant improvement in RSNM than IA-nTFET at different $V_{DD}$. The Fig. 10 shows WSNM of only inward access-pTFET at $V_{DD}$=0.9 V but, since inward access nTFET is not able to produce sufficient WSNM and hence WSNM in this case cannot be figured out.

**C. Dynamic Stability Characterization**

In nanoscale technology, reliability has become an important issue for SRAM cell design [25, 26]. Both local mismatch and scaled $V_{DD}$ degrade read stability and write ability. Traditional SRAM cell design criteria uses static stability analysis in terms static noise margins (SNM) for read failure and write noise margin (WNM) for write failure [27]. These metrics are identified as optimistic in write ability and pessimistic in read stability during infinite access time. However, static approach does not able to characterize the impact of dynamic dependencies on cell stability within a finite duration [28, 29]. The read SNM is generally overestimates the read failure. Therefore, the study of dynamic stability for an SRAM bitcell is required to determine the functionality or a successful read and write operation in the time domain. Dynamic stability metrics, derived from the SRAM under dynamic access, have been proposed to provide a better estimate of SRAM cell at low $V_{DD}$. 
D. Dynamic Read/Write operation in 6T SRAM Cell

The cell content stability during read and write ability can be quantified by finite duration of pulse width which depends on WL and high/low state of the BL i.e. dynamic read noise margin (DRNM) [26] and critical width of wordline pulse (WLcritic) [30]. DRNM is defined as minimum voltage difference between the internal nodes of cross coupled inverter over a finite duration.

During read operation, the bitline voltage is quickly discharged through pull-down transistor; hence voltage on the internal node ‘Q’ never becomes high enough to change the content of the cell. The margin between two nodes Q and QB can be defined as the minimum voltage difference over time during read operation. The cell stability i.e. WLcritic, during write operation is the minimum time needed for the wordline pulse to turn over the state of Q and QB.

Figure 11 and figure 12 show read stability for IA-nTFET 6T SRAM cell at different pulse widths $T_S=50\text{ps}$ and $T_L=150\text{ps}$ respectively. Figure 11 depicts a critical role of pulse width during read access and shows that shorter WL pulse width as compared to figure12 which successfully latches the cell content at internal nodes Q and QB because bit-line capacitance is dominated mainly by diffusion capacitance of all the access transistors using the same bit-line. The wordline pulse width $T_S$ is sufficient to discharge the bitline (BL) and causes needed differential voltage to surmount the offset or triggers the sense amplifier. Similarly, in figure 12 for a given wordline pulse width $T_L$ is longer than $T_S$. A longer wordline pulse width, $T_L$, is applied to keep the SRAM bitcell under heavy load, which results in the bitcell turning over to an opposite state before the wordline pulse is deasserted which indicates to the severe read failure. It stresses that there should be a critical wordline pulse width, $T_{access}$ ($T_S < T_{access} < T_L$), for which sense amplifier is on the threshold of an accomplished read access which is described as the read access time. This is akin to dynamic read access failure discussed in [28, 31, and 32]. Thus, size of wordline pulse width has an important role in executing the correct read operation.

**Fig. 11.** Read transient waveforms of the inward access n-type DMDG GDOV TFET for 6T SRAM Cell showing successful dynamic read operation at 50ps at $V_{DD}=0.9\text{V}$.
Fig. 12. Read transient waveforms of the inward access n-type DMDG GDOV TFET for 6T SRAM Cell showing read failure for increased wordline pulse width 150ps at $V_{DD}=0.9V$.

Fig. 13. Read transient waveforms of the inward access p- and n-type DMDG GDOV TFET for 6T SRAM Cell showing dynamic read operation for 100 ps wordline pulse width at $V_{DD}=0.9V$.

The Fig. 13 also shows, given pulse width on wordline where the duration of pulse is 100ps. A successful read operation occurs for both inward-access transistors where DRNM of IA-pTFET is 672mV and DRNM of IA-nTFET is 600mV at $V_{DD}=0.9V$. In the given Fig. 14, when the duration of pulse width of WL increases read failure occurs in the IA-nTFET based SRAM cell whereas the sustainability of data retention is more in IA-pTFET cell. Hence, it can be said that successful cell stability occurs during read operation when short WL pulse width is provided. When we increase the WL pulse width, it is possible to flip the cell content of the inverter node. Fig. 15 shows transient write operation of DMDG-GDOV TFET with inward-pTFET and inward-nTFET access transistors.

The Fig. 15 shows that IA-pTFET has the ability to flip the cell content of SRAM within a given WL critical pulse width and on the other side IA-nTFET fails to turn over the data contents of two nodes Q and QB of cross coupled inverter of SRAM cell during 100ps given pulse width on the wordline. From Fig. 16 and Fig. 17, the cell stability of the 6T DMDG-GDOV TFET SRAM based on inward access transistor is analyzed. The figure depicts the cell stability during read and during write operation. The cell stability depends on the ratio of the transistor width in the inverter and the access transistor. Fig. 17 shows that the $WL_{crit}$ is larger for $\beta>1$ in IA-nTFET than IA-pTFET which indicates a write failure. Therefore, we can say that inward nTFETs cannot be used as the access transistors. Fig. 16 shows DRNM of 6T TFET SRAM cell based on both types of IA-TFET and reported work [33] which used IA-pTFET for SRAM cell. Plot of DRNM and cell ratio ($\beta$) describes that Dynamic read noise margin of DMDG-GDOV TFET based IA-pTFET SRAM cell significantly outperforms IA-nTFET and single gate IA-pTFET SRAM cell.
Fig. 16. DRNM Comparision for the 6T SRAM cell based on DMDG-GDOV TFET with inward-pTFET/inward-nTFET and reported data [33].

Fig. 17. WLcritc Comparision for the 6T SRAM cell based on DMDG-GDOV TFET with inward-pTFET/inward-nTFET.

If Cell ratio $\beta \geq 1$, DRNM of IA-pTFET increases whereas DRNM IA-nTFET seems to be constant at 500 mV. This is why when pTFET is used as access transistor it works as pull-up node which reduces the strength of access transistor and consequently increases the drive current through pull down transistor of the inverter of SRAM cell.

4. 6T SRAM ASSIST TECHNIQUES

As scaling continues circuit assist techniques are increasingly essential to protect the 6T cell functional window of operation. The power and cost are clearly vital factors in determining the optimal assist method which meets the functional margin and delay requirements. Scaling the power supply ($V_{DD}$), variation in threshold voltage reduces performance of SRAM cell in terms of stability metrics. Hence, a trade-off between read stability and write ability of the 6T SRAM cell is required to retain the 6T cell functional operation circuit assist techniques. However, in standard SRAM cell design, using tunnel FET the uni-directionality is a major concern for access transistor in SRAM Cell during write operation; inward-pTFET access transistor has been used especially at low $V_{DD}$. 
Mainly there are four types of write and read assist techniques which are used as percentage of supply voltage ($V_{DD}$) because assist techniques are usually set by voltage dividers or charge redistribution and is proportional to $V_{DD}$ [16, 34, and 35]. The effective assist techniques are used for read ability and write ability that are called Read Assist (RA) and Write Assist (WA). There are four types of Read Assist (RA) techniques namely $V_{DD}$ raising, GND lowering, wordline raising and bitline lowering. And four types of Write Assist are $V_{DD}$ lowering, GND raising, wordline lowering and bitline raising. Among all four WA and RA, we have used only one WA or RA to fulfill the write and read ability criteria. For using Assist techniques, first of all the access transistor and pull-up (PU) and pull-down (PD) transistors are sized for reliable read and write operation. Since, pTFET has been used as access transistor, the wordline lowering is used for assist write unlike CMOS based SRAM cell design whereas nMOS access transistor uses wordline raising technique. For successful write operation the cell ratio ($\beta$) is chosen below 1 i.e. 0.4 and read operation is taken 1 for 0.6V power supply shown in Fig. 18 and Fig 19. Here the wordline lowering has been used which improves the write operation by increasing the drive strength of access transistor as the source of IA-pTFET is connected to BL/BLB. Hence negative voltage increases the gate voltage which is sufficiently able to pull down the voltage through BL/BLB below critical threshold voltage. The wordline lowering can be implemented by charge pump or by capacitive coupling. Similarly for read assist, the wordline raising technique is used for read stability for low $V_{DD}$. In this technique, as the wordline voltage is increased the $V_{GS}$ of access transistor (AT) decreases, so the drive ability of AT reduces and the drive strength of PD transistor increases by voltage divider rule, thereby improving read ability.

Fig. 14. Read transient waveforms of the inward access p- and n-type DMDG GDOV TFET for 6T SRAM Cell showing successful dynamic read operation and read failure for increased wordline pulse width 200ps at $V_{DD}=0.9V$.

Fig. 20 shows the read static noise margin for IA-pTFET transistor with RA wordline raising technique at different wordline raising voltage levels. The figure shows that if the wordline voltage of IA-pTFET increases, the strength of access transistor decreases and therefore the data retention ability at nodes Q and QB potentially increases during read operation.
Fig. 15. Write transient waveforms of the 6T SRAM cell based on DMDG-GDOV TFET with inward-pTFET/inward-nTFET during wordline pulse width of 100ps showing dynamic write operation at VDD=0.9V.

Fig. 18. Read transient waveforms of the inward access p-type DMDG GDOV TFET for 6T SRAM Cell showing successful dynamic read operation for 100ps wordline pulse width at VDD=0.6V.
5. CELL PERFORMANCE METRICS

Performance metrics for TFET based SRAM cell is analyzed by read delay and write delay which is compared with IA-nTFET 6T SRAM cell and IA-pTFET 6T SRAM cell and reported work at different V_{DD}. For 6T SRAM cell the read delay is defined as the time delay between half of wordline (WL) voltage activation to 10% of pre-charged voltage difference and the bitlines [6]. In this work, read delay is calculated for IA-nTFET and IA-pTFET based SRAM cell at V_{DD}=0.9 V and 0.6 V. Further, read delays are compared with Jawar Singh et al. [13] as shown in table I. It is observed that SRAM cell using IA-pTFET give smaller read access delay than IA-nTFET and mixed (Inward-Outward) access TFET in SRAM cell for both V_{DD} levels. This feature can be characterized by the size of access transistor in favor of write operation (β is set to 0.6). For read delay, the wordline raising read assist technique helps to achieve minimum delay under low V_{DD} than others.

<table>
<thead>
<tr>
<th>Performance Metric</th>
<th>IA-nTFET (V_{DD}=0.9V)</th>
<th>IA-pTFET (V_{DD}=0.9V)</th>
<th>IA-nTFET (V_{DD}=0.6V)</th>
<th>IA-pTFET (V_{DD}=0.6V)</th>
<th>Reported [13] (V_{DD}=0.9V)</th>
<th>Reported [13] (V_{DD}=0.6V)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Read delay (ps)</td>
<td>55</td>
<td>15</td>
<td>100</td>
<td>50</td>
<td>300</td>
<td>500</td>
</tr>
<tr>
<td>Write delay (ps)</td>
<td>290</td>
<td>88</td>
<td>300</td>
<td>100</td>
<td>90</td>
<td>100</td>
</tr>
<tr>
<td>Static Power</td>
<td>0.62pW</td>
<td>0.62pW</td>
<td>0.4pW</td>
<td>0.4pW</td>
<td>0.9pW</td>
<td>0.5pW</td>
</tr>
<tr>
<td>consumption (pW)</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table I shows that the write delay of IA-pTFET is smaller than IA-nTFET for both V_{DD} voltage levels due to unidirectional behavior resulting in weak drive current from access transistor using IA-nTFET than IA-pTFET and large WLcritc indicates the write failure. However using IA-pTFET and mixed access transistor in SRAM
cell approximately same write delay is achieved. The potential benefit of TFET has very low off state leakage current and produces low static power consumption in all TFET based SRAM cell design. Using IA-nTFET and IA-pTFET there is low static power consumption in SRAM cell design than the reported work for both V_{DD}, shown in table I.

6. CONCLUSION

We propose a silicon based DMDG-GDOV TFET 6T SRAM cell design using inward access nTFET and inward access pTFET technique in which SRAM cell using IA-pTFET transistor show good read as well as write operation and outperform in the stability metrics. By taking advantage of TFET low off state current and steep slope below 60mV/decade is obtained. Power supply at V_{DD} 0.9 V is used without write and read assist technique which show significant read/write stability for IA-pTFET SRAM cell and at V_{DD}=0.6V using WA/RA technique wordline lowering and wordline raising show successful read/write operation in dynamic and static conditions. Cell performance metrics for both the types of IA transistor techniques is investigated and compared with reported work which show that, using IA-nTFET and IA-pTFET there is low static power consumption than the reported work for both V_{DD}.

REFERENCES

ign in deep deep

ion of dynamic SRAM stability in 45 nm CMOS. IEEE J. Solid

–

m on Nanoscale

–

nd Benton H. Calhoun. "Analyzing static and dynamic write margin for nanometer

of the

VLSI Circuits and Systems Letter

[35]

[34]

[33]

[32]

[31]

[30]

[29]

[28]

[27]

[26]

[25]

[24]

[23]

[22]

[20]

[19]

[18]

[17]

[16]

[15]

[14]

[13]

[12]

[11]

[10]

[9]

[8]

[7]

[6]

[5]

[4]

[3]

[2]

[1]
Design and Analysis of DECODER circuit with Source biasing Technique for Memory array application

Neha Gupta and Vaibhav Neema
Department of Electronics and Telecommunication Engineering
Institute of Engineering and Technology, Devi Ahilya University, Indore – 452017, INDIA
nehagupta121192@gmail.com, vaibhav.neema@gmail.com

Abstract: To reduce leakage power dissipation in Memory cell researchers has proposed many circuit techniques. In this research work solution for low leakage power with high speed row and column decoder circuit for Memory array application is proposed. Leakage power loss is a major concern at lower technology node as it drains the battery even when a circuit is idle. For reduction in overall power dissipated by complete memory system, we are proposing new circuit technique i.e. Source biasing DECODER for low leakage current and Low delay. This technique is designed and analyzed for Memory array application. Comparison of Pre-layout and post-layout with corner simulation of conventional decoder and source bias decoder are presented in this work. From the simulation result it is observed that the proposed Source bias DECODER circuit approach to consume less leakage current. From FF corner simulation has less Static/Leakage power dissipation up to 89.03% as compared to standard post-layout simulation of source bias decoder in 25°C temperature. The Dynamic power dissipation and delay is reduced up to 4.48% and 7.32% respectively in standard post-layout simulation of source bias decoder circuit as compare to standard post-layout simulation of conventional basic decoder circuit. The simulation is done at 1.2 V supply voltage using Cadence EDA tool with 180 nm GPDK technology file.

Keywords: static & dynamic power dissipation, leakage current, Sleep transistor, Source biasing.

1. INTRODUCTION

As demand of battery operated systems like Mobile Phone, Laptop’s, Hand Held device etc are increasing continuously so that for battery operated systems power dissipation is one of the important parameter for VLSI design engineer. Basic decoder design for Memory Array application consume more power during word selection process, very few researches have addressed this issue. Here in this work we provide solution for this issue and proposed Source bias Decoder circuit design technique [1].

This work mainly concern to propose circuit technique which will be most suitable for decoder circuit design for memory array application. For decoder circuit design vlsi designers are more concern about leakage power, dynamic power and delay of the circuit. One major parameter to be considering for circuit design is temperature stability of the circuit. Source bias technique is preferable use for low leakage power circuit design application.

Here in this presented work source bias technique is used to design and implement decoder circuit for memory array application. Post and pre layout simulation at various temperature is use to check practical feasibility of the circuit. After the layout designing, DRC run, LVS run, and RCX extraction steps are also performed in configure test-bench.

In proposed source biasing decoder circuit, one addition biasing circuit is used with conventional decoder for leakage current reduction. Biasing circuit is connected using clustering technique with all NAND circuit like ground gated circuits. In clustering technique [2], area penalty can be overcome where a source biasing circuit is used for number of integrated NAND circuit cells. By proper connection of source biasing circuit in cluster, will overcome delay and area in decoder circuit with low leakage current. The main use of cluster is to reduce the overall size of proposed circuit.

The paper is organized as follows: Section II describes related work in which one technique is existing and one is proposed. Section III describes parameters definition and explanation. Experimental results, comparison, and discussion of pre-layout and post-layout of proposed technique with existing technique are presented in section IV. The paper is concluded in section V.
2. Related Work

DECODER is one of the essential components for Memory Array designing. In his research work we consider application source biasing technique for Decoder circuit implementation. The post and pre Layout simulation results of conventional and proposed circuit using source biasing technique are used.

A. Convention Decoder circuit

This is normal working of 3X8 decoder circuit using NAND Gate and used inverter test bench for compliment operation [3]. The output of decoder circuit is from D0 to D7 and input is giving which is A, B, and C as shown in fig. 1. This is conventional decoder worked normal operation like when given input is ABC = (001) than D1 output line is zero and another output line is one and so on. For Dynamic operation or active mode, transient inputs is given to A, B, and C. The output of decoder is transient and is high all time except when input combinations meet and satisfy the basic decoder condition.

Fig 2 shows the layout diagram of basic decoder circuit. Arrow indicates the layout of inverter and NAND circuit in this fig 2. RCX extracted view of conventional decoder circuit is shown in fig 3.
B. Source Biasing Decoder circuit

Decoder circuit using Source bias is same as cluster circuit technique [4] in which ground gated sleep transistor is used. This sleep transistor is replaced by a source biasing circuit [5] as shown in fig 4. Where source terminal is connected to the output of inverter, input terminal is common and connected to supply voltage and body terminal is grounded. This technique is best for Leakage power or static power calculation.

In fig. 4, Leakage current is calculated when sleep NMOS transistor is in idle mode. In this mode, 0V is applied at gate terminal and get 1V at source terminal from inverter output so that sleep transistor is in off state. So the effective voltage gets reduced to \( V_{dd} - V_{tp} \), where \( V_{tp} \) is threshold voltage of PMOS. In this way leakage current is least in source bias decoder circuit.

Source biasing circuit techniques utilizes the source terminal to modify the threshold voltage of a transistor. The voltage difference between the source and body terminals \( (V_{SB}) \), the threshold voltage can decrease or increase. When negative voltage is applied across \( V_{SB} \), the transistor is said to be Reverse source bias otherwise called Forward source bias [6].

Fig 5 shows the layout diagram of source bias decoder circuit. One additional circuitry is used i.e. source bias circuit. RCX extracted view of Source biasing decoder circuit is shown in fig 6. RCX extracted view represent parasitic resistance and capacitances generates due to interconnects and Poly lines.

Figure 3: RCX Extracted view of Basic decoder circuit

Figure 4: Source biasing decoder circuit
3. Simulation Parameters

Following parameters are calculated and consider for comparative study of proposed and conventional design:

A. Delay

Delay is calculated between trigger (input) and target (output) value. It is a time difference between 50% input transition and 50% output transition. The $t_{\text{PLH}}$ defines the response time of the gate for a low to high output transition, while $t_{\text{PHL}}$ refers to a high to low transition. The propagation delay [7] as the average of the two

$$t_p = \frac{t_{\text{PLH}} + t_{\text{PHL}}}{2}.$$

B. Dynamic Power:

Due to logic transitions causing logic gates to charge/discharge load capacitance. This is the power dissipation that occurs when the sleep is in active mode. Dynamic power is proportional to a square of $V_{\text{dd}}$ [8].

$$P_{\text{dynamic}} = V_{\text{dd}}^2 \cdot f \cdot C_L \cdot \alpha$$

Where $P_{\text{dynamic}}$ = Dynamic power, $C_L$ = Load capacitance, $\alpha$ = Driving factor, $f$ = Frequency, $V_{\text{dd}}$ = Supply voltage

C. Leakage current:
Leakage current is the current that flows in a device which is in “off” state or sleep transistor is in ideal state where no current will flow ideally [9].

D. Static power dissipation:
This is the power dissipation that occurs when the sleep is in standby mode or idle mode. In this mode, very small current flow through transistor. It will reduce leakage power dissipation [10] of the circuit.

\[ P_{\text{leakage}} = I_{\text{leakage}} \cdot V_{\text{dd}} \]

Where \( P_{\text{leakage}} \) = Leakage power dissipation, \( I_{\text{leakage}} \) = Leakage current, \( V_{\text{dd}} \) = Supply voltage.

E. Static energy and dynamic energy:
Static energy component is proportional to \( V_{\text{dd}} \) whereas dynamic energy component is proportional to square of \( V_{\text{dd}} \). Static and dynamic energy component is given by [11]:

\[ E_{\text{static}} = I_{\text{leakage}} \cdot V_{\text{dd}} \cdot T_{\text{delay}} \]

\[ = P_{\text{static}} \cdot T_{\text{delay}} \]

\[ E_{\text{dynamic}} = \alpha \cdot C_{L} \cdot V_{\text{dd}}^{2} \]

Where \( \alpha \) is transition activity, \( C_{L} \) is load capacitance, \( I_{\text{leakage}} \) leakage current, \( T_{\text{delay}} \) is circuit delay.

F. Dynamic EDP and Dynamic PDP:
To design energy efficient circuit in low power application energy delay product is important parameter. EDP and PDP of any circuit should be as small as possible for low power circuit design. Dynamic energy delay product (EDP) and dynamic power delay product (PDP) is given by [11]:

\[ \text{EDP}_{\text{dynamic}} = E_{\text{dynamic}} \cdot T_{\text{delay}} \]

\[ \text{PDP}_{\text{dynamic}} = E_{\text{dynamic}} \cdot T_{\text{delay}} \cdot f_{\text{clock}} \]

\[ = P_{\text{dynamic}} \cdot T_{\text{delay}} \]

\[ \text{PDP}_{\text{static}} = P_{\text{static}} \cdot T_{\text{delay}} \]

4. Simulation Result and Discussion
All simulation parameters are calculated in CADENCE EDA tool with 180nm GPDK technology file. Post-layout simulation is done after extraction and inclusion of parasitic capacitance and resistance value in simulation, these parasitic are generating due to interconnect and ploys lines in layout design of the circuit.

Table 1 shows the comparative study of Source bias decoder circuit and conventional decoder circuit with process corner simulation at room temperature. Fig 7 shows the comparative graph of dynamic power and delay of source bias decoder with conventional decoder with process corner simulation at room temperature. Fig 8 shows the comparative graph of all parameters with pre and post layout simulation of source biasing decoder circuit with process corner simulation at various temperature values. Where process corner shows corner simulation with slow-slow (SS), slow-fast (SF), fast-slow (FS), fast-fast (FF), and standard (stat). Temperature variation also compare with corner simulation.

<table>
<thead>
<tr>
<th>Parameters</th>
<th>Simulation</th>
<th>Standard Decoder</th>
<th>Source Bias Decoder</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>stat SS SF FS FF</td>
<td>stat SS SF FS FF FF</td>
</tr>
<tr>
<td>Dynamic Power (μW)</td>
<td>Pre-layout</td>
<td>0.418 0.176 0.355 0.481 1.18</td>
<td>0.404 0.17 0.34 0.468 1.134</td>
</tr>
<tr>
<td></td>
<td>Post-layout</td>
<td>0.513 0.285 0.452 0.574 1.252</td>
<td>0.492 0.274 0.436 0.56 1.21</td>
</tr>
</tbody>
</table>
Delay (nS)

<table>
<thead>
<tr>
<th></th>
<th>Pre-layout</th>
<th>1.31</th>
<th>1.665</th>
<th>1.38</th>
<th>0.764</th>
<th>0.134</th>
<th>1.3</th>
<th>1.656</th>
<th>1.39</th>
<th>0.783</th>
</tr>
</thead>
<tbody>
<tr>
<td>Post-layout</td>
<td>0.669</td>
<td>2.096</td>
<td>2.19</td>
<td>0.846</td>
<td>0.401</td>
<td>0.62</td>
<td>2.035</td>
<td>2.14</td>
<td>0.901</td>
<td>0.454</td>
</tr>
</tbody>
</table>

Static Power*

<table>
<thead>
<tr>
<th></th>
<th>Pre-layout</th>
<th>0.179</th>
<th>0.022</th>
<th>0.361</th>
<th>2.81</th>
<th>3.154</th>
<th>0.29</th>
<th>0.33</th>
<th>0.45</th>
<th>0.126</th>
<th>1.162</th>
</tr>
</thead>
<tbody>
<tr>
<td>Post-layout</td>
<td>0.179</td>
<td>0.022</td>
<td>0.361</td>
<td>2.815</td>
<td>3.154</td>
<td>0.312</td>
<td>0.316</td>
<td>0.316</td>
<td>0.124</td>
<td>0.034</td>
<td></td>
</tr>
</tbody>
</table>

*Static power is nW in standard decoder and fW in Source bias decoder.

Figure 7: (a) Comparison of Dynamic power of source bias decoder with conventional decoder and (b) Comparison of Delay of source bias decoder with conventional decoder with process corner simulation.

(a) Dynamic Power Dissipation (μW)
(b) Delay (nS)
(c) Leakage Current (fA)
5. Conclusion

In this paper, the source biasing decoder technique is the best suitable circuit design technique for memory array application. Where the configuration of source biasing decoder circuit is fulfilled all requirements that memory design engineers required.

The Dynamic power dissipation, delay are reduced up to 4.30% and 6.29% respectively in standard pre-layout simulation and 4.48% and 7.32% respectively in standard post-layout simulation of source bias decoder circuit as compare to standard pre and post layout simulation of conventional basic decoder circuit in room temperature (25°C).

Static/Leakage power dissipation, Leakage current, Delay, and Dynamic power dissipation are reduced up to 89.03%, 88.84%, 27.4%, and 2.46 times respectively in FF corner simulation as compared to standard post-layout simulation of source bias decoder circuit in 25°C temperature.

Static/Leakage power dissipation, Leakage current, Delay, and Dynamic power dissipation are reduced up to 78.30%, 78.25%, 14.54%, and 2.44 times respectively in FF corner simulation as compared to standard post-layout simulation of source bias decoder circuit in 50°C temperature. But when temperature will continue increase all the parameter values increases and result will degrade.

Acknowledge:

We are thankful to M.P. Council of Science & Technology, Bhopal, India, for finical support under R&D project scheme.No: 1950/CST/R&D/Phy & Engg Sc/2015: 27th Aug 2015.

References:


Updates

Upcoming Conferences/Workshops


Call for Papers/Proposals


Awards

- TCVLSI Student Travel Award ($250) - "On the multidisciplinary control and sensing of a smart hybrid morphing wing“ – by Jodin Gurvan in ECMSM 2017.

Job Openings

- None

Ph.D. Fellowships Available

- None

TCVLSI Member News

None
Outreach and Community

[Brief] Broader Impacts – Stop, Collaborate, and Listen
Mike Borowczak
Department of Computer Science, University of Wyoming, Laramie, WY, USA
Andrea C Burrows
Department of Secondary Education, University of Wyoming, Laramie, WY, USA

In early May, the National Science Foundation held one of their annual “NSF Days” on the University of Wyoming campus. A cohort of roughly 10 NSF program officers as well as a multitude of NSF PIs spent the day talking about individual directorates, future trajectories, and most importantly, what it takes to put together successful proposals in the NSF of today. One of the key differentiators in today’s grants is the strength of a proposals broader impacts. Some key take-aways from the session:

[1] Your broader impacts must be integrated within your proposal – they can’t simply be “tacked you’re your broader impacts should be woven into the fabric of your intellectual merit – this is by far, the biggest challenge, since there is no formula to accomplish this. Examples of these broader impact “riders” include:
   a. “we will embed the results of research into a course” – of course you will/should, that’s your job
   b. “we will talk to K-12 students about our research to interest them in our field” – do you have a background in K-12 pedagogy? This is a pretty shallow attempt, NSF has spent millions on effective transference of research into K-12 classrooms, you should probably do a literature survey.
   c. “this research will change the lives of everyone on the planet because …” If you cannot measure or explicitly state your reach, your broader impacts are likely exaggerated (there are exceptions – e.g. CRISPR).

[2] Broader impacts, like your intellectual merit, are built on the foundation of prior work and research. What you do to accomplish your broader impacts should be grounded in literature and current best-known-practices. Do a literature survey or simply a keyword search on what you propose to do for you broader impacts - don’t stop until you find something – while the idea might seem novel in your domain area, the likelihood that the underlying idea has never been tried is slim. After performing a literature survey on just your broader impact idea, you’ll likely realize that you have neither the time or expertise to do it all.

[3] Depending on your background and prior experience, it is likely that you will need to find collaborators that can help you develop, implement, and sustain your broader impact goals and objectives. Seeking out collaborators from outside of your technical realm can seem daunting, especially if they come back rejecting your ideas as “groundless within the field of X.” Take a moment, stop, and listen. Successful collaboration between two starkly different domain experts takes open, active-communication, and trust – your potential collaborator might not understand everything about your intellectual merit, but it’s just as likely that you don’t quite understand all of the theory and research behind what it takes to implement successful, transformative, broader impacts.

Authors Note: Every NSF review panel is different, every proposal is different, the summary here is a synthesis of the conversations from one 8-hour day talking with ten program officers, a multitude of NSF PIs, and a handful of reviewers on their perspectives of the biggest short-coming of most proposals today. So, in summary, take a moment to STOP (and think about if your broader impacts are well integrated), COLLABORATE (with peers that bring value to both your Intellectual Merit & Broader Impacts) and LISTEN (to those outside of your expertise area). While the authors are not responsible for the success or failure of your current/future proposal, we do hope that you take some of this information to impact future change in your proposals.

Want to see your Broader Impacts/Outreach highlighted here?

Email Mike at Mike.Borowczak@uwyo.edu
Call for Contributions

The VLSI Circuits and Systems Letter aims to provide timely updates on technologies, educations and opportunities related to VLSI circuits and systems for TCVLSI members. The letter will be published twice a year and it contains the following sections:

- **Features**: selective short papers within the technical scope of TCVLSI, “What is” section to introduce interesting topics related to TCVLSI, and short review/survey papers on emerging topics in the areas of VLSI circuits and systems.

- **Opinions**: Discussions and book reviews on recent VLSI/nanoelectronic/emerging circuits and systems for nanocomputing, and “Expert Talks” to include the interviews of eminent experts for their concerns and predictions on cutting-edge technologies.

- **Updates**: Upcoming conferences/workshops of interest to TCVLSI members, call for papers of conferences and journals for TCVLSI members, funding opportunities and job openings in academia or industry relevant to TCVLSI members, and TCVLSI member news.

- **Outreach and Community**: The “Outreach K20” section highlights integrating VLSI computing concepts with activities for K-4, 4-8, 9-12 and/or undergraduate students. It also features student fellowship information as well a “Puzzle” section for our readership.

We are soliciting contributions to all these four sections. Please directly contact the editors and/or associate editors by email to submit your contributions.

**Submission Deadline:**
All contributions must be submitted by March 7, 2017 in order to be included in the April issue of the letter.

**Editors:**
- Saraju Mohanty, University of North Texas, USA, saraju.mohanty@unt.edu
- Xin Li, Duke University, USA, xinli.ece@duke.edu

**Associate Editors:**
- **Executive**: Yiyu Shi, University of Notre Dame, USA, yshi4@nd.edu
- **Features**: Hideharu Amano, Keio University, Japan, hunga@am.ics.keio.ac.jp
- **Features**: Shiyan Hu, Michigan Technological University, USA, shiyan@mtu.edu
- **Features**: Saket Srivastava, University of Lincoln, United Kingdom, ssrivastava@lincoln.ac.uk
- **Features**: Qi Zhu, University of California, Riverside, USA, gzhu@ece.ucr.edu
- **Opinions**: Prasun Ghosal, Indian Institute of Engineering Science and Technology, India, p.ghosal@it.iiets.ac.in
- **Opinions**: Michael Hübner, Ruhr-University of Bochum, Germany, Michael.Huebner@ruhr-uni-bochum.de
- **Opinions**: Jawar Singh, Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, India, jawar@iiitdmj.ac.in
- **Opinions**: Yasuhiro Takahashi, Gifu University, Japan, vasut@gifu-u.ac.jp
- **Updates**: Helen Li, University of Pittsburgh, USA, hal66@pitt.edu (featured member story)
- **Updates**: Anirban Sengupta, Indian Institute of Technology, Indore, India, asengupt@iiti.ac.in, (awards, member news)
- **Updates**: Jun Tao, Fudan University, China, taojun@fudan.edu.cn (upcoming conferences and workshops, funding opportunities)
- **Updates**: Himanshu Thapliyal, University of Kentucky, USA, hthapliyal@uky.edu (call for papers and proposals, job openings and Ph.D. fellowships)
- **Outreach and Community**: Mike Borowczak, University of Wyoming, USA, mike.borowczak@uwyo.edu