Free Advertising Forums | Free Advertising Board | Post Free Ads Forum | Free Advertising Forums Directory | Best Free Advertising Methods | Advertising Forums - Cheap Office 2007 Objective Design a CMOS circuit

VLSI Design
Ideas (658)

Lab Assignment 3

DANAI
CHASAKI

ID: 22169216

Objective: Design and style a CMOS circuit and
layout for a 4-bit accumulator with four instances of the bitslice
accumulator from Lab 2.

This lab involves the layout of a
CMOS circuit and Layout of a 4 bit accumulator. The
Accumulator consists of a four instances of the bitslice
accumulator from lab2. A 4-bit accumulator consists of a 4-bit full
adder and a resetable 4-bit register. Its 7 inputs
are clk, A3, A2, A1, A0,
c_in, and reset. Its 5
outputs are Q3, Q2, Q1, Q0 and c_out.
The adder computes the sum of A3, A2, A1, A0, Q3, Q2, Q1, Q0,
and c_in, and generates a sum
S3, S2, S1, S0 and a carry c_out.
The register samples S3, S2,Microsoft Office Pro 2007, S1, S0 on the rising edge
of clk and
stores the result on Q3, Q2, Q1, Q0.

POST: Schematic of the
4 bit accumulator

We
obtain the schematic of the 4-bit accumulator by instantiating 4 individual
symbols of the 1 bit accumulator and interconnecting them as shown above.

Below we
give a detailed view, clearly showing the interconnections and the capacitances
placed at the Co and the Q nodes.

The accumulator instantiated above is a slightly
different version of the one in Lab 2.

The nodes Q0 through Q3 here are actually the buffered values Qout in the previous schematic. Also the pin Co_bar that was used for debugging in Lab 2 has been
removed as it is no longer required. Node Q from Lab 2 also has been made an
internal wire, and is no longer a separate pin.

POST: Validation test
sequence (which includes both inputs and response) and an image of the
simulator output (waveforms).

The
circuit must be validated for all possible input combinations, using minimal
input vectors. This is done by exploiting the inherent parallelism in the
circuit as explained below.

First we
enter the input vectors and the nodes to be analyzed and reset the entire
circuit.

This
sets Q to 0000, and S (Sum) to 0000.

In the
next clock cycle, we set A = 1111, Cin = 1 and Q is
0000 (prev. value of S)

On
addition, this leads to a ripple effect testing the CARRY logic in all the 4
individual bit dataslices.

Testing
the Sum logic is relatively easy to explain, we set Cin
=0, and test 0+0 , 0+1 and 1+1 using the sequences as shown above. The 1+0
logic would be the parallel of the 0+1 logic.

The
IRSIM command file is shown below.

SIM.CMD
INPUT FILE

sim.cmd

stepsize 50

vector
Q Q3 Q2 Q1 Q0

vector
A A3 A2 A1 A0

vector
S S3 S2 S1 S0

vector
Carry Cout Co2 Co1 Co0

analyzer
Reset phi Cin A Q S Carry

vector
in phi Cin Reset

set
in 101

set a
0000

s

set
in 001

set a
0000

s

testing
the carry logic via ripple effect

set
in 110

set a
1111

s

set
in 010

set a
1111

s

testing
the sum logic

testing
if 0+0 = 0

set
in 100

set a
0000

s

set
in 000

set a
0000

s

testing
if 0+1 = 1

set
in 100

set a
1000

s

set
in 000

set a
1000

s

testing
if 1+1 = 0

set
in 100

set a
1000

s

set
in 000

set a
1000

s

IRSIM
output

The
schematic was extracted into a netlist using the schm2sim.pl
perl script and IRSIM was run.

The
IRSIM output file shows that the Carry and SUM logic work correctly, as
explained in the section above. Also, the above simulation shows that the reset
logic is working correctly and we can also observe that the value of Sum (S) is
latched at the positive clock (phi) edge and passed onto Q.

Thus the
circuit functionality is validated.

POST: An image of the
layout, with the total height and width annotated. Description of changes made
to the bitslice layout of Lab 2.

Layout
of the 4-bit accumulator:

The
layout of the accumulator is shown below. The aspect ratio is approx 2:1. This
is because the aspect ratio of the 1 bit accumulator was about 3:1 . Placing 4 of those slices side by side and adding the
extra wiring to interconnect the individual slices led to the extra increase in
the width. There is scope for further reducing the width and gain a few more
Lambda of space, by jamming the individual blocks together,Office 2010 Sale, as is seen by the
gaps. However, this would heavily reduce design clarity and increase debugging
complexity. The PHI and RESET signals are run across in a horizontal strip of
M1 across the circuit (in level 3 on either side of the gnd
signal).

Changes
made to the layout of lab2:

The
accumulator instantiated above is a slightly modified version of that in Lab 2.
Nodes Q0 through
Q3 here are actually the buffered value Qout in the
previous schematic. Also Co_bar pin that was used for
debugging in Lab 2 has been removed as it is no longer required. Node Q from
Lab 2 also has been made an internal wire, and is no longer a separate pin, to
avoid conflict with the naming of pins as required by this problem. The rest of
the layout was not modified and is essentially the same. 4 instances were
placed the one after the other and interconnected using wires. The labelling of
the nodes was also changed to meet the new requirements.

Algorithmic
verification of layout

POST: Hand calculations
of the expected final sum, Simulator waveforms corresponding to the above
algorithm, the static and dynamic power dissipation of the accumulator as
reported by the simulator (no hand calculations).

The last
4 digits of my student ID are 9216.

The
algorithm executed is as follows:

Reset the accumulator Set c_in=0 for the duration of this test Let c_out be an output, and let the sum wrap around in case of an overflow (i.e. modulo-16 addition) Set A=9 (A3=1, A2=0, A1=0, A0=1) and pulse the clock Set A=2 (A3=0, A2=0, A1=1, A0=0) and pulse the clock Set A=1 (A3=0, A2=0, A1=0, A0=1) and pulse the clock Set A=6 (A3=0, A2=1, A1=1, A0=0) and pulse the clock The final sum will be Q=0010

Hand
Calculations:

Clk(t)

Reset

A(t)

Q(t-1)

S(t)

Carry out(t)

0

1

0000

0000

0000

0

1

0

1001

0000

1001

0

2

0

0010

1001

1011

0

3

0

0001

1011

1100

0

4

0

0110

1100

0010

1

FINAL
SUM: 0010.

IRSIM simulation

Input command file:

stepsize 50
vector Q Q3 Q2 Q1 Q0
vector A A3 A2 A1 A0
vector S S3 S2 S1 S0

analyzer phi Reset Cin
Cout A Q S
vector in phi Cin Reset

set in 101
set A 0000
s
set in 001
set A 0000
s

set in 100
set A 1001
s

set in 000
set A 1001
s

set in 100
set A 0010
s

set in 000
set A 0010
s

set in 100
set A 0001
s

set in 000
set A 0001
s

set in 100
set A 0110
s

set in 000
set A 0110
s

IRSIM output

The
average power dissipation as obtained from the HSPICE simulation (log file) is
given as 423u Watt and the max power dissipation was obtained as 89m W

The
dynamic power dissipation is plotted with respect to phi as shown below.

POST: A description of
the critical path, the test sequence which exercises it, and an image of the
simulator output with the clock frequency annotated.

Critical
path analysis:

The
Critical path would be the longest/ slowest path from the input to the output.
In the 4-bit ripple carry adder under consideration, looking at the circuit
topology, we can estimate that the critical path would involve carry
propagation (the ripple effect) in the first 3 blocks, followed by Sum and
generation in the 4th block.

To test
this, we could easily use a shorter version of the test sequence given in the
problem statement. The idea is to try to generate a Cout
of 1 and then vary the clock frequency and check for deformities in the Cout signal. By deformities, we mean the Cout signal does not reach the expected 2.5V value and is
clipped.

The test
sequence is described below.
Reset the accumulator Set c_in=0 for the duration of this test Let c_out be an output, and let the sum wrap around in case of an overflow (i.e. modulo-16 addition) Set A=5 (A3=0, A2=1, A1=0, A0=1) and pulse the clock Set A=6 (A3=0, A2=1, A1=1, A0=0) and pulse the clock Set A=7 (A3=0, A2=1, A1=1, A0=1) and pulse the clock (Here Cout should be 1) The output Q3Q2Q1Q0 should be 0010.

Normal
operation:

At
500 MHz, output is distorted at Cout does not
register the required 1, leading to erroneous result

Correct
frequency of operation of circuit approx 450 MHz, based on simulations.

POST: A table of the
setup times and an image of the simulator output with the setup times
annotated.

Setup
time estimation:

Since
there is a flip-flop in each bit slice that samples Sum on the positive edge of
the clock (phi), there would be set-up time requirements on the individual
inputs A0, A1, A2, A3 and Cin that must be met
in order for the system to function correctly.

Definition:
Setup time is
defined as the time that the data must arrive before the clock edge (in this
case the rising edge), in order for it to be sampled correctly.

We
evaluate the setup times by trial and error as shown below.

Each bit
slice is identical and hence the setup-times of A0, A1, A2 and A3 would be the
same. The small variation in their individual setup times due to the clock
signal propagation delay /skew from one end to the other of the layout can be
ignored given the fact that we are trying to obtain a first order estimate of
the various delays and times via simulation. Hence we determine the setup time
of one of the blocks say A0 and that of Cin

To
determine the setup time, the algorithm is as shown below.

We need
to sample a Q = 1 through the flip-flop, which means we should have a Sum = 1

Now Sum
= A+Q+Cin

Setup
time of A is obtained by setting A=1, Q=Cin= 0

Setup
time of Cin is obtained by setting Cin=1,Windows 7 License, A=Q=0

Setup
time of A0 = 400ps

In the
above figure, Sum S0 is sampled by the flip-flop and passed on to Q0

Setup
time violation on A results in data not being sampled
(below)

Setup
time analysis of Cin

Setup
time was found to be 300ps

Setup time violation (on Cin)
example.

Node

Setup Time

A0

400 ps

A1

400 ps

A2

400 ps

A3

400 ps

Cin

300 ps

POST: A table of the
propagation delays, a description of how you chose your test sequences and an
image of the simulator output with the propagation delays annotated.

Propagation
delay estimation:

Propagation
delay is defined as the time it takes for a signal to propagate from one end to
the other end of the circuit. It may also be defined as the time it takes for
the outputs to change with respect to the input. The worst case propagation
delay is obtained using the critical path of the circuit.

The test
sequence is explained below

1)
Reset the circuit (Sets all Q’s to 0)

2)
Set Cin = 0, and all A’s to logic 1. This will result
in all the Sums going high and trigger all the Q’s to 1 on the next clock cycle
edge

3)
Measure the delay from Phi to Q.

4)
Reset the circuit again. This will result in all the Q’s going back to logic 0.

5)
Measure the delay from Reset to Q.

The
propagation delays are expected to be in the following order Cin > A0 > A1> A2 > A3

The
actual simulation results are given below.

Based on
the above estimation, we find propagation delays of all the Q outputs are the
same. This is expected,Cheap Office 2007, since the layouts are identical

Signal

Tp from Phi

Tp from Reset

Q0

1.57ns

420ps

Q1

1.57ns

420ps

Q2

1.57ns

420ps

Q3

1.57ns

420ps

Propagation
delay to C_out estimation

Propagation
delay estimation of Cout from reset and phi is done
using the same logic as shown above. To obtain
the delays from A0, A1, A2 A3 and Cin, we use the
following method

1)
Reset the system: this results in Q3Q2Q1Q0 = 0000

2)
Set A3A2A1A0 = 1111

This results in Sum = 1111 and Q =
1111 in the next clock edge

3) Now
depending on which propagation delay estimate is required, set the
corresponding value to 1 and the remaining inputs to 0

For
example if the delay from Cin is required, set Cin = 1,Microsoft Office 2007 Pro Plus, A = 0000 and clk the ckt. This will result in Cout = 1
and then we can measure the propagation delay from Cin.
Similarly, if the delay from A3 is required, set Cin
= 0, A = 1000 and clk the ckt.
This will result in Cout = 1 and then we can measure
the propagation delay from A3.

Prop
delay from A0 to Cout= 1.1ns = 1100ps

Prop
delay from Reset to Cout= 0.06ns = 60ps

Prop
delay from A1 to Cout= 900ps

Prop
delay from A2 to Cout = 700ps

Propagation
delay from A3 to Cout = 400ps

Prop
delay from Phi to Cout = 400ps

Propagation
delay from Cin to Cout =
1100 ps