文档库

最新最全的文档下载
当前位置:文档库 > Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

Performance-driven Synthesis of Asynchronous Controllers Kenneth Y.Yun Bill Lin David L.Dill Srinivas Devadas

Department of ECE,University of California,San Diego,CA92093

IMEC,Kapeldreef75,B-3001Leuven,Belgium

Computer Systems Laboratory,Stanford University,Stanford,CA94305 Department of EECS,MIT,Cambridge,MA02139

Abstract

We examine the implications of a new hazard-free combinational logic synthesis method[8],which generates multiplexor trees from binary decision diagrams(BDDs)—representations of logic func-tions factored recursively with respect to input variables—on extended burst-mode asynchronous synthesis.First,the use of the BDD-based synthesis reduces the constraints on state minimization and assignment,which reduces the number of additional state vari-ables required in many cases.Second,in cases where conditional signals are sampled,it eliminates the need for state variable changes preceding output changes,which reduces overall input to output la-tency.Third,selection variables can easily be ordered to minimize the latency on a user-speci?ed path,which is important for optimiz-ing the performance of systems that use asynchronous components. We present extensive evaluations showing that,with only minimal optimization,the BDD-based synthesis gives comparable results in area with our previous exact two-level synthesis method.We also give a detailed example of the speci?ed path optimization.

1Introduction

There have been many recent advances in asynchronous circuits and systems,both in tool design[1,2,3,5,7,9,10,12,13,16,15,20, 21,22]and actual systems design[6,9,10,11,14,17].However, for maximum acceptability,it is imperative to be able to synthesize circuits that work with existing systems,which are largely made out of synchronous components.One particularly promising design style is the extended-burst-mode[23].

This paper describes a new synthesis algorithm for asynchronous controllers speci?ed in extended-burst-mode[23].This algorithm assumes the target implementation to be a combinational circuit with both primary outputs and state variables fed back.The combi-national circuit is derived from a Binary Decision Diagram[4]using

a recently developed hazard-free combinational synthesis method

[8].This algorithm is designed,?rst and foremost,to minimize output latency.

This new approach has many advantages over other synthe-sis methods[15,16,23,22],which implement the combinational logic as two-level AND-OR circuits.In many cases,the circuits synthesized using this new method have considerably lower out-put latencies than the circuits synthesized by the method in[23]. Furthermore,this method provides a platform to further minimize the delay on user-speci?ed input/output path,which can be very important for achieving high performance in systems that use asyn-chronous components.

Supported by the Semiconductor Research Corporation,Contract no.93-DJ-205.

Supported by the EXACT project under the program ESPRIT6143of the European Commission

Supported by the NSF Young Investigator Award 2Controller Speci?cation and Implementation

In this section,we summarize a design style,called extended burst-mode,and an implementation style,called3D,to specify and imple-ment asynchronous controllers for heterogeneous systems—sys-tems with both synchronous and asynchronous components.These topics have been discussed in more detail in previous publications [23,22].

Abstract Performance-driven Synthesis of Asynchronous Controllers

ok? frin? /

frin* dackn? / dreq?

frin* dackn? / dreq?

Figure1:Extended-burst-mode Speci?cation.

Figure1shows an example of an extended burst-mode speci?-cation.Signals not enclosed in angle brackets and ending with or are terminating signals.These are edge signals.The signals enclosed in angle brackets are conditionals,which are level sig-nals whose values are sampled when all of the terminating edges associated with them have occurred.A conditional can be read“if is high”and can be read“if is low.”A state transition occurs only if all of the conditions are met and all the terminating edges have appeared.A signal ending with an asterisk is a directed don’t care.If is a directed don’t care,there must be a sequence of state transitions in the machine labeled with.If a state transition is labeled with,the following state transitions in the machine must be labeled with or with or(the terminating edge for the directed don’t care).This speci?cation de-scribes a state machine having a conditional input,3edge inputs,and2outputs.Consider the state transitions out of state4.The behavior of the machine at this point is:“if is low when falls,change the current state from4to5and lower the output;if is high when falls,change the current state from4to2and

lower output.”

A directed don’t care may change at most once during a sequence of state transitions it labels,i.e.,directed don’t cares are monotonic signals,and,if doesn’t change during this sequence,it must change during the state transition its terminating edge labels.A terminating edge which is not immediately preceded by a directed don’t care is called compulsory,since it must appear during the state transition it labels.is low when the speci?cation is in state4.It can rise at any point as the machine moves through states5and6or through state2,depending on the level of,and it must have risen by the time the machine moves to states6or2,because the terminating edge appears between states5and6and between4and2.

The input signals are globally partitioned into level signals(con-ditionals),which can never be used as edge signals,and edge signals (terminating or directed don’t care),which can never be used as level signals.If a level signal is not mentioned on a particular state transition,it may change freely.If an edge signal is not mentioned, it is not allowed to change.

More generally,an extended-burst-mode asynchronous?nite state machine[23]is speci?ed by a state diagram which consists of a?nite number of states,a set of labeled state transitions connecting pairs of states,and a start state.Each state transition is labeled with a set of conditional signal levels and two sets of signal edges:an input burst and an output burst.An output burst is a set of output edges, and an input burst is a non-empty set of input edges(terminating or directed don’t care),at least one of which must be compulsory.

In a given state,when all the speci?ed conditional signals have correct values and when all the speci?ed terminating edges in the input burst have appeared,the machine generates the corresponding output burst and moves to a new state.Speci?ed edges in the input burst may appear in arbitrary temporal order.However,the condi-tional signals must stabilize to correct levels before any compulsory edge in the input burst appears and must hold their values until after all of the terminating edges appear.The minimum delay from the conditional stabilizing to the?rst compulsory edge is called the setup time.Similarly,the minimum delay from the last terminating edge to the conditional changing is called the hold time.Actual values of setup and hold time of conditional signals with respect to“sampling”edges depend on the implementation.Conditional signal levels need not be stable outside of the speci?ed sampling periods.Each signal speci?ed as a directed don’t care may change its value monotonically at any time including during output bursts, unless it is already at the level speci?ed by the next terminating edge.Outputs may be generated in any order,but the next set of compulsory edges from the next input burst may not appear until the machine has stabilized.This requirement—the environment must wait generating the next set of compulsory edges until the circuit stabilizes—called the fundamental-mode environmental constraint.

There is an additional restriction to extended-burst-mode spec-i?cations,called the distinguishability constraint,which prevents ambiguity among multiple input bursts emanating from a single state:For every pair of input bursts and from the same state,ei-ther the conditions are mutually exclusive,or the set of compulsory edges in is not a subset of the set of all possible edges in.

2.13D Implementation

A3D asynchronous?nite state machine is a4-tuple

where is a non-empty set of primary input symbols,a non-empty set of primary output symbols,a possibly empty set of internal state variable symbols,and:

is a next-state function.The hardware implementation of a3D state machine(see?gure2)is a combinational network,which implements the next-state function,with the outputs of the network fed back as inputs to the network.There are no explicit storage

Abstract Performance-driven Synthesis of Asynchronous Controllers

Figure2:3D Asynchronous State Machine. elements such as latches,?ip-?ops or C-elements in a3D machine.

A3D implementation of an extended-burst-mode speci?cation is obtained from the next-state table,a3-dimensional tabular rep-resentation of.The next state of every reachable state must be speci?ed in the next-state table;the remaining entries are don’t cares.

A machine cycle consists of an input burst followed by a con-current output and state burst.Initially or after completion of the previous output and state burst,the machine waits for an input burst to arrive.When the machine detects that all of the terminating edges of the input burst have appeared,it generates a concurrent output/state burst,which may be empty.

2.22-level Synthesis

In the3D implementation of extended-burst-mode machines,no fed-back output or state variable change arrives at the network input until all of the speci?ed edges in the output and state burst have appeared at the network output.These conditions are met by inserting delays in the feedback paths as necessary.A3D machine can then be viewed as a combinational network alternately excited by a set of input edges(during an input burst)and by a set of fed-back output and state variable edges(during an output/state burst).Thus each burst is a generalized transition of inputs to the combinational network,as described below:

Generalized transition

A generalized transition is a triple where is a mapping from a set of inputs to a set of input types,a start-cube,and an end-cube.There are three types of inputs:rising edge,falling edge,and level signals.Edge inputs can only change monotonically. Level inputs must remain constant or unde?ned(don’t care),which implies that each level input must hold the same value in both and or be unde?ned in both and.Level inputs,if they are unde?ned,may change non-monotonically,

A generalized transition cube is the smallest cube that contains the start-and end-cubes and.It represents the set of all minterms that can be reached during a legal transition from a point in start-cube to a point in end-cube,assuming that the inputs can change in arbitrary order.Open generalized transition cubes,,,and,denote,, and respectively.Note that,if.The start-subcube is a maximal subcube of such that the value of every rising edge input in is0,if it is in,and the value of every falling edge input in is1,if it is in.The end-subcube is a maximal subcube of such that the value of every rising edge input in is1,if it is in,and the value of every falling edge input in is0,if it is in.Intuitively,the longest transitions,disregarding non-monotonic signals,are those that lead from to.

(a) function?hazard?free static transition

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

(c) function?hazard?free

dynamic transition

(b) static function hazard(d) dynamic function hazard

Figure3:Generalized Transition Cube.

A generalized transition is a static transition for iff

;it is a dynamic transition for iff. No change in level inputs can enable output changes directly,that is,at least one edge input must change from0to1or from1to0in a generalized dynamic transition.

During a generalized transition,each output signal is assumed to change its value at most once.If not,a function hazard is said to be present.Below is the de?nition of function hazard adapted for generalized transitions:

De?nition1A combinational function contains a function haz-ard during a generalized transition iff

1.there exists a pair of minterms in such that

,or

2.there exists a pair of minterms in such that

,or

3.there exists a pair in such that and

and.

An extended-burst-mode transition is a generalized transition with the following requirements:

1.For every pair of minterms and in,.

2.For every pair of minterms and in,. Theorem1Every extended-burst-mode transition is function-hazard-free.

An edge signal that changes from0or to1or from1or to0during an extended-burst-mode transition from to is a terminating signal in.An edge signal whose value is in is a directed don’t care in.A level signal whose value is in is an undirected don’t care.In a dynamic extended-burst-mode transition,the output is enabled to change only after all of the terminating edges appear.

In order for a2-level AND-OR implementation of an output or a state variable function to be hazard-free,a set of covering require-ments[23,16]must be satis?ed for each burst,i.e.,extended-burst-mode transition.It was shown in[23]that it is not always possible to satisfy the covering requirements for all of the speci?ed bursts under the presence of non-monotonically changing(unde?ned)con-ditionals,if the single transition time(STT)state assignment[19]is used.The approach taken in[23]was to insert a state burst between a conditional input burst and the corresponding output burst in order to guarantee that the covering requirements can be satis?ed for all of the speci?ed bursts.Unfortunately,the early state burst between the input burst and output burst increased the input/output latency signi?cantly,and also tended to increase the circuit area.

3BDD-based Combinational Synthesis

In this paper,we use a new BDD based combinational synthe-sis technique from[8].This approach imposes a different set of requirements to guarantee freedom from all hazards,but we will show that it is always possible to meet these requirements without the multiple transition time state assignment that was required in the method of[23],resulting in greatly reduced latency in many cases.

Combinational networks that describe next-state functions are constructed from a BDD(binary decision diagram)description. The basic gates that comprise combinational networks are ANDs, ORs,NANDs,NORs,inverters,and MUXes.We assume that every basic gate is atomic,i.e.,a single transition of a gate input cannot cause a multiple transitions at the output.We do not rely on inertial delays,that is,we assume a pure gate delay model,and allow for arbitrary wire delays.

3.1Multiplexer Networks derived from BDDs

The following de?nition of a Binary Decision Diagram is from[4].

De?nition2A Binary Decision Diagram is a rooted,directed graph with vertex set containing two types of vertices.A non-terminal vertex has as attributes an argument index

1and two children.A terminal vertex has as attributes a value01.

The correspondence between a BDD and a Boolean function is de?ned as below:

De?nition3A Binary Decision Diagram having root vertex denotes a function de?ned recursively as:

1.If is a terminal vertex:

(a)If1,then1.

(b)If0,then0.

2.If is a non-terminal vertex with,then

1

11.

is called the decision variable for vertex.

In addition,

1.Each decision variable occurs at most once on every path from

a terminal vertex to the root vertex,

2.A reduced BDD is a BDD in which for any

vertex and no two subgraphs are identical.

A reduced ordered BDD(ROBDD)is a canonical form with the following restriction:for any non-terminal vertex,if is a non-terminal,then,and if is a non-terminal,then.

A reduced free BDD(free BDD)is a BDD which does not require a strict variable ordering(unlike in an OBDD)but still requires that each decision variable is encountered at most once when traversing a path from a terminal vertex to the root vertex.

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

f

(b)

Abstract Performance-driven Synthesis of Asynchronous Controllers

(c) c

b

f f

Figure4:(a)BDD(b)MUX network derived from BDD(c)Sim-pli?ed network(by constant propagation).

A multi-level network can be derived directly from a BDD by replacing each vertex with a two-input MUX with the decision variables as the inputs of the MUXes.Figure4b shows a MUX network derived from the BDD in?gure4a.If one or more input of a MUX is constant,the MUX can be replaced with a simpler gate,such as a NAND or a NOR.This constant propagation is carried out topologically from inputs to outputs.Figure4c shows an equivalent network after the constant propagation.

3.2Hazard-free Combinational Synthesis

We use the approach from[8]to synthesize hazard-free combina-tional circuits under extended-burst-mode transitions.This method is based on building a BDD for a speci?ed function and deriving a multi-level circuit from it.To ensure that the resulting multi-level circuit is hazard-free,a requirement called the trigger signal order-ing(TSO)must be satis?ed.This requirement imposes constraints on the variable ordering of the BDD.It was shown in[8]that if this variable ordering is satis?ed,then the resulting multi-level circuit is free of logic hazards for a set of speci?ed transitions.Note that every input change in[8]was assumed to be monotonic during each transition.We will prove that the resulting circuit is free of logic hazards for a set of speci?ed extended-burst-mode transitions,in which some inputs may change non-monotonically,as long as the TSO requirement is satis?ed.

A trigger state is a state in which an input change enables the output to change.A trigger signal is an input signal whose transition enables the output to change;a non-trigger signal is an input signal which is enabled to change but cannot by itself enable the output to change.The TSO requirement states that trigger signals in a trigger state must appear before the non-trigger signals of the same trigger state in the variable ordering.

In the generalized transition cube that corresponds to an extended-burst-mode dynamic transition,all terminating signals are trigger signals in one or more minterms,because terminating edges can appear in any temporal order and the last one that appears is a trigger signal.Note that no terminating signal can be a non-trigger signal,because no output change can be enabled until all termi-nating edges appear.Furthermore,all don’t care signals(directed or undirected)are non-trigger signals in one or more minterms, because their values may change anywhere,including in the trig-ger states,in the generalized transition cube.Clearly,no don’t care signal can be a trigger signal in any minterm in the generalized tran-sition cube,because don’t care signals can never enable outputs to change.Therefore,we can impose a set of ordering requirements, which do not con?ict,as a suf?cient condition for hazard freedom per generalized transition cube,although the TSO requirement in [8]is an imposition on each trigger state in the transition cube.

Now we can state the variable ordering requirements for the extended-burst-mode transitions as follows:Along every path from root to terminal of the BDD whose corresponding cube intersects the generalized transition cube,no don’t care signal of a dynamic transition appears before a terminating signal of the same.

Here,we prove that the combinational network derived from a reduced free BDD description is hazard-free during an extended-burst-mode transition as long as the BDD satis?es the variable ordering requirement for the transition stated above.

Lemma1If is an extended-burst-mode transition for, then for every don’t care signal in

and for every minterm in.

De?nition4Subtransitions:

1.If is not a constant0in,is a subtran-

sition of with the value of?xed to1.

2.If is not a constant1in,is a subtran-

sition of with the value of?xed to0.

Note that,if is a constant1in, and,if is a constant0in.

Lemma2If is an extended-burst-mode transition for, then

1.is an extended-burst-mode transition for,if

is not a constant0;

2.is an extended-burst-mode transition for,if

is not a constant1.

Corollary1If is an extended-burst-mode dynamic tran-sition for and is a don’t care in,then is an extended-burst-mode dynamic transition for and

for.

Corollary2Static transitions of cofactors:

1.is a static transition for if is an

extended-burst-mode transition for and is a falling termi-nating signal.

2.is a static transition for if is an

extended-burst-mode transition for and is a rising termi-nating signal.

Theorem2The combinational network derived from a reduced BDD(ordered or free)description of is hazard-free during an extended-burst-mode transition if it satis?es the variable ordering requirement for the transition:no don’t care signal appears before a terminating signal.

Proof:We prove by induction on the number of variables.

Base case:The sole input of the network is connected to the select input of the multiplexor.The other input terminals are con-nected to a constant1or0.Since the multiplexor is atomic and only the select input can change,is hazard-free.

Inductive hypothesis:Now assume that a combinational net-work derived from a reduced BDD representation of an-input function(1),which satis?es the variable ordering require-ments for an extended-burst-mode transition,is hazard-free during the extended-burst-mode transition.

Now consider the network derived from a reduced BDD representation of function with1input variables and an extended-burst-mode transition for.Assume that the select input of the multiplexor driving the output of is and the data inputs are and,the Shannon cofactors of with respect to and.Then is an extended-burst-mode transition for if is not a constant0,and is for if is not a constant1,by Lemma2.Since satis?es the variable ordering requirements,so do and.Therefore,is hazard-free if is not a constant0,and is hazard-free if is not a constant1,by the inductive hypothesis.We will consider3cases:is a constant, is a don’t care,and is a terminating signal.

1.is a constant:

The multiplexor is a wire as long as remains constant.Thus, if0,is hazard-free since is not a constant1.

Likewise,is hazard-free,if 1.

2.is a don’t care:

First we prove by contradiction that must be a static transition for if is a don’t care.Assume that

is a dynamic transition for.By Corollary1,is an extended-burst-mode dynamic transition for.Suppose

remains at1while changes.Then the change in

propagates to,which means that there is a terminating signal that enables to change,regardless of,violating the variable ordering requirement.

By Lemma1,for every in .Therefore,and are static transitions of same type,that is,both00or both11, for and respectively.By the inductive hypothesis,and are hazard-free,therefore constant.Since the multiplexor is atomic,is hazard-free.

3.is a terminating signal:

Without loss of generality,consider only the case in which

rises.By Corollary2,undergoes a static transition.By the inductive hypothesis,both and are hazard-free.Consider the case in which0.We prove by contradiction that

must rise or remain a constant.Assume that is initially1and falls to0,but rises?rst.Since0and1initially,

is enabled to rise as rises.This is a static function hazard, since is assumed to be0at the end of the transition.Thus

must rise or remain a constant;in both cases,is hazard-free.

Similarly,we can prove that is hazard-free when1,by proving that must fall or remain a constant.

Lemma3There always exists a BDD that satis?es the variable ordering requirements for two dynamic input burst transitions from a speci?cation state.

Proof:For every pair of state transitions and ema-nating from state,there exist a compulsory transition signal in the input burst of and in the input burst of such that is a constant in the input burst of and is a constant in the input burst of.If and appear before any other variable involved in the ordering,all variable ordering requirements can be satis?ed in the sub-BDDs.

Our strategy is to build an ROBDD using a global variable ordering,if such an ordering can be found,or to build a free BDD. If no global order exists,there is always a variable that can appear ?rst.This variable partitions the function into a left and right BDD. The left and right BDDs do not have to have the same variable order, so they can be constructed recursively using the same method.

In this3D implementation of extended-burst-mode machines, only input bursts can be dynamic transitions.If a unique code is assigned to each speci?cation state,we can always use fed-back state variables as partitioning variables,so that the variable ordering requirements of each speci?cation state are satis?ed locally in each partition.Therefore,there exists a hazard-free free BDD implementation for every legal extended-burst-mode speci?cation. State minimization is also constrained to avoid creating variable ordering con?icts.

4Sequential Synthesis Procedure

The sequential synthesis procedure consists of the following three steps:(1)hazard-free state assignment(2)hazard-free state mini-mization(3)state encoding.

4.1Hazard-free State Assignment

We describe an algorithm for simultaneously minimizing and assigning states which always results in a hazard-free single-transition-time(STT)state assignment.In contrast,the previous algorithm for extended-burst-mode[23]use a multiple-transition-time assignment:a state variable change was required before an output change,increasing latency signi?cantly.Indeed,it can be shown that multiple transitions are necessary for extended-burst-mode when implemented with2-level AND-OR logic,so the use of BDDs has an inherent performance bene?t.

This algorithm builds a primitive next-state table—a3-dimensional table with-axis representing the input bit vector, -axis the output bit vector,and-axis the speci?cation states. The algorithm assigns,according to the extended-burst-mode se-mantics,a next state,which consists of two components:next outputs and next speci?cation state,to entries in the table.An-plane of the primitive next-state table is called a layer.Initially, each speci?cation state is assigned to a unique layer.The algo-rithm then collapses the primitive next-state table into a reduced next-state table by merging compatible speci?cation states without violating TSO requirements.

In order to describe the primitive next-state table construction formally,a formal de?nition of the extended-burst-mode speci?-cation,adapted from the de?nition of the burst-mode speci?ca-tion in[16],is needed.An extended-burst-mode speci?cation is a directed graph,0,where

is a?nite set of states;is the set of state transitions;1is the set of conditional inputs;

1is the set of edge inputs;1

is the set of outputs;0is the unique start state;labels each state transition with a set of conditional inputs;and

are labeling functions used to de?ne the unique entry cube of each state.The function:01de?nes the values of the conditional inputs.The function:01de?nes the values of the edge inputs and the function:01 de?nes the values of the outputs upon entry to each state.

Labeling functions and are derived from graph.:de?nes the set of edge input changes(input burst)and:de?nes the set of output changes(output burst).(and denote the power set of inputs and the power set of outputs respectively.)

Given a state transition,,is in the input burst iff 11,is in the input burst iff

00,and is in the input burst iff.That is,

iff.Similarly, is in the output burst iff10,is in the output burst iff0 1.That is,

iff.Let

.

The unique entry condition is satis?ed by the above de?nition. The remaining requirements to ensure well-formed speci?cations are:

Every input burst must contain a compulsory edge.That is,for every state transition,there exists

such that.

Every pair of transitions emanating from the same state must satisfy the distinguishability constraint.That is,for every pair,

,im-plies that either or and are mu-tually exclusive,that is,there exists such that

.

For every sequence of state transitions,1

,with1and,there exists 1such that.

Given an extended-burst-mode speci?cation,,as described above,let be the set of conditional input bit vectors 1011,be the set of edge input bit vectors1011, and be the set of output bit vectors1

011.A primitive next-state table is de?ned as

,where is the set of speci?cation states, and:and:

01de?ne the next speci?cation state function and the next output function respectively.

Below,we describe the next-state assignment.is a sym-bolic code assigned to state.

denotes a cube in01,where is the length of state variable bit vectors.Of course,the actual value of will not be determined until the state encoding is done.in the place of con-ditional inputs is a shorthand notation,meaning that all conditional inputs are.is de?ned as:for all1,

if:

otherwise

The next-state assignment is as follows:For all1

and for every state transition,

Conditional input burst setup:

for all in,and, where

,

.

Input burst:

for all in,and, where

,

.

Output/state burst:

for all in,and,where

,

.4.2Hazard-free State Minimization

In the next step,the algorithm transforms the primitive next-state ta-ble into a reduced next-state table by merging layers.Speci?cation states can be merged into a common layer,iff they are compatible, as described below.

De?nition5States and in are compatible(denoted as )iff for every in,

1.and

2.and

3.if for all1,then for every pair

of state transitions,and,is a terminating signal in and is a don’t care in imply that

is a terminating signal in or is a don’t care in, that is,

implies or.

Informally,the third criterion states that no input burst from has con?icting ordering requirements with an input burst in that has identical values of fed-back outputs.

The state minimization and encoding to complete the sequential synthesis are described in[22].

5Path Optimization Example

In synchronous designs,one of the important design objectives is to carefully balance the computation blocks so that no part of the circuits are idle while other parts are busy because the clock period is determined by the worst-case delay of all the computation blocks.However,asynchronous designs can proceed immediately upon receipt of a completion signal,so it is often desirable to create highly unbalanced asynchronous circuits,where one path has been optimized,possibly at the expense of others,to optimize overall system performance.

To illustrate this point and also provide a nice circuit example, we consider a hypothetical problem posed by Ivan Sutherland[18]. When the circuit in?gure5detects a transition on(request),if is high,it signals on2to start an expensive operation“C”,then signals completion on when it receives2from the operation; otherwise,it does nothing except signal as quickly as possible. It is quite likely in this situation that the designer would want the minimum possible latency from to in this case to maximize overall system performance.The implementation shown does not do a very good job of minimizing this latency,because there would be a signi?cant delay from to1for most implementations along with an additional delay through the“merge”element,which is an XOR gate.

This controller uses the2-phase signaling,i.e.,every transition of a signal is considered as a request or acknowledge.We have included the extended-burst-mode speci?cation in?gure5.If

is sampled high when(request)toggles,the controller toggles2, signaling the block C to begin a computation.When C?nishes the computation,it toggles2;the controller then toggles(acknowl-edge).However,if is sampled low when toggles,then the controller toggles directly.

The result of applying our synthesis method from[23]turned out to be remarkably similar to the naive design at the top of?gure5, which we found disappointing.Our hand designs were better,but also unsatisfactory.However,using this new BDD-based approach, we were able to produce an extremely good result:the latency from to was just the delay from the select input to the output of a single multiplexor.The solution was generated by building a BDD so that the decision variable is placed as close to the output

Abstract Performance-driven Synthesis of Asynchronous Controllers

Abstract Performance-driven Synthesis of Asynchronous Controllers

Figure5:Select/Merge Example.

as possible,while satisfying other requirements to keep the circuit hazard-free.The?nal implementation is shown in?gure6.

It would be very easy to generalize this idea by allowing the user to specify a set of particular paths to optimize,in order of priority.The variable ordering in the BDD could be chosen to put the high-priority inputs as close as possible to the output,subject to the correctness constraints imposed by TSO,etc.

6Experiments

We modi?ed the3D synthesis tool described in[23],in particular, the hazard-free state assignment and combinational synthesis steps. We used this modi?ed tool in conjunction with the combinational synthesis tool[8]to perform experiments(see table1)on many examples previously synthesized by the method described in[23]. With very modest efforts to?nd the optimal variable order(we tried a few random orderings and picked the best result),most of the examples required less area than the previous method,primar-ily because of the reduction in the number of state variables due to simpler state assignment.For a minority of the examples,area increased somewhat.We believe that the area results will be fur-ther improved with the development of heuristic variable ordering algorithms tuned to our application.

A more important issue is output latency.Out of39examples synthesized,24of them(the names with*)previously required state variable changes before output changes for some of the speci?ed transitions.In these cases,using BDD-based synthesis rather than 2-level synthesis improved the output latency by100%or

Abstract Performance-driven Synthesis of Asynchronous Controllers

more. References

[1]V.Akella and G.Gopalakrishnan.SHILPA:A high-level

Figure6:Select/Merge Implementation.

synthesis system for self-timed circuits.In ICCAD-92,pp 587–591.

[2]P.Beerel and T.Meng.Automatic gate-level synthesis of

speed-independent circuits.In ICCAD-92,pp581–586. [3] E.Brunvand and R.F.Sproull.Translating concurrent pro-

grams into delay-insensitive circuits.In ICCAD-89,pp262–265.

[4]R.E.Bryant.Graph-based algorithms for boolean func-

tion manipulation.IEEE Transactions on Computers,C-35(8):677-691,August1986.

[5]T.-A.Chu.Synthesis of self-timed VLSI circuits from graph-

theoretic speci?cations.Technical Report MIT-LCS-TR-393, MIT,1987.Ph.D.Thesis.

[6] A.Davis,W.Coates,and K.Stevens.The Post Of?ce ex-

perience:designing a large asynchronous chip.In HICSS, V olume I,pp409–418,January1993.

[7]http://www.wendangku.net/doc/91c7400f844769eae009ed1d.htmlvagno and A.Sangiovanni-Vincentelli.Algorithms for

synthesis and testing of hazard-free asynchronous circuits.

Kluwer Academic,1993.

[8] B.Lin and S.Devadas.Synthesis of Hazard-Free Multi-level

Logic Implementations under Multiple-Input Changes from Binary Decision Diagrams.In This Proceedings.

[9] A.J.Martin.Programming in VLSI:From communi-

cating processes to delay-insensitive VLSI circuits.In

C.A.R.Hoare,editor,UT Year of Programming Institute

on Concurrent Programming,Addison-Wesley,1990. [10]Teresa H.Meng.Synchronization Design for Digital Systems.

Kluwer Academic,1990.

[11] C.E.Molnar,T.-P.Fang,and F.U.Rosenberger.Synthesis

of delay-insensitive modules.In Henry Fuchs,editor,1985 Chapel Hill Conference on Very Large Scale Integration,pp 67–86.CSP,Inc.,1985.

[12] C.W.Moon,P.R.Stephan,and R.K.Brayton.Speci?ca-

tion,synthesis,and veri?cation of hazard-free asynchronous circuits.In ICCAD-91,pp322–325.

[13] C.Myers and T.Meng.Synthesis of timed asynchronous

circuits.In ICCD-92,pp279–284.

Speci?cation State vars Total

States/Primary added literals

Transitions In Out2-level BDD2-level BDD iccad93ex*342220206 edac93ex*4532213220 condtest*4532213023 dff1*4622202812 dff2*4622202812 sr2*81223328255 sr2x2*8203342131130 q424422112724 select2ph*4822204256 selmerge2ph*81232218947 sin1317343471108 ring-counter8812114564 binary-counter323214339480 binary-counter-co3232153310488 pe-send-ifc1114532290115 pe-rcv-ifc121544328485 dramc121476107176 cache-ctrl38491619117041379 tsend*22307455328583 isend*24327457490887 trcv*16227432175111 ircv*16227432188124 tsend-bm*111464229688 trcv-bm*810643177104 isend-bm*12156433177103 ircv-bm*81064318078 tsend-csm*111464439267 trcv-csm*81064327072 isend-csm*1215643414281 ircv-csm*81064328076 abcs23339733199278 stetson-p13138131433376754 stetson-p2252781244178319 stetson-p38114210167 biu-?fo2dma*11135255125166?focellctrl3322111620 scsi-targ-send*7842335350 scsi-init-send*7842223137 scsi-init-rcv-sync4531112016

Table1:Experimental Results.

[14]S.Nowick,M.Dean,D.Dill,and M.Horowitz.The design

of a high-performance cache controller:a case study in asyn-chronous synthesis.In HICSS,volume I,pp419–427,January 1993.

[15]S.M.Nowick and B.Coates.Automated design of high-

performance unclocked state machines.In ICCD-94.

[16]Steven M.Nowick.Automatic synthesis of burst-mode asyn-

chronous controllers.Ph.D.Thesis,Stanford University,1993.

[17]S.M.Nowick,K.Y.Yun,and D.L.Dill.Practical asyn-

chronous controller design.In ICCD-92,pp341–345.

[18]Ivan Sutherland,1994.Private communication.

[19]Stephen H.Unger.Asynchronous Sequential Switching Cir-

cuits.Wiley-Interscience,1969.[20]Peter Vanbekbergen.Synthesis of asynchronous controllers

from graph-theoretic speci?cations.Ph.D.Thesis,Interuni-versitair Micro-Elektronica Centrum,1993.

[21] C.Ykman-Couvreur,B.Lin,G.Goossens,and H.De Man.

Synthesis and optimization of asynchronous controllers based on extended lock graph theory.In EDAC-93,pp512–517. [22]Kenneth Y.Yun and David L.Dill.Automatic synthesis of

3D asynchronous state machines.In ICCAD-92,pp576–580.

[23]Kenneth Y.Yun and David L.Dill.Unifying Syn-

chronous/Asynchronous State Machine Synthesis.In ICCAD-93,pp255–260.