# CCIT Report #585 April 2006

**Delay-Optimal Ordering of Wires in Interconnect Channels** Konstantin Moiseev, Shmuel Wimer and Avinoam Kolodny

# ABSTRACT

The problem of ordering and sizing parallel wires residing in a single metal layer within an interconnect channel is addressed in this paper. Wires are ordered such that cross-capacitances between neighboring wires are optimally shared for circuit delay minimization. Using an Elmore delay model including cross capacitances, an optimal wire ordering is uniquely determined, such that average signal delay can be minimized by proper allocation of inter-wire spaces. For uniform-width wires, the optimal order depends on the size of drivers, and is independent of size of receivers. The optimal order corresponds to minimal differences between driver resistances of neighboring wires. This result applies to most practical VLSI design scenarios. The problem of simultaneously ordering and optimizing variablewidth wires is addressed also. The same ordering method is shown to be advantageous for minimizing the critical wire delay in most problem instances. Examples for 65-nanometer technology are analyzed and discussed.

#### Keywords

routing, wire ordering, wire spacing.

#### **1. INTRODUCTION**

Cross-capacitances between wires in interconnect structures have a major effect on circuit timing. The importance of this effect grows with technology scaling [1], [2]. In this paper, delays in a bundle of parallel wires (with different drivers and loads) are minimized by choosing an optimal ordering of the nets, in addition to optimal allocation of wire widths and inter-wire spaces. The total width of the structure is a given constraint. Ordinary delay minimization ignores the net ordering degree of freedom and treats the order of signals in the channel as given. A brute-force approach to determine the best ordering is to generate all wire permutations, and for each permutation solve the wire-width and space optimization problem [25]. This approach, however, is computationally infeasible for bundles exceeding just a few wires. The existence of a typical optimal wire ordering, which ensures that wire sizing and spacing yields best circuit timing, is proven in this paper. This optimal order is applicable for most VLSI circuit design scenarios. Its derivation is straight forward and requires only simple sorting. Heuristics for solving the most general cases of this problem are described and evaluated.

Net-ordering for delay optimization has not been addressed in previous works. For a fixed order of wires, the problem of allocating widths and spaces to maximize performance in tuning of bus structures was proposed in [3] and solved in [25]. The wire sizing problem has been addressed in [4] and [5] for a single net. Sizing and spacing multiple nets with consideration of coupling capacitance has been addressed in [6] for general interconnect layouts by converting cross capacitance to effective fringe capacitance. Coupling capacitance has been addressed explicitly in the context of physical design for minimizing crosstalk noise [7],[8] or dynamic power [9]. Some authors treated wire sizing for throughput optimization in buses using uniform wire widths and spaces [21], [26], [27]. Several variants of netreordering have been applied for improving layout efficiency [13], and for noise reduction [8], [14] [15] [16] [17]. Swapping of wires for power reduction was applied in [18]. Vittal et al. [14] have suggested without proof to reduce crosstalk noise by sorting wires in order of driver strength, which is closely related to our result in delay minimization.

# 2. PROBLEM FORMULATION

Consider a channel of *n* signal nets  $\sigma_0, ..., \sigma_{n-1}$  between two side-walls (wires at fixed locations, connected to  $V_{\alpha}$  or  $V_{ss}$ ) as shown in Fig. 1.  $S_i$  and  $S_{i+1}$ , respectively, denote spaces to the lower and upper neighbors of wire  $W_i$ . The length of each wire is L. The sum of wire widths and spaces between the lower and upper side walls is given in the following constraint (1), which represents the total width A of the available area for laying out the bundle of wires. Throughout the rest of this paper the terms left (right) and lower (upper) are used interchangeably for the sake of convenience.

$$g\left(\overline{W},\overline{S}\right) = \sum_{j=0}^{n-1} W_i + \sum_{j=0}^n S_i = A \tag{1}$$



Figure 1: Signal drivers (modeled as voltage sources with series resistances), interconnect channel wires of length L, and receivers (modeled as load capacitances). Timing optimization is performed by reordering the signal wires and by allocating wire widths and spaces, for a given constrained channel width A.

Signal delays are expressed by an Elmore model using simple approximations for wire capacitances and wire resistance. The delay of signal  $\sigma_i$  was derived in [25] and is given by

$$\Delta_i\left(\overline{W},\overline{S}\right) = a + R_i\left(kW_i + g + C_i\right) + \frac{b + eC_i}{W_i} + \left(hR_i + \frac{d}{W_i}\right)\left(\frac{1}{S_i} + \frac{1}{S_{i+1}}\right)(2)$$

The coefficients a, b, d, e, k, g, h are technology dependent parameters. This model includes effects of wire resistance (inversely proportional to wire width  $W_i$ ) and effects of wire capacitance terms (area capacitance is proportional to  $W_i$ , cross-capacitances to neighboring wires are inversely proportional to spaces  $S_i$  and  $S_{i+1}$ ). Miller coupling factors can be included in the last term to account for worst-case crosstalk effect on delay.

Let  $\pi \in \Pi$  denote an order (permutation) of the signals in the interconnect channel, taken from the set of all n! possible orders. We are seeking an order  $\pi^*$  of the channel signals, that (after wire width and space allocation) yields the minimum average wire delay, i.e minimum total sum of delays given in (3):

$$f(\pi, \overline{W}, \overline{S}) = \sum_{i=0}^{n-1} \Delta_i(\pi, \overline{W}, \overline{S}) = \sum_{i=0}^{n-1} \left\{ a + R_i(kW_i + g + C_i) + \frac{b + eC_i}{W_i} + \left(hR_i + \frac{d}{W_i}\right) \left(\frac{1}{S_i} + \frac{1}{S_{i+1}}\right) \right\}.$$
(3)

This objective function is mathematically convenient because it is differentiable, and it is also a useful performance metric in industrial practice [25].

Assume for the moment that the order  $\pi$  of the signals in the channel is given. Then, in order to minimize (3) subject to (1) we differentiate f and g by all of their sizing variables:

$$\frac{\partial f}{\partial W_{i}} = kR_{i} - \frac{eC_{i} + b}{W_{i}^{2}} - \frac{d}{W_{i}^{2}} \left(\frac{1}{S_{i}} + \frac{1}{S_{i+1}}\right), 0 \le i \le n-1$$

$$\frac{\partial f}{\partial S_{i}} = -\frac{1}{S_{i}^{2}} \left[h(R_{i-1} + R_{i}) + d\left(\frac{1}{W_{i-1}} + \frac{1}{W_{i}}\right)\right], 0 \le i \le n-1$$

$$\frac{\partial f}{\partial S_{0}} = -\frac{1}{S_{0}^{2}} \left[hR_{0} + \frac{d}{W_{0}}\right]$$

$$\frac{\partial f}{\partial S_{n}} = -\frac{1}{S_{n}^{2}} \left[hR_{n-1} + \frac{d}{W_{n-1}}\right]$$

$$(4)$$

$$\frac{\partial g}{\partial W_i} = 1, 0 \le i \le n - 1$$
$$\frac{\partial g}{\partial S_i} = 1, 0 \le i \le n$$

At the minimum there exists some real number  $\lambda$  (Lagrange multiplier), satisfying  $\nabla f = \lambda \nabla g$ . Rearranging and substituting yields the following:

$$\lambda = kR_{i} - \frac{eC_{i} + b}{W_{i}^{2}} - \frac{d}{W_{i}^{2}} \left(\frac{1}{S_{i}} + \frac{1}{S_{i+1}}\right), 0 \le i \le n - 1$$

$$\lambda = -\frac{1}{S_{i}^{2}} \left[h(R_{i-1} + R_{i}) + d\left(\frac{1}{W_{i-1}} + \frac{1}{W_{i}}\right)\right], 0 < i \le n - 1$$

$$\lambda = -\frac{1}{S_{0}^{2}} \left[hR_{0} + \frac{d}{W_{0}}\right]$$

$$\lambda = -\frac{1}{S_{n}^{2}} \left[hR_{n-1} + \frac{d}{W_{n-1}}\right]$$
(5)

The above equations and the area constraint equation (1) impose 2n+2 algebraic equations in 2n+2 unknown variables  $\lambda, W_0 \cdots W_{n-1}, S_0 \cdots S_n$ . Solving and substituting into (3) produces minimal total sum of signal delays for the assumed order  $\pi$ .

The order of wires affects the sum of delays primarily because every pair of adjacent signal drivers is associated with a shared cross-capacitance between the wires. It makes sense to allocate large spaces to a wire driven by a weak driver, in order to reduce the driver's load. Strong drivers can cope with large cross-capacitances corresponding to narrow spaces. Consequently, in order to best utilize the total area given for the channel, weak drivers should share the same large spacing. Thus, a signal with a weak driver can benefit from the large space allocated anyway to its neighbor, which also has a weak driver, and vice versa. Similarly, strong drivers can share a small inter-wire space. The space sharing idea is illustrated in Fig. 2. There, the channel is comprised of some signals with weak drivers (W) and some with strong drivers (S). The ordering in Fig. 2(b) is superior to Fig. 2(a), which is apparently the worst. Wire sizing and spacing optimization aiming at minimizing the total sum of delays will yield smaller (better) delays for configuration 2(b), in comparison with 2(a).



Figure 2: Space sharing in two interconnect channel configurations. a) Interleaved placement of strong and weak drivers, b) Sorted placement of signals according to driver strength.

Consider now  $\pi \in \Pi$  as variable, and find which order yields the minimum total sum of delays after wire sizing and spacing for each ordering, as

discussed in the previous section. One needs therefore to solve the following problem:

Minimize 
$$f(\pi, \overline{W}, \overline{S})$$
, subject to  $\pi \in \Pi$  and constraint (1)  

$$\begin{cases}
\min_{\pi \in \Pi} f(\pi, \overline{W}, \overline{S}); \\
\sum_{j=0}^{n-1} W_j + \sum_{j=0}^n S_j = A
\end{cases}$$
(6)

In this formulation, both signal ordering and wire sizing are considered simultaneously. The main result of this paper is that the signal ordering can typically be solved independently of the wire sizing. Moreover, the optimal order can be derived directly from the setting of the given problem, by positioning the wires according to the effective resistances of their drivers, in a characteristic form of *symmetric hill*. It is shown below that the independence of sizing and ordering, and the optimality of symmetric hill ordering are valid for most problem instances which occur in practical circuit design.

# 3. OPTIMALITY OF SYMMETRIC HILL ORDER

#### 3.1 Wires of uniform width

For the sake of clarity, assume first that all the wires have the same width W while spaces can vary among wires. Hence, wire sizing means finding the optimal W and allocating optimal spaces between wires. For any order  $\pi \in \Pi$ , minimizing the total sum of delays involves only n+2 variables  $(W, S_0, \dots, S_n)$ . The following conditions are necessary for optimum:

$$\partial f / \partial S_i + \lambda \, \partial g / \partial S_i = 0, 0 \le i \le n \tag{7}$$

$$\partial f / \partial S_i = -2d / W S_i^2 - h (R_{i-1} + R_i) / S_i^2 = 0, 0 < i < n$$
(8)

$$\partial f / \partial S_0 = -d / W S_0^2 - h R_0 / S_0^2$$
  

$$\partial f / \partial S_n = -d / W S_n^2 - h R_n / S_n^2$$
  

$$\partial g / \partial S_i = 1, 0 \le i \le n$$
(9)

Substitution of (9) and (8) into (7) yields  

$$\lambda = 1/S_i^2 (2d/W + h(R_{i-1} + R_i)), 0 < i < n \qquad (10)$$

$$\lambda = 1/S_0^2 (d/W + hR_0)$$

$$\lambda = 1/S_n^2 (d/W + hR_{n-1})$$

From (10) we obtain the following expressions for spaces at the optimum:  $S_{i} = \sqrt{1/\lambda \left( \frac{2d}{W} + h(R_{i} + R_{i-1}) \right)}, 0 < i < n \qquad (11)$   $S_{0} = \sqrt{1/\lambda \left( \frac{d}{W} + hR_{0} \right)}$   $S_{n} = \sqrt{1/\lambda \left( \frac{d}{W} + hR_{n-1} \right)}$ 

These optimal spaces depend on resistances of the signal drivers, but are independent of capacitive loads. Substitution of (11) into (1) yields the following expression for  $\lambda$ 

$$\lambda = \frac{1}{A - nW} \left( \sum_{i=1}^{n-1} \sqrt{\frac{2d}{W} + h(R_{i-1} + R_i)} + \sqrt{\frac{d}{W} + hR_0} + \sqrt{\frac{d}{W} + hR_{n-1}} \right)^2. (12)$$

Further substitution into (3) produces the following expression for the minimal total sum of delays.

Equation (13) consists of two terms  $f_1$  and  $f_2$ . The first term  $f_1$  is invariant for different orders of signals. In the second term  $f_2$ , the indices of adjacent signals interact with each other in square root terms, thus making  $f_2$  dependent on the order of signals in the channel. The physical reason for this is that cross capacitance between adjacent wires is determined by the space they share with each other.

What is the order  $\pi \in \Pi$  that minimizes  $f_2$ ? The interaction of adjacent signals (indices) appears in a sum of successive square roots of driver resistance. This hints that the sought order will enforce the two resistances under each square root to be as close as possible to each other. Notice also that  $f_2$  contains separate square roots with driver resistance of the leftmost and rightmost signals. Therefore, it makes sense to position the wires with lowest driver resistances (strongest drivers) at the left and right ends of the channel. As proven below, symmetric hill ordering captures the above reasoning, and indeed yields the minimum in average wire delay.

**Definition:** *Symmetric hill ordering.* Given a sequence  $(R_i)_{i=1}^n$  of positive real numbers, assume without loss of generality that they are increasingly ordered, e.g.,  $R_1 \le R_2 \le \cdots \le R_{n-1} \le R_n$ . Split this sequence into odd and even subsequences  $R_1 \le R_3 \le \cdots \le R_{2k-1} \le R_{2k+1} \le \cdots$  and  $R_2 \le R_4 \le \cdots \le R_{2k} \le R_{2k+2} \le \cdots$ . Reverse the order of numbers in the even subsequence, thus turning it into monotonic decreasing sequence. Finally, concatenate the odd and the modified (reversed) even subsequences into one sequence. The order of numbers in the new sequence thus obtained is called *Symmetric hill ordering* as it resembles climbing descending a symmetric hill. Fig. 3 demonstrates the described process. For proving the optimality of symmetric hill ordering, a few more definitions and properties are introduced.



**Definition:** successive roots sum. Let  $(R_0, \dots, R_{n-1})$  be a sequence of positive real numbers (driver's resistance) and let  $\varphi(R)$  be a positive, non decreasing function of driver resistance. The sum of roots of successive numbers of the sequence  $\sqrt{\varphi(R_0)} + \sum_{i=1}^{i=n-1} \sqrt{\varphi(R_{i-1}) + \varphi(R_i)} + \sqrt{\varphi(R_{n-1})}$  is called successive roots sum.

The term  $f_2$  in (13) is a successive roots sum where  $\varphi(R) = d/W + hR$ . We'll show in the following that *successive roots sum* gets its minimum when the sequence of R values is in symmetrical hill order.

**Property:** *pair swapping*. Let  $(R_i, R_j, R_k, R_l)$  be a quadruple of real positive numbers, such that the internal numbers satisfy  $R_j \ge R_k$ , while the external numbers satisfy  $R_i \le R_l$ . There exists then :

$$\sqrt{R_i + R_j} + \sqrt{R_j + R_k} + \sqrt{R_k + R_l} \ge \sqrt{R_i + R_k} + \sqrt{R_k + R_j} + \sqrt{R_j + R_l}$$
(14)

The above inequality states that swapping the numbers of the internal pair  $(R_j, R_k)$  such that their relation will be the same as that of the external pair will yield smaller successive roots sum.

**Proof:** In (14) the middle terms on both sides are equal; therefore it is sufficient to prove that  $\sqrt{R_i + R_j} + \sqrt{R_k + R_l} \ge \sqrt{R_i + R_k} + \sqrt{R_j + R_l}$ . Squaring the two sides one needs to show that  $(R_i + R_j) \times (R_k + R_l) \ge (R_i + R_k) \times (R_j + R_l)$ . Expanding both sides, it is left to show that  $(R_i - R_l) \times (R_k - R_j) \ge 0$ , which indeed follows from the assumption on the relations of the internal and external pairs. •

The pair swapping property exists also when instead of R we consider  $\varphi(R)$ , where function  $\varphi$  is positive non-increasing. Just substitute  $\rho = \varphi(R)$ , and all the above arguments hold for  $\rho$ . **Property:** *optimal insertion.* Let  $(R_0, \dots, R_{n-1})$  be a sequence of positive real numbers ordered as a symmetric hill. Let  $R > \max{R_0, \dots, R_{n-1}}$ . Then the location where inserting R into the sequence minimizes the new successive root sum, is at the center between the two largest numbers. Hence the new sequence is also in symmetric hill order.

**Proof:** Let us insert *R* arbitrarily into the sequence between  $R_i$  and  $R_{i+1}$ , thus resulting in the quadruples  $(R_{i-1}, R_i, R, R_{i+1})$  and  $(R_i, R, R_{i+1}, R_{i+2})$  in the new sequence of n+1 numbers. If  $R_i$  and  $R_{i+1}$  were not the two center numbers of the old sequence (top of the hill), at least one of these quadruples satisfies the condition of pair swapping. Therefore, the successive roots sum of the new sequence can be reduced by appropriate swapping of R with its left or right neighbor. If R is inserted before  $R_1$  or after  $R_n$ , a swap with  $R_1$  (or  $R_n$ ) decreases the sum of successive roots. The only position where the pair swapping condition does not exist is in between the two largest numbers of the old sequence. Such insertion creates a new sequence ordered as a symmetric hill.

The optimal insertion property exists also when instead of R we consider  $\varphi(R)$ , where function  $\varphi$  is positive non-increasing.

**Definition:** *local maximum.* Let  $(R_0, \dots, R_{n-1})$  be a sequence of positive real numbers. The number  $R_j, 0 < j < n-1$ , is called a local maximum of  $(R_0, \dots, R_{n-1})$  if both  $R_j \ge R_{j-1}$  and  $R_j \ge R_{j+1}$ .

**Property:** *local maximum elimination.* Let  $(R_0, \dots, R_{n-1})$  be a sequence of positive real numbers. Let  $(R_i, R_{i+1})$  and  $(R_j, R_{j+1}, R_{j+2})$  be two disjoint subsequences, where  $R_{j+1}$  is a local maximum and  $R_i > R_{j+1} > R_{i+1}$ . Then, moving  $R_{j+1}$  in between  $R_i$  and  $R_{i+1}$  decreases the successive root sum of the sequence.

**Proof:** For the sake of convenience let denote us  $R_i, R_{i+1}, R_i, R_{i+1}, R_{i+2}$  by a, x, y, b, z, respectively. In this notation if a > b > x and b is a local maximum of (y, b, z), we need to show that  $\sqrt{a+x} + \sqrt{y+b} + \sqrt{b+z} > \sqrt{a+b} + \sqrt{b+x} + \sqrt{y+z}$ . From the setting we 0 < x, y, z < b and a > b. Therefore. if know that we define  $g(x, y, z) = \sqrt{a+x} + \sqrt{y+b} + \sqrt{b+z} - \sqrt{a+b} - \sqrt{b+x} - \sqrt{y+z},$ (15)it is sufficient to show that g is positive in the region 0 < x, y, z < b.

The region of definition of g is a box, thus convex, in x - y - z space. Therefore, if we show that g is monotonic and has positive values on the corners of the box 0 < x, y, z < b, then it is positive in the entire box.

Indeed, all the first derivatives of g are non zero in the box.

$$\partial g/\partial x = 1/2\sqrt{a+x} - 1/2\sqrt{b+x} > 0$$
,  $\partial g/\partial y = 1/2\sqrt{b+y} - 1/2\sqrt{y+z} < 0$   
and  $\partial g/\partial z = 1/2\sqrt{b+z} - 1/2\sqrt{y+z} < 0$ , for  $0 < x, y, z < b$ .

It remains to show that g is positive in the eight corners of the box, given at points (0,0,0), (b,0,0), (0,b,0), (0,0,b), (b,b,0), (0,b,b), (b,0,b), (b,b,b). A straightforward substitution in the expression of g in (15) shows its positive value at every corner. •

The local maximum elimination property exists also when instead of R we consider  $\varphi(R)$ , where function  $\varphi$  is positive non-increasing.

Based on the above properties, we are ready to prove the theorem of optimal signal ordering in a channel.

**Theorem 1 (optimal ordering of uniform-width wires):** Let a signal channel have arbitrary drivers, arbitrary capacitive loads and uniform wire width. Then the symmetric hill ordering of the signals in the channel according to driver resistance yields minimum total sum of delays (equivalent to average signal delay).

**Proof:** It was shown in (13) that for any order of the signals, the minimized total sum of delays f consists of two terms  $f_1$  and  $f_2$ . The term  $f_1$  captures the delays resulting from the capacitive loads, a component that is independent of the signal order in the channel. The term  $f_2$  captures the delay contributed by the cross capacitances of the signals, a component which depends on the signal order. It is therefore sufficient to minimize  $f_2$ .

Let  $\pi^* = (R_0, \dots, R_{n-1})$  be the drivers' resistance symmetric hill order of the channel, and denote by  $f_2(\pi^*)$  the corresponding term in the minimized total sum of delays obtained. We'll show by induction that for any other order  $\pi$  of driver resistances  $f_2(\pi^*) \leq f_2(\pi)$ .

For a channel comprised of one or two signals the induction hypothesis trivially exists. For a channel of three signals, the optimality of symmetric hill ordering follows from the optimal insertion property. Put the two smaller driver resistances, say  $R_{\alpha}$  and  $R_{\beta}$  in the channel first. Then, the optimal insertion property dictates the location of  $R_{\gamma}$  at the center, thus resulting in symmetric hill order. If  $R_{\alpha}$  ( $R_{\beta}$ ) and  $R_{\gamma}$  are placed first, a direct calculation shows that  $R_{\beta}$  ( $R_{\alpha}$ ) needs to reside such that  $R_{\gamma}$  is located at the center.

By the induction hypothesis, the symmetric hill order is optimal for any n-1 signals channel. Assume on the contrary that there exists a n signal channel whose optimal order  $\pi'$  is not symmetric hill. It follows from the non optimality of  $\pi^*$  that  $f_2(\pi^*) > f_2(\pi')$ .

Let  $(R_l, R_x, R_r)$  be the center triplet of  $\pi^*$ , namely,  $R_x$  is the largest resistance. There are two possibilities: triplet  $(R_l, R_x, R_r)$  exists or doesn't exist in  $\pi'$ .

If it exists, let us delete  $R_x$  from both  $\pi'$  and  $\pi^*$ , thus inducing channels of n-1 signals  $\pi'^{,n-1}$  and  $\pi^{*,n-1}$ . The first is not symmetrically hill ordered, while the second is. It follows from the induction hypothesis that  $f_2(\pi^{*,n-1}) < f_2(\pi'^{,n-1})$ . However, the magnitude of the difference in  $f_2$  between the *n* signal channel and its n-1 signal channel induced by  $R_x$  deletion is the same for  $\pi'$  and  $\pi^*$  and equals to  $\Delta = \sqrt{2d/W + h(R_l + R_x)} + \sqrt{2d/W + h(R_l + R_x)} - \sqrt{2d/W + h(R_l + R_r)}$ , where *W* is the uniform wire width and *d* and *h* are technology parameters. Therefore  $f_2(\pi^*) = f_2(\pi^{*,n-1}) + \Delta < f_2(\pi'^{,n-1}) + \Delta = f_2(\pi')$ ,

This is a contradiction to  $f_2(\pi^*) > f_2(\pi')$  that followed from the non optimality hypothesis of  $\pi^*$ .

Consider now the case where triplet  $(R_l, R_x, R_r)$  doesn't exist in  $\pi'$ . Then there are two possibilities. In the first, the triplet appears in  $\pi'$  as a subsequence  $(R_x, \max(R_l, R_r), \min(R_l, R_r))$ . The pair swapping property can be applied on the quadruple  $((R, R_x, \max(R_l, R_l), \min(R_l, R_l)))$ , hence  $f_2$  can be reduced. In the second possibility, in any order at least one of  $R_l$  and  $R_r$  is a local maximum in  $\pi'$ , say  $R_l$ . Then applying the local maximum elimination property to  $R_1$  and moving it to be adjacent to  $R_x$ , will decrease  $f_2$  value of the newly created order. This again contradicts the optimality assumption of  $\pi'$ .

The theorem states that the minimization of total sum of delays in a channel whose wires have uniform widths yields the best (minimal) result if the signals are ordered in a symmetric hill according to the effective resistances of their drivers. Notice that although wire width W is uniform, it is still a variable and should be optimally set together with the spaces  $(S_0, \dots, S_n)$  between the wires in order to minimize the total sum of delays. This is a simplification of the total sum of delays minimization problem [25], where individual wires can have different widths  $(W_1, \dots, W_n)$ .

#### 3.2 Non-uniform wire widths

In the following we'll prove the optimality of the symmetric hill ordering for a more general cases with non-uniform wire width. First we assume that wire widths are matched to driver strengths. It is shown below that this dependency is common in most practical VLSI designs, and minimal total sum of delays (equivalent to average signal delay) is obtained by symmetric hill ordering. Next, we discuss the most general case where wire width can be varied arbitrarily.

# 3.2.1 <u>Wire widths matched to driver strengths</u>

Let  $\psi(R)$  be a positive, non decreasing function of the driver resistance R, and let the corresponding wire width be defined by

$$W = 1/\psi(R). \tag{16}$$

In the former discussion of uniform wire width  $\psi(R)$  was simply a constant. The relation in (16) represents impedance matching, where a stronger driver (smaller R) is assigned a wider wire with a lower impedance.

Substitution of (16) into (10) yields the following relations between the wire spacing and their corresponding driver resistance.

$$S_{i} = \sqrt{\frac{1}{\lambda} \left( d\left(\psi\left(R_{i-1}\right) + \psi\left(R_{i}\right)\right) + h\left(R_{i-1} + R_{i}\right) \right)}, 0 < i < n$$
(17)  
$$S_{0} = \sqrt{\frac{1}{\lambda} \left( d\psi\left(R_{0}\right) + hR_{0} \right)}$$
  
$$S_{n} = \sqrt{\frac{1}{\lambda} \left( d\psi\left(R_{n-1}\right) + hR_{n-1} \right)}$$

The corresponding total sum of delays can again be split into order independent and order dependent terms as below.

$$f = na + b \sum_{i=0}^{n-1} \psi(R_i) + k \sum_{i=0}^{n-1} \frac{R_i}{\psi(R_i)} + g \sum_{i=0}^{n-1} R_i + e \sum_{i=0}^{n-1} \psi(R_i) C_i + \sum_{i=0}^{n-1} R_i C_i +$$
(18)

$$\underbrace{\frac{1}{A-n\sum_{i=0}^{n-1}\frac{1}{\psi(R_i)}}\left[\sum_{i=0}^{n-2}\sqrt{d(\psi(R_i)+\psi(R_{i+1}))+h(R_i+R_{i+1})}+\sqrt{d\psi(R_0)+hR_0}+\sqrt{d\psi(R_{n-1})+hR_{n-1}}\right]^2}.$$

The term  $f_1$  in (18) captures the capacitive load driven by the signal, which is independent of the signals order. The term  $f_2$  captures the cross capacitance between signals, and it depends on their order. The term  $f_2$  satisfies the definition of successive root sums, and all three properties are valid. The optimal signal ordering theorem can now be extended as follows.

**Theorem 2 (optimal ordering of variable-width wires):** Let a signal channel have arbitrary drivers, arbitrary capacitive loads and wire width inversely proportional to the corresponding driver resistance. Then the symmetric hill ordering of the signals in the channel according to driver resistances yields minimum total sum of delays.

**Proof:** All properties of symmetric hill order still hold when instead of *R* we consider positive non-increasing function  $\varphi(R)$ .

The function  $\psi(R) = \alpha + \beta R$ , where  $\alpha$  and  $\beta$  are real positive number is admissible, providing further minimization in comparison with the case of uniform width. The minimum total sum of delays is obtained by first ordering the signals according to Theorem 2. Then a minimization of total sum of delays for that order takes place, where the wire spacing  $(S_0, \dots, S_n)$  and the parameters  $\alpha$  and  $\beta$  are the optimization variables. Notice that  $\beta = 0$  is the case of uniform wire width.

#### 3.2.2 Further Generalization of wire widths

Assume now that wire width can vary arbitrarily. It is no longer true that symmetric hill ordering yields the minimum total sum of delays. This general case might be caused by large capacitive loads, since the optimal setting of wire width depends on the corresponding load. This in turn affects the optimal order within the channel.

The most general problem setting such that symmetric hill ordering still yields minimal total sum of delays can be characterized by writing the relation between wire widths and driver resistances at minimum total sum of delays. At the minimum, equations (1) and (4) satisfy:

$$\frac{\partial f}{\partial W_i} + \lambda \frac{\partial g}{\partial W_i} = 0, 0 \le i \le n - 1.$$
(19)

Differentiating (1) and (4) we obtain

$$\frac{\partial f}{\partial W_i} = -\frac{1}{W_i} \left( b + \frac{d}{S_i} + \frac{d}{S_{i+1}} + eC_i \right) + kR_i \quad ; \quad \frac{\partial g}{\partial W_i} = 1$$
(20)

Substituting (20) into (19) yields

$$W_{i} = \sqrt{\frac{b + \frac{d}{S_{i}} + \frac{d}{S_{i+1}} + eC_{i}}{\lambda + kR_{i}}}$$
(21)

Equation (21) describes the dependence of wire width at minimum total sum of delays on the corresponding driver resistance, spacing to neighbor wires, and the capacitive load driven by the wire. Whenever equation (21) implies a monotonic non-increasing relation between wire width and driver resistance for all signals in the channel, symmetric hill order must yield the minimum total sum of delays among all possible orders.

Driver resistance appears in the denominator of (21), yielding a non increasing relation. For the term  $d/S_i + d/S_{i+1}$  at the nominator it has been

shown in (17) that the spaces are monotonically increasing with driver resistance, which also imposes a non increasing relation between  $W_i$  and  $R_i$ . The only remaining term is the capacitive load at the nominator. In order to obtain a monotonic relation in (21) we impose the following condition on resistance of drivers and their corresponding capacitive loads.

**Theorem 3:** Let a *n* signal channel have arbitrary drivers and capacitive loads. Let  $\sigma_i, \sigma_j, 0 \le i, j \le n-1$  be any two signals and let  $(R_i, C_i)$  and  $(R_j, C_j)$  be their driver resistance and capacitive load, respectively. If the relation  $R_i \ge R_j$  implies  $C_i \le C_j$  and vice versa, then the symmetric hill order yields minimum total sum of delays among all orders.

**Proof:** If follows from equation (21) that wire widths are non increasing functions of driver resistance. Therefore theorem 2 is satisfied.

#### 4. IMPLICATIONS FOR CIRCUIT DESIGN

The above results are applicable to layout optimization for circuit performance improvement, by net ordering combined with wire sizing and spacing in interconnect channels. This section discusses two variants of the performance optimization problem, namely critical signal delay minimization, and impact of worst-case crosstalk effect on delays.

## 4.1.1 <u>Critical signal delay minimization (MinMax problem)</u>

Optimization of the interconnect layout as described above yields a distribution of delays within the bundle of wires which minimizes the

average signal delay. It is possible to perform further tuning of the layout by wire sizing and spacing, such that the worst wire delay in the distribution will be reduced (MinMax optimization problem). Such improvement is achieved by allocating more area to this wire, hence other wires become slower. Applying this tuning iteratively, the delays of all wires must be equal [25]. Typically, a slight improvement in the delay of the slowest wire is gained at the expense of significant increase in the delays of all the other signals, until they all become equally critical. In industrial design practice, tuning each interconnect channel layout until all wire delays are completely balanced is not a desirable approach [11][12]. Optimization of average wire delay (which is equivalent to maximization of average slack) is more common [25].

Despite the limited practicality, it is still interesting to find which wire ordering leads to the best possible worst wire delay for MinMax delay optimization. Formally, f in (3) is replaced by

$$f = \max_{0 \le i \le n-1} \left\{ \left( a + \frac{b}{W_i} + \frac{d}{W_i S_i} + \frac{d}{W_i S_{i+1}} + \frac{eC_i}{W_i} + kR_i W_i + gR_i + \frac{hR_i}{S_i} + \frac{hR_i}{S_{i+1}} + R_i C_i \right) \right\}$$

Given signal order, an iterative method to find MinMax optimum can be used [25]. The following observations have been made with regard to MinMax delay optimization, based on extensive numerical experimentation:

 Symmetric hill ordering is not necessarily optimal for the MinMax delay problem. When there are some excessively large load capacitances, or when there are only small differences among driver resistances, then the optimal order is most likely different from symmetric hill.

- Symmetric hill order is optimal for many practical instances of the MinMax delay problem. The conditions which guarantee this property for a particular problem instance remain an open question. However, symmetric hill ordering can be used as a useful heuristic.
- 3. The optimal MinMax delay is quite sensitive to net ordering. When random channel layouts are optimized heuristically by symmetric hill ordering (followed by wire sizing and spacing for MinMax delay), the percentage of delay improvement often exceeds the improvement gained in optimizing for average wire delay.

#### 4.1.2 Impact of non-uniform crosstalk assumption

A worst-case crosstalk effect on delay occurs if adjacent wires make simultaneous logic transitions in opposite directions [19][20][22][24]. Although this is pessimistic, designers often assume such a worst case condition in delay modeling. This is done by multiplying cross capacitances between active wires by a Miller coupling factor (usually a factor of 2 is assumed [10]). Since sidewall shielding wires are inactive, they don't induce a Miller effect. Consequently, the delay expression (2) should be modified for worst-case crosstalk, such that all the internal 1/S terms are multiplied by 2, except the terms  $1/S_0$  and  $1/S_n$ , which are left unchanged. Note that the sidewall wires appear in the equations as if they were signals with zero-resistance drivers. In symmetric hill order without assuming worst-case crosstalk, the wires with strongest (lowest-resistance) drivers are placed next to the sidewalls. The outermost spaces  $S_0$  and  $S_n$  tend to be smaller than others, since they are not shared by two signals and the strong drivers can handle their capacitances.

However, when the problem is modified by Miller coupling factors for worst-case crosstalk, the sidewall spaces have lower effective capacitances, and they are most appropriate for the weakest drivers (which are the most sensitive to crosstalk [23]). Consequently, the preferred wire ordering can become a "symmetric valley" instead of a "symmetric hill": the wires with weakest drivers are placed near the shields and the other wires are sorted such that the drivers with smallest resistance are in the middle of the channel. The resistances are monotonically increasing from the middle towards the sides, thus minimizing the differences between neighboring drivers.

## 5. EXPERIMENTAL RESULTS

Numerical experiments for various problem instances were performed using 65 nanometer technology parameters. Continuous optimization has been used, and results were verified for allowed discrete sizes as required by the technology. Delay improvements were verified by SPICE simulations of several circuits before and after optimization.

#### Experiment 1

Random problem instances using five signals were evaluated as follows: Each signal was assigned a driver randomly. The range of driver resistances was 50  $\Omega$  to 3  $K\Omega$ , and load capacitances in the range 10-200fF were assigned accordingly, to avoid excessive driver loading, such that the conditions of theorem 3 were always satisfied. For each problem the wire widths and spaces were optimized once to yield minimum total sum of delays, and again to yield minimum worst-wire delay (MinMax). This was repeated for all the 5!=120 possible orders. The procedure was done for eight different channel widths A – 1.5, 2, 2.5, 3, 3.5, 7, 9.5 and  $12 \mu m$ , and five different channel lengths L – 300, 500, 800, 1200 and 1500  $\mu m$ . The optimization impact (% improvement of best versus worst ordering, after width/space optimization, averaged over 20 random problem instances of each width and length configuration) is presented in Table 1. This experiment demonstrates that net ordering can significantly improve results of wire sizing and spacing optimization, especially when width is tightly constrained. All obtained optimal orders for total sum of delays minimization came out as symmetric hills (as expected, since theorem 3 is always satisfied in this example).

## Experiment 2

The impact of net ordering on interconnect channels containing a large number of wires was evaluated, using 15 representative interconnect channels in 65nm technology. The number of signal wires per channel varied from 10 to 128, averaging 49. Width of each channel was determined by budgeting 2 minimal metal pitches per wire. Driver resistances varied from 50  $\Omega$  to 2.5  $K\Omega$ , averaging 1.24  $K\Omega$ . Exhaustive search to find the worst and best ordering is infeasible for such problems. Instead, a poor ordering has been guessed, and the corresponding signal delays were compared with results of symmetric hill ordering. The experiment confirmed that symmetric hill net ordering and sizing optimization, up to 18.3% in average delays were obtained. On average, the interconnect delay improvement in this experiment was 11.8%, which is equivalent to 5% of the clock cycle.

#### Table 1

Average improvement (best vs. worst ordering) for random problem instances, in sum-of-delays (upper half cell) and worst wire delay (lower half cell)

| Channel | Channel width |        |        |        |        |       |        |       |
|---------|---------------|--------|--------|--------|--------|-------|--------|-------|
| length  | 1.5 μm        | 2 µm   | 2.5 μm | 3 µm   | 3.5 µm | 7 μm  | 9.5 μm | 12 µm |
| 300 µm  | 10.14%        | 9.13%  | 8.13%  | 7.25%  | 6.62%  | 3.12% | 2.25%  | 1.97% |
|         | 17.19%        | 14.98% | 12.7%  | 10.86% | 9.84%  | 4.6%  | 2.86%  | 2.13% |
| 500 μm  | 11.31%        | 9.5%   | 8.21%  | 7.46%  | 6.71%  | 3.32% | 2.43%  | 2.14% |
|         | 17.24%        | 15.18% | 13.29% | 10.81% | 9.64%  | 5.13% | 3.07%  | 2.94% |
| 800 µm  | 9.82%         | 8.76%  | 7.79%  | 7.32%  | 6.5%   | 2.47% | 1.92%  | 1.05% |
|         | 16.22%        | 14.11% | 13.08% | 11.09% | 9.98%  | 5.14% | 3.24%  | 1.83% |
| 1200 µm | 8.78%         | 8.23%  | 7.38%  | 6.89%  | 6.35%  | 2.24% | 1.7%   | 1.1%  |
|         | 14.18%        | 14.58% | 13%    | 11.63% | 9.84%  | 5.13% | 2.72%  | 1.51% |
| 1500 µm | 7.63%         | 7.2%   | 6.94%  | 6.54%  | 6.12%  | 2.1%  | 1.81%  | 0.97% |
|         | 14.13%        | 14.02% | 12.97% | 11.51% | 10.15% | 4.99% | 2.62%  | 2%    |

## Experiment 3

This example demonstrates how the set of wire driver resistances influences ordering optimization impact. The effect of signal ordering on MinmAx delay in channels with both strong and weak drivers is shown in Table 2. A bundle of 7 signals with driver-load pairs of (50  $\Omega$ , 50 fF) or (3  $K\Omega$ , 5 fF) is examined for various numbers of the weak drivers. Channel width and length were A=3  $\mu m$  and L=500  $\mu m$ . As can be expected, when the numbers of strong and weak drivers were about equal, signal ordering is most effective. The worst ordering is indeed the interleaved one, described in Figure 2a, while the best one is clearly symmetric hill.

| Table 2                                                                            |
|------------------------------------------------------------------------------------|
| % improvement of best versus worst ordering, after width/space optimization, for a |
| signal channel with two driver strengths                                           |

| No. of weak drivers | Percent of improvement in worst delay |  |  |  |  |
|---------------------|---------------------------------------|--|--|--|--|
| 1                   | 0.11%                                 |  |  |  |  |
| 2                   | 8%                                    |  |  |  |  |
| 3                   | 12.7%                                 |  |  |  |  |
| 4                   | 16.3%                                 |  |  |  |  |
| 5                   | 10.76%                                |  |  |  |  |
| 6                   | 5.25%                                 |  |  |  |  |

#### Experiment 4

This example demonstrates the influence of driver's resistances range on ordering optimization impact. The range of drivers is specified by the ratio  $\frac{R_{\text{max}}}{R_{\text{min}}}$ , where  $R_{\text{max}}$  and  $R_{\text{min}}$  are the largest and the smallest driver resistances in a set of wires being ordered. 19 different 7-wire sets were evaluated, with driver resistances distributed uniformly around a constant average of  $1 K\Omega$ . In these sets,  $\frac{R_{\text{max}}}{R_{\text{min}}}$  varied from 1 (all drivers equal) to 6.4. Channel length is 700  $\mu m$  and width is 3  $\mu m$  in all cases. The results are presented in Figure 4. As can be seen, optimization impact increases with resistance range. Worst wire delay optimization is influenced much more

than optimization of average delay. For larger range of driver resistances the increase in delay improvement is negligible.



Figure 4. Influence of relative range of drivers on optimization impact

#### Experiment 5

The delay improvement by wire ordering is examined for different metal levels. Problem instances of 500  $\mu m$  length and variable widths have been used. Technology parameters were extracted for for 5<sup>th</sup>, 6<sup>th</sup>, 7<sup>th</sup> and 8<sup>th</sup> metal levels in an industrial 65nm process. Results of 20 random minmax optimization problem instances for each {channel width; metal level} combination were calculated. In order to compare different metal levels, channel widths are expressed in terms of the metal pitch  $W_{min} + S_{min}$ . Channel widths thus in each case is  $k(W_{min} + S_{min}) + S_{min}$ , where k = 6,...12. The average delay improvement is presented in Fig. 5. As can be seen, the optimization

impact is about the same for the different metal levels, and the optimization is more beneficial when the channel width is more constrained.



Figure 5. The relation between metal level and delay improvement for channels of different widths

## Experiment 6

In this experiment, delays obtained by exhaustive simultaneous ordering/sizing/spacing optimization are compared with results of heuristics using symmetric hill order for total sum of delays objective. Another set of random 1600 instances were generated similar to example 1, with the same set of channel widths and lengths. Heuristic wire width assignment with the inverse linear width function  $\psi(R) = \alpha + \beta R$  was applied. For each value of channel width and length, the delay difference between the optimal result

of exhaustive search and the optimal result of the heuristic was expressed as a fraction of the delay difference between best and worst results of the exhaustive search. On average for all these problem instances, the global minimum was approached as closely as 0.37%.

# 6. CONCLUSION

Reordering of wires in a constrained-width interconnect channel can improve results of wire-sizing and spacing for timing optimization. The optimal order of wires generally depends on both wire driver resistances and load capacitances. Analysis of average delay minimization showed that when wire widths are uniform or are specified by a monotonic nonincreasing function of driver resistance, the optimal order can be determined directly. This optimal order depends on driver resistances only, and takes the form of a symmetric hill. Load capacitances do not affect the optimal order under these conditions. The general problem of simultaneous net-ordering, wire-sizing and spacing optimization has been presented. In the general case, the optimal solution might be dominated by load capacitances, and the optimal order may not be symmetric hill. Solution heuristics were proposed for the general case. Numerical experiments demonstrated that delay improvements in the range of about 10% are attainable in state-of-the art technology, and heuristic results approach the global optimum within approximately 0.5%.

## 7. REFERENCES

[1] R. Ho, K. Mai and M. Horowitz, "The future of wires", *Proceedings of the IEEE*, Vol. 89, no. 4, Apr. 2001.

- [2] D. Sylvester and K. Keutzer, "Getting to the bottom of deep submicron", *in Proc. ICCAD*, pp. 203-211, 1998.
- [3] A. Kahng, S. Muddu, E. Sarto and R. Sharma, "Interconnect Tuning Strategies for High-Performance ICs", Proc. Design, Automation and Testing in Europe, Paris, February 1998.
- [4] J. Cong and K. S. Leung, "Optimal Wiresizing Under Elmore Delay Model". *IEEE Trans. on Computer-Aided Design*, vol. 14, no. 3, pp. 321-336, Mar. 1995.
- [5] S. S. Sapatnekar, "Wire Sizing as a Convex Optimization Problem: Exploring the Area Delay Tradeoff", *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 15, No. 8, pp. 1001 - 1011, August 1996.
- [6] J. Cong, L. He, C. K. Koh and Z. Pan, "Interconnect Sizing and Spacing with Consideration of Coupling Capacitance", *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 20, no. 9, pp.1164-1169, September 2001.
- [7] M. Becer, D. Blaauw, V. Zolotov, R. Panda, I. Hajj, "Analysis of Noise Avoidance Techniques in Deep-Submicron Interconnects Using a Complete Crosstalk Noise Model", *IEEE/ACM Design Automation and Test in Europe Conference (DATE)*, pp. 456-463, March 2002.
- [8] D. A. Kirkpatrick and A. L. Sangiovanni-Vincentelli. "Techniques for crosstalk avoidance in the physical design of high performance digital systems", *In Proc. IEEE/ACM Int. Conf. Computer Aided Design*, pp. 616--619, Nov. 1994.

- [9] R. Arunachalam, E. Acar, and S. R. Nassif, "Optimal shielding/spacing metrics for low power design", *Proceedings of IEEE Computer Society Annual Symposium on VLSI*, pp. 167-172, 2003.
- [10] P. Chen, D. A. Kirkpatrick and K. Keutzer, "Miller Factor for Gate-Level Coupling Delay Calculation", *in Proc. ICCAD*, pp.68-74, 2000.
- [11] M. Hashimoto and H. Onodera, "Increase in delay uncertainty by performance optimization", *Proc. IEEE International Symposium on Circuits and Systems* (*ISCAS*), pages 379--382, 2001.
- [12] X. Bai, C. Visweswariah, P. N. Strenski, D. J. Hathaway, "Uncertainty-Aware Circuit Optimization", *The 39th Design Automation Conference (DAC)*, pp.58-63, 2002.
- [13] P. Groeneveld, "Wire ordering for detailed routing", *IEEE Design and Test*, vol.6, issue 6, pp. 6-17, Nov. 1989
- [14] A.Vittal, L Chen, M. Marek-Sadowska, K. Wang, S Yang, "Crosstalk in VLSI Interconnections", *IEEE Transactions on CAD Design of IC and Systems*, Vol.18, No. 12, Dec. 1999
- [15] T. Gao and C. Liu, "Minimum Crosstalk Channel Routing", *TechnicalDigest Int. Conf. on CAD*, pp. 692-696, 1993
- [16] J. Ma and L. He, "Formulae and Applications of Interconnect Estimation Considering Shield Insertion and Net Ordering", 2001 International Conference on Computer-Aided Design (ICCAD '01), pp. 327-332, Nov. 2001

- [17] P. Gupta and A. Kahng. "Wire Swizzling to Reduce Delay Uncertainty Due to Capacitive Coupling." *Proc. IEEE Intl. Conf. on VLSI Design*, January, p.431, 2004.
- [18] E. Macii, M. Poncino and S. Salerno, "Combining Wire Swapping and Spacing for Low-Power Deep-Submicron Buses", *Proc.of the 13th ACM Great Lakes symposium on VLSI*, pp. 198-202, 2003
- [19] T. Sato, Y. Cao, D. Sylvester and C. Hu, "Characterization of interconnect coupling noise using in-situ delay change curve measurements", *Proc. IEEE* ASIC/SoC Conf., pp. 321-325, 2000
- [20] P. Chen and K. Keutzer, "Toward true crosstalk noise analysis", Proc. ICCAD, pp. 132-137, 1999
- [21] T. Lin and L. Pillegi, "Throughput-Driven IC Communication Fabric Synthesis", *Proc. ICCAD*, pp. 274-279, 2002.
- [22] T. Sato, Y. Cao, K. Agarwal, D. Sylvester, and C. Hu, "Bidirectional Closed-Form Transformation Between On-Chip Coupling Noise Waveforms and Interconnect Delay-Change Curves", *IEEE Transactions on Computer-Aided Design* of Integrated Circuits and Systems, pp. 560-572, March 2003
- [23] O. Milter, A. Kolodny, "Crosstalk noise reduction in synthesized digital logic circuits", IEEE transactions on VLSI, pp. 1153 – 1158, Dec. 2003
- [24] A. Kahng, S. Muddu and D. Vidhani, "Noise and Delay Uncertainty Studies for Coupled RC Interconnects", Proc. of Twelfth Annual IEEE International ASIC/SO SOC Conference, pp. 3-8, Sept. 1999
- [25] S. Wimer, S. Michaely, K. Moiseev and A. Kolodny, "Optimal Bus Sizing in Migration of Processor Design", *IEEE Transactions on Circuits and Systems – I*, vol. 53. no. 5, May 2006.
- [26] A. Naeemi, R. Venkatesan, and J. D. Meindl, "Optimal global interconnects for GSI," *IEEE Trans. Electron. Devices*, vol. 50, pp. 980-987, April 2003.

[27] D. Pamunuwa, L. Zheng and H. Tenhunen, "Maximizing throughput over parallel wire structures in the deep submicrometer regime", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Volume 11 Issue 2, 2003.