# Net-Ordering for Optimal Sharing of Cross-Capacitances in Nanometer Interconnect Design

1

# Konstantin Moiseev, Shmuel Wimer and Avinoam Kolodny

# ABSTRACT

This paper addresses the problem of ordering and sizing parallel wires in a single metal layer within an interconnect bus of a given width, such that cross-capacitances are optimally shared for circuit delay minimization. Using an Elmore delay model including cross capacitances for a bundle of fixed-width wires, we show that an optimal wire ordering is uniquely determined, such that best timing can be obtained by proper allocation of inter-wire spaces. The optimal ordering, called BMI (Balanced Monotonic Interleaved) depends on the size of drivers, and is independent of size of receivers. The paper also addresses the problem of simultaneously ordering and optimizing variable-width wires. A heuristic approach for wire ordering and sizing is presented. Examples for 90-nanometer technology are analyzed and discussed.

#### **Categories and Subject Descriptors**

B.7.1 [Integrated Circuits]: Design Styles – *Microprocessors;* B.7.2 [Integrated Circuits]: Design Aids – *Placement and Routing.* 

# **General Terms**

Performance, Design

#### Keywords

routing, wire ordering, wire spacing.

#### **1. INTRODUCTION**

Cross-capacitances between wires in interconnect structures have a major effect on circuit timing. The importance of this effect grows with technology scaling. Since cross-capacitance between two wires depends on interwire spacing and affects the delays of both wires, allocation of inter-wire spaces and wire widths becomes an optimization problem for bus structures under a total area constraint [1]. This paper addresses a more general problem, where delays in a bundle of parallel wires (with different drivers and loads) are minimized by choosing an optimal



Figure 1. a. – interleaved placement of wires; b – sorted placement of wires

ordering of the nets, in addition to optimal allocation of wire widths and inter-wire spaces. The total width of the structure is a given constraint. The problem is motivated by the following example: Assume a bus of 2n signals, n of them with strong drivers of size A (small effective output resistance) and n others with weak drivers of size B (large driver resistance). Consider two ways to reorder these signals in the bus. In the first (see Fig.1a), signals with drivers A and B are interleaved (A, B, A, B etc). When the corresponding wires and their spaces are sized for delay minimization, we expect type-B wires to have larger spacing to their neighbors, since their drivers are too weak to drive large cross-capacitances. Because of the interleaved arrangement, type B wires will share these large spaces with their type-A neighbors, which don't require such large spacing. The order shown in Fig. 1b allows wires of type A to share small spaces, and wires of type B to share large spaces. The second configuration obtains better circuit timing, because it

allows more effective space allocation than the first ordering.

This example demonstrates that wire ordering according to driver strength can improve results of delay minimization. Ordinary delay minimization ignores the net ordering degree of freedom and treats the order of signals in the bus as given. A brute-force approach to determine the best ordering is to generate all signal permutations in the bus, and for each permutation solve the wire-width and space optimization problem. This approach, however, is computationally infeasible when size of the bus exceeds a few signals. This paper proves the existence of optimal wire ordering that yields best delay minimization by wire sizing and space allocation. The paper describes an efficient algorithm to find the optimal order for a wide range of practical cases. The paper also presents and evaluates heuristics for solving the most general cases of this problem.

#### 1.1 Related Works

The problem of allocating widths and spaces to maximize performance in bus structures was proposed in [1]. The wire sizing problem has been addressed in [2] and [3] for a single net. Sizing and spacing multiple nets with consideration of coupling capacitance has been addressed in [4] for general interconnect layouts. Coupling capacitance has been addressed explicitly in the context of physical design for minimizing crosstalk noise [5,6, 8, 9] or dynamic power [7].

Several variants of net-reordering have been applied for improving layout efficiency [10], and for noise reduction [6, 11, 12, 13, 14, 15, 16]. Swapping of wires for power reduction was applied in [17]. Vittal et al. [11] have suggested to reduce capacitive coupling noise by sorting wires in order of driver strength, which is closely related to our results. Our analysis is focused on the effect of coupling capacitances on nominal delay. We avoid worst-case assumptions about delay uncertainty because of transient crosstalk, which may be circumvented by functional or temporal separation [18, 21]. However, since weakly-driven signals are most sensitive to crosstalk noise [20], our approach also reduces noise sensitivity.

### **PROBLEM FORMULATION**

#### **1.2 Interconnect configuration**

Circuit structure and notation are shown in Figure 2, illustrating n signal nets  $\sigma_0, ..., \sigma_{n-1}$  between two shield wires.

 $S_i$  and  $S_{i+1}$ , respectively, denote spaces to the left and right neighbors of wire  $W_i$ . The length of all the wires is L.

The total sum of wire widths and spaces is constrained to be A, representing the area available for laying out the interconnecting wires of the bus.

$$g\left(\overline{W},\overline{S}\right) = \sum_{j=0}^{n-1} W_j + \sum_{j=0}^n S_j = A$$
(2.1)

## 1.3 Delay model

The delay  $\Delta_i$  of signal  $\sigma_i$  can be calculated from the  $\pi$ -model equivalent circuit shown in Figure 3, where  $R_{d_i}$  is the effective output resistance of the driver,  $R_{w_i}$  is the wire resistance,  $C_{w_i}$  is the area and fringe capacitance,  $C_{c_i}$  and  $C_{c_{i+1}}$  are the coupling capacitances to the right and left neighboring signals, and  $C_{l_i}$  is the capacitive load of the receiver's input.

Using an Elmore model with first order approximation for capacitances, the delay can be expressed as [Error! Reference source not found.]:

$$\Delta_{i} = \left(a + \frac{b}{W_{i}} + \frac{d}{W_{i}S_{i}} + \frac{d}{W_{i}S_{i+1}} + \frac{eC_{i}}{W_{i}} + kR_{i}W_{i} + gR_{i} + \frac{hR_{i}}{S_{i}} + \frac{hR_{i}}{S_{i+1}} + R_{i}C_{i}\right)^{(2.2)}$$

where coefficients of wire widths, spaces, driver resistances and load capacitances are technology-dependent constants denoted a, b, d, e, k, g, h.

### 1.4 Objective functions

Let  $f_1$  given in (2.3) be the objective function we wish to minimize.  $f_1$  denotes the sum of all signal delays. It is commonly used since it captures the contributions of all signals to circuit timing,.

$$f_1 = \sum_{i=0}^{n-1} \left(a + \frac{b}{W_i} + \frac{d}{W_i S_i} + \frac{d}{W_i S_{i+1}} + \frac{eC_i}{W_i} + kR_iW_i + gR_i + \frac{hR_i}{S_i} + \frac{hR_i}{S_{i+1}} + R_iC_i\right)$$
(2.3)

Sometimes it is appropriate to speed-up the slowest signal in of the bus . The objective function for such MinMax optimization is

$$f_{2} = \max_{0 \le i \le n-1} \left\{ \left(a + \frac{b}{W_{i}} + \frac{d}{W_{i}S_{i}} + \frac{d}{W_{i}S_{i+1}} + \frac{eC_{i}}{W_{i}} + kR_{i}W_{i} + gR_{i} + \frac{hR_{i}}{S_{i}} + \frac{hR_{i}}{S_{i+1}} + R_{i}C_{i}\right) \right\}$$
(2.4)

This paper addresses the minimization of  $f_1$ . Comparisons with  $f_2$  are given for some of the examples presented below.

## 1.5 Environment conditions and ordering of wires

By rearranging terms (2.2) can be written for each wire as follows (wire index was omitted for simplicity):

$$\Delta = a'R_d + b'C_l + R_dC_l + d' . (2.5)$$

 $R_d$  and  $C_l$  denote driver resistance and load capacitance of the wire. The pair  $(R_d, C_l)$  represents *environment* conditions for the wire. All other terms encapsulate wire intrinsic parameters: either fixed technology parameters or wire characteristics (width, length, spaces to neighbors). Our goal is to perform signal ordering in the bus according to environmental conditions of the wires, followed by optimizing the intrinsic parameters, so as to minimize  $f_1$ .

#### **1.6 Optimization problems**

For the sake of clarity we first consider in section 3 the special case of all wires having the same given width W. Optimization therefore explores net-ordering and wire-spacing. Later in section 4 we address the more general problem of delay optimization by simultaneous wire ordering, wire width assignment, and inter-wire space allocation.

## 2. OPTIMAL ORDERING AND SPACING OF *n* EQUAL-WIDTH WIRES

### 2.1 Dependence of objective function on order of wires

Let all wires have a fixed width W, so  $f_1$  is a function of n + 1 variables  $S_i$ . The solution of minimizing  $f_1$  under the constraint g implies

$$\frac{\partial f_1}{\partial S_j} + \lambda \frac{\partial g}{\partial S_j} = 0, \ 0 \le j \le n \quad (3.1)$$

where  $\lambda$  is a Lagrange multiplier.

The partial derivatives of  $f_1$  and g with respect to  $S_i$  are:

$$\frac{\partial f_1}{\partial S_i} = -\frac{d}{WS_i^2} - \frac{d}{WS_i^2} - \frac{hR_i}{S_i^2} - \frac{hR_{i-1}}{S_i^2},$$

$$\frac{\partial f_1}{\partial S_0} = -\frac{d}{WS_0^2} - \frac{hR_0}{S_0^2}, \quad \frac{\partial f_1}{\partial S_n} = -\frac{d}{WS_n^2} - \frac{hR_{n-1}}{S_n^2}, \quad 0 < i < n$$

$$\frac{\partial g}{\partial S_i} = 1, \quad 0 \le i \le n \qquad (3.3)$$

Substituting (3.2) and (3.3) to (3.1) and solving yields

$$\lambda = \frac{1}{S_i^2} (\frac{2d}{W} + hR_i + hR_{i-1}), \ 0 < i < n$$

$$\lambda = \frac{1}{S_0^2} (\frac{d}{W} + hR_0), \ \lambda = \frac{1}{S_n^2} (\frac{d}{W} + hR_{n-1})$$
(3.4)

Rearranging the terms in (3.4), the following interesting property of the minimum sum of delays is obtained.

**Property 1.1.** At minimum sum of delays in a signal bus, the total sum of squares of even spaces is equal to the total sum of squares of odd spaces.

$$S_0^2 + S_2^2 + S_4^2 + \dots + S_{n-1}^2 = S_1^2 + S_3^2 + S_5^2 + \dots + S_n^2$$
(3.5)

Notice that (3.5) holds regardless of the given wire width (equal for all signals) and wire environment conditions. Though not proven in this paper, (3.5) holds when signals can have different wire widths which are optimized together with the spaces in order to minimize the total sum of delays. Property 1 reflects the fact that adjacent wires in the bus share common spaces.

Deriving  $S_i$  explicitly from (3.4) in terms of environment conditions, we obtain:

$$S_{i} = \sqrt{\frac{1}{\lambda} \left(\frac{2d}{W} + hR_{i} + hR_{i-1}\right)}, 0 < i \le n-1$$

$$S_{0} = \sqrt{\frac{1}{\lambda} \left(\frac{d}{W} + hR_{0}\right)},$$

$$S_{n} = \sqrt{\frac{1}{\lambda} \left(\frac{d}{W} + hR_{n-1}\right)}$$
(3.6)

A direct consequence of (3.6) is that when all wire widths are equal, the optimal spaces between wires are proportional to the square root of the sum of driver resistances of signals sharing a common space. Using (2.1) and (3.6) we can derive  $\lambda$ . Further substitution of  $\lambda$  and (3.6) into (2.3), the minimal total sum of delays is expressed in terms of technology parameters, given wire width, bus area constraint and environment conditions:

$$f_{1} = n\left(a + \frac{b}{W}\right) + \left(kW + g\right)\sum_{i=0}^{n-1} R_{i} + \frac{e}{W}\sum_{i=0}^{n-1} C_{i} + \sum_{i=0}^{n-1} C_{i}R_{i} + \frac{1}{A - nW}\left(\sum_{i=0}^{n-2} \sqrt{\left(\frac{2d}{W} + hR_{i} + hR_{i+1}\right)} + \sqrt{\frac{d}{W} + hR_{0}} + \sqrt{\frac{d}{W} + hR_{n-1}}\right)^{2}$$
(3.7)

Let's

define

 $f_{12} = \frac{1}{A - nW} \left( \sum_{i=0}^{n-2} \sqrt{\left(\frac{2d}{W} + hR_i + hR_{i+1}\right)} + \sqrt{\frac{d}{W} + hR_0} \right)$ 

$$f_{11} = n \left( a + \frac{b}{W} \right) + \left( kW + g \right) \sum_{i=0}^{m-1} R_i + \frac{e}{W} \sum_{i=0}^{m-1} C_i + \sum_{i=0}^{m-1} C_i R_i$$
 and 
$$+ \sqrt{\frac{d}{W} + hR_{n-1}} \right)^2,$$

such that  $f_1 = f_{11} + f_{12}$ . Notice that  $f_{11}$  is independent of the order of signals in the bus, while  $f_{12}$  depends on wire ordering. Consequently, there exists an order which minimizes the total sum of delays. A conclusion from the expressions in (3.7) is the following:

**Corollary:** For uniform wire widths, wire ordering affects the minimal sum of delays via driver resistances, while the effect load capacitances is order-insensitive. If all driver resistances are equal, the optimal total sum of signal delays in a bus is independent of their order.

The physical reason for this is that a wire's load capacitance affects only the delay of that signal, while crosscapacitances affect the delay of the wires sharing it. In the following we describe how to obtain this order, and prove that this order is indeed the optimal one. Take the driver with the largest resistance to reside at the center of the bus channel. Then at each turn take a driver in monotonically decreasing order of resistance and locate it alternately to the left and right of the signal bus as shown in figure 4. We call the resulting order <u>BMI (Balanced Monotonic Interleaved)</u>.

Let us prove now that BMI order is indeed the optimal one which yields minimal total sum of delays. Without loss of generality let us assume that all  $R_i$  are different. First, we introduce some notation.

 $\{R_0, ..., R_{n-1}\}$  denotes the set of  $R_0, ..., R_{n-1}$ , while  $(R_0, ..., R_{n-1})$  denotes a specific order. A sum of the form  $\sum_{i=0}^{n-2} \sqrt{f(R_i) + f(R_{i+1})} + \sqrt{f(R_0)} + \sqrt{f(R_{n-1})}$  is called a  $\Phi$ -sum  $\Phi_n(f, \Pi_n)$ , where f is a continuous function and  $\Pi_n = (R_0, ..., R_{n-1})$  is a given order (permutation). The term  $f_{12}$  is thus a  $\Phi$ -sum with  $f(R) = \frac{d}{2} + hR$ 

$$f(R) = \frac{\alpha}{W} + hR \tag{3.8}$$

**Definition 3.1 (formal definition of BMI order):** Given a bus of *n* signals with driver resistances  $R_0, \ldots, R_{n-1}$ , the order (permutation) of signals  $\Pi^* = (R_0, \ldots, R_{n-1})$  is called <u>Balanced Monotonic Interleaved (BMI) order</u> if it satisfies

$$R_0 < R_{n-1} < R_1 < R_{n-2} < \dots < R_{\left\lfloor \frac{n}{2} \right\rfloor^{-1}} < R_{\left\lfloor \frac{n}{2} \right\rfloor^{+1}} < R_{\left\lfloor \frac{n}{2} \right\rfloor}.$$
(3.9)

Notice that the reversed permutation which satisfies  $R_{n-1} < R_0 < R_{n-2} < R_1 < \dots$ , is also BMI.

**Definition 3.2 (Maximum point):** Given a permutation  $\Pi = (R_0, ..., R_x, R_y, R_z, ..., R_{n-1})$ ,  $R_y$  is called <u>maximum</u> <u>point</u> of  $\Pi$  if

$$R_{v} > \max\left\{R_{x}, R_{z}\right\}, \qquad (3.10)$$

namely,  $R_y$  is not smaller than its left and right neighbors.

**Definition 3.3 (End value).** Given the permutation  $\Pi = (R_0, ..., R_{n-1})$ ,  $R_0$  and  $R_{n-1}$  are called <u>end values</u> of  $\Pi$ .



Figure 2. Interconnect configuration



Figure 4. Building BMI order from sorted set of wires

 $\Phi$ -sums posses some properties that are presented below. These are required later to prove that BMI order is optimal. **Property 3.1 (Indifference).** Let the permutation  $\Pi_n = (R_0, ..., R_{n-1})$  be augmented to a new permutation  $\Pi_{n+1} = (R_0, ..., R_i, R_{new}, R_{i+1}, ..., R_{n-1})$  by adding a new value  $R_{new}$ , or let it be modified into another permutation  $\Pi'_n = (R_0, ..., R_x, R_i, R_{x+1}, ..., R_{i-1}, R_{i+1}, ..., R_{n-1})$  by moving  $R_i$  to another position. Then the change in the value of  $\Phi_{n+1}(f, \Pi_{n+1})$  and  $\Phi_n(f, \Pi'_n)$  are affected only by the inserted or moved values and their old and new neighbors in the permutations.

**Property 3.2 (Pair balancing).** If the permutation  $\Pi_n = (R_0, ..., R_{n-1})$  includes a subsequence  $\Psi = (R_\alpha, R_\beta, R_\gamma, R_\delta)$  where  $R_\beta > R_\gamma$  (called internal pair) and  $R_\alpha < R_\delta$  (external pair), then swapping  $R_\beta$  and  $R_\gamma$  will decrease  $\Phi(f, \Pi_n)$  (Fig. 5).

**Property 3.3 (Set balancing).** Let  $\Pi_n = (R_0, ..., R_{n-1})$  be BMI-ordered and let a new value be added to the ordered set (permutation), resulting in  $\Pi_{n+1}$ . Then the position of  $R_x$  which minimizes the increase of the  $\Phi$ -sum is between the two largest values in  $\Pi_n$ . Notice that  $R_x$  is now the new (and single) maximum point.

**Property 4 (Maximum point elimination):** Let  $\Pi_n = (R_0, ..., R_{n-1})$  be a permutation, and let  $\Gamma = (R_\alpha, R_\beta)$  and  $\Omega = (R_\gamma, R_\delta, R_\varepsilon)$  be its subsequences. If  $R_\delta$  is a maximum point and there exists  $R_\alpha > R_\delta > R_\beta$ , then reinserting  $R_\delta$  between  $R_\alpha$  and  $R_\beta$  will decrease  $\Phi(\Pi_n)$  (Fig 6).

## 2.2 Optimal order theorem

<u>Theorem 1 (Optimal order)</u>: Given a bus whose wires are of uniform width W, the BMI order of signals in the bus yields minimum total sum of delays.

**Proof:** Let f be defined by (3.8) and  $\Pi^*$  be the BMI permutation. We'll show by induction that

$$\min_{\Pi} \Phi_n(f, \Pi) = \Phi_n(f, \Pi^{-}).(3.11)$$

Buses with one and two wires are trivially BMI. The theorem holds for 3 wires since we can always put first two signals in the bus, and then augment it by the addition of  $R_3$ , which satisfies  $R_3 > \max(R_1, R_2)$ . The set balancing property ensures that placing the third wire in between the two others is optimal. The order thus obtained is BMI.

Assume now that the theorem holds for *n* wires. Denote the corresponding permutation by  $\Pi^{*,n}$ . Let us show that for any  $R_x$  added to the bus, the optimal order of signals in the n + 1-signal bus is also BMI.

Without loss of generality we may assume that

$$R_x > \max\{R_0...R_{n-1}\}$$
 (3.12)

Indeed, if this was not the case, we could always select  $R_x = \max\{R_0...R_{n-1}, R_x\}$ . The optimal order for the *n* signals  $R_0...R_{n-1}$  is BMI by the induction assumption. Let us now find the optimal location in the bus for the newly added wire. By the set balancing property, minimal increase in  $\Phi$ -sum results when  $R_x$  is placed in between the two largest values in  $\Pi^{*,n}$ . Let us mark the left and the right by  $R_l$  and  $R_r$ , respectively. The augmentation of  $\Pi^{*,n}$  by inserting  $R_x$  results a new set and a corresponding order  $\Pi^{*,n+1}$ , which is also BMI.

One needs now to prove that among all the permutations of n+1 wires,  $\Pi^{*,n+1}$  is indeed the one for which total sum



Figure 5. Pair balancing property

Figure 6. Maximum point elimination property

of delays is minimized. Let  $\Pi^{',n+1}$  be another permutation, which we assume to be the optimal. There are two possibilities. In the first  $\Pi^{',n+1}$  contains one of the subsequences  $(R_l, R_x, R_r)$  or  $(R_r, R_x, R_l)$ . In this case,  $\Pi^{',n+1}$  is definitely suboptimal. This follows by removing  $R_x$ , thus leaving a permutation  $\Pi^{',n}$ , which by the induction hypothesis is inferior compared to  $\Pi^{*,n}$ . The increase of  $\Phi$ -sum caused by inserting  $R_x$  to  $\Pi^{',n}$  and  $\Pi^{*,n}$  is the same by the indifference property, since in both  $\Pi^{',n+1}$  and  $\Pi^{*,n+1} R_x$  has the same neighbors. Consequently,  $\Pi^{',n+1}$  couldn't be optimal.

In the second case  $R_x$ ,  $R_l$  and  $R_r$  are not adjacent in  $\Pi^{',n+1}$ . Recall that except for  $R_x$ ,  $R_l$  and  $R_r$  are the largest. One can then use the pair balancing and maximum point elimination properties swap values and obtain the subsequence  $(R_l, R_x, R_r)$  or  $(R_r, R_x, R_l)$ . Since pair balancing and maximum point elimination decrease the total sum of delays, the resulting permutation is better. This contradicts the optimality of  $\Pi^{',n+1}$ . If it happens that one of  $R_x$ ,  $R_l$  and  $R_r$  is the side value in  $\Pi^{',n+1}$ , bus walls can be thought of as a wires with zero driver resistance. Then all properties decreasing  $\Phi$ -sum apply.

The optimal order theorem guarantees that minimal sum of delays is obtained in the case of uniform wire width if signals are BMI-ordered. Then space optimization takes place, yielding the optimal inter-wire spaces as defined by (3.6). In order to obtain further delay improvement, the uniform wire width W can also be included in the optimization. The entire optimization flow is as follows:

#### Algorithm Uniform\_Width\_Optimization {

Arrange wires in BMI order. Minimize\_ Function  $f_1(W, S_0, ...., S_n)$ 

### 3. PROBLEM EXTENSION FOR UNEQUAL WIRE WIDTHS

### 3.1 W<sub>i</sub> preassigned as a function of R<sub>i</sub>

Uniform wire width may lead into sub optimal sum of delays. On the other hand, assigning arbitrary widths to wires may destruct the property that among all orders BMI is optimal. Fortunately, for a wide and practical range of assignments of different wire widths to different signals, BMI order is optimal.

Assume that wires widths are assigned as follows:

$$W_i = \frac{1}{\psi(R_i)},\tag{4.1}$$

where  $\psi$  is a monotonically non-decreasing functions of  $R_i$ .

Such assignment is practically common, as one attempts to balance the resistance of the driver and the resistance of the driven line. Substituting (4.1) in (3.4) yields:

$$S_{i} = \sqrt{\frac{1}{\lambda} \left( d\psi(R_{i}) + d\psi(R_{i-1}) + hR_{i} + hR_{i-1} \right)}, \ 0 < i \le n-1$$

$$S_{0} = \sqrt{\frac{1}{\lambda} \left( d\psi(R_{0}) + hR_{0} \right)},$$

$$S_{n} = \sqrt{\frac{1}{\lambda} \left( d\psi(R_{n-1}) + hR_{n-1} \right)}$$
(4.2)

(3.7) in this case becomes:

$$f_{1} = na + b\sum_{i=0}^{n-1} \psi(R_{i}) + k\sum_{i=0}^{n-1} \frac{R_{i}}{\psi(R_{i})} + g\sum_{i=0}^{n-1} R_{i} + e\sum_{i=0}^{n-1} \psi(R_{i})C_{i} + \sum_{i=0}^{n-1} C_{i}R_{i} + \frac{1}{A - \sum_{i=0}^{n-1} W_{i}} \left(\sum_{i=0}^{n-2} \sqrt{(d\psi(R_{i}) + d\psi(R_{i+1}) + hR_{i} + hR_{i+1})} + \sqrt{d\psi(R_{0}) + hR_{0}} + \sqrt{d\psi(R_{n-1}) + hR_{n-1}}\right)^{2}$$

$$(4.3)$$

The second part of (4.3) is a  $\Phi$ -sum expression due to the function  $\psi$  used in (4.1). Theorem 1 is now extended for the more general case:

# <u>Theorem 2</u>: Let wire width be a monotonic non-increasing function of driver resistance. The net-ordering yielding minimal sum of delays is then BMI.

Notice that the previously discussed case of uniform wire width W is a particular case of theorem 2. The proof of the theorem is very similar to the proof of Theorem 1 and the associated  $\Phi$ -sum properties.

The function  $\psi(R)$  needs to be selected carefully. The goal is to obtain minimal sum of delays close to the absolute minimum which could be achieved in the space of all orderings, wire widths and wire spacing assignments. A simple, yet practical, wire width function is the following inverse linear.

$$W_i(R_i) = \frac{\alpha}{\beta + \gamma R_i}, \qquad (4.4)$$

where  $\alpha$ ,  $\beta$  and  $\gamma$  are positive constants. Next section demonstrates that this function yields delays which are very close to the global minimum.

## 3.2 Simultaneous optimization of net-ordering, wire sizing and wire spacing

So far we have discussed cases where wire widths were pre-assigned in a certain way, such that BMI net-ordering yields the minimum sum of delays. In the most general case, both wire widths and spaces can vary arbitrarily, yielding 2n + 1 equations. The introduction of wire ordering optimization makes the problem much harder to solve due to the combinatorial nature of the latter. In this case the optimal order of wires is not necessarily BMI. The solution of the most general problem is very complex, as it involves the exploration of many permutations.

In order to make the computational effort acceptable, the following heuristic is proposed. It is based on the BMI order and yields near-optimal solutions. The complex optimization problem is divided into two successive simpler ones. The first assigns wire widths by some parameterized monotonic non-increasing function such as (4.4). BMI order is now guaranteed to be the optimal. Then continuous optimization which explores for the optimal values of wire spacing and the width-function parameters (e.g.  $\alpha$ ,  $\beta$  and  $\gamma$  in (4.4)). This heuristic reduces time complexity of the optimization

problem from O(n!) to  $O(n \cdot p)$ , where p is the number of parameters in the width function. Each wire width and space optimization uses only n + p variables rather than 2n + 1. Experiments show that a well-chosen width-function yields ordering, widths and spaces that result in total sum of delays which is very close to the global optimum.

Algorithm Parametric\_Width\_Optimization {

**Define**  $Wi=Width_function(Ri, \alpha, \beta, \gamma, ....)$ 

Arrange wires in BMI order.

**Minimize\_Function**  $f_l(\alpha, \beta, \gamma, ..., S_0, ..., S_n)$ 

}

## 4. RESULTS AND DISCUSSION

In the following results obtained by MATLAB experiments for various problem instances are given. We used 90 nanometer technology parameters. Coefficients in delay equation (2.2) were calculated based on [19]. In all the examples, the bus area constraint A (total sum of wire widths and spaces) was taken to be 11.1  $\mu m$ , unless otherwise specified. Length L of wires was taken to be 600  $\mu m$ .

The first experiment demonstrates the existence of a typical optimal ordering that behaves similarly to BMI. We ran 20 random problem instances using five signals. Each signal was assigned a driver–receiver pair drawn randomly. The range of driver resistances is 100  $\Omega$  to 2  $K\Omega$  and the range of load capacitances is 10 fF to 200 fF. For each problem

we then optimized the wire widths and spaces to yield minimum total sum of delays. This was done for all the 5!=120 possible order permutations. For each problem, the permutation yielding minimum total sum of delays is presented in Figure 7 a. There, the graphs present the driver resistance versus its optimal location. Though the optimal ordering is not always BMI (this is the most general problem), it behaved very similar to BMI. Further averaging of the results of the 20 optimal orders is illustrated in Figure 7b. Average driver resistance as function of signal location in the bus speaks for itself.

The next experiment shows the delay improvement obtained by net-ordering optimization. There, 40 problems were randomly drawn from the same range as before. The difference in minimal total sum of delays between the best and worst ordering was about 3.6% on average, ranging from 1.1% to 8.0%

#### Table 1

| Bus width,<br>[µm] | No. of<br>wires | Best delay<br>obtained | Worst delay<br>obtained | Diff., % |
|--------------------|-----------------|------------------------|-------------------------|----------|
| 11.1               | 5               | 177.4219               | 184.7639                | 4.13     |
| 13.1               | 6               | 201.274                | 211.4628                | 5.06     |
| 15.1               | 7               | 202.9721               | 213.4702                | 5.17     |
| 17.1               | 8               | 261.846                | 277.1902                | 5.86     |
| 19.1               | 9               | 277.9003               | 294.2682                | 5.89     |
| 21.1               | 10              | 350.8222               | 374.5193                | 6.75     |

The third experiment examines the potential improvement in minimizing total sum of delays made by signal ordering for various bus sizes. Table 1 summarizes the results. Sizes of 5 to 10 signals were examined. Resistance of drivers were randomly drawn from the range  $[100 \Omega, 2 K\Omega]$  and load capacitances are all set to 10 fF. Results of best and worst ordering are presented. The experiment gives the reason to believe that net ordering becomes more effective as size of bus increases.

The next example shows in Table 2 the effect of signal ordering on busses comprising of both strong and weak drivers. A bus of 7 signals comprising driver – load pairs of (100  $\Omega$  – 50 fF) and (1.9  $K\Omega$  – 5 fF) was examined for various numbers of weak drivers. As could be expected, when the number of strong and weak drivers is equal, signal ordering is most effective. The worst ordering was indeed the interleaved one described in Figure 1a, while the best one was clearly BMI.

#### Table 2.

| No. of weak<br>drivers | Best delay obtained, [ps] | Worst delay obtained, [ps] | Diff., % |
|------------------------|---------------------------|----------------------------|----------|
| 1                      | 169.0378                  | 169.7011                   | 0.39     |
| 2                      | 204.7659                  | 214.6185                   | 4.81     |
| 3                      | 245.1067                  | 265.0063                   | 8.12     |
| 4                      | 289.3409                  | 321.7007                   | 11.19    |
| 5                      | 337.3242                  | 362.7043                   | 7.52     |
| 6                      | 391.6022                  | 405.4936                   | 3.55     |

In the last example, delays obtained by exhaustive simultaneous ordering/sizing/spacing optimization are compared with results of heuristics using BMI order. 10 sets of five (driver\_resistance, load\_capacitance) pairs with the same characteristics as in the previous examples were created. For each set of pairs all permutations were generated. After that, 3 different optimizations were performed. First, minimization of total sum of delays (optimization by wire width-space resizing) was performed for each permutation and the one giving best total sum of delays was chosen. In addition, heuristics Uniform\_Width\_Optimization and Parametric\_Width\_Optimization with the inverse linear width function (eq. 4.4) were applied. The results are presented in Table 3. Notice that in columns 4 and 6 delay interval between results of exhaustive search. As can be seen, by using uniform width optimization, global minimum can be approached as close as 12.4% in average, and by choosing parametric width optimization, it is approached as close as 2.8%. By choosing more complicated width functions, the global minimum can be approached with very small deviation.

Wire ordering optimization technique can also be used together with minimizing delay of the most critical wire  $f_2$  (2.4)

. Such optimization problem still has not been solved analytically, but numerical experiments show that an optimal order of wires exists in this case also and it seems to be BMI or very close to BMI. The experiments indicate that wire ordering optimization, together with minimizing maximum delay in a bus can result in significant performance



a. 20 random sets of wires b. Average of sets from (a).

x-axis: location of wires in a bus

y-axis: wire driver resistances, [K  $\Omega$  ]

improvement. For  $5 \mu m$ -width and 1000  $\mu m$ -length bus with 5 wires with driver resistances of 100  $\Omega$ , 200  $\Omega$ , 500  $\Omega$ , 1.2 K $\Omega$  and 1.5 K $\Omega$  and load capacitances all 10 fF, obtained improvement of 10.42%.

# Table 3

| Exhaustive<br>search best<br>delay, [ps] | Exhaustive<br>search | Uniform width opt.  |       | Parametric width opt. |      |
|------------------------------------------|----------------------|---------------------|-------|-----------------------|------|
|                                          | worst<br>delay, [ps] | Num.<br>value, [ps] | %     | Num.<br>value, [ps]   | %    |
| 261.6658                                 | 267.6162             | 262.6209            | 16.05 | 262.0982              | 7.28 |
| 246.9674                                 | 256.7411             | 248.8906            | 19.68 | 247.0419              | 0.76 |
| 345.3575                                 | 352.6115             | 345.5324            | 2.41  | 345.4035              | 0.63 |
| 255.4197                                 | 264.2321             | 256.3625            | 10.70 | 255.4783              | 0.66 |
| 302.1004                                 | 321.3083             | 303.4084            | 6.81  | 302.4597              | 1.87 |
| 259.0154                                 | 266.2539             | 260.4453            | 19.75 | 259.4824              | 6.45 |
| 194.9445                                 | 203.8133             | 197.3182            | 26.76 | 195.0626              | 1.33 |
| 290.1618                                 | 293.0519             | 290.3148            | 5.29  | 290.211               | 1.70 |
| 232.874                                  | 246.2123             | 234.0159            | 8.56  | 233.0841              | 1.58 |
| 297.6708                                 | 305.0678             | 298.2873            | 8.33  | 298.1222              | 6.10 |
|                                          |                      |                     |       |                       |      |

# 5. CONCLUSION

We have shown that reordering of wires in a certain way can improve results of timing optimization by wire-sizing and spacing, for a wiring channel of constrained size. It has been shown that the optimal order of wires generally depends on both wire driver resistances and load capacitances. Analysis of sum-of-delays minimization (which is equivalent to minimization of the average signal delay in the channel) with shield wires at the sides of the channel, showed that when wire widths are uniform or are specified by a monotonic non-increasing function of wire driver resistance, the optimal order is BMI (Balanced Monotonic Interleaved) and depends on driver resistances only. The general problem of simultaneous net-ordering, wire-sizing and spacing optimization was presented, and solution heuristics were proposed, reducing complexity from O(n!) to O(np), and the number of optimization variables from 2n + 1 to n + p. Numerical experiments demonstrated heuristic results approaching the global optimum within approximately 3%.

#### 6. REFERENCES

1. Kahng, S. Muddu, E. Sarto and R. Sharma, "Interconnect Tuning Strategies for High-Performance ICs", *Proc. Design, Automation and Testing in Europe*, Paris, February 1998.

2. J. Cong and K. S. Leung, "Optimal Wiresizing Under Elmore Delay Model". *IEEE Trans. on Computer-Aided Design*, vol. 14, no. 3, pp. 321-336, Mar. 1995.

3. S. S. Sapatnekar, "Wire Sizing as a Convex Optimization Problem: Exploring the Area Delay Tradeoff", *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 15, No. 8, pp. 1001 - 1011, August 1996.

4. J. Cong, L. He, C. K. Koh and Z. Pan, "Interconnect Sizing and Spacing with Consideration of Coupling Capacitance", *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 20, no. 9, pp.1164-1169, September 2001.

5. M. Becer, D. Blaauw, V. Zolotov, R. Panda, I. Hajj, "Analysis of Noise Avoidance Techniques in Deep-Submicron Interconnects Using a Complete Crosstalk Noise Model", *IEEE/ACM Design Automation and Test in Europe Conference (DATE)*, March 2002, pp. 456-463.

6. D. A. Kirkpatrick and A. L. Sangiovanni-Vincentelli. "Techniques for crosstalk avoidance in the physical design of high performance digital systems", *In Proc. IEEE/ACM Int. Conf. Computer Aided Design*, pp. 616--619, Nov. 1994.

7. R. Arunachalam, E. Acar, and S. R. Nassif, "Optimal shielding/spacing metrics for low power design", *Proceedings of IEEE Computer Society Annual Symposium on VLSI*, pp. 167-172, 2003.

8. J. Cong, D. Z. Pan and P. V. Srinivas, "Improved Crosstalk Modeling for Noise Constrained interconnect Optimization", *ACM/IEEE International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems*, pp. 14-20, Austin, December 2000.

9. H. Zhou, D.F. Wong, "Global Routing with Crosstalk Constraints", *Proceedings of the 35th annual conference on Design automation*, vol. 00, pp. 374-377, 1998

P. Groeneveld, "Wire ordering for detailed routing", *IEEE Design and Test*, vol.6, issue 6, pp. 6-17, Nov.

11. A.Vittal, L Chen, M. Marek-Sadowska, K-P. Wang, S Yang, "Crosstalk in VLSI Interconnections", *IEEE Transactions on CAD Design of IC and Systems*, Vol.18, No. 12, Dec. 1999

12. Tong Gao and C.L. Liu, "Minimum Crosstalk Channel Routing", *TechnicalDigest Int. Conf. on CAD*, 1993, pp 692-696.

13. James D.Z. Ma and Lei He, "Formulae and Applications of Interconnect Estimation Considering Shield Insertion and Net Ordering", 2001 International Conference on Computer-Aided Design (ICCAD '01), pp.327-332, Nov. 2001.

14. Rony Kay and Rob A. Rutenbar, "Wire Packing: A Strong Formulation of Crosstalk-Aware Chip-Level Track/Layer Assignment with an Efficient Integer Programming Solution", *Proc. of 2000 Int. Symposium on Physical Design*, 2000, pp. 61-68.

15. P. Gupta and A.B. Kahng. "Wire Swizzling to Reduce Delay Uncertainty Due to Capacitive Coupling." *Proc. IEEE Intl. Conf. on VLSI Design*, January, 2004.

16. K. S. Jhang, S. Ha, C.S. Jhon, "A Segment Rearrangement Approach to Channel Routing under the Crosstalk Constraints", *APCCAS '94., 1994 IEEE Asia-Pacific Conference on Circuits and Systems*, 1994, pp. 536-541

17. E. Macii, M. Poncino, S. Salerno, "Combining Wire Swapping and Spacing for Low-Power Deep-Submicron Buses", *Proc. of the 13th ACM Great Lakes symposium on VLSI*, 2003, pp. 198-202

18. Desmond A. Kirkpatrick and Alberto L. Sangiovanni-Vincentelli, "Digital Sensitivity: Predicting Signal Interaction using Functional Analysis", *Proc.IEEE/ACM International Conference on Computer Aided Design*, 1996, pp. 536-541.

19. Berkeley Predictive Technology Model, <u>http://www-device.eecs.berkeley.edu/~ptm/</u>

20. T. Sato, Y. Cao, D. Sylvester and C. Hu, "Characterization of interconnect coupling noise using in-situ delay change curve measurements", *Proc. IEEE ASIC/SoC Conf.*, 2000, pp. 321-325.

21. P. Chen and K. Keutzer, "Toward true crosstalk noise analysis", Proc. ICCAD, 1999, pp. 132-137

A. Abou-Seido, B. Nowak and C. Chu, "Fitted Elmore Delay: A Simple and Accurate Interconnect Delay Model", *IEEE Transactions on VLSI Systems*, vol. 12, no. 7, pages 691-696