Professional Documents
Culture Documents
I. I NTRODUCTION
In recent years, programmable logic devices (PLDs) are
used in various elds, such as prototyping, high performance
computing and networking. Field programmable gate arrays
(FPGAs) [1], [2], which can realize complex sequential
circuits, are the most widely used PLDs. An FPGA consists
of look-up tables (LUTs) as logic elements and switch
blocks (SBs) as routing elements, and LUTs and SBs are
connected with wires (Fig. 1(a)). To realize a circuit on an
FPGA, an LUT works as a logic cell, and SBs determine the
connections among LUTs (logic cells). To our knowledge,
however, the area for routing resources can reach more than
80% of the total area of an FPGA (e.g., [3], [4]), and it
degrades area efciency. In other words, the number of logic
elements per unit area is small in FPGAs. Since SBs account
for a large portion of the area for routing resources, a new
recongurable device, called memory-based PLD (MPLD1 ),
which has no SBs as routing elements has been proposed [5]
to improve area efciency. An MPLD consists of an array of
multiple-output look-up tables (MLUTs), which can be used
as either logic element or routing element (Fig. 1(b)). By
1 MPLDTM
Connection Block
LUT
448
Switch Block
MLUT
(a) FPGA
(b) MPLD
Figure 1.
addr
addr[1]
data[1]
addr[0]
data[0]
adjacent line
short vertical distant line
long distant line
skipping four MLUTs
MLUT with
long vertical distant line
MLUT with
long 45-degree distant line
MLUT with
long 135-degree distant line
tile of MLUTs
Figure 4.
MLUT
(a) MLUT
data[0] addr[1]
addr[0]
data[1]
MLUT
data[3]
addr[2]
addr[3] data[2]
Figure 2.
data[2]
Figure 3. Two segments of nets and one logic cell realized by an MLUT
data[3:0]
(b) AD pair
addr[2]
64bit
SRAM
(MLUT)
data[3]
addr[3:0]
addr[3]
data
[3][2][1][0] [3][2][1][0]
0 0 0 0
0 0 - 0
0 0 0 1
0 0 - 0
0 0 1 0
0 0 - 0
0 0 1 1
0 0 - 1
0 1 0 0
1 0 - 0
0 1 0 1
1 0 - 0
0 1 1 0
1 0 - 0
0 1 1 1
1 0 - 1
1 0 0 0
0 1 - 0
1 0 0 1
0 1 - 0
1 0 1 0
0 1 - 0
1 0 1 1
0 1 - 1
1 1 0 0
1 1 - 0
1 1 0 1
1 1 - 0
1 1 1 0
1 1 - 0
1 1 1 1
1 1 - 1
449
module decoder(a,b,c,d,z);
input a,b,c,d;
output z;
assign z = (a&~b)&(c|d);
endmodule;
(a) RTL
Logic
cell
1
s = s0;
sb = s;
c = cost(s);
T = T0;
M = M0;
while(T>Tend){
loop_cnt = 0;
while(loop_cnt <M){
sn = neighbor(s);
cn = cost(sn);
Logic
cell
2
Figure 5.
Data conversion
}
T = *T;
M = *M;
// cooling
}
return sb;
Figure 6.
B. Neighbor function
The things necessary to dene when using SA are a
method to make neighbor solutions from the current solution
and a method to compare them. Our placement algorithm
adopts two ways to make a neighbor solution. One is the
migration of logic cell, and the other is the exchange of
logic cells.
In the migration of logic cell, a logic cell and an MLUT
are selected at random, and the cell is placed at the MLUT.
In this move, the maximum range of move is given to limit
the migration distance. The migration distance of a cell is
dened by the length of the shortest path without distant
lines from the source MLUT to the destination MLUT of
the cell. The length of a path is the number of MLUTs
on the path. The maximum range of move, m, is set to
the length of the longer (horizontal or vertical) edge of the
450
rectangle covering all the cells (MLUTs). bbx (n) (bby (n)) is
the length of the bounding box along x-axis (y-axis). Note
that, since our target MPLD can be considered as a diagonal
grid of MLUTs, we dene bounding boxes, x- and y-axes,
diagonally, as shown in Fig. 7. MLUT(k, l) refers to the
MLUT at point (k, l) in the diagonal Cartesian coordinate
system.
2) Congestion: Congestion in Expression (1) represents
the estimated congestion of the routes of nets. When the
estimated routes of nets are congested in a specic area, the
factor becomes high. It reduces the overcrowding of nets. In
the estimation, we assume that each net is routed in the area
of its bounding box without any detour, allowing overlapping
of routes. Note that the bounding boxes of nets can overlap
each other. It means there is a possibility of the interference
of nets, and if a lot of bounding boxes of nets overlap in a
specic area, the possibility increases. Therefore, the area in
which nets concentrate is roughly estimated from the number
of the overlapping bounding boxes of nets.
In addition, we consider the direction of routes, because
only the nets whose routes have the same direction can
interfere each other in MPLD architecture. Thus, for each
wire of an AD pair, the congestion level is separately
calculated, and the congestion level of an AD pair is dened
as the sum of the congestion levels of its wires, as shown
in Fig. 8. Congestion of a placement solution is dened as
Expression (3).
N
net
congestion =
(2)
source
destination
other
wire
Bounding
Box
bby(n)
bbx(n)
Figure 7.
n=1
M
L
U
T
cx (n)
cy (n)
1
.
bby (n) + 1
1
.
bbx (n) + 1
(4)
(5)
451
source
destination
net i
net j
congestion level
of AD pair
(k,l+1)
0.33
(k+1,l)
0.17
a,bVc
0.42
(=0.25+0.17)
(k,l)
0
y
0.33
0.17
0.25
(k-1,l)
Figure 8.
V. ROUTING A LGORITHM
(k,l-1)
cannot be routed
cell
cell
cell
cell
cell
cell
Figure 9.
Effect of nearness
B. Pre-routing
In pre-routing, nets are routed one by one, ignoring the
previously routed nets, by using Dijkstras shortest path
algorithm. That is, nets can overlap each other in this step.
A multi-terminal net is decomposed into two-terminal nets
when the net is routed. In this step, each edge has the same
unit length.
C. Actual Routing
According to the result of pre-routing, the estimated
congestion for each edge is dened by the number of
nets passing through the edge in pre-routing. In this step,
each edge has its initial length dened by the estimated
congestion of the edge. In actual routing, nets are routed one
by one. Thus, the length of an edge is updated to innity
when the edge is used. For each net, a shortest path is found
as its route by using Dijkstras algorithm.
452
Table II
R ESULT OF C 1908
p
1
0
5
10
15
20
1
1
1
1
1
1
1
1
1
1
q
1
1
1
1
1
1
0
5
10
15
20
1
1
1
1
1
r
1
1
1
1
1
1
1
1
1
1
1
0
5
10
15
20
routing R.
100.00%
95.01%
100.00%
95.01%
92.87%
92.87%
95.25%
100.00%
100.00%
100.00%
100.00%
90.26%
100.00%
100.00%
100.00%
100.00%
MLUT U.
28.91%
N/A
28.12%
N/A
N/A
N/A
N/A
28.95%
30.03%
31.90%
32.84%
N/A
30.94%
30.94%
35.89%
35.89%
success R.
40%
0%
10%
0%
0%
0%
0%
60%
60%
70%
50%
00%
90%
100%
100%
70%
VI. E XPERIMENTS
We conducted experiments to evaluate our proposed algorithm and MPLDs ability. In the experiments, circuits
from ISCAS85 and ISCAS89 benchmark circuit suites
were placed and routed on MPLDs with 1530, 3336,
6360, and 9390 MLUTs, where an MPLD with H W
MLUTs consists of W vertical columns each of which has H
vertically arranged MLUTs. The positions of primary I/Os
of each circuit were manually given. Each of the coefcients
p, q, and r in the cost function of placement was set to 0, 1,
5, 10, 15, and 20. For each combination of the coefcient
values, placement and routing were performed 10 times, and
the best solution was chosen. Note that our cost function
when q = 0 and r = 0 is equivalent to that of VPR [6].
was set to 4.
Table I shows the experimental results of ISCAS89
benchmark circuits. In the table, routing R. represents the
ratio of the number of the nets successfully routed to the
number of all the nets. MLUT U. (usage) represents the
ratio of the number of the used MLUTs. Note that MLUTs
are also used as routing elements. And p, q, and r are
the coefcients in the cost function when the best result
(with the highest routing R., and then the lowest MLUT
usage) is achieved. P&R success R. (ratio) represents the
ratio of the number of circuits successfully routed to the
total number of circuits. We observed that our method
VII. C ONCLUSIONS
In this paper, we proposed a placement and routing algorithm for a new recongurable device, MPLD, considering
the overcrowding of logic gates and interference of nets, and
evaluated its effectiveness and MPLDs ability to realize a
circuit. The experimental results showed that our proposed
algorithm suppresses the overcrowding of logic gates, and
453
Table I
P LACEMENT AND ROUTING RESULT OF ISCAS89
circuit
s27
s298
s349
s386
s420
s510
s526n
s731
s832
s953
s1238
s1488
s5378
s13207
s35932
s38584
P&R success R.
#gates
10
119
161
159
196
211
194
393
287
395
508
653
2779
7951
16095
19253
MLUTs
15*30
15*30
15*30
15*30
15*30
15*30
15*30
15*30
33*36
63*60
63*60
63*60
93*90
93*90
93*90
93*90
p
1
5
1
5
1
10
1
5
10
20
15
1
1
1
1
1
q
5
5
5
1
1
10
1
5
5
1
5
5
1
15
10
5
r
0
0
0
1
5
20
10
15
10
15
15
1
1
10
5
10
our method
routing R.
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
96.2%
96.4%
87.5%
MLUT U.
1.06%
8.43%
11.47%
17.16%
13.65%
41.89%
13.80%
31.71%
21.48%
12.05%
14.62%
17.19%
17.97%
27.48%
N/A
N/A
VPR equiv. (q = r = 0)
routing R.
MLUT U.
100%
1.07%
100%
8.93%
100%
11.73%
100%
17.38%
100%
14.02%
93.16%
N/A
100%
14.48%
98.33%
N/A
92.33%
N/A
95.58%
N/A
92.88%
N/A
92.20%
N/A
81.87%
N/A
99.83%
N/A
92.49%
N/A
86.31%
N/A
37.5%
Figure 10.
[8] Synopsys, Inc., RTL Synthesis & Test. [Online]. Available: http://www.
synopsys.com/Tools/Implementation/RTLSynthesis/
[9] Altera Corporation, Design Software. [Online]. Available: http://www.
altera.com/products/software/
[10] Berkeley Logic Synthesis and Verication Group, ABC: A System for
Sequential Synthesis and Verication. [Online]. Available: http://www.
eecs.berkeley.edu/alanmi/abc, Release 70930.
[11] C. E. Cheng, Accurate and Efcient Placement Routability Modeling, in Proc. IEEE/ACM ICCAD, Nov. 1994, pp.690695.
[12] H. Shirota, S. Shibatani, and M. Terai, A New Rip-Up and Reroute
Algorithm for Very Large Scale Gate Arrays, IEICE Trans. on
Fundamentals, vol.E80-A, no.3, pp.506513, March 1997.
454