EDA Environment For Evaluating A New Switch-Block-Free Reconfigurable Architecture (MPLD)

2011 International Conference on Reconfigurable Computing and FPGAs
EDA Environment for Evaluating

a New Switch-Block-Free Recongurable Architecture
Masatoshi NAKAMURA, Masato INAGI,
Kazuya TANIGAWA, Tetsuo HIRONAKA
Graduate School of Information Sciences
Hiroshima City University
3-4-1 Ozukahigashi, Asaminami, Hiroshima, 731-3194 Japan
Email: {inagi, tanigawa, hironaka}@hiroshima-cu.ac.jp
Masayuki SATO, Takashi ISHIGURO

R & D Center
Taiyo Yuden Co., Ltd.
6-16-20 Ueno, Taito, Tokyo, 110-0005 Japan
Email: {m-sato, tad-ishiguro}@jty.yuden.co.jp
this structure, an MPLD aims more efciently to realize a

circuit than standard FPGAs.
To use an MPLD as a circuit, we need to congure the
MLUTs on the MPLD. To generate conguration data, it is
necessary to go through the same processes as of FPGAs.
In other words, it is necessary to write a logic design
of the target circuit in a hardware description language
(HDL), to synthesize the netlist of the circuit, and to conduct
placement and routing of the circuit on the device. In the
processes, placement and routing depends on the target
device. However, no placement and routing algorithms for
MPLDs have been reported yet.
Placement and routing techniques for FPGAs have been
studied for more than two decades. VPR [6] is an example
of open-source placement and routing tool for FPGAs. In
placement and routing, logic cells are allocated (i.e., placed)
to LUTs, and they are connected by using wires and routing
elements, such as SBs in FPGAs. Since the amount of
routing elements are xed in an FPGA, the degree of routing
freedom is ensured by preparing much routing resource in
the device. On the other hand, in an MPLD, it is ensured
by not distinguishing logic elements and routing elements.
It is determined by the dispersion of placed logic cells as
obstacles. Since there are no consideration about adjusting
AbstractIn this study, we developed and implemented a

placement and routing algorithm for a new switch-block-free
ne-grain recongurable device, called MPLD, as an evaluation
environment of its ability to realize sequential circuits. An
MPLD consists of an array of multiple-output LUTs (MLUTs),
which work as logic elements and/or routing elements, and has
no switch blocks for routing, unlike FPGAs. Thus, when the
logic cells of a circuit are placed on an MPLD, MLUTs need
to be reserved for routing around the placed logic cells. Our
simulated annealing-based placement algorithm for MPLDs
avoids overcrowding logic cells and reserves routing space,
by considering (1) detailed estimated wire congestion and (2)
distance between logic cells, in its cost function. In experiments,
we conrmed that sequential circuits were successfully placed
and routed on MPLDs in our evaluation environment.
Keywords-MPLD; FPGA; placement; routing; EDA tool.
I. I NTRODUCTION
In recent years, programmable logic devices (PLDs) are
used in various elds, such as prototyping, high performance
computing and networking. Field programmable gate arrays
(FPGAs) [1], [2], which can realize complex sequential
circuits, are the most widely used PLDs. An FPGA consists
of look-up tables (LUTs) as logic elements and switch
blocks (SBs) as routing elements, and LUTs and SBs are
connected with wires (Fig. 1(a)). To realize a circuit on an
FPGA, an LUT works as a logic cell, and SBs determine the
connections among LUTs (logic cells). To our knowledge,
however, the area for routing resources can reach more than
80% of the total area of an FPGA (e.g., [3], [4]), and it
degrades area efciency. In other words, the number of logic
elements per unit area is small in FPGAs. Since SBs account
for a large portion of the area for routing resources, a new
recongurable device, called memory-based PLD (MPLD1 ),
which has no SBs as routing elements has been proposed [5]
to improve area efciency. An MPLD consists of an array of
multiple-output look-up tables (MLUTs), which can be used
as either logic element or routing element (Fig. 1(b)). By
1 MPLDTM
Connection Block
LUT
is a trademark of TAIYO YUDEN CO., LTD.
978-0-7695-4551-6/11 $26.00 2011 IEEE

DOI 10.1109/ReConFig.2011.31
448
Switch Block
MLUT
(a) FPGA
(b) MPLD
Figure 1.
FPGAs fabric and MPLDs fabric
the degree of routing freedom in the existing placement

and routing algorithms for FPGAs, directly applying those
algorithms to MPLDs overcrowds the placed logic cells, and
causes the shortage of routing elements between logic cells.
As a result, the target circuit possibly fails to be mapped
into an MPLD.
In this study, we develop a placement and routing tool for
MPLDs to construct an evaluation environment of MPLDs,
and conrm that a sequential circuit can be automatically
placed and routed on the device. Our placement algorithm
for MPLDs is based on a meta-heuristic optimization algorithm, called simulated annealing (SA) [7], which is often
adopted by placement algorithms for LSIs including FPGAs.
In order to prevent the logic cells from overcrowding and to
reserve MLUTs for routing, our algorithm adopts the cost
function that considers
1) detailed estimated wire congestion and
2) distance between logic cells as obstacles
in addition to estimated total wire length considered in VPR.
To conrm MPLDs ability to realize a sequential circuit, we
perform some experiments using the evaluation environment.
The rest of this paper is organized as follows. In section 2,
the basic structure of MPLD is shown. Section 3 introduces
the entire process of our EDA tool for MPLDs. In section
4 and section 5, our placement and routing algorithm is
presented. Section 6 shows the evaluation results. Finally,
section 7 concludes this paper.
addr
addr[1]
data[1]
addr[0]
data[0]
adjacent line
short vertical distant line
long distant line
skipping four MLUTs
MLUT with flip-flop
MLUT with
long vertical distant line
MLUT with
long 45-degree distant line
MLUT with
long 135-degree distant line
tile of MLUTs
Figure 4.
Adjacent lines and distant lines
As shown in Fig. 2(c), an MPLD basically consists

of MLUTs diagonally connected by AD pairs. These AD
pairs, which connect diagonally adjacent MLUTs, are called
adjacent lines. MPLDs also have AD pairs for connecting
distant MLUTs, called distant lines, as shown in Fig. 4.
Some distant lines are used to connect MLUTs and other
elements, such as ip-ops and I/O pads. The best structure
regarding distant lines is still being sought for.
We assume that the MPLDs discussed in this paper consist
of MLUTs with 7 AD pairs, and have the structure shown
in Fig. 4.
MLUT
III. F RAMEWORK OF O UR E VALUATION E NVIRONMENT
(a) MLUT
Our evaluation environment of MPLD is implemented as

an EDA tool for MPLDs.
First of all, in logic synthesis, an architecture-independent
gate-level netlist (Fig. 5 (b)) is synthesized from the register
transfer level (RTL) (Fig. 5 (a)) design of the target circuit
written in HDL. It is done by using a logic synthesizer, such
as Design Compiler [8]. In our experiment, Altera Quartus
II [9] was used for logic synthesis.
Next, in technology mapping, the gate-level netlist is
converted to a cell-level netlist (Fig. 5 (c)), making cells
data[0] addr[1]
addr[0]
data[1]
MLUT
data[3]
addr[2]
addr[3] data[2]
AD pair (addr&dara pair)
Figure 2.
data[2]
Figure 3. Two segments of nets and one logic cell realized by an MLUT
data[3:0]
(b) AD pair
addr[2]
(b) Truth table (memory data)
The most important characteristic of MPLD is that there is

no difference between logic elements and routing elements.
As shown in Fig. 1, an MPLD consists of only MLUTs,
which can be used as either logic element or routing element.
An MLUT with N -input and N -output terminals is realized
by a 2N N -bit memory module (Fig. 2(a)). Fig. 3 illustrates
an example of MLUT working as both logic element and
routing element. We call a pair of bits of its address line
and its data line, an AD pair (Fig. 2(b)).
64bit
SRAM
(MLUT)
data[3]
(a) 4-input/4-output MLUT
II. S TRUCTURE OF MPLD
addr[3:0]
addr[3]
data
[3][2][1][0] [3][2][1][0]
0 0 0 0
0 0 - 0
0 0 0 1
0 0 - 0
0 0 1 0
0 0 - 0
0 0 1 1
0 0 - 1
0 1 0 0
1 0 - 0
0 1 0 1
1 0 - 0
0 1 1 0
1 0 - 0
0 1 1 1
1 0 - 1
1 0 0 0
0 1 - 0
1 0 0 1
0 1 - 0
1 0 1 0
0 1 - 0
1 0 1 1
0 1 - 1
1 1 0 0
1 1 - 0
1 1 0 1
1 1 - 0
1 1 1 0
1 1 - 0
1 1 1 1
1 1 - 1
(c) Basic Structure of MPLD
Basic structures of MLUT and MPLD
449
module decoder(a,b,c,d,z);
input a,b,c,d;
output z;
assign z = (a&~b)&(c|d);
endmodule;
(a) RTL
Logic
cell
1
s = s0;
sb = s;
c = cost(s);
T = T0;
M = M0;
// s: current sol., s0: initial sol.

// sb: best sol.
// c: current cost., cost(s): cost of s
// T: current temp., T0: initial temp
// M: number of loops, M0: initial M
while(T>Tend){
loop_cnt = 0;
while(loop_cnt <M){
sn = neighbor(s);
cn = cost(sn);
// repeat until the terminal temp.
(b) Boolean netlist

1
Logic
cell
2
if (rand() <e^{(c-cn)/T}){ // accept check (comparing the costs)

// update the current solution
s = sn;
c = cn;
update_best_sol(sb,s); // update the best solution if necessary
}
loop_cnt = loop_cnt + 1;
(c) Logic cells

(d) Placement result
Figure 5.
// repeat neighbor generation

// make a neighbor solution
// calculate the cost of the neighbor
(e) Routing result
Data conversion
}
T = *T;
M = *M;
by clustering gates. Each cell is represented by a look-up

table, and must be smaller than an MLUT, in terms of the
number of inputs and the number of outputs. In the current
version of our EDA tool, it is done by a technology mapping
tool proposed in [10], which is also used for FPGAs.
After technology mapping, the cells in the cell-level netlist
are placed in the target MPLD (Fig. 5 (d)). An MLUT can
be shared by some cells as long as the total size of them is
less than the MLUT. After the placement of cells, the route
of each net, which electrically connects cells, is decided in
routing process.
Finally, the bitstream data is generated as the conguration data of the target MPLD.
// cooling
}
return sb;
Figure 6.
Pseudo code of simulated annealing
is terminated when T reach to the termination temperature

Tend .
The number of loops M refers to the number of times of
the neighbor solution generation in the same temperature. In
our implementation, the initial temperature T0 is set to the
temperature at which the accept ratio is about 90%, by binary
search. In our proposed placement method, the number of
loops M0 at the initial temperature is dened as M0 = 10
Nnet 1.33 , where Nnet is the number of the nets of the target
circuit. After M times of neighbor solution generations, T
is cooled.
In cooling process, T and M are updated as follows:
Tk+1 = Tk , and Mk+1 = Mk . In our algorithm,
and are set to 0.9 and 1, respectively.
IV. P LACEMENT A LGORITHM

Placement for MPLDs is a process to allocate cells to
MLUTs. Our proposed placement algorithm is based on
simulated annealing (SA) [7], which is often used for circuit
placement. In this section, we explain how to t SA to
placement process for MPLDs.
A. Simulated Annealing (SA)
The ow of SA is shown in Fig. 6. SA generates a
neighbor solution from the current solution, and compares
them by using a cost function, and decides whether or not it
moves to the neighbor solution. SA searches for the optimal
solution by repeating this. The feature of this method is to
use the temperature parameter T . When deciding whether to
move, if the cost of the solution is higher than the current
solution, the move is accepted by the probability controlled
by T . The higher (i.e., worse) T is, the higher the accept
ratio is. T is decreased by the cooling process after enough
times of neighbor solution generations in the temperature. In
the beginning of SA, the solution space is widely searched.
As the temperature T goes lower, the searched area in the
solution space is converged. Then, in the end of SA, the
area near the optimal solution is intensively searched. This
method suppresses the possibility of being caught near a
local optimum by statistically accepting worse solutions. SA
B. Neighbor function
The things necessary to dene when using SA are a
method to make neighbor solutions from the current solution
and a method to compare them. Our placement algorithm
adopts two ways to make a neighbor solution. One is the
migration of logic cell, and the other is the exchange of
logic cells.
In the migration of logic cell, a logic cell and an MLUT
are selected at random, and the cell is placed at the MLUT.
In this move, the maximum range of move is given to limit
the migration distance. The migration distance of a cell is
dened by the length of the shortest path without distant
lines from the source MLUT to the destination MLUT of
the cell. The length of a path is the number of MLUTs
on the path. The maximum range of move, m, is set to
the length of the longer (horizontal or vertical) edge of the
450
rectangle covering all the cells (MLUTs). bbx (n) (bby (n)) is
the length of the bounding box along x-axis (y-axis). Note
that, since our target MPLD can be considered as a diagonal
grid of MLUTs, we dene bounding boxes, x- and y-axes,
diagonally, as shown in Fig. 7. MLUT(k, l) refers to the
MLUT at point (k, l) in the diagonal Cartesian coordinate
system.
2) Congestion: Congestion in Expression (1) represents
the estimated congestion of the routes of nets. When the
estimated routes of nets are congested in a specic area, the
factor becomes high. It reduces the overcrowding of nets. In
the estimation, we assume that each net is routed in the area
of its bounding box without any detour, allowing overlapping
of routes. Note that the bounding boxes of nets can overlap
each other. It means there is a possibility of the interference
of nets, and if a lot of bounding boxes of nets overlap in a
specic area, the possibility increases. Therefore, the area in
which nets concentrate is roughly estimated from the number
of the overlapping bounding boxes of nets.
In addition, we consider the direction of routes, because
only the nets whose routes have the same direction can
interfere each other in MPLD architecture. Thus, for each
wire of an AD pair, the congestion level is separately
calculated, and the congestion level of an AD pair is dened
as the sum of the congestion levels of its wires, as shown
in Fig. 8. Congestion of a placement solution is dened as
Expression (3).
target MPLD at rst. Then, it is gradually decreased by the

cooling function mk+1 = max(4, 0.9mk ). By considering
this range, wider area of the solution space is searched in
the beginning, and intensive search is conducted in the end
of the optimization. In the exchange of cells, two cells are
selected at random, and then their positions (MLUTs they
belong to) are exchanged.
Note that the total number of input (output) signals of
the cells in an MLUT must be less than the number of
input (output) terminals of the MLUT. Infeasible neighbor
solutions are canceled, and other solutions are generated
until a feasible one is found.
C. Cost function
Since SA searches for the optimal solution by comparing
the evaluation values of solutions, cost function is one of
the most important factors of SA. In our cost function, in
addition to the total wire length and wire congestion, which
are also used for FPGAs, cell nearness penalty that reserves
routing area is considered. When calculating these factors,
the routing directions of signals are also considered.
Our proposed cost function for MPLDs is dened as
Cost = p length + q congestion + r nearness, (1)
where length, congestion and nearness are the factors dened
in the following itemization, and p, q, and r are user-dened
coefcients to balance the factors. To ease the calculation,
distant lines are not considered in the cost function.
1) Length: Length in Expression (1) represents the estimated total wire length. Its role is to reduce wire length. It
is formulated as shown in Expression (2).
length =
N
net
q(n){bbx (n) + bby (n)},
congestion =
where gx (k, l) is the congestion level of the wire from

MLUT(k, l) to x direction. Similarly, gx (k, l), gy (k, l),
and gy (k, l) are the congestion levels of the wires from
MLUT(k, l) to x, y, and y directions, respectively. The
congestion level of a wire of an AD pair is the summation
of all the congestion levels given by the bounding boxes
covering the wire. The congestion level given by a bounding
box has its direction. Suppose MLUT(u, v) is the signal
source of a net. The wire from MLUT(k, l) to x direction
(x direction) is added by the congestion level of the
bounding box if k > u (k < u). The congestion levels
of y direction and y direction are added in the same way.
The congestion level of x direction (y direction) of
bounding box is calculated by Expression (4) (Expression
(5)).
(2)
where q(n) is the weight for the net depending on the

number of the cells to which the net connects [11], and
bbx (n) + bby (n) is the half perimeter of the bounding box
of all the cells (MLUTs) to which the net connects. The
bounding box of cells (MLUTs) refers to the minimum
source
destination
other
wire
Bounding
Box
bby(n)
bbx(n)
Figure 7.
{(gx (k, l))2 + (gx (k, l))2

+ (gy (k, l))2 + (gy (k, l))2 }, (3)
n=1
M
L
U
T
cx (n)
cy (n)
1
.
bby (n) + 1
1
.
bbx (n) + 1
(4)
(5)
The congestion levels of the congested areas are emphasized

by squared gx , gx , gy , and gy in Expression (3).
Bounding box and wire length
451
d(a, b) is the distance (i.e., the shortest path length in the

MPLD) between logic cell a and b. Then, nearness of the
placement solution is dened as

nearness =
p nearness(a, b),
source
destination
net i
net j
congestion level
of AD pair
(k,l+1)
0.33
(k+1,l)
0.17
a,bVc
0.42
(=0.25+0.17)
where Vc is the set of the logic cells of the cell-level netlist.

It works as a repulsive force between cells to ensure MLUTs
for routing (Fig. 9).
(k,l)
0
y
0.33
0.17
0.25
(k-1,l)
Figure 8.
V. ROUTING A LGORITHM
(k,l-1)
Routing process decides the paths of nets among placed

logic cells. Our detailed routing algorithm is based on
Dijkstras shortest path algorithm. Also, our rip-up re-routing
algorithm is based on [12]. When using these algorithms,
we consider wire congestion. The details are described in
the following.
Example of calculation of congestion
occupied MLUT (obstacle)
cannot be routed
cell
cell
A. Flow of Routing Process
cell
Our routing algorithm consists of three steps: pre-routing,

actual routing, and rip-up and re-routing. In pre-routing, each
net is routed ignoring the previously routed nets. It is done
to estimate wire congestion, and the estimated congestion
is used in actual routing. The nets routed in pre-routing
are removed before actual routing. In actual routing, each
net is routed considering the estimated congestion and the
previously routed nets. If it fails to route all the nets, rip-up
and re-routing is conducted. In the step, nets passing through
a congested area are removed, and re-routed.
In this process, an MPLD is modeled as a directed graph
G(V, E), where V is a set of the vertices corresponding to
MLUTs, and E is a set of the edges corresponding to wires
of AD pairs. Note that distant lines are also considered in
G(V, E).
cell
cell
cell
(a) Without nearness
Figure 9.
(b) With nearness
Effect of nearness
For example, in the case that the two bounding boxes of

net i and j overlap each other like as shown in Fig. 8, the
congestion levels by the bounding boxes of net i and j are
as follows.

cx (i) = 0.25
cx (j) = 0.17
,
cy (i) = 0.17
cy (j) = 0.33
B. Pre-routing
In pre-routing, nets are routed one by one, ignoring the
previously routed nets, by using Dijkstras shortest path
algorithm. That is, nets can overlap each other in this step.
A multi-terminal net is decomposed into two-terminal nets
when the net is routed. In this step, each edge has the same
unit length.
3) Nearness: Unlike FPGAs, an MPLD uses the MLUTs

as both logic elements and routing elements. Thus, MLUTs
for routing elements are necessary between MLUTs containing logic cells. However, if the logic cells are overcrowded in
a specic area, it becomes impossible to ensure the enough
routing elements for connection. To resolve this problem, our
cost function introduces the nearness between logic cells as
part of the cost.
Nearness in Expression (1) represents how close logic
cells are each other. The cost of nearness of a pair of logic
cells a and b is dened as

d(a, b) (0 < d(a, b) )
p nearness(a, b) =
0
(otherwise) ,
C. Actual Routing
According to the result of pre-routing, the estimated
congestion for each edge is dened by the number of
nets passing through the edge in pre-routing. In this step,
each edge has its initial length dened by the estimated
congestion of the edge. In actual routing, nets are routed one
by one. Thus, the length of an edge is updated to innity
when the edge is used. For each net, a shortest path is found
as its route by using Dijkstras algorithm.
where is a user-dened constant corresponding to the

number of tracks in each routing channel in an FPGA, and
452
Table II
D. Rip-up and Re-routing
R ESULT OF C 1908
If all the nets cannot be routed in actual routing, rip-up

and re-routing is conducted by using a simplied algorithm
of [12]. It consists of three phases: routing with violations,
local rip-up and re-routing, and global rip-up and re-routing.
Routing with violations are almost the same as our actual
routing. Yet, the length of each edge is updated to not innity
but a large value when the edge is used. In local and global
rip-up and re-routing, the routing area is considered as an
array of square regions. Local window of a square region sr
refers to 3x3 square regions centering on sr .
In local rip-up and re-routing, rst, a square region with
many violations is selected. Then, nets passing through the
region are removed and re-routed in the local window of the
region in the order of their costs given by a cost function,
until all the nets passing through the region are processed
or all the violations in the region are removed. At most Lrr
square regions with violations are processed in local rip-up
and re-routing. We set Lrr to 500 in our experiments.
In global rip-up and re-routing, a net passing through a
congested region is selected, and re-routing is performed.
Global rip-up and re-routing is repeatedly performed until
all the violations are removed or it is performed Grr times.
We set Grr to 50 in our experiments.
If there remain violations in the nal routing solution,
routing process was failed. For more details, please refer to
[12].
p
1
0
5
10
15
20
1
1
1
1
1
1
1
1
1
1
q
1
1
1
1
1
1
0
5
10
15
20
1
1
1
1
1
r
1
1
1
1
1
1
1
1
1
1
1
0
5
10
15
20
IN EACH COST FUNCTION SETTING
routing R.
100.00%
95.01%
100.00%
95.01%
92.87%
92.87%
95.25%
100.00%
100.00%
100.00%
100.00%
90.26%
100.00%
100.00%
100.00%
100.00%
MLUT U.
28.91%
N/A
28.12%
N/A
N/A
N/A
N/A
28.95%
30.03%
31.90%
32.84%
N/A
30.94%
30.94%
35.89%
35.89%
success R.
40%
0%
10%
0%
0%
0%
0%
60%
60%
70%
50%
00%
90%
100%
100%
70%
dramatically improved P&R success ratio compared to VPR.

Moreover, according to Table I, when a circuit is small
compared to the target MPLD, the best p tends to be large
compared to the corresponding q and r, and vice versa.
This is because there is enough room for routing when the
circuit is small, and there are no room when the circuit is
large. This implies that congestion and nearness work well to
improve the placement when there are only a small number
of MLUTs that can be used for routing.
Table II shows the placement and routing results of c1908
from ISCAS85 benchmark circuits for various patterns of p,
q, and r. Circuit c1908 was mapped (i.e., placed and routed)
on an MPLD with 3336 MLUTs. Success R. refers to the
success rate of mapping. Note that mapping was conducted
10 times for each pattern of the coefcients. With respect to
the weight for wire length, p = 1 is the best. This is because
large p leads to overcrowding of logic cells to shorten the
total wire length. Next, with respect to the weight for wire
congestion, q = 1 was the best, though it did not affect the
results much except when q = 0. With respect to the weight
for nearness, large r dramatically improved the success rate
of mapping, though it slightly worsened the total wire length,
and then MLUT usage. This shows the high effectiveness of
the nearness factor in placement for MPLDs.
Fig. 10 illustrates the placement and routing result of a
circuit with 98 gates, s208 from ISCAS89 benchmark suite,
on a 912 MPLD without long distant lines. Note that I/O
pad assignment was given before placement.
VI. E XPERIMENTS
We conducted experiments to evaluate our proposed algorithm and MPLDs ability. In the experiments, circuits
from ISCAS85 and ISCAS89 benchmark circuit suites
were placed and routed on MPLDs with 1530, 3336,
6360, and 9390 MLUTs, where an MPLD with H W
MLUTs consists of W vertical columns each of which has H
vertically arranged MLUTs. The positions of primary I/Os
of each circuit were manually given. Each of the coefcients
p, q, and r in the cost function of placement was set to 0, 1,
5, 10, 15, and 20. For each combination of the coefcient
values, placement and routing were performed 10 times, and
the best solution was chosen. Note that our cost function
when q = 0 and r = 0 is equivalent to that of VPR [6].
was set to 4.
Table I shows the experimental results of ISCAS89
benchmark circuits. In the table, routing R. represents the
ratio of the number of the nets successfully routed to the
number of all the nets. MLUT U. (usage) represents the
ratio of the number of the used MLUTs. Note that MLUTs
are also used as routing elements. And p, q, and r are
the coefcients in the cost function when the best result
(with the highest routing R., and then the lowest MLUT
usage) is achieved. P&R success R. (ratio) represents the
ratio of the number of circuits successfully routed to the
total number of circuits. We observed that our method
VII. C ONCLUSIONS
In this paper, we proposed a placement and routing algorithm for a new recongurable device, MPLD, considering
the overcrowding of logic gates and interference of nets, and
evaluated its effectiveness and MPLDs ability to realize a
circuit. The experimental results showed that our proposed
algorithm suppresses the overcrowding of logic gates, and
453
Table I
P LACEMENT AND ROUTING RESULT OF ISCAS89
circuit
s27
s298
s349
s386
s420
s510
s526n
s731
s832
s953
s1238
s1488
s5378
s13207
s35932
s38584
P&R success R.
#gates
10
119
161
159
196
211
194
393
287
395
508
653
2779
7951
16095
19253
MLUTs
15*30
15*30
15*30
15*30
15*30
15*30
15*30
15*30
33*36
63*60
63*60
63*60
93*90
93*90
93*90
93*90
p
1
5
1
5
1
10
1
5
10
20
15
1
1
1
1
1
q
5
5
5
1
1
10
1
5
5
1
5
5
1
15
10
5
r
0
0
0
1
5
20
10
15
10
15
15
1
1
10
5
10
our method
routing R.
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
100%
96.2%
96.4%
87.5%
MLUT U.
1.06%
8.43%
11.47%
17.16%
13.65%
41.89%
13.80%
31.71%
21.48%
12.05%
14.62%
17.19%
17.97%
27.48%
N/A
N/A
VPR equiv. (q = r = 0)
routing R.
MLUT U.
100%
1.07%
100%
8.93%
100%
11.73%
100%
17.38%
100%
14.02%
93.16%
N/A
100%
14.48%
98.33%
N/A
92.33%
N/A
95.58%
N/A
92.88%
N/A
92.20%
N/A
81.87%
N/A
99.83%
N/A
92.49%
N/A
86.31%
N/A
37.5%
and automatic tting of parameters p, q, and r in our cost

function.
R EFERENCES
[1] Altera Corporation, Altera FPGAs. [Online]. Available: http://www.
altera.com/devices/fpga/
[2] Xilinx Incorporation, FPGA Families. [Online]. Available: http://www.
xilinx.com/products/silicon-devices/fpga/
[3] A. Singh and M. Marek-Sadowska, FPGA Interconnect Plannning, in
Proc. the 2002 Int. Workshop on System-level Interconnect Prediction,
2002, pp.2330.
[4] Z. Marrakchi, H. Mrabet, U. Farooq, and H. Mehrez, FPGA Interconnect Topologies Exploration, Int. Journal of Recongurable
Computing, vol.2009, pp.113, 2009.
[5] N. Hirakawa, M. Yoshihara, K. Tanigawa, T. Hironaka, and M. Sato,
A PLD Architecture for High Performance Computing, in Proc. the
2008 Int. Workshop on Innovative Architecture for Future Generation
High-Performance Processors and Systems, 2008, pp.3542.
[6] V. Bets and J. Rose, VPR: A New Packing, Placement and Routing
Tool for FPGA Research, in Proc. IEEE FPL 1997, 1997, pp.213
222.
[7] P. J. M. van Laarhoven and E. H. L. Aarts, Simulated Annealing:
Theory and Applications, D. Reidel, Dordrecht, The Netherlands, 1987.
Figure 10.
[8] Synopsys, Inc., RTL Synthesis & Test. [Online]. Available: http://www.
synopsys.com/Tools/Implementation/RTLSynthesis/
[9] Altera Corporation, Design Software. [Online]. Available: http://www.
altera.com/products/software/
[10] Berkeley Logic Synthesis and Verication Group, ABC: A System for
Sequential Synthesis and Verication. [Online]. Available: http://www.
eecs.berkeley.edu/alanmi/abc, Release 70930.
Placement and routing result of s208
then the interference of nets. As a result, circuits were

successfully implemented on MPLDs.
We are trying to put the device into commercial use in
the near future. The prototype chips of MPLD have been
fabricated, and we have conrmed that our placement and
routing algorithm can drive the chips.
Our future work includes further evaluation of MPLDs,
improvement and enhancement of the placement and routing algorithm for MPLDs (e.g., consideration of timing,
introduction of clustering technique to handle large circuits),
[11] C. E. Cheng, Accurate and Efcient Placement Routability Modeling, in Proc. IEEE/ACM ICCAD, Nov. 1994, pp.690695.
[12] H. Shirota, S. Shibatani, and M. Terai, A New Rip-Up and Reroute
Algorithm for Very Large Scale Gate Arrays, IEICE Trans. on
Fundamentals, vol.E80-A, no.3, pp.506513, March 1997.
454

EDA Environment For Evaluating A New Switch-Block-Free Reconfigurable Architecture (MPLD)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EDA Environment For Evaluating A New Switch-Block-Free Reconfigurable Architecture (MPLD)

Uploaded by

Copyright:

Available Formats

2011 International Conference on Reconfigurable Computing and FPGAs

EDA Environment for Evaluating

Masayuki SATO, Takashi ISHIGURO

this structure, an MPLD aims more efciently to realize a

AbstractIn this study, we developed and implemented a

is a trademark of TAIYO YUDEN CO., LTD.

978-0-7695-4551-6/11 $26.00 2011 IEEE

FPGAs fabric and MPLDs fabric

the degree of routing freedom in the existing placement

MLUT with flip-flop

Adjacent lines and distant lines

As shown in Fig. 2(c), an MPLD basically consists

III. F RAMEWORK OF O UR E VALUATION E NVIRONMENT

Our evaluation environment of MPLD is implemented as

AD pair (addr&dara pair)

(b) Truth table (memory data)

The most important characteristic of MPLD is that there is

(a) 4-input/4-output MLUT

II. S TRUCTURE OF MPLD

(c) Basic Structure of MPLD

Basic structures of MLUT and MPLD

// s: current sol., s0: initial sol.

// repeat until the terminal temp.

(b) Boolean netlist

if (rand() <e^{(c-cn)/T}){ // accept check (comparing the costs)

(c) Logic cells

// repeat neighbor generation

(e) Routing result

by clustering gates. Each cell is represented by a look-up

Pseudo code of simulated annealing

is terminated when T reach to the termination temperature

IV. P LACEMENT A LGORITHM

target MPLD at rst. Then, it is gradually decreased by the

q(n){bbx (n) + bby (n)},

where gx (k, l) is the congestion level of the wire from

where q(n) is the weight for the net depending on the

{(gx (k, l))2 + (gx (k, l))2

The congestion levels of the congested areas are emphasized

Bounding box and wire length

d(a, b) is the distance (i.e., the shortest path length in the

where Vc is the set of the logic cells of the cell-level netlist.

Routing process decides the paths of nets among placed

Example of calculation of congestion

occupied MLUT (obstacle)

A. Flow of Routing Process

Our routing algorithm consists of three steps: pre-routing,

(a) Without nearness

(b) With nearness

For example, in the case that the two bounding boxes of

3) Nearness: Unlike FPGAs, an MPLD uses the MLUTs

where is a user-dened constant corresponding to the

D. Rip-up and Re-routing

If all the nets cannot be routed in actual routing, rip-up

IN EACH COST FUNCTION SETTING

dramatically improved P&R success ratio compared to VPR.

and automatic tting of parameters p, q, and r in our cost

Placement and routing result of s208

then the interference of nets. As a result, circuits were

You might also like