Professional Documents
Culture Documents
6, December 2013
425
Received August 15, 2013; Revised October 31, 2013; Accepted November 8, 2013; Published December 19,
2013
Abstract: As processing nodes scale up, it is difficult for traditional electronic networks to supply
on-chip communication efficiently due to unacceptable latency, plus power and area consumption.
Alternative interconnects, such as radio frequency interconnect (RF-I) and optical interconnect,
have been explored as interconnection backbones. Hybrid hierarchical architectures with both
traditional interconnects and emerging interconnects have been widely adopted to get excellent
trade-off between latency and power. The hybrid hierarchical architecture with a wireless/RF-I
backbone is more cost-efficient and feasible due to advantages in complementary metal oxide
semiconductor compatibility, compared with other alternative interconnects, and has become one of
the mainstreams of chip multi-processor systems. However, how to efficiently utilize the
wireless/RF-I backbone is a new challenge for designers. Based on analysis of existing typical
hybrid hierarchal wireless/RF-I architectures (HHWAs), the key problems in the Design of
HHWAs are proposed here, and related potential solutions are provided. In particular, strategies for
resource management of wireless/RF-I are explored in detail, and different solutions are discussed.
This work is expected to serve as a basis for future HHWA designs.
Keywords: Network-on-chip, radio frequency interconnect, wireless interconnect
This research was supported by the Beijing Municipal Natural Science Foundation (No.4122010, 2012.1 - 2014.12).
DOI: 10.6029/smartcr.2013.06.004
426
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
Introduction
s we enter the era of multiple cores and beyond, the number of cores, coprocessors, and on-chip accelerators grows
rapidly. The dramatic increase of these processing elements (PEs) imposes a tremendous challenge for on-chip
communication that demand high performance, including lower latency and higher bandwidth, but also minimal
performance per energy/area. According to the International Technology Roadmap for Semiconductors (ITRS) [1],
improving characteristics of metal wires will no longer satisfy performance requirements, and new interconnect paradigms
are needed. Different revolutionary approaches, such as optical interconnect [2][3], radio frequency interconnect (RF-I)
[4][5][6], and wireless interconnect with complementary metal oxide semiconductor (CMOS) ultra wide band (UWB)
technology [7][8], have been explored. But these emerging interconnects have associated antenna and transceiver area,
extra integrated components and power overheads, and thus need to be placed and used optimally to achieve the best
performance without undue overhead [9][10]. Although the traditional planar metal interconnects suffer from limitations
arising from multi-hop communication, which result in high latency and power consumption, they are still highly effective
and suitable for short distances. The vast improvements in CMOS technology have led to wires with only 0.18 pJ/bit of
energy consumption at 1 mm for a 32 nm technology design [11]. Based on these reasons or technology problems, many
researchers adopted hybrid hierarchical wireless/RF-I architectures (HHWAs) to get excellent trade-offs between latency
and power with limited extra cost [12][13][14][15][16]. HHWA is characterized by local traditional wired interconnection
and global wireless/RF-I interconnection, and provides some unique benefits including the following: (1) Instead of multihop in traditional interconnection, wireless/RF-I implements one hop for long distance communication, which alleviates
power consumption while providing high bandwidth and low latency without excessive overhead. (2) Taking full advantage
of traditional networks on a chip (NoCs) and emerging interconnects, HHWA employs their respective merits. (3)
Compared with optical interconnects in hybrid architectures, using wireless/RF-I as a global communication backbone
attains better feasibility and cost-efficiency due to an advantage in CMOS compatibility.
As an architecture composites emerging technologies and traditional interconnects, new design challenges arise that
might be bottlenecks to performance improvement. This work explores the key problems in HHWA designs and provides
related potential solutions, which is expected to serve as basis from which to work towards future HHWA design. The rest
of the paper is organized as follows. In Section 2, we provide a brief overview of the new alternative interconnect
technologies (wireless and RF-I) and how they can be leveraged for on-chip communication. Based on the availability of
these two interconnect technologies, we discuss the topology of HHWAs and explore the existing typical HHWAs in
Section 3. Due to importance of wireless/RF-I resource management in HWWAs, we did an in-depth survey and analyze
the resource arbitration mechanisms in Section 4. In Section 5, we summarize the key problems in HHWA design and
provide related feasible solutions. Finally, we conclude our work in Section 6.
RF-I/Wireless
RF-I
Radio frequency interconnect has been proposed as a high-aggregate bandwidth, low-latency alternative to traditional
interconnect [4][5][19]. Its benefits have been demonstrated for off-chip, on-board communication, as well as for on-chip
interconnection networks [20][21][22].
Unlike conventional metallic wires that require charging and discharging the whole wire to signify either 0 or 1,
RF-I modulates information on an electromagnetic carrier wave that is continuously sent along the transmission line
(Figure 1). RF-I has been projected to scale better than traditional RC wires in terms of delay and power consumption; it
can allow signal transmission across a 400 mm2 die in 0.3 ns via propagation at the effective speed of light [5] as opposed
to less than, or equal to, 4 ns on a repeated bus.
Instead of trying to aggressively expand baseband bandwidth (which often involves power-hungry compensation
techniques to achieve a flat channel frequency response), RF-I divides bandwidth into frequency domains, each becoming a
narrow-band signal, which saves power. By doing this, RF-I also improves bandwidth efficiency by sending many
simultaneous streams of data over a single transmission line. This particular technique is referred to as multi-band RF-I [6].
As shown in the Figure 2, there are N mixers on the transmitting (or Tx) side in multi-band RF-I, where N is the number of
senders sharing the transmission line. Each mixer up-converts individual data streams into a specific channel (or frequency
band). On the receiver (Rx) side, N additional mixers are employed to down-convert each signal back to the original data
and N low-pass-filters (LPF) are used to isolate the data from residual high-frequency components. Based on shortcut
selection, each transmitter or receiver in the topology will be tuned to a particular frequency (or disabled entirely) to
implement our shortcuts [5][6].
427
C Core
$ L2 cache bank
RF-I transmission line
Router
Wireless
Different from RF-I, the transmission channel does not need to be physically laid out for wireless interconnection, and the
communication medium is free space [23]. Wireless communication can be over different frequency ranges, from several
gigahertzes to thousands of gigahertz [24].
An on-chip antenna is always one of the most difficult, but very important, components that can be integrated on-chip
for HHWAs, because passive devices such as inductors consume the dominant portion of the transceiver area. Fortunately,
as CMOS technology improves, not only the size but also the cost of the antenna and required circuits will decrease
dramatically, which provides the feasibility for integrating multiple on-chip antennas [12]. An example of the necessary
components of wireless transceivers for millimeter wave (mm-wave) links in a chip multiprocessor system is shown in
Figure 3. A metal zigzag antenna was demonstrated to support wireless network-on-a-chip (WiNoC) [25] and was used to
design an mm-wave wireless NoC by Deb et al. [26]. As the transmission frequency increased to the terahertz range, carbon
nanotubes (CNTs) were explored for the on-chip antenna [27], and the feasibility of designing a WiNoC was demonstrated
by Ganguly et al. [15]. Compared with RF-I, which needs the transmission line to span the entire chip area, communication
routing is not limited by the physical channel for wireless interconnection. However, wireless interconnection faces
interference challenges and cost problems, which are proportional to the communication distance.
428
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
Cluster 0: C0
C0
Antenna
C0
C1
Receiver
Side
Transmitter
Side
C1
Swith
Driver
Amplifier
C0
C1
LNA
C1
C3
C3
C2
C2
C3
C3
C2
C2
Data to be
transmitted
Modulator
Carrier
Frequency
Serializer
Demodulator
Deserializer
Data Received
C0
429
Global/Express
Network
Local
Network
Local
Network
Local
Network
Shortcuts
430
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
baseline. WCube offers scalable performance in terms of latency and connectivity, compared other HHWAs, and the
architecture has proven cost-efficient with 1024 nodes.
Processing
Node
Wireless links
Switch
Hub
Wcube 0
Wcube 1
Wcube 2
Core
Wireless Router
L2 Cache
Base Router
Figure 7. A two-level WCube structure with a cluster of 16 base routers (i.e. 64 nodes)
Different from WCube, which uses a centralized wireless hub at each group of 64 nodes, in the iWISE architecture,
every router has its own transmitter and receiver for each group of routers. As shown in Figure 8, the iWISE architecture
reduces the hop count by distributing these transceivers at each router, as opposed to the centralized hub found in WCube
[13]. A token scheme is adopted for the wireless routers to share the limited bandwidth, while frequency division
multiplexing (FDM) and time division multiplexing (TDM) are induced to avoid transmission interference.
431
requirements for the access points. Similarly, another arbitration scheme is needed to decide who can get access to the
particular wireless medium (or RF-I channel) in a given period, because all wireless/RF-I access points can tune to the
same channel and can send or receive data from any other wireless/RF-I access point in the network. Therefore, how to
allocate the wireless/RF resource of the specific wireless/RF access point between multiple transmission requirements from
the PEs (or the base routers in the local network) and how to allocate the specific wireless medium or RF-I channel between
multiple wireless/RF-I points in a given period are two of the important problems in wireless/RF-I resource management.
The solutions to the two problems explored so far by different research groups can be broadly classified into three classes,
depending on the specific implementation of the HHWA.
Traditional
Wired Link
Router
Set 2
Set 3
Set 0
Set 0
Core
Wireless link
432
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
arbitration, which allocates the channels in real time to communicating pairs on demand with low arbitration latency, power,
and hardware cost, faces a channel utilization problem and long arbitration latency with non-uniform communication.
However, modern and future CMPs tend not to exhibit this uniformity due to spatial communication heterogeneity. So
stream arbitration was proposed by Xiao et al. [31] as an efficient dynamic bandwidth utilization scheme that can deal with
both spatial and temporal communication heterogeneity. Unlike token arbitration, where channels are coupled to receivers,
a channel in stream arbitration can be used to send packets from any sender to any receiver, which efficiently addresses the
problem of spatial communication heterogeneity. Since stream arbitration is inherently a dynamic arbitration scheme, it
also efficiently handles temporal communication heterogeneity. Stream arbitration partitions the aggregate bandwidth into
arbitration channels and data channels. Active sources (nodes that want to send flits through wireless/RF-I) compete for the
data channels in the arbitration channel in order to talk to their desired destination nodes. Stream arbitration is a distributed
mechanism without a centralized arbitrator and is implemented independently and simply. Stream arbitration proved to be
an efficient scheme for resource arbitration for emerging network technologies, with a case study consisting of a modeled
RF-I network.
Routing
433
The routing strategy determines the path a packet takes from its source to its destination. Due to the different transmission
characteristics of RF-I/wireless compared with traditional wired interconnects, and the harsh requirements for on-chip
design of a hierarchical architecture, the routing mechanism in an HHWA should be simple and reliable, without incurring
too much power, area and latency overhead. We divide routing mechanism into local routing and global routing by whether
using wireless/RF-I. Local routing depends on the topology of the subnets. For example, if the PEs within a subnet are
connected in a mesh, then data routing within the subnet follows dimension order routing. Global routing relates to whether
and how to use the RF/wireless interconnects. Flow control, deadlock avoidance and RF-I/wireless resource management
strategy are key problems in the global routing design. Kim et al. [23] and Deb et al. [24] analyzed the different strategies
adopted by existing HWWAs, and provide very good references and guidance for future HHWA designs. A comprehensive
study quantifying merits and limitations for different strategies and their implementation challenges needs to be carried out,
with an informative comparative analysis [24].
Transmission reliability
Although wireless/RF-I performs well for long distance transmission with high bandwidth, low latency and low energy
consumption, the bit-error problem is a challenge to ensuring reliable message transmission. Within the maximum
communication distance of future CMPs, 1.5 cm, the bit-error rate (BER) of the on-chip wireless channel is less than 109,
which is far higher than that of RC wires. (Current RC wires have an extremely low BER of approximately 1014 [12].)
Error control coding (ECC) is explored by Ganguly et al. [37], who showed that by implementing joint crosstalk avoidance
triple error correction and simultaneous quadruple error detection codes [38] in the wire line links and Hamming code
based product codes (H-PCs) in the wireless links of a hierarchical wireless NoC with CNT antennas [37], it is possible to
improve overall reliability of the wireless NoC manifold. However, application of ECC introduces timing and area
overhead and also incurs fixed overhead over every packet [12][15]. Research into WCube devised a novel and simple loss
management solution that uses a zero-signalingoverhead scheme, overhearing-and-retransmission (OAR), based on
overhearing on intermediate hops, and uses an on-demand, checksum-based error-detection and retransmission scheme at
the last hop [12]. OAR detects and recovers packet losses without extra signaling overhead with a buffer-based mechanism.
The packet is verified by the checksum at the destination, and retransmits if the checksum does not match. This solution is
simple, and induced less extra cost compared with ECC, but the forwarding sequence of packets should be kept to ensure
the correct transmission.
Scalability
To target future large-scale CMPs, scalability is one of the most important problems for the design of an on-chip hybrid
hierarchical architecture. Lee et al. [12] proposed the WCube recursive wireless interconnect structure, which offers
connectivity to thousands of cores in CMPs. A case study with a network consisting of 1024 PEs proved efficient with
WCube and demonstrated a reduced observed latency of 20% to 45% compared to current 2-D wired mesh designs. Since
future communication patterns tend towards the non-uniform and heterogeneous, Xiao et al. [31] proposed a cluster-based
434
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
hierarchical architecture that uses a local transmission line for each core cluster, and a global TL to connect the local TLs.
A network with 16x16 RF nodes for a 32x32 router NoC (each 2x2 router shares one RF node) proved efficient in average
network latency and energy consumption with a hierarchical TL architecture and hierarchical stream arbitration, compared
to architecture with a single TL spanning the whole trip [31]. The three-level architecture with traditional RC connects, RFI and wireless links is also one of the potential solutions for scalability in architecture, and detailed implementation needs to
be proposed in future designs.
Conclusion
As a new architecture composite with emerging interconnects, new design challenges need to be targeted for hybrid
hierarchical wireless/RF-I architectures. Based on analysis of the existing typical HHWAs, we explored strategies for
wireless/RF-I resource management for the first time and discussed the strengths and disadvantages of different solutions.
The key problems in hybrid hierarchical wireless/RF-I architecture design are explored, and related potential solutions are
provided, which we expect to serve as a basis to help with future HHWA designs. Quantitative analysis for the performance
benefits of different HHWAs need to be benchmarked in future work, and detailed investigations for physical
implementations need to be explored in the future.
References
[1] International Technology Roadmap for Semiconductors (ITRS), 2012.
[2] A. Shacham, K. Bergman, L. P. Carloni, Photonic networks-on-chip for future generations of chip multiprocessors,
IEEE Transactions on Computers, vol. 57, no. 9, pp. 1246-1260, 2008. Article (CrossRef Link)
[3] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. P. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G.
Beausoleil, J. H. Ahn, Corona: System Implications of Emerging Nanophotonic Technology, in Proc. of the 35th
Annual International Symposium on Computer Architecture (ISCA08), Washington, DC, USA, pp. 153-164, 2008.
Article (CrossRef Link)
[4] M. F. Chang, I. Verbauwhede, C. Chien, Z. Xu, J. Kim, J. Ko, Q. Gu, B. Lai, Advanced RF/baseband interconnect
schemes for inter- and intra-ulsi communications, IEEE Transactions on Electron Devices, vol. 52, no. 7, pp. 12711285, 2005. Article (CrossRef Link)
[5] M. F. Chang, E. Socher, R. Tam, J. Cong, G. Reinman, RF interconnects for communications on-chip, in Proc. of
the 2008 international symposium on Physical design (ISPD08), ACM New York, NY, pp. 78-83, 2008. Article
(CrossRef Link)
[6] M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, S.-W. Tam, CMP Network-on-Chip Overlaid
with Multi-Band RF-Interconnect, in Proc. of the IEEE Int'l Symposium on High-Performance Computer
Architecture (HPCA), Salt Lake City, UT, February, pp. 191-202, 2008. Article (CrossRef Link)
[7] D. Zhao, Y. Wang, SD-MAC: Design and Synthesis of A Hardware-Efficient Collision-Free QoS-Aware MAC
Protocol for Wireless Network-on-Chip, IEEE Transactions on Computers, vol. 57, no, 9, pp. 1230-1245Sep, 2008.
Article (CrossRef Link)
[8] Y. Wang, D. Zhao, The Design and Synthesis of a Synchronous and Distributed MAC Protocol for Wireless
Network-on-Chip, in Proc. IEEE Intl Conf. Computer-Aided Design, Nov. 2007. Article (CrossRef Link)
[9] S. Deb, K. Chang, et al., Design of an Efficient NoC Architecture using Millimeter-Wave Wireless Links, in Proc.
of 13th Intl Symposium on Quality Electronic Design, pp. 165-172, Mar. 2012. Article (CrossRef Link)
[10] L. P. Carloni, P. Pande, Y. Xie, Networks-on-chip in emerging interconnect paradigms: Advantages and challenges,
in Proc. of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, pp. 93-102, 2009. Article
(CrossRef Link)
[11] H. S. Wang, X. Zhu, L. S. Peh, S. Malik, Orion: A power-performance simulator for interconnection networks, in
Proc. of the 35th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 294305, Nov. 2002. Article
(CrossRef Link)
[12] S. B. Lee et al., A scalable micro wireless interconnect structure for CMPs, in Proc. ACM Annu. Int. Con. Mobile
Comput. Network. (MobiCom), pp. 20-25, 2009. Article (CrossRef Link)
[13] D. D. Tomaso et al., iWise: Inter-router wireless scalable express channels for Network-on-Chips (NoCs)
architecture, in Proc. Annu. Symp. High Performance Interconnects, pp. 11-18, 2011. Article (CrossRef Link)
[14] W. J. Dally, Express cubes: Improving the performance of k-ary n-cube interconnection networks, IEEE Trans.
Computers, vol. 40, no. 9, pp. 1016-1023, Sep. 1991. Article (CrossRef Link)
[15] A. Ganguly, K. Chang, S. Deb, P. Pande, B. Belzer, C. Teuscher, Scalable hybrid wireless network-on-chip
architectures for multicore systems, IEEE Trans. Computers, vol. 60, no. 10, pp. 1485-1502, Oct. 2011. Article
435
(CrossRef Link)
[16] M. F. Chang, J. Cong, A. Kaplan, A. Kaplan, C. Liu, M. Naik, J. Premkumar, G. Reinman, E. Socher, S.-W. Tam,
Power reduction of CMP communication networks via RF-interconnects, in Proc. of the 41st annual IEEE/ACM
International Symposium on Microarchitecture (MICRO 41), Washington, DC, USA, pp. 376-387, 2008. Article
(CrossRef Link)
[17] W. J. Dally, T. B, Principles and Practices of Interconnection Networks. Waltham, MA: Morgan Kaufmann, 2004.
[18] K. Chang, S. Deb, et al., Performance Evaluation and Design Trade-offs for Wireless Network-on-Chip Architecture,
ACM Journal on Emerging Technologies in Computing Systems, vol. 8, no. 8, 2012. Article (CrossRef Link)
[19] M. F. Chang, V. P. Roychowdhury, L. Zhang, H. Shin, Y. Qian, RF/wireless interconnect for inter- and intra-chip
communications, Proceedings of the IEEE, vol. 89, no. 4, Apr. 2001. Article (CrossRef Link)
[20] J. Ko, J. Kim, Z. Xu, Q. Gu, C. Chien, M. Chang, An RF/baseband FDMA-interconnect transceiver for
reconfigurable multiple access chip-to-chip communication, in Proc. of Dig. Tech. Papers Int. Solid-State Circuits
Conf., vol. 1, pp. 338-602, Feb. 2005. Article (CrossRef Link)
[21] H. Wu, L. Nan, S.-W. Tam, et al., A 60GHz on-chip RF-Interconnect with /4 coupler for 5Gbps bi-directional
communication and multi-drop arbitration, in Proc. of Custom Integrated Circuits Conference (CICC), pp. 1-4, 2012.
Article (CrossRef Link)
[22] Y. Kim, G.-S. Byun, A. Tang, C.-P. Jou, H.-H. Hsien, G. Reinman, J. Cong, M. F. Chang, An 8Gb/s/pin 4pJ/b/pin
single-t-line dual (Base+RF) band simultaneous bidirectional mobile memory I/O interface, in Proc. of the IEEE
International Solid-State Circuits Conference (ISSCC), pp. 50-51, 2012. Article (CrossRef Link)
[23] J. Kim, K. Choi, et al., Exploiting New Interconnect Technologies in On-Chip Communication, IEEE Journal on
emerging and selected topics in circuits and systems, vol. 2, no. 2, pp124-136, June 2012. Article (CrossRef Link)
[24] S. Deb, A. Ganguly, P. Pande, D. Heo, B. Belzer, Wireless NOC as interconnection backbone for multicore chips:
Promises and challenges, IEEE Journal on emerging and selected topics in circuits and systems, vol. 2, no. 2, pp228239, June 2012. Article (CrossRef Link)
[25] J. Lin et al., Communication using antennas fabricated in silicon integrated circuits, IEEE J. Solid-State Circuits,
vol. 42, no. 8, pp.1678-1687, Aug. 2007. Article (CrossRef Link)
[26] S. Deb et al., Enhancing performance of Network-on-Chip architectures with millimeter-wave wireless interconnects,
in Proc. IEEE Int. Conf. ASAP, pp. 73-80, 2010. Article (CrossRef Link)
[27] K. Kempa et al., Carbon nanotubes as optical antennae, Adv. Mater., vol. 19, pp. 421-426, 2007. Article (CrossRef
Link)
[28] D. J. Watts, S. H. Strogatz, Collective dynamics of small-world networks, Nature, vol. 393, pp. 440442, 1998.
Article (CrossRef Link)
[29] M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher, S.-W. Tam, CMP Network-on-Chip Overlaid
with Multi-Band RF-Interconnect, UCLA Computer Science Department Technical Report UCLA/CSD-TR-07-0032,
Dec. 2007.
[30] A. Kumar, L.-S. Peh, N. K. Jha, Token flow control, in Proc. of the 41st IEEE/ACM International Symposium on
Microarchitecture (MICRO 08), pp. 342-353, 2008. Article (CrossRef Link)
[31] C. Xiao, M.-C. Frank Chang, J. Cong, M. Gill, Z. Huang, C. Liu, G. Reinman, H. Wu, Stream Arbitration: Towards
Efficient Bandwidth Utilization for Emerging On-Chip Interconnects, ACM Transactions on Architecture and Code
Optimization, vol. 9, no. 4, Jan. 2013. Article (CrossRef Link)
[32] A. E. Eiben, J. E. Smith, Introduction to Evolutionary Computing, Springer Berlin, 2003. Article (CrossRef Link)
[33] M. Sipper, Evolution of Parallel Cellular Machines: The Cellular Programming Approach, Springer Berlin, 1997.
Article (CrossRef Link)
[34] S. Kirkpatrick, Jr C. D. Gelatt M. P. Vecchi, Optimization by simulated annealing, Science, vol. 220, pp. 671-680,
1983. Article (CrossRef Link)
[35] T. Jansen, I. Wegener, A comparison of simulated annealing with a simple evolutionary algorithm on pseudoboolean functions of unitation, Theor. Comput. Sci, vol. 386, pp. 73-93, 2007. Article (CrossRef Link)
[36] International technology roadmap for semiconductors, 2007 edition.
[37] A. Ganguly et al., A unified error control coding scheme to enhance the reliability of a hybrid wireless Network-onChip, in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst, pp.277285, 2011. Article (CrossRef
Link)
[38] A. Ganguly et al., Crosstalk-aware channel coding schemes for energy efficient and reliable NoC interconnects,
IEEE Trans. Very Large Scale (VLSI) Syst., vol. 17, no. 11, pp. 16261639, Nov. 2009. Article (CrossRef Link)
[39] N. Hardavellas, M. Ferdman, B. Falsafi, A. Ailamaki, Reactive NUCA: near-optimal block placement and replication
in distributed caches, in Proc. of the 36th annual international symposium on Computer architecture (ISCA '09).
ACM, New York, NY, USA, 184-195, 2009. Article (CrossRef Link)
[40] H. Lee, S. Cho, R. C. Bruce, StimulusCache: Boosting Performance of Chip Multiprocessors with Excess Cache,
Proc. of the IEEE Int'l Symposium on High-Performance Computer Architecture (HPCA), Bangalore, India, Jan. 2010.
436
Xiao et al.: A Tutorial for Key Problems in the Design of Hybrid Hierarchical NoC Architectures with Wireless/RF
Chunhua Xiao received her B.S. in Electronic Information Engineering from Shijiazhuang
Tiedao University, Hebei Province, China, in 2007, and her M.S. in Computer Science from
Beijing University of Technology, Beijing, China, in 2010. She is currently a PhD student in
Department of Computer Science and Technology, Beijing University of Technology. Her
research interests include embedded system co-design, Multi-processor system-on-chip, and
Network-on-Chip.
Zhangqin Huang received his B.S., M.S., and PhD in Computer Science from Xian Jiaotong
University, China, in 1986, 1989 and 2000, respectively. He is currently the Deputy Director of
the Embedded Software and Systems Institute (ESSI), Beijing University of Technology (BJUT),
China. His current research interests include co-design for embedded software and hardware,
humancomputer interaction based on internet, Multi-processor system-on-chip, mass data
storage, and network information security.
Da Li received his B.S., M.S., and PhD in Computer Science from Xian Jiaotong University,
China, in 2002, 2006 and 2012, respectively. He is currently a instructor of Embedded Software
and Systems Institute (ESSI), Beijing University of Technology (BJUT). His research interests
include embedded FPGA system design and multi-core processors.