21 PhDThesisEduardCalvo

U N I V E R S ITAT PO LIT CN I CA
DE CATA LU NYA
Ph.D. Thesis
Interference in Wireless Networks

Cancelation, Impact, Practical Management,
and Complexity
Author:
Eduard Calvo Page
Advisors:
Dr. Javier Rodrguez Fonollosa

Dr. Josep Vidal Manzano
SPCOM Group
Department of Signal Theory and Communications
`cnica de Catalunya
Universitat Polite
Barcelona, December 2008
Abstract
The layered organization of traditional wired network design has continuously regarded communication links as bit pipes delivering data at some fixed rate with a certain error probability.
While this modeling of the underlying physical layer may result appropriate for wired networks,
it is certainly naive in the wireless domain. Unlike the fixed wired network, where the channel is time-invariant, the propagation physics of wireless channels and potential user mobility
render the wireless network conditions very dynamic and time varying. Much worse, the pipe
model disregards multiuser interference: by giving shared access to the same limited pool of
resources to many users, the transmission rates of the communication links get coupled and
the decomposition of the network into a set of independent single-user links turns out to be
meaningless.
While the role of multiuser interference is widely recognized in prospective systems design,
its impact on network performance is diverse. In this respect, the aim of the present Ph.D.
thesis is to adopt a broad approach in the study of multiuser interference in wireless networks,
recognizing it as a phenomenon with many different facets out of which we concentrate on four
of them: cancelation, impact, practical management, and complexity.
We start studying when and how to perform partial interference cancelation, a technique
that requires full statistical knowledge of the interfering signals at the receivers. We find that
coding and decoding complexity can be traded whenever interference is under the control of the
same source. In other words, the need for partial interference cancelation at the receivers can be
alleviated through the use of appropriate coding techniques exploiting signal correlation at the
transmitters. Additionally, we propose a transmission strategy based on superposition coding
and aided decoding that yields an achievable region at least as large as the best long-standing
region for the interference channel.
Useful as it is, interference cancelation becomes infeasible in applications backed by decentralized wireless networks with uncoordinated nodes. That leaves each sender-destination pair
armed only with point-to-point (single user) strategies. This motivates the study of the totally
asynchronous interference channel with single-user receivers. Having a capacity region rather
involved, the evaluation of achievable rates is tackled based on simpler single-letter inner and
outer bounds. The study of these bounds reveals that the impact of interference on the achiev-
ii
Abstract
able rates can be mitigated through statistical signal design. Besides, the performance losses
associated to the lack of transmission synchronism and the use of single-user decoders in the
low- and high-power and low- and high-interference regimes are also quantified.
Next, the focus is on how to manage interference in a practical scenario where the receivers
are again interference unaware but now frame-synchronous. A practical transmission scheme
that allows for the design of optimal allocation policies of the limited transmission resources
of the network is proposed. Giving special attention to a cellular configuration under practical
conditions, efficient allocation schemes achieving Pareto and sequential optimality, respectively,
are proposed and compared. The emphasis at this point is on the performance-complexity and
throughput-fairness tradeoffs.
While recognizing that multiuser interference and the availability of receiver information
modifies the fundamental limits and the practical figures of merit of wireless networks, the thesis
concludes by studying a related aspect: how multiuser interference impacts on the complexity
required for the evaluation of the previous quantities. Efficient methods for the evaluation of
the capacity region of multiuser channels are proposed and, unlike the single-user case, nonconvexities in optimization problems need to be unavoidably faced.
Resumen
La estructura de capas que rige el dise
no de red ha modelado tradicionalmente los enlaces de
comunicaci
on como tuberas que transportan datos a una tasa fija con una cierta probabilidad
de error. Si bien este modelo implcito de capa fsica puede resultar apropiado para redes cableadas, ciertamente resulta demasiado simple en el dominio inalambrico. El modelo de tuberas
ignora la interferencia multiusuario: al permitir el acceso compartido al mismo conjunto limitado de recursos por parte de varios usuarios, las tasas de transmision de los diferentes enlaces
de comunicaci
on se acoplan y la red ya no puede descomponerse en un conjunto independiente
de enlaces punto a punto.
Mientras que el papel de la interferencia multiusuario es ampliamente reconocido en el dise
no
de sistemas futuros, su impacto en las prestaciones de red es diverso. Es por tanto la intenci
on
de la presente tesis doctoral el adoptar un amplio enfoque en el estudio de la interferencia
multiusuario, reconociendola como un fenomeno con m
ultiples facetas, de las cuales se tratan
cuatro de ellas: cancelaci
on, impacto, gestion practica y complejidad.
Empezamos estudiando el cu
ando y el como de la cancelacion parcial de interferencia, una
tecnica que requiere conocimiento estadstico completo de las se
nales interferentes en los receptores. Se ha demostrado que se pueden intercambiar la complejidad de codificacion y decodificacion cuando la interferencia est
a bajo el control de la misma fuente. En otras palabras, la
necesidad de cancelar interferencia en los receptores se puede relajar gracias al uso de tecnicas de
codificaci
on que aprovechen la correlacion entre se
nales en los transmisores. Adicionalmente, se
ha propuesto una estrategia de transmision basada en codificacion superpuesta y decodificaci
on
ayudada que ofrece una regi
on alcanzable al menos tan grande como la mejor region conocida
para el canal de interferencia.
Siendo u
til, la cancelaci
on de interferencia es impracticable en aplicaciones respaldadas por
redes inal
ambricas descentralizadas con nodos no coordinados. As pues, cada par fuente-destino
se ve relegado a usar u
nicamente estrategias punto a punto (monousuario). Este hecho motiva
el estudio del canal de interferencia totalmente asncrono con receptores monousuario. Al tener
una region de capacidad compleja, la evaluacion de las tasas alcanzables se realiza apoyandose
en cotas interiores y exteriores m
as simples. El estudio de estas cotas revela que el impacto
de la interferencia se puede mitigar a traves del dise
no estadstico de las se
nales transmitidas.
iii
iv
Resumen
Ademas, se han cuantificado las perdidas de prestaciones asociadas a la perdida de sincronismo

y el uso de receptores monousuario tanto en condiciones de baja y alta potencia, como de baja
y alta interferencia.
A continuaci
on se presta atenci
on a c
omo gestionar la interferencia en un escenario practico
donde los receptores de nuevo ignoran la presencia de interferencia, pero ahora mantienen sincronismo de trama. Dando especial atenci
on a una configuracion de red celular, se han propuesto
y comparado esquemas eficientes de asignacion de recursos capaces de alcanzar optimalidad
de Pareto y secuencial. El enfasis en el analisis ha residido en los compromisos prestacionescomplejidad y throughput-igualdad.
Reconociendo que es la interferencia multiusuario y la informacion que sobre ella se tiene
en los receptores lo que condiciona tanto los lmites fundamentales como las figuras de merito
practicas en las redes inal
ambricas, la tesis concluye con el estudio de un aspecto relacionado:
el impacto de la interferencia en la complejidad requerida para evaluar estas cantidades. Se han
propuesto metodos eficientes para la evaluacion de la region de capacidad de canales multiusuario
y, al contrario que en el caso multiusuario, los problemas de optimizacion involucrados presentan
no convexidades inevitables.
A la Rosa Maria,
A mis padres,
Agradecimientos
La realizaci
on de un doctorado no es un empleo de nueve a siete. Es quedarse con el compa
nero
a apurar el deadline hasta el alba. Tambien es marcharse a casa a las cinco con la cabeza a
punto de estallar de tanta ecuaci
on. Es resolver problemas en la ducha. Pero tambien es darse
cuenta de que la demostraci
on tena un fallo, y, a veces, poder resolverlo. No es un mas de lo
mismo. Es un incesable proceso de crecimiento personal que culmina con la alumbracion de una
tesis, el fruto de innumerables anhelos, esperanzas, sufrimientos e ilusiones. En el doctorado, la
lnea que separa trabajo y diversi
on se vuelve difusa y delgada y, sin pasion, de nada sirven los
conocimientos. Querer es poder, como dice el refran.
Tras cuatro a
nos de intenso trabajo, llega el momento de poner fin a una etapa que me ha
permitido vivir muchas experiencias. Escribir estas lneas esta sin duda entre las mas gratificantes. Porque implica que se ha conseguido el objetivo deseado y porque apetece mucho ser
agradecido y justo con todas aquellas personas que han contribudo a hacer de este periodo una
epoca a recordar con cari
no.
Quiero empezar agradeciendo la labor de mis directores de tesis, Javier R. Fonollosa y Josep
Vidal. Tener una direcci
on bicefala ha sido una experiencia muy enriquecedora, tanto a nivel
profesional como personal. Tengo que darles las gracias por muchas cosas, pero en aras de
la brevedad y de la contenci
on de la exaltacion de la amistad, resaltare su apoyo ferreo a mi
trabajo, tanto en los buenos momentos como en los menos buenos. Y quien diga que de los
u
ltimos no hay en una tesis, miente.
I am very grateful to Milica Stojanovic from the Massachusetts Institute of Technology/Northeastern University for giving me another opportunity to establish fruitful collaboration with her, meet each other, and enjoy the New England way of living. Zoran Zvonar is
the perfect gentleman: I want to be like him when I grow up.
Por ense
narme lo que es la pasi
on por el trabajo y predicar con el ejemplo, gracias a Gregori
Vazquez y a Daniel P. Palomar.
Por permitirme disfrutar ense
nando, gracias a mis alumnos y compa
neros de Epsilon.
Mis companeros del D5 se merecen un templo. Por las discusiones tecnicas, los cafes de
maitines, los debates de ca
na y bravas, las noches en las que hemos quemado la ciudad y, en
esencia, por crear un ambiente de trabajo productivo, alegre y sano. No entiendo como no se
vii
viii
Agradecimientos
han dado cuenta todava que aguantar mis chistes malos a cambio de aprender tanto de ellos
es un muy mal trato. Todos merecen mis agradecimientos mas sinceros, aunque el azar, la
actividad profesional o la afinidad personal ha hecho que con algunos de ellos la relacion haya
desbordado el ambito de lo estrictamente laboral. As pues, me quedo con la decision de Andreu
Urruela, el amor por lo bien hecho de Jose Antonio Lopez, la dedicacion de Alejandro Ramrez,
las ocurrencias de Xavi Artigas, la destreza argumental de Julio Rolon, la sensibilidad de Luis
Garca y el caracter de Pau Closas. Os admiro y os cuento entre mis Amigos.
Els meus Amics JoanMa Izquierdo i Jordi Cerdà mhan acompanyat des de ladolescència i
tambe mereixen un sincer agrament per seguir sempre al peu del cano.
Me siento muy afortunado de tener la familia que tengo. Por ser mi club de fans pero
tambien mis primeros crticos. Y eso no se paga con dinero. Su apoyo y cari
no incondicional
han resultado fundamentales desde siempre. A mis padres Selesio y Carmen les doy mil gracias
por haber hecho todo lo posible para que tuviera la mejor educacion y pudiera estudiar: aunque
en esta carrera la meta la voy a cruzar yo, vosotros me habeis llevado en limusina hasta la lnea
de salida. Estoy muy orgulloso de vosotros. Y de mi hermana Sonia y mi cu
nado Santi (el mejor
padrino de boda que conozco). I de la meva nova famlia per part de la Rosa Maria, que tambe
han seguit amb illusi
o levoluci
o daquesta tesi. Y aunque el no lo sepa todava, mi sobrino
Hector me ha hecho muy feliz. Hector, el to ya no ha tenido argumentos para responder a la
mama si no haba dicho que acabara la tesis cuando ella fuera madre...
Diuen que de res serveix pujar una escala molt alta si quan arribes dalt de tot tadones que
està recolzada en la paret equivocada. Alta o baixa, què importa, la meva escala es recolza en
tu, Rosa Maria. Tu fas que jo vulgui ser millor persona. Les bones persones no necessàriament
fan bones tesis. Per`
o si les fan amb amor i et tenen a tu al costat, son felices. I això es del que
anava la pellcula, oi?
Eso es to, eso es to, eso es todo amigos.
Eduard Calvo Page

Diciembre de 2008
Este trabajo ha sido parcialmente financiado por el Ministerio de Educaci

on y Ciencia a traves del programa
de Formaci
on de Profesorado Universitario (F.P.U.) mediante la beca AP-2004-3549.
Contents
Abstract
Resumen
iii
List of Figures
xiv
List of Tables
xv
Notation
xvii
Acronyms
xix
1 Introduction
1.1
Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Partial Interference Cancelation: When and How

2.1
2.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
When . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2
How . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.1.3
Summary of contributions . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Simultaneous vs Alternate Partial Interference Cancelation in the BC . . . . . .
13
2.2.1
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.2.2
The two-layered random binning achievable rate region . . . . . . . . . .
14
ix
Contents
2.2.3
Equality of Martons region and the two-layered random binning region .
15
2.2.4
Comparison with the interference channel . . . . . . . . . . . . . . . . . .
16
Partial Interference Cancelation through Aided Decoding . . . . . . . . . . . . .
17
2.3.1
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.3.2
The achievability result . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.3.3
An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
2.A Appendix: Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
2.B Appendix: Proof of Theorem 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.B.1 Proof of RMT R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.B.2 Proof of R RMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.C Appendix: Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.3
2.4
3 The Totally Asynchronous Interference Channel with Single-User Receivers 37

3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.1.1
40
3.2
The Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3.3
The Gaussian IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
3.3.1
Definition of optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.3.2
Finite expansion analysis of mutual information . . . . . . . . . . . . . . .
46
3.4
On the Optimality of Gaussian-Distributed Codes . . . . . . . . . . . . . . . . .
47
3.5
Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
3.6
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
4 Optimal Resource Allocation in Cellular Networks with Partial CSI

4.1
4.2
57
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.1.1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.1.2
Adopted network setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
4.1.3
60
System Model and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
xi
Contents
4.3
Relay-Assisted Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
4.3.1
Maximum instantaneous achievable rates . . . . . . . . . . . . . . . . . .
63
4.3.2
Universal concave lower bounds on the achievable rates . . . . . . . . . .
65
Achievable Instantaneous Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
4.4.1
DL instantaneous achievable rate region . . . . . . . . . . . . . . . . . . .
68
4.4.2
UL instantaneous achievable rate region . . . . . . . . . . . . . . . . . . .
70
Maximum Network Utility Rate Allocation Policies . . . . . . . . . . . . . . . . .
71
4.5.1
User utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.5.2
Network utility maximization . . . . . . . . . . . . . . . . . . . . . . . . .
74
4.6
Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
4.7
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
4.A Appendix: Proof of Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . .
85
4.4
4.5
5 Multiuser Interference and Evaluation of Capacity Regions

5.1
5.2
5.3
5.4
87
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.1.1
The DMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.1.2
The dDMBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
5.1.3
91
The DMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
5.2.1
The capacity region as a rank-one constrained optimization problem . . .
92
5.2.2
Relaxation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
5.2.3
Performance analysis of marginalization . . . . . . . . . . . . . . . . . . . 101
5.2.4
Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
The Degraded DMBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.3.1
The capacity region as a DC optimization problem . . . . . . . . . . . . . 110
5.3.2
Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.3.3
The BEC-BSC degraded broadcast channel . . . . . . . . . . . . . . . . . 112
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.A Appendix: Proof of Proposition 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 116
xii
Contents
5.B Appendix: Proof of Lemma 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.C Appendix: Proof of Proposition 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.D Appendix: Proof of Proposition 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.E Appendix: Proof of Proposition 5.7 . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.F Appendix: Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.G Appendix: Proof of Lemma 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.H Appendix: Proof of Lemma 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.I
Appendix: Proof of Proposition 5.8 . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6 Conclusions
Bibliography
129
List of Figures
2.1
The 1 2 discrete memoryless broadcast channel without common information. .
13
2.2
The 2 2 Discrete Memoryless IC (DMIC). . . . . . . . . . . . . . . . . . . . . .
18
2.3
An example 2 2 modified interference channel. . . . . . . . . . . . . . . . . . .
23
2.4
The regions RHK (PZ1 Z2 U1 U2 X1 X2 ) (dashed) and R(PZ1 Z2 X1 X2 ) (solid) computed
for the binary interference channel of Figure 2.3. Units are [nat/ch. use]. . . . . .
24
2.5
Diagram of the coding scheme: two layered random binning. . . . . . . . . . . . .
27
2.6
Coding scheme for the interference channel. The role of the time-sharing random
variable Q has been omitted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
33
Achievable rate regions of Gaussian-, uniformly-, and ternary-distributed codes
for a fully symmetric GIC with P = 15 and c = 0.1 (left) and c = 1/ 2 (right),
which correspond to a signal-to-interference ratio value of 20 dB and 3 dB, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
53
Achievable symmetric rate of Gaussian-, uniformly-, and ternary-distributed

codes for two different values of c yielding theoretical threshold powers of 1 and 10
(left). Comparison between the theoretical value of Pth (c) (3.63) and the threshold power of uniformly-distributed codes as a function of the coupling coefficient
(right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3
Achievable symmetric rates in the low-power regime, P = 1 (left), and high-power

regime, P = 1000 (right), as a function of the coupling coefficient c. . . . . . . . .
4.1
54
DL cooperation protocol: the DL phase is split into two subphases attending to

the half duplex nature of the RS. . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2
53
63
Exact ergodic capacity (solid lines) and Lemma 1 lower bound (dashed lines) vs
snr for different antenna configurations and Rayleigh fading. . . . . . . . . . . . .
xiii
67
xiv
List of figures
4.3
Exact ergodic capacity (solid lines) and Lemma 2 lower bound (dashed lines) vs
snr1 for different values of snr2 and Rayleigh fading. The antenna configuration
is nr = nt,1 = nt,2 = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4
Network utility achieved by global (blue) and sequential (red) optimization, with
(solid) and without (dashed) relaying infrastructure, and deployment layout.
4.5
. .
80
Per-user served throughput of global (blue) and sequential (red) optimization,

with (solid) and without (dashed) relaying infrastructure. . . . . . . . . . . . . .
4.6
67
80
Network utility achieved by sequential optimization with time-domain prescheduling allowing a maximum of 12 (black), 8 (blue), and 4 (red) users per
frame, and deployment layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7
82
Average per-user served throughput per QoS class achieved by sequential optimization with time-domain pre-scheduling allowing a maximum of 12 (black), 8
(blue), and 4 (red) users per frame. . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8
82
Maximum user delay (number of frames idle) achieved by sequential optimization

with time-domain pre-scheduling allowing a maximum of 12 (black), 8 (blue), and
4 (red) users per frame. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.9
83
Average steady-state per-user and link-direction throughput versus fairness index.

The corresponding values of are 0.25, 0.50, 0.75, 1, and 5. . . . . . . . . . . . .
84
5.1
The boundary of C is obtained solving (5.5)-(5.9) for each [0, 1].
93
5.2
The support of the randomly generated probability distributions q is the largest
. . . . . . .
circle centered at E{q} = q that fits within the probability simplex. . . . . . . .

5.3
99
The capacity region C of the nBS-MAC, in [bit/ch. use], for different values of
and . Note that (, ) = (0, 0) corresponds to the BS-MAC. . . . . . . . . . . . . 106
5.4
The probability p(, ; ) as a function of for different values of and . Note

that (, ) = (0, 0) corresponds to p() (5.58). . . . . . . . . . . . . . . . . . . . . 106
5.5
Bounds on the capacity region for DMAC1 . Units are [bit/ch. use]. . . . . . . . . 109
5.6
Bounds of the capacity region for DMAC2 . Units are [bit/ch. use]. . . . . . . . . 109
5.7
Bounds of the capacity region for DMAC3 . Units are [bit/ch. use]. . . . . . . . . 110
5.8
The BEC-BSC degraded broadcast channel . . . . . . . . . . . . . . . . . . . . . 113
5.9
The capacity region C BECBSC of the BEC-BSC degraded broadcast channel, in
[bit/ch. use], for different values of . . . . . . . . . . . . . . . . . . . . . . . . . . 114
List of Tables
2.1
Situations of decoding error and associated rate constraints. . . . . . . . . . . . .
2.2
Rate constraints for the achievability of a (R1 , R2 ) = (R10 + R11 , R20 + R22 ) rate
34
pair. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
4.1
Physical layer setup of the simulated scenario . . . . . . . . . . . . . . . . . . . .
79
4.2
Relative execution times per transmission frame of the MATLAB implementations

of all the resource allocation strategies. . . . . . . . . . . . . . . . . . . . . . . . .
81
xvi
List of tables
Notation
Throughout this dissertation, boldface lower-case letters shall denote column vectors, with 0n
and 1n standing for the all-zero and all-one column vectors of length n, respectively. We shall
denote by xi the i-th entry of vector x, and use and for scalar and component-wise inequal-
ities indistinctly. Similarly, z = min{x, y} denotes the vertical stacking of the component-wise
minimum of two vectors. Boldface upper-case letters are used for matrices, with In standing for
the n n identity matrix and Ai,j denoting the entry of the i-th row and j-th column of matrix
A, whose transpose and Hermitian are AT and A , respectively. The i-th ordered eigenvalue
(singular value) of an square (arbitrary) matrix A is denoted by i (A), where i (A) i+1 (A).
Whenever needed, the superscript ()? shall denote the optimal value of a variable.
As for random variables, we shall denote by Xkn an n-dimensional random vector taking
on the value xnk over the finite set Xkn with probability PXkn (xnk ). The i-th component of Xkn
n = [X
is denoted by Xk,i , whereas Xki = [Xk,1 . . . Xk,i ] and Xk,i
k,i . . . Xk,n ]. If the context is
appropriate (no ambiguity is possible), we shall use the equivalence [a, b] {a, a + 1, . . . , b}, for
arbitrary integers a, b such that b > a. Other specific notation is introduced as follows:
(x)+
The projection of the real variable x onto the non-negative semiaxis, i.e.,
(x)+ = max{x, 0}.
Proportional to.
(
1, x = 0
[x]
Kronecker delta. That is, [x] =
Co{S}
Convex hull of the set S.
N (, C)
Multivariate Normal (Gaussian) distribution with mean and covariance ma-
AB
Hadamard (i.e. element-wise) product of matrices. That is, [A B]i,j =
0, otherwise.
trix C.
Ai,j Bi,j .
ab
a much smaller than b.
ab
a much greater than b.
xvii
Acronyms
AB
Arimoto-Blahut
AEP
Asymptotic Equipartition Property
AWGN
Additive White Gaussian Noise
BA-MAC
Binary Adder MAC
BC
Broadcast Channel
BEC
Binary Erasure Channel
BS
Base Station
BSC
Binary Symmetric Channel
BS-MAC
Binary Switching MAC
CSI
Channel State Information
DC
Difference of Convex
dDMBC
Degraded DMBC
DL
Downlink
DMAC
Discrete memoryless MAC
DMBC
Discrete Memoryless BC
DMC
Discrete Memoryless Channel
DMIC
Discrete Memoryless IC
FDD
Frequency Division Duplexing
FFT
Fast Fourier Transform
FUSC
Full Usage of Sub-Channels
GIC
Gaussian IC
H&K
Han and Kobayashi
IC
Interference Channel
IFFT
Inverse FFT
xix
xx
Acronyms
KKT
Karush-Kuhn-Tucker
LOS
Line Of Sight
MAC
Multiple Access Channel
MIMO
Multiple-Input Multiple-Output
MS
Mobile Station
nBS-MAC
Noisy BS-MAC
NLOS
Non-Lone Of Sight
NU
Network Utility
OFDMA
Orthogonal Frequency Division Multiple Access
PUSC
Partial Usage of Sub-Channels
QoS
Quality Of Service
RC
Relay Channel
RS
Relay Station
SISO
Single-Input Single-Output
TDD
Time Division Duplexing
TDMA
Time Division Multiple Access
UL
Uplink
Chapter 1
Introduction
More than half a century ago, the birth of Information Theory settled the fundamental principles
governing the feasibility of reliable communication in single-user channels. Subsuming elegantly
the impact of both channel distortion and noise in the single figure of merit of channel capacity,
this new research framework not only ended with the long-standing folk theorem that regarded
transmission rate and accuracy as an unavoidable tradeoff. It also facilitated the advent of
modern digital communications and digital data recording: now, reliability could be achieved at
any rate below capacity. The rest is well known history.
In parallel, military interests triggered in the 1960s the creation of another apparently unrelated field of research: networking. However, what started with early studies on packet switching
and followed with the beginning of Internet, could no longer be kept within the defense umbrella.
The scalability, robustness to link failure, and business appeal of Internet turned it into a mass
phenomenon, giving rise to new user needs and services that rendered indispensable what was
unprecedented. From http to ftp, from file sharing communities to social networks, the eclosion
of communication networks has changed every single aspect of our behavior, labor, and leisure
time.
In studying network performance, the layered organization of traditional wired network design has continuously regarded communication links as bit pipes delivering data at some fixed
rate with a certain error probability. While this modeling of the underlying physical layer may
result appropriate for wired networks, it is certainly naive in the wireless domain. Unlike the
fixed wired network, where the channel is time-invariant, the propagation physics of wireless
channels and potential user mobility render wireless network conditions very dynamic and time
varying. Much worse, the pipe model disregards multiuser interference: by giving shared access
to the same limited pool of resources to many users (either information sources or sinks), the
transmission rates of the communication links get coupled and the decomposition of the network
into a set of independent single-user links turns out to be meaningless. It is at this point when
Information Theory can come back to help.
Chapter 1. Introduction
Information Theory has not been unaware of the multiuser interference problem in multi-
terminal scenarios, as the first studies of multiuser channels date back to as soon as 1961, only
13 years after its birth in 1948. Besides, interference was not the only inherent phenomenon
of wireless networks that received attention from the information theoretic community: issues
such as cooperation and feedback also played an important role. In an impressive synthetic
effort, every possible degree of freedom of physical layer interaction between the terminals of
a wireless network were condensed in five multiuser channel models: the two-way channel, the
broadcast channel, the multiple access channel, the relay channel, and the interference channel.
Any network under study can therefore be decomposed into some combination of these building
blocks.
While the study of just one of the previous canonical multiuser channel models could well fit
the entire scope of a Ph.D. dissertation, it is the aim of the present one to adopt a higher level
approach and study multiuser interference in wireless networks as a phenomenon with many
facets by itself. Thus, from pure network information theory to the organization of real network
scenarios, from optimization theory to statistical analysis, the focus of this dissertation is diverse
in topics, yet it hopes to be specific in conclusions.
1.1
Motivation and Objectives
Although the role of multiuser interference is widely recognized in prospective systems design,
its impact on network performance is diverse. To serve as an example, wireless sensor networks,
wireless multihop networks, and mobile cellular systems, three key scenarios that shall enable
most of upcoming communication services, exhibit different characteristics that may cause the
eventual transmission strategies of the network terminals to be significantly different. In this
respect, we identify the information that each receiver has about the structure of the interference
to be the principal bottleneck constraining both the performance and sophistication of the
potential interference mitigation techniques that can be carried out. This has motivated the
study of four different research lines, each one represented by a different word in the subtitle of
the present dissertation. Namely,
Cancelation. When the receivers of a network have complete knowledge of the statistical
structure of the interference,
Is it beneficial to allow all the receivers of the network to perform partial interference
cancelation simultaneously?
If so, what is the most advantageous partial interference cancelation scheme?
Impact. If no knowledge at all is available at the receivers, what is the performance loss
experimented? Can proper signal design mitigate the performance degradation?
1.2. Thesis Outline
Practical Management. If no knowledge at all is available at the receivers, but some sort
of centralized coordination is possible, is there a way of getting rid of the performance

losses? If so, how can we get the most out of the available transmission resources?
Complexity. In evaluating the fundamental limits of wireless networks, what is the complexity increase due to multiuser interference?
An individual chapter is dedicated to each of the above research lines and a brief summary
of these chapters is presented next.
1.2
Thesis Outline
Chapter 2
This chapter focuses first on the benefits of allowing all the receivers of a network to perform
partial interference cancelation simultaneously. While it is clear that such feature can never lead
to performance losses, we find out that sometimes it does neither produce rate gains as compared
to a situation where cancelation is performed alternatively at the receivers. Intuitively, when
all the interference is under the control of the same source, correlated coding can alleviate all
the users from performing partial interference cancelation simultaneously. This work has led to
a layered generalization of the random binning technique devised by Slepian and Wolf in the
context of coding of correlated sources for multiuser coding.
Besides, borrowing from the work of Marton for the broadcast channel, a novel transmission
strategy for the interference channel based on superposition coding and aided decoding is found
to yield an achievable region at least as large as the best known achievability result for this
channel.
Chapter 3
While Chapter 2 implicitly assumed perfect knowledge of the codebooks of the interferent users,
Chapter 3 focuses on the opposite situation of interference unaware receivers. That is often the
case in decentralized wireless networks with uncoordinated nodes, and motivates the study of the
totally asynchronous interference channel with single-user receivers. The capacity region of this
channel is characterized within an Information Spectrum approach, although more amenable
single-letter inner and outer bounds are also provided. As an interesting result, Gaussiandistributed codes are found not to be optimal for the Gaussian case, as other practical codes are
shown to outperform them. An analytical characterization of the conditions for the existence
of other input statistics superior to Gaussian reveals that, essentially, the channel needs to be
interference-limited.
Chapter 4
As a result of the performance losses accounted for the lack of interference information in Chapter
3, the motivation of Chapter 4 is i) to mitigate them in a practical scenario where the receivers
are also interference unaware but frame-synchronous, and ii) to propose a practical transmission
scheme that allows for the design of optimal allocation policies of the limited transmission
resources of the network. Special attention is given to a cellular configuration, and, under
practical conditions, efficient allocation schemes achieving Pareto and sequential optimality,
respectively, are proposed and compared. The emphasis in this Chapter is on the performancecomplexity and throughput-fairness tradeoffs.
Chapter 5
While the focus of Chapters 2, 3, and 4 is on how multiuser interference and the availability of
receiver information modifies the fundamental limits (achievable rates in Chapter 2 and capacity
regions in 3) or the practical figures of merit (user throughput, Chapter 4) of wireless networks,
a related and so far unexplored aspect is how the same issues impact on the evaluation of these
quantities. Thus, Chapter 5 is devoted to propose efficient evaluation methods of capacity regions
of multiuser channels. It will be shown that, unlike the single-user case, multiuser interference
brings about non-negligible complexity in the computation of the performance limits. Particular
attention is payed to the multiple access channel and the degraded broadcast channel.
1.3
Research Contributions
The work conducted within the present thesis resulted in the publication of several contributions
in technical journals and international conferences. The details of the research contributions in
each chapter are as follows.
Chapter 2
The achievability result for the interference channel based on superposition coding and aided
decoding, lead to the following contribution:
E. Calvo, Javier R. Fonollosa, and J. Vidal, A simple achievable rate region for the
interference channel, unpublished.
Unfortunately, by the time we were to submit this journal paper (March 2006) we found that,
in an independent work, another set of authors had already published the same exact result one
month before [Cho06a].
1.3. Research Contributions
Chapter 3
The main results of this chapter are currently under review in one journal paper:
[Cal08a] E. Calvo, J. R. Fonollosa, and J. Vidal, The totally asynchronous interference channel with single-user receivers, submitted to IEEE Transactions on Information
Theory, September 2008.
Chapter 4
One journal paper currently under review and one conference paper summarize the research
contributions of the chapter:
[Cal08e] E. Calvo, J. Vidal, and J. R. Fonollosa, Optimal resource allocation in relay-
assisted cellular networks with partial CSI, submitted to IEEE Transactions on Signal
Processing, May 2008.
[Cal07e] E. Calvo, J. Vidal, and J. R. Fonollosa, Resource allocation in multihop OFDMA

broadcast networks, Proc. IEEE Workshop on Signal Process. Advances for Wireless
Commun. (SPAWC), Helsinki, Finland, June 2007.
Chapter 5
Regarding the evaluation of capacity regions of multiuser channels, the research contributions
of this chapter are in the form of one journal paper currently under review and two conference
papers:
[Cal07d] E. Calvo, D. P. Palomar, J. R. Fonollosa, and J. Vidal, On the computation of
the capacity region of the discrete MAC, submitted to IEEE Transactions on Information
Theory, May 2007.
[Cal07c] E. Calvo, D. P. Palomar, J. R. Fonollosa, and J. Vidal, The computation of

the capacity region of the discrete MAC is a rank-one non-convex optimization problem,
Proc. Intl. Symposium on Information Theory (ISIT), pp. 2396-2400, Nice, France, June
2007.
[Cal08c] E. Calvo, D. P. Palomar, J. R. Fonollosa, and J. Vidal, The computation of
the capacity region of the discrete degraded BC is a non-convex DC problem, Proc. Intl.
Symposium on Information Theory (ISIT), pp. 1721-1725, Toronto, Canada, July 2008.
Other contributions not directly related with this dissertation

Apart from the topics covered in this dissertation, other research areas have been addressed
during the period of Ph.D. studies. Some of these topics were related with research projects
for private industry and public administrations, and the most relevant publications are listed
below.
? Research in multiuser detection for underwater communications:
[Cal08d] E. Calvo and M. Stojanovic, Efficient channel estimation-based multi-user
detection for underwater CDMA systems, accepted for publication in IEEE Journal
of Oceanic Engineering, 2008.
[Cal05] E. Calvo and M. Stojanovic, A coordinate descent algorithm for multichan-
nel multiuser detection in underwater acoustic DS-CDMA systems, Proc. IEEE

OCEANS Europe Conference, Brest, France, June 2005.
? Research in resource allocation with perfect CSI:

[Cal07b] E. Calvo, J. R. Fonollosa, and J. Vidal, Near-optimal joint power and rate
allocation for OFDMA broadcast channels, Proc. IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, HI, April 2007.
[Cal07a]E. Calvo and J. R. Fonollosa, Efficient resource allocation for orthogonal
transmission in broadcast channels, Proc. IEEE Workshop on Signal Process. Ad-
vances for Wireless Commun. (SPAWC), Helsinki, Finland, June 2007.

? Additional research in resource allocation with imperfect CSI:
[Mu
n07] O. Mu
noz, J. Vidal, A. Agustn, E. Calvo, and A. Alcon, Resource man-
agement for relaying-enhanced WiMAX: OFDM and OFDMA, Workshop Trends in

Radio Resource Management, Barcelona, November 2007.
? Research in wireless sensor networks:

[Clo07] P. Closas, E. Calvo, J. Fernandez, and A. Pagès, Coupling noise effect in selfsynchronizing wireless sensor networks, Proc. IEEE Workshop on Signal Process.
Advances for Wireless Commun. (SPAWC), Helsinki, Finland, June 2007.
? Research in 4G systems
[Cal08b] E. Calvo, I. Kov
acs, L. Garca, and J. R. Fonollosa, A reconfigurable
downlink air interface: design, simulation methodology, and performance evaluation,

Proc. ICT-Mobile Summit, Stockholm, Sweden, June 2008.
Chapter 2
Partial Interference Cancelation:

When and How
Wireless networks are made of several information sources and sinks that share the same transmission resources in their pursuit of reliable communication of multiple information flows. The
fact that the transmission resources are always limited in practice together with the potential
conflicts of interest that may arise between neighboring links couples individual performance
and complicates analysis. A rigorous way of approaching analysis is through the study of multiuser channels borrowed from Network Information Theory [Cov06, Ch. 14]: the broadcast
channel (BC) [Cov72, Ber73, Gal74, Cov75, Mar79, Cov81a, Cov98, Wei06], the multiple-access
channel (MAC) [Ahl71, Lia72], the relay channel (RC) [Cov79], and the interference channel
(IC) [Car75, Car78, Han81, Car83].
Simple as they are, many problems regarding the characterization of their fundamental communication limits (capacity regions) are still open and aging. We believe that the main difficulty
driver in their analysis is multiuser interference. In non-degenerated channels we cannot assume
that the transmissions of the sources take place over orthogonal channels: the channel output of
each receiver depends on the codewords transmitted by all the senders. This fact creates interdependencies between achievable rates and renders impossible the decomposition of a network
into a set of independent single-user links.
Given that the presence of multiuser interference is often unavoidable, the next question is
what to do with it. The right answer depends on the degree of knowledge that each receiver has
about it. If completely unaware about interference, each sender-destination pair shall use singleuser codes and expect performance losses (see Chapter 3 for quantification and minimization
of these losses). Oppositely, if the identities and the codebooks of the interferers are perfectly
known at each destination, the solution is to perform partial interference cancelation. This latter
situation is the focus of this chapter.
Chapter 2. Partial Interference Cancelation: When and How
2.1
Introduction
In a strict sense, the only two channels that suffer from multiuser interference are the BC and
the IC. While in the MAC all the signals received at the destination have to be reliably decoded
and, hence, cannot be regarded as interference, in the RC there is no interference at all since
the role of the relay is to enhance the communication of the principal sender-destination pair.
Yet both the BC and the IC can be viewed as the two sides of the same tapestry: the BC can
be interpreted as an IC where all the senders are physically located inside the same transmitter,
or an IC where all senders can cooperate.
The study of the IC has direct applications in wireless network models where communication
between different sender-receiver pairs can take place simultaneously. Some examples are wireless
sensor networks, wireless multi-hop networks, and mobile cellular systems with small bandwidth
reuse factor suffering from large inter-cell interference. On the other hand, the BC models the
situation in which one sender wishes to transmit information to a number of receivers (the
information for each receiver can be different or not). This description perfectly matches the
downlink of a mobile cellular system, with the base station acting as the only sender of the
scenario. In this chapter, we shall concentrate on these two channels with two clear objectives
in mind regarding the potential of partial interference cancelation:
The analysis of when it is useful from an achievable rate standpoint.
The analysis of how to perform it efficiently.
We address the first item in Section 2.2, the second in Section 2.3, and conclude this chapter in
Section 2.4.
2.1.1
When
We start focusing on the BC. The capacity region of the discrete memoryless BC (DMBC) is
known when the channel is degraded, that is, when the different channel outputs form a Markov
chain in some specific order. In this case, Bergmans [Ber73] provided an achievable rate region
that was later found to be the capacity region thanks to the converse theorems of Wyner [Wyn73]
(for the specific case of binary symmetric channels) and Gallager [Gal74] (general case). For
general DMBCs, the first achievable rate region was obtained in Covers seminal paper [Cov72],
and was later extended by Van der Meulen [Meu75] and Cover [Cov75]. They considered the
general situation in which a common message may be sent to all the receivers. Here, the approach
will be that the information for each user is independent and there is no common message. For
a review of the most important contributions in the context of the DMBC the reader is referred
to [Meu77] and [Cov98].
The most advantageous approach so far to the DMBC without common information came
2.1. Introduction
along with the work of Marton [Mar79] (see El Gamal and Van der Meulen [Gam81] for a simpler
proof based on typicality arguments). The most salient feature of Martons contribution is that
it allows for arbitrarily correlated auxiliary random variables, hence enlarging the region with
respect to independent intermediate encoding. Martons general achievable rate region for the
1 2 discrete memoryless broadcast channel is stated in Theorem 2.1.
Theorem 2.1 (Marton) Consider the region RMT (PW U1 U2 X ) consisting of the set of (R1 , R2 )
rate pairs satisfying
R1 I(W U1 ; Y1 )
(2.1)
R2 I(W U2 ; Y2 )
(2.2)
R1 + R2 min{I(W ; Y1 ), I(W ; Y2 )} + I(U1 ; Y1 |W ) + I(U2 ; Y2 |W ) I(U1 ; U2 |W ) (2.3)

for some joint probability distribution PW U1 U2 X defined on W U1 U2 X . Denote by P the
set of all such distributions. The region

RMT =
[
PW U1 U2 X P
RMT (PW U1 U2 X ),
(2.4)
is achievable for the 1 2 DMBC.

The region RMT is characterized by two features. Namely, i) the auxiliary random variables
U1 and U2 bearing private information for each of the receivers respectively are correlated using
the random binning technique borrowed from Slepian-Wolf coding of correlated sources [Sle73];
ii) the additional auxiliary random variable W bearing information for only one of the receivers
is reliably decoded at the intended receiver and simultaneously benefits the decoding of the nonintended receiver. The non-intended receiver does not reliably decode W but only takes it into
account to compute the typical sets for decoding its information-bearing signal. In this sense,
the non-intended receiver partially mitigates the interference arising from the transmission of
information to the other user by considering W . The full region is obtained time-sharing the
situations where W is intended for the first or for the second receiver.
Whereas in Martons setting the maximal points of the region are achieved when only one
receiver performs partial interference cancelation on the signals intended for the other user, we
study whether an increase on the achievable rates can be realized by allowing both receivers
to cancel interference simultaneously. To that end, we consider in Section 2.2 an extension of
Martons coding/decoding schemes in which both receivers are allowed to simultaneously take
into account information about the message of the undesired user. The extended coding scheme
uses four random variables: W1 , W2 , U1 , U2 . As in Martons setting, U1 and U2 bear private
information to receiver 1 and receiver 2, respectively. The difference comes with the signal W1 ,
which is intended for receiver 1 but is taken into account in receiver 2; the role of W2 is dual.
The pair (U1 , U2 ) is generated conditionally on (W1 , W2 ) and both pairs ((W1 , W2 ) and (U1 , U2 ))
10
are correlated using a layered random binning scheme. Hence, the two-layered random binning
coding scheme is suitable for all the arbitrary joint probability distributions PW1 W2 U1 U2 X .
To illustrate whether or not this enhanced feature may lead to an increase on the achievable
rates, we analyze the probability of error of the proposed coding and decoding schemes and derive
a new achievable rate region, the two-layered random binning region. We prove in Section 2.2
that Martons region and the two-layered random binning region are equal. That is, for the BC
the achievable rates do not increase if only one user at a time or both of them perform partial
interference cancelation of the undesired signal.
This result is indeed not valid for the IC [Car78], for which the best achievable regions
[Han81, Cho06a, Cho06b] are obtained with coding/decoding strategies which allow for simultaneous partial interference cancelation. In general, these strategies can be shown to strictly
outperform the regions obtained with non-simultaneous partial interference cancelation. The
rationale behind this fact is that in the BC, since all the interference is originated by the same
source, we can benefit from being able to correlate the signals intended for both users. In contrast, in the IC, there is room for rate increase as the codewords of the users are forced to be
independent since no cooperation between sources is possible.
Section 2.2 is organized as follows: in Section 2.2.1 we address the necessary definitions that
will be needed in Section 2.2.2, where the analysis of the two-layered random binning region is
performed. The two-layered random binning region is shown to be equal to Martons region in
Section 2.2.3, hence showing that the generalization of Martons scheme does not increase the
achievable rates in the BC. Section 2.2.4, traces a parallelism with the IC, for which the result
of Section 2.2.3 cannot be extrapolated (a counterexample is provided).
2.1.2
How
If the previous section discussed the convenience of allowing all the receivers to perform partial
interference cancelation simultaneously, we now move on to study how to perform cancelation in
a way that the achievable rates are the largest possible. While Section 2.2 unveils some desirable
features of an interference cancelation strategy for the BC, it is their application to the IC that
yields a real improvement.
The characterization of the capacity region of the general IC remains as an open problem,
although it has been solved in several cases. Namely:
When all the channel outputs are statistically equivalent [Car83].
When the interference is very strong [Car75] or strong [Sat78, Han81, Sat81, Cos87].
A class of discrete additive degraded interference channels [Ben79].
A class of deterministic interference channels [Gam82].
2.1. Introduction
11
Other references that obtained inner and outer bounds on the capacity region of the IC are
[Car78,Han81,Car83] for the general case and [Car78,Han81,Car83,Cos87,Sas04,Kra04] for the
Gaussian case.
Our work particularly draws on the previous research made by Carleial [Car78] and Han and
Kobayashi (H&K) [Han81]. Both provided achievable rate regions for the discrete memoryless
IC (DMIC) which were based on the application of the superposition coding technique envisioned by Cover for the broadcast channel [Cov72, Ber73]. While [Car78] considered successive
superposition, [Han81] used simultaneous superposition. In particular, the most general achievable rate region derived in [Han81] is shown to include the achievable rate region of [Car78]
and has remained as the best achievable rate region for the DMIC so far. Oppositely to what
is discussed in [Han81], we do not think that the goodness of the region of [Han81] must be
attributed to the superiority of simultaneous superposition coding over successive superposition
coding but to the superiority of the decoding scheme that was used in [Han81] with respect to
that of [Car78].
Following the terminology of [Cov72, Ber73, Meu75], Carleials and H&Ks regions are based
on encapsulating cloud centers and additional high-resolution information (satellites) inside
the transmitters codewords. The behavior of the decoding at each receiver is ruled by the
information-bearing random variables of the desired transmitter and the cloud center of the
interfering sender. While Carleial adopted a sequential decoding strategy, H&K applied joint
decoding. Since the prior is a special case of the latter, H&Ks decoding strategy results superior.
In Section 2.3, we extend H&Ks decoding strategy by making explicit that each receiver is
only interested in the information-bearing random variables of its desired user. For instance,
receiver 1 may consider the ensemble of cloud centers of sender 2 to aid the decoding of its
intended message: ambiguity on the actual message carried by the cloud center of sender 2 can
be allowed as long as there is no ambiguity on its desired message. While H&K and Carleial
forced receiver 1 to reliably decode the message born in the cloud center of sender 2, we just
consider that cloud center as an aid for the decoding of the intended message. The concept
of aided decoding, is also present in the work of Marton [Mar79] for the broadcast channel.
As previously explained, in [Mar79, Thm. 2] an information-bearing random variable W is
communicated from one sender to one receiver that reliably decodes it, while the non-intended
receiver uses it just for aiding its decoding. In the context of the IC, the decoding of each
receiver is aided by the cloud centers of the interfering sender.
We show in Section 2.3 that the gains associated to the use of aided decoding instead of joint
decoding are in the form of fewer rate constraints arising from the expression of the probability
of error. That leads to an achievable rate region at least as large as H&Ks region. While
potential rate gains could also be expected from this result, [Cho06b] showed that both regions
are equal. This does not imply that simplification is the only plus we get from aided decoding:
having less constraints, the resulting error exponents in the finite blocklength regime can be
12
larger with aided decoding than with H&Ks decoding strategy.

The coverage of this item in Section 2.3 is structured in the following manner. Section 2.3.1
addresses the definitions and some preliminaries regarding the 2 2 DMIC. The achievable rate
regions with successive superposition and aided decoding are presented in Section 2.3.2, where
it is compared with the regions by Carleial [Car78] and H&K [Han81], and Section 2.3.3 shows
an example of a 2 2 DMIC for which the proposed achievable rate region outperforms H&Ks
region for a given choice of the input statistics.
2.1.3
Summary of contributions
This chapter focuses first on the convenience of allowing all the receivers of a network to perform
partial interference cancelation simultaneously. The study of this issue has resulted in the
following contributions:
The coding/decoding strategy of Marton [Mar79] for the BC is extended to allow for a
scenario of simultaneous interference cancelation by all the receivers, giving rise to the
two-layered random binning region.
The two layered random binning region and Martons region are shown to be equal, thus
concluding that simultaneous and alternate partial interference cancelation result in the
same achievable rates in the BC.
This fact has been compared to what happens in the IC, where simultaneous partial
interference cancelation clearly outperforms alternate cancelation.
From the study of the coding scheme of Marton for the BC, some ideas have been extrapolated
to the IC, where the target has been to extend the long-standing H&Ks achievable region
[Han81]. In this respect, the contributions are the following:
A new decoding scheme, aided decoding, is introduced and shown to result in a achievable
rate region which contains only a subset of the inequalities that described H&Ks region.
An example is provided where, for some fixed input statistics, the achievable rates with
aided decoding strictly outperform those of H&K.
Based upon [Cho06b], which shows that both H&K and the aided decoding region are
equal, the advantages of the aided decoding region are precluded to simplicity (less rate
constraints and less auxiliary random variables) and potentially better error exponents.
2.2. Simultaneous vs Alternate Partial Interference Cancelation in the BC
M1
A
A- n
- X (M1 , M2 )

M2
13
PY1 |X (y1 |x) - Y1n - m

1 (Y1n )
-
PY2 |X (y2 |x) - Y2n - m

2 (Y2n )
Figure 2.1: The 1 2 discrete memoryless broadcast channel without common information.
2.2
Simultaneous vs Alternate Partial Interference Cancelation

in the BC
2.2.1
Definitions
A 1 2 DMBC without common information consists of one information source (sender) and 2
receivers. The sender wishes to communicate independent information to each of the receivers.
Definition 2.1 A 1 2 discrete memoryless broadcast channel (DMBC) consists of one finite
input alphabet set X , two finite output alphabet sets Y1 , Y2 and two marginal transition probaQ
bility distributions PY1 |X and PY2 |X such that P{ykn |xn } = nj=1 PYk |X (yk,j |xj ), k = 1, 2.
Note that the DMBC is not defined in terms of its joint transition probability PY1 Y2 |X since
the capacity region depends only on the marginal distributions [Cov06, Thm. 14.6.1].

Definition 2.2 A (2nR1 , 2nR2 ), n code for the 1 2 DMBC without common information

consists of two sets of integers 1, 2nRk (k = 1, 2) called the message sets, one encoding function,
Xn :

1, 2nR1 1, 2nR2 X n
(2.5)
and 2 decoding functions

m
k : Ykn 1, 2nRk ,
k = 1, 2.
(2.6)
The sender draws the indices (M1 , M2 ) independently and uniformly from the message sets

1, 2nRk (k = 1, 2) and sends the corresponding codeword X n (M1 , M2 ), of length n, over the
channel. The k-th receiver uses its channel output, Ykn , to create an estimate of the message
Mk , m
k (Ykn ). This process is shown in Figure 2.1.
Since the message for each user is drawn independently, the average probability of error of

a (2nR1 , 2nR2 ), n code for the DMBC is defined as
Pe(n) = 2n(R1 +R2 )
mk
P { {m
1 (Y1n ) 6= m1 } {m
2 (Y2n ) 6= m2 }| (m1 , m2 ) sent} .
[1,2nRk ]
k=1,2
(2.7)
14
Definition 2.3 A rate pair (R1 , R2 ) is said to be achievable for the 1 2 DMBC if there exists

(n)
a sequence of (2nR1 , 2nR2 ), n codes with Pe 0 for arbitrarily large block length n.
2.2.2
The two-layered random binning achievable rate region
Consider the following generalization of Martons setting:

Two auxiliary random variables W1 and W2 are correlated using the random binning
technique.
A pair of auxiliary random variables U1 and U2 are generated conditionally on W1 and W2

and are correlated also using the random binning technique.
The signals (Wk , Uk ) bear the information of Mk to the k-th receiver, k = 1, 2.

The k-th receiver decodes (Wk , Uk ) aided by Wj (j 6= k) when computing the typical sets,
k = 1, 2. That is, each user performs partial interference cancelation by considering the
signal Wj .
The achievable rate region based on the above points is called the two-layered random binning
region, and is stated in the following theorem.
Theorem 2.2 Consider the region R(PW1 W2 U1 U2 X ) consisting of the set of (R1 , R2 ) rate pairs
that can be expressed as R1 = RW1 + RU1 , R2 = RW2 + RU2 satisfying
eU
RW1 H(W1 ) ; RU1 R
1
eU
RW2 H(W2 ) ; RU2 R
2
(2.8)
(2.9)
eU + R
eU I(U1 ; U2 |W1 W2 )(2.10)
RW1 + RW2 H(W1 W2 ) ; RU1 + RU2 R
1
2
eU I(U1 ; Y1 |W1 W2 ) ; R
eU I(U2 ; Y2 |W1 W2 )
R
1
2
(2.11)
eU I(W2 U1 ; Y1 |W1 ) ; RW + R
eU I(W1 U2 ; Y2 |W2 )
RW2 + R
1
1
2
(2.13)
eU I(W1 U1 ; Y1 |W2 ) ; RW + R
eU I(W2 U2 ; Y2 |W1 )
RW1 + R
1
2
2
(2.12)
eU I(W1 W2 U1 ; Y1 ) ; RW + RW + R
eU I(W1 W2 U2 ; Y2 ).
RW1 + RW2 + R
1
1
2
2
(2.14)
for some joint probability distribution PW1 W2 U1 U2 X defined on W1 W2 U1 U2 X and

eU , R
eU . Denote by P 0 the set of all such distributions. The region
non-negative reals R
1
2
[
R = Co
R(PW1 W2 U1 U2 X ) ,
(2.15)
PW1 W2 U1 U2 X P 0
called the two-layered random binning region, is achievable for the 1 2 DMBC.
Proof. See Appendix 2.A.
15
2.2. Simultaneous vs Alternate Partial Interference Cancelation in the BC
2.2.3
Equality of Martons region and the two-layered random binning region
For the purpose of comparing Martons achievable rate region to the achievable rate region
with simultaneous partial interference cancelation, the two-layered random binning region, we
consider the following alternative rephrasing of RMT .
Lemma 2.1 Consider the region RMT (PW1 W2 U1 U2 X ) consisting of the set of (R1 , R2 ) rate pairs
satisfying
R1 I(W1 W2 U1 ; Y1 )
(2.16)
R2 I(W1 W2 U2 ; Y2 )
(2.17)
R1 + R2 min{I(W1 W2 ; Y1 ), I(W1 W2 ; Y2 )}
(2.18)
+ I(U1 ; Y1 |W1 W2 ) + I(U2 ; Y2 |W1 W2 ) I(U1 ; U2 |W1 W2 )
(2.19)
for some joint probability distribution PW1 W2 U1 U2 X (w1 , w2 , u1 , u2 , x) P 0 . Then, the following
holds,
RMT =
[
PW1 W2 U1 U2 X P 0
RMT (PW1 W2 U1 U2 X ).
(2.20)
Proof. Note that

RMT =
[
PW U1 U2 X P
RMT (PW U1 U2 X )
[
PW1 W2 U1 U2 X P 0
RMT (PW1 W2 U1 U2 X )
(2.21)
since we can always set (W1 , W2 ) = (W, ) or (W1 , W2 ) = (, W ) in RMT (PW1 W2 U1 U2 X ), where
is a constant. On the other hand, consider the Markov chain (W1 , W2 ) W (U1 , U2 )
obtained when W = f (W1 , W2 ), a deterministic function of (W1 , W2 ). Then, by the data

processing theorem and the fact that W = f (W1 , W2 ) we obtain
I(W Uk ; Yk ) I(W1 W2 Uk ; Yk ),
k = 1, 2
(2.22)
k = 1, 2
(2.23)
I(Uk ; Yk |W ) = I(Uk ; Yk |f (W1 , W2 )) = I(Uk ; Yk |W1 W2 ),
k = 1, 2
(2.24)
I(W ; Yk ) I(W1 W2 ; Yk ),
I(U1 ; U2 |W ) = I(U1 ; U2 |f (W1 , W2 )) = I(U1 ; U2 |W1 W2 ).
(2.25)
The mutual information terms of the most right hand side terms of the above inequalities depend
on PW1 W2 U1 U2 X . Since we can cover all the distributions PW by passing the distribution PW1 W2
through a deterministic function, the following holds
RMT =
[
PW U1 U2 X P
which concludes the proof.
RMT (PW U1 U2 X )
[
PW1 W2 U1 U2 X
P 0
RMT (PW1 W2 U1 U2 X ),
(2.26)
The two-layered random binning region R of Section 2.2.2 is the natural extension of Martons
achievable rate region to the case in which both users perform simultaneous partial interference
16
cancelation on the signal intended for the other user. On the other hand, Lemma 2.1 allows us
to compare it with Martons region expressed in terms of the same ensemble of auxiliary random
variables. The main result of this Section is stated in the following Theorem.
Theorem 2.3 The achievable rates are not enlarged when both users perform simultaneous
partial interference cancelation with respect to time-sharing between only one user canceling
interference, i.e.,
R = RMT .
(2.27)
Martons achievable rate region and the two-layered random binning region are equal.
Proof. See Appendix 2.B.
2.2.4
Comparison with the interference channel
For the general discrete memoryless IC, the best achievable rate region is that of Han and
Kobayashi [Han81], which is based on embedding the message Mk into the codeword Xk (k =
1, 2) in two steps: first, some partial information about Mk is born by an auxiliary random
variable; second, the auxiliary random variable together with the rest of information is embedded
into Xk . While the auxiliary random variable of each user helps the decoding of the interfered
user when computing the typical sets (like the signals W1 , W2 of region R), the information
not embedded in that variable (analogous to the indexes j1 , j2 of R) remains private. The
improvement on [Han81] discussed in Section 2.3 is that the partial information carried by the
auxiliary random variable of the interfering transmitter (analogous to the indexes i1 , i2 of R)

needs not to be reliably decoded at the interfered receiver.
Oppositely to what happened in the BC, the best achievable rates for the IC [Han81] are
achieved by coding schemes that allow for simultaneous partial interference cancelation. These
schemes can be shown to strictly outperform time-sharing among non-simultaneous interference
cancelation setups, contrarily to what happens in the DMBC (Theorem 2.3), for which the
achievable rates remain unchanged.
Consider a DMIC with very strong interference, that is, a DMIC for which the following
holds
I(X1 ; Y1 |X2 ) I(X1 ; Y2 )
I(X2 ; Y2 |X1 ) I(X2 ; Y1 ).
(2.28)
(2.29)
The capacity region of such a channel [Car75] is given by the convex hull of the set of (R1 , R2 )
rate pairs satisfying
R1 I(X1 ; Y1 |X2 )
R2 I(X2 ; Y2 |X1 )
(2.30)
(2.31)
2.3. Partial Interference Cancelation through Aided Decoding
17
for some product probability distribution PX1 PX2 . The capacity region is achieved when both
receivers perform successive decoding. This is equivalent to consider that both receivers aid their
decoders by including the codeword of the undesired transmitter when computing the typical
sets. Therefore, the capacity-achieving scheme performs simultaneous interference cancelation.
If we consider the time-sharing of two non-simultaneous interference cancelation schemes, we
are only able to achieve the region given by the convex hull of the set of (R1 , R2 ) rate pairs
satisfying
R1 I(X1 ; Y1 |X2 )
(2.32)
R2 I(X2 ; Y2 )
(2.33)
R1 I(X1 ; Y1 )
(2.34)
or
R2 I(X2 ; Y2 |X1 )
(2.35)
for some product probability distribution PX1 PX2 . Clearly, the last region is a subset of the
capacity region. For some channels, it is an strict subset (consider for instance the Gaussian
IC [Car78]). Hence, for DMICs with strong interference, simultaneous interference cancelation
can strictly outperform non-simultaneous interference cancelation. The consideration of the
interference channel with very strong interference shows that the result of Theorem 2.3 cannot
be easily extended to other channels but that it describes an particular characteristic of the BC.
This fact can be explained in the following manner. A 1 2 DMBC can be viewed as a 2 2
DMIC in which both information sources are physically located inside a unique transmitter and
hence there is only one input codeword. Whereas in the interference channel the codewords
of each transmitter are independent, in the broadcast channel all the information is originated
at the same logical entity and hence the codebooks of the auxiliary random variables can be
arbitrarily correlated to improve the achievable rates. This advantage at the coding side of the
channel provides the transmitter with additional degrees of freedom that can be exploited in the
form of clever coding schemes for which non-simultaneous interference cancelation is capable of
achieving the same rates as if simultaneous interference cancelation was performed with more
complicated codes. Since in the interference channel there is no possibility to correlate the
codewords of the transmitters, it is strictly beneficial for the achievable rates that both receivers
simultaneously cancel out part of their interference.
2.3
2.3.1
Partial Interference Cancelation through Aided Decoding

Definitions
Following the terminology of [Han81], we shall distinguish between the 2 2 DMIC and the 2 2
modified DMIC.
18
M1 - X1n (M1 ) - PY1 |X1 X2 (y1 |x1 , x2 ) - Y1n - m

1 (Y1n )
2 (Y2n )
M2 - X2n (M2 ) - PY2 |X1 X2 (y2 |x1 , x2 ) - Y2n - m
Figure 2.2: The 2 2 Discrete Memoryless IC (DMIC).
The 2 2 DMIC
2.3.1.1
Definition 2.4 A 2 2 DMIC consists of 2 finite input alphabet sets, X1 , X2 , two finite output
alphabet sets Y1 , Y2 and two marginal transition probability distributions PY1 |X1 X2 and PY2 |X1 X2
n
Q
such that P{ykn |xn1 , xn2 } =
p(yk,j |x1,j , x2,j ), k = 1, 2.
j=1

Definition 2.5 A (2nR1 , 2nR2 ), n code for the 2 2 DMIC consists of two sets of integers
nR
1, 2 k (k = 1, 2), called the message sets, two encoding functions,
Xkn :

1, 2nRk Xkn ,
k = 1, 2,
(2.36)

m
k : Ykn 1, 2nRk ,
k = 1, 2.
(2.37)
and two decoding functions

Sender k draws an integer Mk uniformly from its message set 1, 2nRk and sends the corresponding codeword Xin (Mk ), of length n, over the channel (k = 1, 2). Receiver k uses its channel
k (Ykn ) of the transmitted message. This process is shown in
output Ykn to create an estimate m
Figure 2.2.
Since the message of each user is drawn independently, the average per-receiver probabilities

of error of a (2nR1 , 2nR2 ), n code for the DMIC can be expressed as
(n)
Pk (e) = 2n(R1 +R2 )

mk
P {m
k (Ykn ) 6= mk | (m1 , m2 ) sent} ,
k = 1, 2.
(2.38)
[1,2nRk ]
k=1,2
Definition 2.6 A rate tuple (R1 , R2 ) is said to be achievable for the 2 2 DMIC if there exists

(n)
a sequence of (2nR1 , 2nR2 ), n codes with Pk (e) 0, k = 1, 2, for an arbitrarily large block
length n.
2.3.1.2
The 2 2 modified DMIC
In the modified DMIC, the way the information is coded and sent through the channel is different.
Essentially, each user sends two independent flows of information instead of just one.
19

Definition 2.7 A (2nR10 , 2nR11 , 2nR20 , 2nR22 ), n code for the 2 2 modified DMIC consists

of four sets of integers 1, 2nRk0 , 1, 2nRkk (k = 1, 2), called the message sets, two encoding
functions,
Xkn :
nR nR
1, 2 k0 1, 2 kk Xkn ,
k = 1, 2,
(2.39)

(m
k0 , m
kk ) : Ykn 1, 2nRk0 1, 2nRkk ,
k = 1, 2.
(2.40)

Sender k draws two integers (Mk0 , Mkk ) uniformly from its message sets 1, 2nRk0 1, 2nRkk
and sends the corresponding codeword Xkn (Mk0 , Mkk ), of length n, over the channel (k = 1, 2).
Mk0 will be termed the low resolution message, while Mkk will be termed the high resolution mes k0 (Y n ), M
kk (Y n ))
sage, k = 1, 2. Receiver k uses its channel output Y n to create an estimate (M
k
of the transmitted messages.

Again, the messages of each user are drawn independently, and the average per receiver

probabilities of error of a (2nR10 , 2nR11 , 2nR20 , 2nR22 ), n code for the 2 2 modified DMIC can
be expressed as
(n)
Pk (e) = 2n(R10 +R11 +R20 +R22 )

X
P { (m
k0 (Ykn ), m
kk (Ykn )) 6= (mk0 , mkk )| (m10 , m11 , m20 , m22 ) sent} ,
k = 1, 2. (2.41)
m10 ,m11
m20 ,m22
Definition 2.8 A rate tuple (R10 , R11 , R20 , R22 ) is said to be achievable for the 2 2 modified

(n)
DMIC if there exists a sequence of (2nR10 , 2nR11 , 2nR20 , 2nR22 ), n codes with Pk (e) 0, k =
1, 2, for an arbitrarily large block length n.
There are two main reasons for considering the modified interference channel: C:
Any achievability result in the modified channel can be extrapolated to the original one:
if (R10 , R11 , R20 , R22 ) is achievable for the prior, (R1 , R2 ) = (R10 + R11 , R20 + R22 ) is
achievable for the latter [Han81, Cor. 2.1]. In fact, the achievable regions of Carleial,
H&K, and the proposed one have been obtained as projections into the (R1 , R2 ) plane of
achievable regions in the modified setup. This will become explicit in Section 2.3.2.
By splitting the information flow of one sender-receiver pair in one high- and another lowresolution message components, partial interference cancelation arises in a natural way by
trying to cancel out the impact of the low-resolution signal of the interfering user.
20
2.3.2
The achievability result
Consider the region R(PQZ1 Z2 X1 X2 ) consisting of the set of (R1 , R2 ) rate pairs such that R1 =
R10 + R11 , R2 = R20 + R22 satisfying
R11 I(X1 ; Y1 |Z1 Z2 Q) ; R22 I(X2 ; Y2 |Z1 Z2 Q)
(2.42)
R10 + R11 I(X1 ; Y1 |Z2 Q) ; R20 + R22 I(X2 ; Y2 |Z1 Q)
(2.43)
R10 + R11 + R20 I(X1 Z2 ; Y1 |Q) ; R20 + R22 + R10 I(X2 Z1 ; Y2 |Q)
(2.45)
R11 + R20 I(X1 Z2 ; Y1 |Z1 Q) ; R22 + R10 I(X2 Z1 ; Y2 |Z2 Q)
(2.44)
for some joint probability distribution of the form PQZ1 Z2 X1 X2 = PQ PZ1 |Q PX1 |Z1 Q PZ2 |Q PX2 |Z2 Q
defined on Q Z1 Z2 X1 X2 . Denote by P the set of all such valid joint distributions. Q
is a time-sharing random variable whose support set satisfies |Q| 3 owing to Caratherodorys
theorem.
Theorem 2.4 The region

R=
[
PQZ1 Z2 X1 X2 P
R(PQZ1 Z2 X1 X2 )
(2.46)
is achievable for the 2 2 interference channel.

Proof. Please refer to Appendix 2.C.
2.3.2.1
Carleials achievable rate region
Consider the region RC (PQZ1 Z2 X1 X2 ) consisting of the set of (R1 , R2 ) rate pairs such that R1 =
R10 + R11 , R2 = R20 + R22 satisfying
{R10 I(Z1 ; Y1 |Q); R20 I(Z2 ; Y1 |Z1 Q)} or {R10 I(Z1 ; Y1 |Z2 Q); R20 I(Z2 ; Y1 |Q)}(2.47)
{R10 I(Z1 ; Y2 |Q); R20 I(Z2 ; Y2 |Z1 Q)} or {R10 I(Z1 ; Y2 |Z2 Q); R20 I(Z2 ; Y2 |Q)}(2.48)
R11 I(X1 ; Y1 |Z1 Z2 Q) ; R22 I(X2 ; Y2 |Z1 Z2 Q)
(2.49)
for some joint probability distribution of the form PQZ1 Z2 X1 X2 P defined on Q Z1 Z2
X1 X2 .
Theorem 2.5 (Carleial) The region

RC =
[
PQZ1 Z2 X1 X2 P
RC (PQZ1 Z2 X1 X2 )
(2.50)
21
Althouth Carleial formulated RC using the convex-hull operation instead of using the time-
sharing random variable Q, we have done so in order to establish a fair comparison with the
proposed achievable rate region. Moreover, as pointed out in [Han81], the results obtained with
the formulation in terms of Q are conjectured to be strictly superior than the ones obtained
with the convex-hull operation for the general IC.
2.3.2.2
Han and Kobayashis achievable rate region
Consider the region RHK (PQZ1 Z2 U1 U2 X1 X2 ) consisting of the set of (R1 , R2 ) rate pairs such that
R1 = R10 + R11 , R2 = R20 + R22 satisfying
R10 I(Z1 ; Y1 |U1 Z2 Q) ; R10 I(Z1 ; Y2 |U2 Z2 Q)
(2.51)
R11 I(U1 ; Y1 |Z1 Z2 Q) ; R22 I(U2 ; Y2 |Z1 Z2 Q)
(2.53)
R20 I(Z2 ; Y1 |U1 Z1 Q) ; R20 I(Z2 ; Y2 |U2 Z1 Q)
(2.52)
R10 + R20 I(Z1 Z2 ; Y1 |U1 Q) ; R10 + R20 I(Z1 Z2 ; Y2 |U2 Q)
(2.54)
R11 + R20 I(U1 Z2 ; Y1 |Z1 Q) ; R22 + R10 I(U2 Z1 ; Y2 |Z2 Q)
(2.56)
R10 + R11 I(Z1 U1 ; Y1 |Z2 Q) ; R20 + R22 I(Z2 U2 ; Y2 |Z1 Q)
(2.55)
R10 + R11 + R20 I(Z1 U1 Z2 ; Y1 |Q) ; R20 + R22 + R10 I(Z2 U2 Z1 ; Y2 |Q)
(2.57)
for some probability distribution PQZ1 Z2 U1 U2 X1 X2 defined on Q Z1 Z2 U1 U2 X1 X2

such that
Z1 , Z2 , U1 , and U2 are conditionally independent given Q.

X1 = f1 (Z1 , U1 |Q), X2 = f2 (Z2 , U2 |Q).
Denote by P 0 the set of all such valid distributions.
Theorem 2.6 (Han and Kobayashi) The region
[
RHK (PQZ1 Z2 U1 U2 X1 X2 )
RHK =
(2.58)
PQZ1 Z2 U1 U2 X1 X2 P 0

It is shown in [Han81] that the inclusion RC RHK holds. We show next that the inclusion
RHK R also holds.
Lemma 2.2 The achievable rate region RHK is a subset of R.

Proof.
It is sufficient to prove that for each PQZ1 Z2 U1 U2 X1 X2 P 0 , there exists some
SQZ1 Z2 X1 X2 P such that RHK (PQZ1 Z2 U1 U2 X1 X2 ) R(SQZ1 Z2 X1 X2 ).
22

Let SQZ1 Z2 X1 X2 P be such that SQZ1 Z2 X1 X2 = SQ SZ1 X1 |Q SZ2 X2 |Q , where SQ = PQ and
X
PZk Uk |Q (zk , uk |q) 1{xk =fk (zk ,uk |q)} , k = 1, 2.
(2.59)
SZk Xk |Q (zk , xk |q) =
uk
Note that this choice of SQZ1 Z2 X1 X2 satisfies SQZ1 Z2 X1 X2 = PQZ1 Z2 X1 X2 . To the end of showing
that RHK (PQZ1 Z2 U1 U2 X1 X2 ) R(SQZ1 Z2 X1 X2 ), first note that (2.51), (2.52), and (2.54) are
constraints that limit the boundary of RHK (PQZ1 Z2 U1 U2 X1 X2 ) that do not have counterparts in
R(SQZ1 Z2 X1 X2 ). On the other hand, the constraints (2.53), (2.55), (2.56), and (2.57) satisfy
I(Uk ; Yk |Z1 Z2 ) = I(Uk Zk ; Yk |Z1 Z2 ) I(Xk ; Yk |Z1 Z2 ),
k = 1, 2
I(Uk Zj ; Yk |Zk ) = I(Zk Uk Zj ; Yk |Zk ) I(Xk Zj ; Yk |Zk ),
k = 1, 2 j 6= k
I(Zk Uk ; Yk |Zj ) I(Xk ; Yk |Zj ),
k = 1, 2 j 6= k
I(Zk Uk Zj ; Yk )
k = 1, 2 j 6= k.
I(Xk Zj ; Yk ),
(2.60)
(2.61)
(2.62)
(2.63)
These inequalities follow from the data processing theorem since (Zk , Uk ) Xk (Y1 , Y2 )
forms a Markov chain, k = 1, 2. Note that all the inequalities are satisfied with equality if
f1 (, ), f2 (, ) are such that the following functions
Z1 = h11 (X1 , U1 ) ; Z2 = h21 (X2 , U2 )
(2.64)
U1 = h12 (Z1 , X1 ) ; U2 = h22 (Z2 , X2 )
(2.65)
exist [Gam82].
2.3.2.3
Comparison of the coding strategies of RC , RHK , and R
In RC , Carleial [Car78] considered superposition coding in which the cloud centers Zk were
encapsulated probabilistically into the input codewords Xk , while H&K [Han81] considered the
simultaneous deterministic encapsulation (represented by the functions f1 (, ) and f2 (, )) of
the cloud centers Zk and the high-resolution information-bearing random variables Uk . For
the achievability of the region R we adopted Carleials approach based on the following two
arguments: i) Lemma 2.2 shows that there is no information loss in using such a coding scheme
but the possibility of increasing the mutual information; ii) the achievable rate region can be
formulated in terms of a fewer number of random variables, obtaining simpler expressions.
We believe that the coding scheme by Carleial embodies a wider class of coding strategies
than the ones considered by H&K. H&K require, for a given value of Q, the Zk s and the Uk s to
be independent. However, in the sequential superposition encoding approach, the codewords Xk
may be formed as a deterministic function of two random variables Zk and Uk , not necessarily
meeting the independence constraint.
2.3.2.4
Comparison of the decoding strategies of RC , RHK , and R
The region RC was obtained assuming a sequential behavior of the decoders. Receiver 1, for
instance, tries first to reliably decode the cloud centers of both senders in a successive man-
23
M11
?
?
M10 - Z1n (M10 ) -
U1n (M11 )
L
?
- - Xn P
-Yn
1 1
1
PP
PP
L

qYn
P
- - Xn
2
2
n 6
-
M20 - Z2n (M20 ) -
U2 (M22 )
6
6
M22
Figure 2.3: An example 2 2 modified interference channel.
ner (either Z1 first and Z2 second, or vice versa) and finally attempts to reliably decode the
high-resolution desired message contained in X1 . In contrast, H&K improved the decoding
performance by allowing the decoders to perform joint decoding and not requiring an specific
decoding order. In this case, receiver 1 tries to reliably decode the information carried by
(Z1 , U1 , Z2 ) jointly.
We propose a decoding scheme that relaxes further the decoding strategy of the receivers.
The key point is noticing that, for instance, receiver 1 does not require to reliably decode Z2 .
Clearly, not considering Z2 at receiver 1 would limit the achievable rates. However, receiver
1 can get benefit from taking Z2 into account without decoding the low-resolution message of
sender 2 with the aided decoding concept. The rationale behind the aided decoding concept
is to include Z2 to form the typical sets at receiver 1 but not to consider having ambiguity
about the information carried by Z2 as an error event. This way, the probability of error can be
decomposed into fewer terms that yield fewer rate constraints, as seen in Section 2.3.2.2.
2.3.3
An example
In order to illustrate Lemma 2.2, we present an example for fixed Q = where, for a given
probability distribution PZ1 Z2 U1 U2 X1 X2 P 0 , the region R(PZ1 Z2 X1 X2 ) (where PZ1 Z2 X1 X2 is
the appropriate marginalization of PZ1 Z2 U1 U2 X1 X2 ) strictly extends RHK (PZ1 Z2 U1 U2 X1 X2 ).
Consider the following binary-inputs (Xk = {0, 1}, k = 1, 2) binary-outputs (Yk = {0, 1},
k = 1, 2) modified interference channel of Figure 2.3.

Let
PY1 |X1 X2 (1|x1 , x2 ) = [0.8 0.9 0.2 1],
PY2 |X1 X2 (1|x1 , x2 ) = [0.8 0.1 0.8 0.2],
(2.66)
specify the transition probability matrices of the channels (the columns match the natural or-
24

0.18
0.12
0.06
0.025
0.050
0.075
0.10
Figure 2.4: The regions RHK (PZ1 Z2 U1 U2 X1 X2 ) (dashed) and R(PZ1 Z2 X1 X2 ) (solid) computed
for the binary interference channel of Figure 2.3. Units are [nat/ch. use].
dering of the inputs). Let PZ1 Z2 U1 U2 X1 X2 P 0 be such that
PZ1 (1) = 0.8, PU1 (1) = 0.9 ; PZ2 (1) = 0.8, PU2 (1) = 0.9,
(2.67)
and, as shown in Figure 2.3, xk = fk (zk , uk ) = zk uk , k = 1, 2. The dashed boxes of Figure 2.3 represent how, for the sequential superposition coding used in R, Xk is obtained in a
transparent manner from Zk in the sense that there is no explicit labeling of the high-resolution
information-bearing random variable as done in RHK (the Uk signal). For the sake of fairness, we
compute RHK (PZ1 Z2 U1 U2 X1 X2 ) with the distribution above and R(PZ1 Z2 X1 X2 ) with the same
distributions of the cloud centers and the conditional distributions of the Xk s given the Zk s
obtained from (2.67) as well. Both regions are shown in Figure 2.4.
Although Figure 2.4 shows that R(PZ1 Z2 X1 X2 ) strictly outperforms RHK (PZ1 Z2 U1 U2 X1 X2 ),
this does not imply that the full region R strictly outperforms RHK . Lemma 2.2 only proves
that RHK R, but we havent shown strict improvement of R over RHK . To this respect, it
was recently shown in [Cho06b] that both R and RHK are equal. That is, after performing the
union over P and P 0 , respectively, both regions coincide.
2.4. Conclusions
2.4
25
Conclusions
In scenarios where the presence of multiple neighboring sender-receiver pairs originates multiuser
interference, the impact on the achievable rates is significant. One useful tool that can be used
to mitigate the performance losses when the codebooks of the interferent users are known is
partial interference cancelation. In this chapter we have studied when and how to perform it,
giving special emphasis to the BC and the IC.
In the BC, the best known achievable rate region is obtained by time-sharing two situations
in which only one of the receivers partially mitigates the interference that arises from the transmission of information to another user. If the coding strategy is generalized so as to comprise
more general schemes in which both receivers perform partial interference cancelation simultaneously, the achievable rates remain unchanged. In the interference channel, oppositely, this result
does not hold. Intuitively, when all the interference is under the control of the same source,
correlated coding (as a way of precoding) can alleviate all the users from performing partial
interference cancelation simultaneously (this is the case in the BC). On the other hand, if the
interference is generated by non-cooperative sources that are physically disperse, the control of
each information source over the resulting aggregate signal at the intended receiver is rather
limited. This is the case in the IC, where partial interference cancelation significantly improves
overall performance.
Motivated by the way interference cancelation is carried out in Martons decoding strategy
for the BC, we also worked towards improving achievability results in the IC through the use
superposition coding at the sources and aided decoding at the receivers. By aided decoding we
understand that the decoding at each receiver is aided by the inclusion of a low-resolution message component of the interfering user in the typical sets. This, in contrast to other approaches,
does not imply to reliably decode part of the message of the interfering user. As a result, some
rate constraints are relaxed. However, that only leads to an improvement on simplicity with
respect to the best known achievability results since i) the new achievable region is described
using a fewer number of rate inequalities, and ii) it requires a fewer number of auxiliary random
variables. Although rate gains are precluded, the fewer number of rate inequalities of the proposed region may result in improved error exponents in the finite blocklength regime. Out of
the scope of this dissertation, however, this line has not been explored.
26
2.A
Appendix: Proof of Theorem 2.2
Let m1 [1, 2nR1 ], m2 [1, 2nR2 ] denote the messages intended for receivers 1 and 2, respectively.
The messages are encoded into the codeword X n (m1 , m2 ) through a two-stage coding process.
To start with, we define the following isomorphisms:
k : [1, 2Rk ] [1, 2nRWk ] [1, 2nRUk ]
j
k

mk (ik , jk ) = (mk 1)2nRUk + 1, mk (ik 1)2nRUk
(2.68)
(2.69)
and their inverses

nRWk
1
] [1, 2nRUk ] [1, 2Rk ]
k : [1, 2
(2.70)
(ik , jk ) mk = (ik 1)2nRUk + jk ,
(2.71)
where Rk = RWk + RUk and k = 1, 2. Note that, since k is an isomorphism, it is equivalent to transmit the message pair (m1 , m2 ) than to transmit the message tuple (i1 , j1 , i2 , j2 ) =
(1 (m1 ), 2 (m2 )). The idea of the coding scheme is the following:
(ik , jk ) are integers to be communicated to the k-th receiver.
The integers (i1 , i2 ) are encoded simultaneously using the random binning technique with
the random variables W1 and W2 .
For a given value of (i1 , i2 ) the integers (j1 , j2 ) are encoded simultaneously using the
random binning technique with the random variables U1 and U2 conditioned on (W1 , W2 ).
Receiver 1 attempts to reliably decode (i1 , j1 ) aided by W2 , i.e., it tries to mitigate the
interference that arises from the transmission of m2 by taking into account the random
variable that encodes i2 (without eventually decoding it). Receiver 2 operates similarly.
Codebook generation.
Generate 2nRW1 w1n sequences of length n, with each component drawn

e
i.i.d. PW1 . Similarly, generate 2nRW2 w2n sequences of length n, with each component drawn
e
i.i.d. PW2 . Distribute the 2nRW1 w1n sequences uniformly into 2nRW1 bins, each one containing
e
eW RW ). Proceed similarly with the wn sequences and
2n(RW1 RW1 ) sequences on average (R
e
eW RW ). Let the integers i1 , i2 index the W1 and W2 bins.

throw them into 2nRW2 bins (R
2
2
Let a W1 W2 product bin be the product space of a W1 bin and a W2 bin. The 2n(RW1 +RW2 )
W1 W2 product bins remain conceptually in the first layer.
For each W1 W2 product bin, select arbitrarily a jointly typical pair (w1n , w2n ) and label
it as a product bin head. The head of the (i1 , i2 )-th W1 W2 product bin is denoted by
(w1n (i1 , i2 ), w2n (i1 , i2 )). For each product bin, discard all the pairs of sequences that are not
product bin heads.
For each W1 W2 product bin, generate 2nRU1 un1 sequences of length n, with each component
e
drawn i.i.d. PU1 |W1 W2 (conditioned by the W1 W2 product bin head). Similarly, generate
27
2.A. Appendix: Proof of Theorem 2.2

j1

2nRU1
(,
)
U1
2
1

-
-
-
- j2

...
i1
U2
2nR-U2
- n
- X (i1 , j1 , i2 , j2 ) X n (m1 , m2 )
nRW1
2

W1
(, )
2
1

-
-
-
- i2

...
W2
W2
2nR-
Figure 2.5: Diagram of the coding scheme: two layered random binning.
2nRU2 un2 sequences of length n, with each component drawn i.i.d. PU2 |W1 W2 (conditioned by
e
the W1 W2 product bin head). Distribute the 2nRU1 un1 sequences uniformly into 2nRU1 bins,
e
eU RU ). Proceed similarly with the
each one containing 2n(RU1 RU1 ) sequences on average (R
1
1
eU RU ). Let the integers j1 , j2 index the U1
un sequences and throw them into 2nRU2 bins (R
e
and U2 bins. Let a U1 U2 product bin be the product space of a U1 bin and a U2 bin. For
a given W1 W2 product bin, the 2n(RU1 +RU2 ) U1 U2 product bins remain conceptually in a
second layer.
For each W1 W2 product bin, select arbitrarily in all the associated U1 U2 product bins
a jointly typical pair (un1 , un2 ) and label it as a product bin head. The product bin head of the
(j1 , j2 )-th U1 U2 product bin derived from the (i1 , i2 )-th W1 W2 product bin is denoted
by (un1 (i1 , i2 , j1 , j2 ), un2 (i1 , i2 , j1 , j2 )). For each U1 U2 product bin, discard all the pairs of
sequences that are not product bin heads. The surviving unk sequences are arbitrarily indexed as
unk (i1 , i2 , lk ), where lk 2n min{RUk ,RU1 +RU2 } 2nRUk , k = 1, 2. Denote by k (i1 , i2 , jk ) the set
e
of unk (i1 , i2 , lk ) sequences belonging to the jk -th Uk bin, whose cardinality is coarsely bounded
P nRUk
e
e
e
by |k (i1 , i2 , jk )| 2nRUk . In addition, j2k =1 |k (i1 , i2 , jk )| 2n min{RUk ,RU1 +RU2 } 2nRUk
also.
Encoding. If the source wants to transmit the message pair (m1 , m2 ), it first computes (i1 , j1 ) =
1 (m1 ) and (i2 , j2 ) = 2 (m2 ). Then, the (i1 , i2 )-th product bin head of the W1 W2 layer and the
(j1 , j2 )-th product bin head of the U1 U2 layer derived from (w1n (i1 , i2 ), w2n (i1 , i2 )) are selected.
An encoding error is declared if no product bin head could be defined in any of the layers. Finally,
a codeword xn of length n is drawn with each component i.i.d. PX|W1 W2 U1 U2 (conditioned
on (w1n (i1 , i2 ), w2n (i1 , i2 ), un1 (i1 , i2 , j1 , j2 ), un2 (i1 , i2 , j1 , j2 ))) and sent through the channel. This
process is shown in Figure 2.5.
28
Decoding. The decoding is symmetrical at both receivers.

(n)
Receiver 1 computes the set A,1 of -typical (w1n (i1 , i2 ), w2n (i1 , i2 ), un1 (i1 , i2 , l1 ), y1n ) sequences (for all l1 ) and chooses (i1 , j1 ) [1, 2nRW1 ] [1, 2nRU1 ] as the unique pair of
(n)
integers such that (w1n (i1 , i2 ), w2n (i1 , i2 ), un1 (i1 , i2 , l1 ), y1n ) A,1 for some i2 [1, 2nRW2 ]
and some l1 such that u1 (i1 , i2 , l1 ) 1 (i1 , i2 , j1 ). If none or more than one such (i1 , j1 )
pair can be found, a decoding error is declared.
Assuming that an estimate of both integers (i1 , j1 ) is available and a decoding error has
not been declared, the estimate of the message m1 is obtained as m
1 = 1 (i1 , j1 ) =
1
(i1 1)2nRU1 + j1 .
Receiver 2 operates similarly by interchanging the roles of w1n and w2n , replacing un1 by un2 ,
and using 1
2 .
An error event will occur if any of the receivers declares
Analysis of the probability of error.
a decoding error, or if the transmitter declares an encoding error in any of the layers. Without
loss of generality, it can be assumed that the integers (m1 , m2 ) = (1, 1) were sent. Equivalently,
assume that (i1 , i2 , j1 , j2 ) = (1, 1, 1, 1) were sent (since k (1) = (1, 1), k = 1, 2). Hence, by the
union bound,
(n)
(n)
(n)
(n)
Pe(n) Penc,1 + Penc,2 + Pdec,1 + Pdec,2 ,
(2.72)
where :
(n)
Penc,1 is the probability of not finding a suitable product bin head in the (1, 1)-th W1 W2
product bin.
(n)
Penc,2 is the probability of not finding a suitable product bin head in the (1, 1)-th U1 U2
product bin belonging to the layer derived from the (1, 1)-th W1 W2 product bin.
(n)
Pdec,k is the probability of decoding error of the k-th receiver, that is,

(n)
Pdec,k = P (ik , jk ) 6= (1, 1)|(1, 1, 1, 1) sent .
(2.73)
The conditions on the rates that allow an arbitrary low probability of encoding error at both
(n)
layers (Penc,i 0, i = 1, 2) can be derived and extrapolated from the work of Marton [Mar79]
and the posterior alternative proof by El Gamal and Van der Meulen [Gam81],
eW ; RU R
eU
RW1 R
1
1
1
eW ; RU R
eU
RW2 R
2
2
2
(2.74)
(2.75)
eW + R
eW I(W1 ; W2 ) ; RU + RU R
eU + R
eU I(U1 ; U2 |W1 W2 ). (2.76)
RW1 + RW2 R
1
2
1
2
1
2
29
2.A. Appendix: Proof of Theorem 2.2

(n)
Now consider the term Pdec,1 . Let us define the following events
(n)
E(i1 , i2 , j1 ) = {(w1n (i1 , i2 ), w2n (i1 , i2 ), un1 , y1n ) A,1 for some un1 1 (i1 , i2 , j1 )|(1, 1, 1, 1) sent},
(2.77)
which allow us to bound the probability of decoding error as

(n)
Pdec,1 P{E0 } +
6
X
j=1
(2.78)
P{Ej }.
The definition of the error events Ej and their associated rate constraints are the following
E0) E0 , E(1, 1, 1). By the Asymptotic Equipartition Property (AEP) [Cov06, Ch. 3],
P{E0c } .
o
n S
E(i1 , 1, 1) .
E1) E1 ,
i1 6=1
(a)
P{E1 } = P {E(i1 , 1, 1) for some i1 6= 1} 2nRW1 P {E(2, 1, 1)}
(2.79)
(b)
(n)
2nRW1 |1 (2, 1, 1)| P{(w1n (2, 1), w2n (2, 1), un1 (2, 1, l1 ), y1n ) A,1 |(1, 1, 1, 1) sent}
(c)
(n)
2n(RW1 +RU1 ) P{(w1n (2, 1), w2n (2, 1), un1 (2, 1, l1 ), y1n ) A,1 |(1, 1, 1, 1) sent}
X
(d) n(RW +R
eU )
1
1
=2
PW1 W2 (w1 , w2 )PU1 |W1 W2 (u1 |w1 , w2 )PY1 |W2 (y1 |w2 )
e
(2.80)
(2.81)
(2.82)
(n)
n
(w1n ,w2n ,un
1 ,y1 )A,1
2n(RW1 +RU1 ) 2n(H(W1 ,W2 ,U1 ,Y1 )+) 2n(H(W1 W2 )) 2n(H(U1 |W1 W2 )2) 2n(H(Y1 |W2 )2)(2.83)
e
2n(RW1 +RU1 ) 2n(H(Y1 |W2 )H(Y1 |W1 W2 U1 )6) = 2n(RW1 +RU1 ) 2n(I(W1 U1 ;Y1 |W2 )6)
e
(2.84)
The inequalities (a) and (b) and (c) follow from the union bound and the symmetry of
the code. In (c), it has been used that |1 (2, 1, 1)| 2nRU1 (in the worst case scenario,
e
all the un1 sequences are concentrated in the same U1 bin). The equality (d) takes into
account that when (1, 1, 1, 1) is sent, y1n does not depend on w1n (2, 1), but may depend on
w2n (2, 1) if the same w2n sequence happens to belong to the product bin head of the (1, 1)
and the (2, 1) W1 W2 product bins. In addition, it has been considered that l1 is such
that un1 (i1 , i2 , l1 ) 1 (2, 1, 1). Since can be made arbitrarily small, the probability of
the error event E1 is driven to zero with n if
eU I(W1 U1 ; Y1 |W2 ).
RW1 + R
1
(2.85)
A similar analysis is performed for the rest of error events. Note that, in the rest of cases,
the mutual information constraining the appropriate sum of rates follows after subtracting
H(Y1 |W1 W2 U1 ) to the entropy of Y1 conditioned by an appropriate subset of random
variables that follows from the dependence tree in each error event. We will use this fact
to simplify the derivations of the remaining rate constraints.
n S
o
E2) E2 ,
E(i1 , i2 , 1) . Similarly and up to some epsilons,
i1 ,i2 6=1
eU H(Y1 ) H(Y1 |W1 W2 U1 ) = I(W1 W2 U1 ; Y1 ).

RW1 + RW2 + R
1
(2.86)
30

Note that in this situation y1n is independent of any of the other signals.
o
n S
E(i1 , 1, j1 ) . As in E1), the channel output y1n may depend on w2n (i1 , 1).
E3) E3 ,
i1 ,j1 6=1
Hence,
eU H(Y1 |W2 ) H(Y1 |W1 W2 U1 ) = I(W1 U1 ; Y1 |W2 ).

(2.87)
RW1 + R
1
o
n S
E(1, 1, j1 ) . Here, y1n shows dependence on w1n (1, 1) and w2n (1, 1). There E4) E4 ,
j1 6=1
fore,
eU H(Y1 |W1 W2 ) H(Y1 |W1 W2 U1 ) = I(U1 ; Y1 |W1 W2 ).

(2.88)
R
1
o
n S
E(1, i2 , j1 ) . Oppositely to what happened in E1), y1n does not depend
E5) E5 ,
i2 ,j1 6=1
on w2n (1, 2), but may depend on w1n (1, 2) if the same w1n sequence happens to belong to
the product bin head of the (1, 1) and the (1, i2 ) W1 W2 product bins.
eU H(Y1 |W1 ) H(Y1 |W1 W2 U1 ) = I(W2 U1 ; Y1 |W1 ).
RW2 + R
1
E6) E6 ,
i1 ,i2 ,j1 6=1
(2.89)
o
E(i1 , i2 , j1 ) . When there is mismatch in all the variables, y1n is
independent of w1n , w2n and un1 hence yielding

eU H(Y1 ) H(Y1 |W1 W2 U1 ) = I(W1 W2 U1 ; Y1 ).
RW1 + RW2 + R
1
(2.90)
Notice that E1) and E2) are redundant. If we gather the useful rate constraints for receiver 1
together with the extrapolated rate constraints for receiver 2, we obtain the following system of
constraints:
eU I(U1 ; Y1 |W1 W2 ) ; R
eU I(U2 ; Y2 |W1 W2 )
R
1
2
(2.91)
eU I(W2 U1 ; Y1 |W1 ) ; RW + R
eU I(W1 U2 ; Y2 |W2 )
RW2 + R
1
1
2
(2.93)
eU I(W1 U1 ; Y1 |W2 ) ; RW + R
eU I(W2 U2 ; Y2 |W1 )
RW1 + R
1
2
2
(2.92)
eU I(W1 W2 U1 ; Y1 ) ; RW + RW + R
eU I(W1 W2 U2 ; Y2 ),
RW1 + RW2 + R
1
1
2
2
(2.94)
which ensure vanishing probabilities of decoding error at both receivers. Taking into account the
rate constraints that arose from the necessity of driving to zero the probability of an encoding
error at the source
eW ; RU R
eU
RW1 R
1
1
1
eW ; RU R
eU
RW2 R
2
2
2
(2.95)
(2.96)
eW + R
eW I(W1 ; W2 ) ; RU + RU R
eU + R
eU I(U1 ; U2 |W1 W2 ), (2.97)
RW1 + RW2 R
1
2
1
2
1
2
eW , R
eW . Hence, they can be increased up
it can be noticed that there is no constraint put on R
1
2
e
to their maximum value (RWk H(Wk ), k = 1, 2 due to the necessity of finding enough typical
sequences for the codebook construction). This results in the achievability of R(PW1 W2 U1 U2 X ).
By timesharing arguments, the convex hull of the union over all the possible probability distributions in P 0 , the region R, is achievable.
31
2.B. Appendix: Proof of Theorem 2.3
2.B
We shall prove that RMT R and R RMT hold.
2.B.1
Proof of RMT R
Consider the expression of R(PW1 W2 U1 U2 X ) and set W1 = W and W2 = (constant). After the
following steps:
set RW2 = 0 (W2 = ),

eU using the Fourier-Motzkin elimination technique
eU and R
eliminate the variables R
2
1
[Dan97],
set the change of variables: RU1 = R1 RW1 , RU2 = R2 ,

eliminate RW1 using Fourier-Motzkin elimination,
the achievable rate region R(PW U1 U2 X ) is suitably projected into the (R1 , R2 ) plane. Its ex-
pression becomes
R1 I(W U1 ; Y1 )
(2.98)
R2 I(U2 ; Y2 |W )
(2.99)
R1 + R2 min{I(W ; Y1 ), I(W ; Y2 )} + I(U1 ; Y1 |W ) + I(U2 ; Y2 |W ) I(U1 ; U2 |W ),(2.100)

which depends only on PW U1 U2 X P. Similarly, when W1 = and W2 = W we obtain that the
projection of R(PW U1 U2 X ) into the (R1 , R2 ) plane has the expression

R1 I(U1 ; Y1 |W )
(2.101)
R2 I(W U2 ; Y2 )
(2.102)
R1 + R2 min{I(W ; Y1 ), I(W ; Y2 )} + I(U1 ; Y1 |W ) + I(U2 ; Y2 |W ) I(U1 ; U2 |W ),(2.103)

which again depends only on PW U1 U2 X . Finally, note that
[
[
(a)
RMT (PW U1 U2 X ) = Co
RMT (PW U1 U2 X )
RMT =
PW U1 U2 X P
(b)
= Co
PW U1 U2 X P
= Co
Co (R(PW U1 U2 X ) R(PW U1 U2 X ))
(2.105)
R(PW1 W2 U1 U2 X )
Co
PW1 W2 U1 U2 X P 0
W1 = or W2 =
= R,
(2.104)
PW U1 U2 X P
R(PW1 W2 U1 U2 X )(2.106)
PW1 W2 U1 U2 X P 0
(2.107)
32
where the equality (a) follows from the convexity of RMT , and (b) follows from noticing that the
two maximal points of RMT (PW U1 U2 X ) are contained within R(PW U1 U2 X ) R(PW U1 U2 X ).
2.B.2
Proof of R RMT
Consider the expression of R(PW1 W2 U1 U2 X ) in Theorem 2.2. After the steps

eU using the Fourier-Motzkin elimination technique,
eU and R
eliminate the variables R
2
1
set the change of variables: RU1 = R1 RW1 , RU2 = R2 RW2 ,
the projection of R(PW1 W2 U1 U2 X ) into the (R1 , R2 , RW1 , RW2 ) hyperplane is obtained. Among
the rate inequalities that define the region in that hyperplane, we reproduce here the following
subset:
R1 I(W1 U1 ; Y1 |W2 )
(2.108)
R2 I(W2 U2 ; Y2 |W1 )
(2.109)
R1 + R2 I(W1 W2 U1 ; Y1 ) + I(U2 ; Y2 |W1 W2 ) I(U1 ; U2 |W1 W2 )
(2.110)
R1 + R2 I(W2 U1 ; Y1 |W1 ) + I(W1 U2 ; Y2 |W2 ) I(U1 ; U2 |W1 W2 )
(2.111)
R1 + R2 I(U1 ; Y1 |W1 W2 ) + I(W1 W2 U2 ; Y2 ) I(U1 ; U2 |W1 W2 ),
(2.113)
R1 + R2 I(W1 U1 ; Y1 |W2 ) + I(W2 U2 ; Y2 |W1 ) I(U1 ; U2 |W1 W2 )
(2.112)
which defines a subset of RMT (PW1 W2 U1 U2 X ) (Lemma 2.1). Therefore, there is no need to
consider the rest of rate inequalities of R(PW1 W2 U1 U2 X ) to show that
R(PW1 W2 U1 U2 X ) RMT (PW1 W2 U1 U2 X ),
and, consequently,
[
R = Co
R(PW1 W2 U1 U2 X ) Co
PW1 W2 U1 U2 X P 0
(2.114)
(a)
RMT (PW1 W2 U1 U2 X ) = RMT ,(2.115)
PW1 W2 U1 U2 X P 0
where equality (a) follows from the convexity of RMT , which makes the convex hull unnecessary.
2.C
It is sufficient to show the achievability of (2.42)-(2.45) for the modified interference channel.
Suppose we have a coding scheme like the one in Figure 2.6.
Codebook generation.
Generate a random sequence Qn of length n with each element i.i.d.
PQ . For each sequence q n , sender 1 generates 2nR10 independent codewords of length n
33
2.C. Appendix: Proof of Theorem 2.4

- Y n - (M
10 , M
11 )
1
M10 - Z1n (M10 ) - X1n (M10 , M11 )

A
6
A
A -P
M11
Y1 |X1 X2 (y1 |x1 , x2 )
PY2 |X1 X2 (y2 |x1 , x2 )
M22

?
M20 - Z2n (M20 ) - X2n (M20 , M22 )

A
A
A - Y n - (M
20 , M
22 )
2
Figure 2.6: Coding scheme for the interference channel. The role of the time-sharing random
variable Q has been omitted.
indexed as Z1n (M10 |q n ) (M10 [1, 2nR10 ]) with each element i.i.d. PZ1 |Q . For each generated
(q n , Z1n (M10 |q n )) pair, sender 1 generates 2nR11 independent codewords of length n indexed as
X1n (M10 , M11 |q n ) (M11 [1, 2nR11 ]) with each element i.i.d. PX1 |Z1 Q . The rate of sender 1
when projected into the original interference channel is R1 = R10 + R11 . Sender 2 proceeds
similarly using the same sequence Qn (its value is revealed to both senders).
Encoding. To send (mk0 , mkk ) [1, 2nRk0 ] [1, 2nRkk ], sender k transmits Xkn (mk0 , mkk |q n ),
k = 1, 2.
Decoding. Receiver 1 looks for the unique pair of integers (m

10 , m
11 ) [1, 2nR10 ] [1, 2nR11 ]
satisfying
(q n , z1n (m10 |q), xn1 (m10 , m11 |q n ), z2n (m20 |q), y1n ) A(n)
,
(n)
for some m20 [1, 2nR20 ] (one or many of them), where A
(2.116)
is the set of -typical sequences.
Note that the value of q n used in the construction of the codewords is also told to both receivers.
If none or more than such a pair exists, a decoding error is declared. Note that in order not
to have ambiguity on its decision, receiver 1 requires a unique pair (m
10 , m
11 ) to trigger the
jointly typical set with some z2n sequence. It might have ambiguity about m20 if such z2n was not
unique. However, this is not a problem for receiver 1 since it is only interested in decoding the
(n)
information-bearing random variables of sender 1 (an ambiguity on m20 does not affect P1 (e)).
Receiver 2 proceeds similarly.
Analysis of the probability of error. Let us focus on the probability of error at receiver 1. Thanks
to the symmetry of the random code, we can express the average probability of error (2.41) at
receiver 1 as

(n)
P1 (e) = P (m
10 , m
11 ) 6= (1, 1)|(1, 1, 1, 1) sent ,
(2.117)
where (1, 1, 1, 1) indicates the value of the tuple (m10 , m11 , m20 , m22 ). Next, by defining the
events

E(m10 , m11 , m20 ) , (q n , z1n (m10 |q n ), xn1 (m10 , m11 |q n ), z2n (m20 |q n ), y1n ) A(n)
|(1, 1, 1, 1) sent
(2.118)
we can set a complete list of the different situations in which a decoding error is declared at
receiver 1 along with their associated rate constraints in Table 2.1.
34
Case
Error event at receiver 1
Associated rate constraint
E1
E(1, 1, 1)
E2
E(m10 , 1, 1)|E1 (1, 1, 1), m10 6= 1
R10 < I(X1 ; Y1 |Z2 Q)
E3
E(1, m11 , 1)|E1 (1, 1, 1), m11 6= 1
R11 < I(X1 ; Y1 |Z1 Z2 Q)
E4
E(m10 , m11 , 1)|E1 (1, 1, 1), m10 , m11 6= 1
R10 + R11 < I(X1 ; Y1 |Z2 Q)
E5
E(m10 , 1, m20 )|E1 (1, 1, 1), m10 , m20 6= 1
R10 + R20 < I(X1 Z2 ; Y1 |Q)
E6
E(1, m11 , m20 )|E1 (1, 1, 1), m11 , m20 6= 1
R11 + R20 < I(X1 Z2 ; Y1 |Z1 Q)
E7
E(m10 , m11 , w20 )|E1 (1, 1, 1), m10 , m11 , m20 6= 1
R10 + R11 + R20 < I(X1 Z2 ; Y1 |Q)
Table 2.1: Situations of decoding error and associated rate constraints.

Note that, oppositely to what happened with H&Ks decoding scheme, we do not consider
E(1, 1, m20 ), for some m20 6= 1, an error event for receiver 1. This fact allows us to disregard
two inequalities (one for each receiver). The rate constraints associated to each error event are
obtained using standard results on joint typicality in a not difficult but rather long manner. For
this reason, we will skip some of the parts of the proof which can be easily derived.
Consider first the error event E1. Thanks to the AEP, we can bound P{E(1, 1, 1))} .
With respect to the error event E2, we have:
P{E2}
=
=
(b)
(a)
P{E1 (w10 , 1, 1) for some w10 6= 1} 2nR10 P{E1 (2, 1, 1)}
2nR10 P{(q n , z1n (2|q n ), xn1 (2, 1|q n ), z2n (1|q n ), y1n ) A(n)
|(1, 1, 1, 1) sent}
X
PQZ1 X1 (q n , z1n , xn1 )PZ2 |Q (z2n |q n )PY1 |Z2 Q (y1n |z2n , q n )
2nR10
(2.119)
(2.120)
(2.121)
(n)
n n
(q1n ,z1n ,xn
1 ,z2 ,y1 )A
(c)
2nR10 2n(H(Q,Z1 ,X1 ,Z2 ,Y1 )+) 2n(H(Q,Z1 ,X1 )) 2n(H(Z2 |Q)2) 2n(H(Y1 |Z2 Q)2) (2.122)
2nR10 2n(H(Y1 |Z1 ,X1 ,Z2 ,Q)H(Y1 |Z2 Q)6)
(d)
2nR10 2n(I(X1 ;Y1 |Z2 Q)6) ,
(2.123)
where (a) follows from the symmetry of the code construction and the union bound, (b) follows
from the dependence tree between z1n (2|q n ), xn1 (2, 1|q n ), z2n (1|q n ), and y1n . Finally, (c) follows
from applying [Cov06, Thm. 14.2.1] and (d) follows since Z1 X1 Y1 forms a Markov chain
given Q. Using similar upper-bounding techniques for the other cases of error event we obtain
the rest of rate constraints of Table 2.1.
(n)
Note that the union bound can be used to upper bound P1 (e) as the sum of the probabilities
of cases E1 to E2. Hence, if all the rate constraints of Table 2.1 are satisfied, a total rate of
R1 = R10 + R11 is reliably communicated from sender 1 to receiver 1. Dual constraints with
(n)
interchanged subindexes arise to minimize P2 (e). Table 2.2 summarizes all the rate constraints
that must be fulfilled for the achievability of a (R1 , R2 ) = (R10 + R11 , R20 + R22 ) rate pair.
35
2.C. Appendix: Proof of Theorem 2.4

Rate constraint (receiver 1)
Rate constraint (receiver 2)
R11 < I(X1 ; Y1 |Z1 Z2 Q)
R22 < I(X2 ; Y2 |Z1 Z2 Q)
R10 + R11 < I(X1 ; Y1 |Z2 Q)
R20 + R22 < I(X2 ; Y2 |Z1 Q)
R11 + R20 < I(X1 Z2 ; Y1 |Z1 Q)
R22 + R10 < I(X2 Z1 ; Y2 |Z2 Q)
R10 + R11 + R20 < I(X1 Z2 ; Y1 |Q)
R20 + R22 + R10 < I(X2 Z1 ; Y2 |Q)
Table 2.2: Rate constraints for the achievability of a (R1 , R2 ) = (R10 + R11 , R20 + R22 ) rate pair.
Redundant constraints have been removed.
Chapter 3
The Totally Asynchronous

Interference Channel with
Single-User Receivers
In Chapter 2 we discussed about the convenience of allowing the receivers to perform partial
interference cancelation. On the one hand, it was found that this feature enhances the achievable
rates as long as each transmitter is not capable of shaping all the multiuser interference (e.g.,
in the Interference Channel). On the other hand, when useful, each receiver does not need to
reliably decode the underlying messages carried by the interfering signals: by including them
in the definition of the typical sets, the chances of correct detection of the desired message are
increased even when the interfering messages cannot be resolved.
It is implicitly assumed in the previous reasoning that the receivers have perfect knowledge
of the codebooks of the interfering users. In other words, each receiver is able to form a finite list
of potential interference realizations to aid the decoding of the desired message. Whereas this
is a reasonable assumption in small-size centralized networks, there are many other practical
applications where the interferers and/or the coupling channels are unknown that prevent us
from blissfully adopting it. In particular, in this chapter we focus on the achievable rates in
decentralized wireless networks with uncoordinated sender-destination pairs by considering an
equivalent information theoretic model.
This motivates the study of the totally asynchronous interference channel with single-user
receivers. Since this channel is not information stable, we characterize its capacity region resorting to information density, although more amenable single-letter inner and outer bounds
are provided as well. The single-letter inner bound is obtained by constraining each sender to
use stationary inputs with i.i.d. letters. Aiming at numerical evaluation of achievable rates, we
subsequently concentrate on this achievable rate region.
When considering the Gaussian case, the Gram-Charlier expansion approximation of mutual
37
38
Chapter 3. The Totally Asynchronous Interference Channel with Single-User Receivers
information allows us to i) show that Gaussian-distributed codes are not always optimal, and
ii) characterize sufficient conditions under which other input distributions may increase simultaneously the mutual information of both links beyond the Gaussian achievable rates. These
conditions depend exclusively on the transmit power constraints of the senders and the coupling
coefficients of the channel, and essentially require the receivers to be hampered by at least moderate interference. As a constructive verification of our results, some non-Gaussian-distributed
codes are studied to show that their achievable rates outperform those of Gaussian-distributed
codes when the conditions derived under the finite expansion approximation statistical analysis
are fulfilled.
3.1
Introduction
The interference channel (IC) [Car78] is the network scenario that models the interactions between several disjoint non-cooperative (in the sense that no relays are used for cooperation)
sender-receiver communication links sharing, generally in a non-orthogonal manner, the same
physical medium. Interference is generated across links, and the achievable rates of each pair are
hence coupled. The fact that each destination is interested in decoding only one among all the
information-bearing codewords on which its channel output depends is what makes this channel
difficult to analyze.
Finding a single-letter characterization of the capacity region of the IC remains an open
problem which, however, has been solved in some particular cases: i) statistically equivalent
channel outputs [Car83], ii) very strong [Car75,Car78] or strong interference [Sat78,Han81,Sat81,
Cos87], iii) a class of discrete additive degraded ICs [Ben79], and iv) a class of deterministic
ICs [Gam82]. As for results that hold in general, inner and outer bounds on the capacity
region have been found for discrete memoryless [Car78, Han81, Car83, Cho06b] and Gaussian
ICs [Car78, Han81, Car83, Kra04, Sas04, Sha07, Etk07].
The original definition of the IC assumes perfect frame synchronization between the senders
and the receivers, i.e. that the codewords sent by the transmitters are received at unison at
each destination. Instead, in the absence of frame synchronism, there exist unpredictable offsets
between the epochs at which codewords are received at the destinations. In decentralized wireless
networks consisting of several autonomous sender-receiver pairs this latter situation is likely to be
predominant. Additionally, without a centralized entity broadcasting information about who and
when is transmitting, the destinations may be unaware of the potential interference hampering
the transmission of their intended user. Routing paths that change, fading dynamics, or node
mobility may render useless any interference estimation attempt at the destinations by causing
fast time-varying changes on the network topology and hence on the interference. The following
are example scenarios and applications dealing with asynchronism and unknown interference.
Decentralized networks with simple receivers [Scu08a, Scu08b, Pan08, Les06]. In an ad-hoc
3.1. Introduction
39
network consisting of a number of noncooperative wideband links sharing the same physical resources, signalling between transmitters (for synchronization purposes) and between
transmitters and unintended receivers (to facilitate multi-user decoding) is hampered by
interference, propagation delay, and the dynamics of the network topology. In the absence
of a common clock reference, a practical choice is to employ OFDMA with independent
coding across subcarriers and perform single-user decoding in each of them.
Networks with non-stationary interference [Big07b]. Consider a cellular system with users
logging in and out independently. The statistics of the user session duration result in
an stochastic process counting the number of interferers from a neighboring cell either i)
unable to be tracked, or ii) requiring neighbor discovery mechanisms and adaptive multiuser decoding techniques of unaffordable complexity. In this scenario, the computational
burden of the receivers can be alleviated by treating interference as noise.
The throughput scaling law of multihop wireless networks [Gup00, Xie04]. The throughput
analysis of asymptotically large multihop wireless networks requires the use of mathe-
matically amenable decoding strategies at the receivers. To that end, the assumption of
totally asynchronous transmissions in conjunction with single-user decoding (of generally
Gaussian-distributed codes) give rise to closed-form results.
Motivated by the operational and practical constraints of these scenarios, we focus on the
2 2 IC in the limited setup where the receivers are restricted to treat interference as noise (we
shall refer to this strategy as single-user detection) and transmission is totally asynchronous (in
the sense of [Hui85]). The lack of frame synchronism has been studied in the discrete multipleaccess channel, and diverse results are obtained depending on the degree of asynchronism [Hui85,
Cov81b] and the memory of the channel [Ver89]. To the best of our knowledge, its impact on
the achievable rates of the interference channel has not been addressed previously. Similarly,
the rate penalty for precluding multiuser decoding strategies may also depend on the nature of
the channel as, surprisingly, for sufficiently small interference single-user detection achieves the
sum capacity of the frame-synchronous Gaussian IC [Sha07].
We first consider the discrete alphabet case in Section 3.2, where we define the channel
model and show that it is information unstable [Pin64]. Consequently, the capacity region does
not admit a single-letter characterization, and, rather, is given using an Information Spectrum
approach [Ver94]. This same approach was used in [SB06] to characterize the capacity region
of the frame-synchronous discrete IC. Pursuing analytical results, we provide an achievable rate
region and an outer bound to the capacity region using single-letter expressions. The singleletter inner bound is achieved by stationary inputs with i.i.d. letters and, given its simplicity,
we subsequently focus on this achievable rate region throughout the rest of the chapter.
Next, we move on to the Gaussian IC (GIC). It is a common practice to use Gaussiandistributed codes in the frame-synchronous GIC, the main reasons being that i) they allow for
closed-form characterizations of achievable rate regions, and ii) they achieve capacity in the
40
extreme cases when interference is strong [Han81, Sat81] or absent. Whether the optimality of
Gaussian-distributed codes can be extrapolated to all the ranges of interference intensity or to
the scope of this work is not known. Interestingly, [Che93] showed that Gaussian inputs fall short
of achieving the capacity region of the GIC when it is characterized using a limiting expression.
However, that tells us little about the possible optimality of Gaussian inputs in single-letter
characterizations of the capacity region (their gap to capacity is known to be at most 1 bit/ch.
use [Etk07]), provided that they exist. Again, when it comes to frame asynchronism and singleuser decoding, nothing is known.
In studying whether stationary i.i.d. Gaussian inputs maximize the achievable rates without
frame synchronism and single-user decoders, Section 3.3 defines optimality of input distributions
with respect to the achievable rates. Additionally, the Gram-Charlier expansion of mutual
information is introduced. Making use of this tool, we show in Section 3.4 that Gaussiandistributed codes do not always achieve all the points of the achievable rate region. Additionally,
simple sufficient conditions for non-optimality of Gaussian-distributed codes are derived that
only depend on the coupling coefficients of the channel and the transmit power constraints. For
the symmetric GIC (equal coupling coefficients, equal transmit power constraints) they reduce
to exceeding a threshold transmit power. In other words, Gaussian codewords are not optimal
when the channel is interference-limited.
This result is intuitive in the following sense: only when the distribution of the interferenceplus-noise term is dominated by the distribution of interference the optimization of each input
distribution may impact on mutual information and significant gains with respect to Gaussiandistributed codes can be expected. Some non-Gaussian-distributed codes are numerically shown
to outperform the Gaussian achievable rates in accordance with the analytical results of Section
3.5, hence constructively proving them. Finally, Section 3.6 concludes the chapter.
3.1.1
The contributions of this chapter are the following:

The capacity region of the totally asynchronous interference channel with single-user re-
ceivers is characterized using an Information Spectrum approach, but practical single-letter

inner and outer bounds are also provided.
A finite series approximation analysis of mutual information subsumes the impact of input
statistics on the achievable rates through a quadratic form around the Gaussian achievable
rates.
The statistical analysis of mutual information reveals analytic conditions under which
Gaussian-distributed codes are not optimal, and practical alternative codes are shown
to outperform the Gaussian achievable rates in excellent accordance with the derived
expressions.
41
3.2. The Capacity Region
3.2
The Capacity Region
Let us denote by X1 and X2 the input alphabets of the two senders, and by Y1 and Y2 the
output alphabets of the two destinations. The 2 2 frame-asynchronous discrete memoryless IC
(DMIC) consists of two conditional distributions {PYk |X1 X2 : X1 X2 Yk }k=1,2 that describe
the underlying channels, and two collections of distributions {PD1 ,n , PD2 ,n }

n=1 both defined on
{0, 1, . . . , n 1}
n=1 that describe the degree of asynchronism. We say that the channel is totally
asynchronous [Hui85] when PD1 ,n , PD2 ,n are uniform for all n. In contrast, a frame-synchronous
IC is characterized by PD1 ,n (0) = PD2 ,n (0) = 1 for all n.

A code for the 2 2 frame-asynchronous DMIC consists of two encoding functions
Xkn : {1, . . . , 2nRk } Xkn ,
k = 1, 2,
(3.1)
m
k : Ykn {1, . . . , 2nRk },
k = 1, 2.
(3.2)
Sender 1 draws a message M1 uniformly from {1, . . . , 2nR1 } and sends the corresponding codeword X1n , of length n, over the channel. We assume without loss of generality1 that receiver 1
is frame-synchronized with sender 1, and thus Y1n is a sufficient statistic for the message M1 .
Since receiver 1 is unaware of the presence of an interferer, the random delay D1 (with distribution PD1 ,n ) experienced by the codewords of sender 2 is unknown. Similar arguments apply
for sender 2 and receiver 2.
Thus, by treating the transmission of the interferer as noise, the transmission of each transmitter faces a channel whose conditional distribution depends on the delay of the interferer.
That is, the channel realization from transmitter 1 and receiver 1 is determined by the random
variable D1 , which is chosen according to PD1 ,n at the beginning of transmission and then held
fixed, i.e.,
P
Y1n |X1n ,D1
(y1n |xn1 , d1 )
X
n
xn
2 X2
n
X2,nd
1 +1
1
(xd2,1
)PX nd1 (xn2,d1 +1 )
2,1
n
Y
i=1
PY1 |X1 X2 (y1,i |x1,i x2,i ). (3.3)
We shall use W1n to denote the channel from sender 1 to receiver 1 without particularizing on
the delay instance,
PW1n (y1n |xn1 ) =
n1
X
d1 =0
PD1 ,n (d1 )PY1n |X1n ,D1 (y1n |xn1 , d1 ) =
n1
X
d1 =0
1
P n n (y n |xn , d1 ).
n Y1 |X1 ,D1 1 1
(3.4)
Expression (3.3) reflects the fact that each received frame of receiver 1 depends on two codewords
of user 2. Since each of them corresponds to an independent draw of the message M2 , they are
independent as well. As nothing is imposed on the distribution of the interference, the channel
(3.3) may have memory in general. A related model to (3.3) is the composite channel [Eff08], or
1
Synchronization can be achieved via the use of periodic preamble sequences at negligible rate penalty [Hui85].
42
averaged channel [Ahl68], where each component channel (for every instance of D1 ) is assumed
stationary and ergodic. The composite channel is information unstable [Pin64], so hence is the
frame-asynchronous DMIC with single-user decoders. In order to characterize the capacity of
this channel, we can therefore treat each link as a general single-user channel and apply the
capacity formula by Verd
u and Han [Ver94].
Theorem 3.1 The capacity region of the totally-asynchronous DMIC with single-user receivers
is given by
C=
PX1 ,PX2

Rk sup{k : FXk (k ) = 0}, k = 1, 2 ,
(3.5)
where PXk = {PXkn }

n=1 is a sequence of finite-dimensional distributions , and FXk is the limit
of the cumulative distribution function of the normalized information density of the k-th link,
i.e.,
FXk (k ) = lim P
n

Xkn Wkn

1
n
n
n
n
iX W (X ; Y ) k ,
n k k k k
(3.6)
where the information density amounts to

iXkn Wkn (xnk ; ykn )
Pn1
n n
PWkn (ykn |xnk )
dk =0 PDk ,n (dk )PYkn |Xkn ,Dk (yk |xk , dk )
= log
= log
Pn1
n
PYkn (ykn )
dk =0 PDk ,n (dk )PYkn |Dk (yk |dk )
Pn1
n n
dk =0 PYkn |Xkn ,Dk (yk |xk , dk )
= log
.
Pn1
n
dk =0 PYkn |Dk (yk |dk )
(3.7)
(3.8)
Proof. The proof is a natural extension of the capacity formula proved in [Ver94] for any arbitrary
single-user channel.
Given the complex form of the expressions in Theorem 3.1, we aim at deriving simpler
expressions upper and lower bounding the capacity region. To that end, we start by noticing
that the sequences of distributions PX1 and PX2 in (3.5) can be interpreted as the distributions
generating the codebooks of senders 1 and 2, respectively [Eff08]. As these distributions can
be arbitrary in the union in (3.5), an inner bound is obtained when they are constrained to be
i.i.d..
Lemma 3.1 The achievable rate region

R=
[
PX1 ,PX2
{R1 I(X1 ; Y1 ), R2 I(X2 ; Y2 )}
(3.9)
is an inner bound of the capacity region; R C.

Proof. An inner bound is obtained when we restrict the union in (3.5) to input distributions
Q
of the form PXkn (xn ) = ni=1 PXk (xk,i ). In this case, since the interference seen by receiver 1
43
3.3. The Gaussian IC
is i.i.d. irrespective of the realization of the random variable D1 , the channel between sender 1
and receiver 1 becomes memoryless with conditional distribution
PY1 |X1 (y1 |x1 ) =
X
x2 X2
PX2 (x2 )PY1 |X1 X2 (y1 |x1 , x2 ),
(3.10)
and the normalized information density iX1n W1n (X1n ; Y1n )/n converges to I(X1 ; Y1 ) by the law of
large numbers. Thus, FX1 (1 ) = u(1 I(X1 ; Y1 )), where u() is the unit step function, and
the supremum in (3.5) is hence I(X1 ; Y1 ). Lemma 3.1 follows by applying similar arguments to
the link between sender 2 and receiver 2 and noticing that the achievable rates now depend on
the input distributions through PX1 and PX2 only.
Lemma 3.2 The region

Ro =
[
PX1 ,PX2
{R1 I(X1 ; Y1 |X2 ), R2 I(X2 ; Y2 |X1 )}
(3.11)
is an outer bound to the capacity region; C Ro .

Proof. The proof follows by noticing that
(a)
(b)
(c)
1
1
I(X1n ; Y1n ) lim inf I(X1n ; Y1n |X2n ) I(X1 ; Y1 |X2 ),
n n
n n
(3.12)
sup{1 : FX1 (1 ) = 0} lim inf
where (a) follows from [Ver94, Thm. 8.(h)], (b) is a consequence of the independence between X1n
and X2n , and (c) holds for some pair of distributions PX1 and PX2 and follows from [Hui85, Thms.
3.3-3.4]. Similar arguments hold for the link between sender 2 and receiver 2, where [Hui85,
Thms. 3.3-3.4] guarantees that sup{2 : FX2 (2 ) = 0} I(X2 ; Y2 |X1 ) where I(X2 ; Y2 |X1 ) is
evaluated using the same distributions PX1 , PX2 as in I(X1 ; Y1 |X2 ) (3.12).
Although the outer bound on the capacity region is trivial, it is worth pointing out that both
R and Ro are characterized in terms of the union of regions, without any convex hull operation.
Intuitively, the lack of frame synchronism precludes time-sharing between distributions, as hap-
pens in the discrete-multiple access channel [Hui85]. Due to its simplicity and amenability for
numerical computation, we subsequently focus on the achievable rate region R throughout the
rest of the chapter.
3.3
The Gaussian IC
Consider the 2 2 standard-form GIC [Car78],

Y1 = X1 + c21 X2 + Z1
(3.13)
Y2 = X2 + c12 X1 + Z2 ,
(3.14)
44
where Zk N (0, 1) k = 1, 2, and the codewords X1 and X2 are independent and independent
of the noise samples Z1 and Z2 . The input codewords satisfy the transmit power constraint
E{Xk2 } Pk , k = 1, 2. By fully symmetric GIC we denote a GIC where c12 = c21 , c

and P1 = P2 , P . In the totally asynchronous setup with single-user decoders, to maximize
the achievable rate region R, the senders must optimize the distribution of their codewords,
described by the pdfs fX1 and fX2 .
3.3.1
Definition of optimality
Definition 3.1 The input distributions fX1 and fX2 are -optimal, [0, /2], if they achieve
the rate pair of the boundary of R that intersects the line R2 = tan()R1 . Denote such pair by
(R1? (), R2? ()).
It is not difficult to show that any pair of -optimal distributions is a solution to the optimization
problem
(
maximize
fX1 ,fX2
I(X1 ; Y1 ) I(X2 ; Y2 )
min
,
cos()
sin()
subject to fXk (x) 0 x R, k = 1, 2

Z
fXk (x)dx = 1, k = 1, 2
Z
x2 fXk (x)dx Pk , k = 1, 2.
(3.15)
(3.16)
(3.17)
(3.18)
The problem (3.15)-(3.18) is rather involved because of the intricate dependence of the mutual
information on fX1 and fX2 . Consider for instance the mutual information of the first link,
ZZ
fY |X (y|x)
dxdy (3.19)
I(X1 ; Y1 ) = I(X1 ; X1 + c21 X2 + Z1 ) =
fX1 Y1 (x, y) log 1 1
fY1 (y)
ZZ
g(y x)
dxdy,
(3.20)
=
fX1 (x)g(y x) log R
fX1 (z)g(y z)dz
where g fc21 X2 +Z1 is
1
g(x) = fc21 X2 (x) fZ1 (x) =
|c21 |
1
1
2
fX2 (z/c21 ) e 2 (xz) dz.
2
(3.21)
A similar expression holds for the mutual information of the second link.
Instead of finding a pair of -optimal distributions, we are interested in knowing whether
the common practice of using Gaussian-distributed codewords for the GIC is always an optimal
choice in our setup. The achievable rate region of Gaussian-distributed codes is given by2

[
p1
p2
1
1
G
R =
(R1 , R2 ) : 0 R1 log 1 +
, 0 R2 log 1 +
.(3.22)
2
2
1 + c221 p2
1 + c212 p1
0pk Pk
k=1,2
Unless the logarithm basis is indicated, it can be chosen arbitrarily as long as both sides of the equations
have the same units.
45
3.3. The Gaussian IC
Gaussian-distributed codes are optimal, i.e. RG = R, if and only if a pair of Gaussian distribu-
tions fX1 , fX2 is -optimal [0, /2]. Let us denote by (p1 (), p2 ()) the power allocation
that achieves the rate pair of the boundary of RG that intersects the line R2 = tan()R1 . The
region RG can hence be rephrased as

[
1
1
p1 ()
p2 ()
G
.
(R1 , R2 ) : 0 R1 log 1 +
, 0 R2 log 1 +
R =
2
2
1 + c221 p2 ()
1 + c212 p1 ()
0/2
(3.23)
To further explore the optimality of Gaussian-distributed codes, consider the following result.
Lemma 3.3 For any fixed [0, /2], if X1 N (0, p1 ) with 0 p1 P1 , the optimal
distribution of X2 that maximizes the achievable rates along the line R2 = tan()R1 is X2
N (0, p2 ) for some 0 p2 P2 . An analogous result follows if the user indexes are swapped.
Proof. Once X1 follows a Gaussian distribution, the problem at hand is to determine the
distribution of X2 solution to
(
)
I(X1 ; Y1 ) I(X2 ; Y2 )
maximize min
,
.
cos()
sin()
X2 :E{X22 }P2
(3.24)
While I(X2 ; Y2 ), equivalent to the mutual information in an AWGN channel with noise power
(1 + c212 p1 ), is maximized by a Gaussian X2 , the mutual information of the first link requires
more attention. It can be rephrased as
I(X1 ; Y1 ) = h(c21 X2 + X1 + Z1 ) h(c21 X2 + Z1 )
(3.25)
h(X2 + N1 ) h(X2 + N2 ),
(3.27)
= h(X2 + (X1 + Z1 )/c21 ) h(X2 + Z1 /c21 )
(3.26)
where N1 N (0, (1 + p1 )/c221 ), N2 N (0, 1/c221 ), and h() stands for differential entropy. The
maximization of (3.27) with respect to the distribution of X2 subject to a power constraint P2
falls within the class of problems addressed by [Liu07, Thm. 1], where it was shown that the
optimal X2 is Gaussian, too. Since a Gaussian-distributed X2 maximizes simultaneously both
mutual informations, it only remains to optimize its power p2 , 0 p2 P2 , so that (3.24) is
maximized.
According to Lemma 3.3, Gaussian-distributed codes behave as a (possibly local) extremum

in the maximization of the achievable rates. They are also a greedy strategy: although this
input distribution maximizes mutual information if interference is absent, it also gives rise to the
worst additive interference [Iha78, Dig01]. In other words, Gaussian-distributed codes maximize
h(Yk ) and h(Yk |Xk ) simultaneously for k = 1, 2, but this does not necessarily imply that they
maximize I(Xk ; Yk ) as well. Since direct construction of -optimal distributions (3.15)-(3.18)
seems overwhelming, we shall adopt a completely different approach for showing non-optimality
of Gaussian-distributed codes. It is based on the relation between mutual information and the
shape of the pdf of the codewords, as described by their cumulants.
46
3.3.2
Finite expansion analysis of mutual information
Denote by X the support set of a zero-mean continuous random variable X whose pdf is fX .
Let X () denote its characteristic function,
X () = E{ejX }.
(3.28)
The cumulants [Pap84] {i (X)}+

i=1 of X are defined as the coefficients of the McLaurin series
of the natural logarithm of the characteristic function,
loge (X ()) =
+
X
i (X)
i=1
(j)i
.
i!
(3.29)
The cumulants of the zero-mean random variable X can be related to its (central) moments,
and have some interesting properties concerning the shape of fX , as listed below [Nik93].
First-order cumulant: 1 (X) = E{X} = 0.
2 .
Second-order cumulant (variance): 2 (X) = E{X 2 } X
Third-order cumulant (skewness): 3 (X) = E{X 3 }.

Fourth-order cumulant (kurtosis): 4 (X) = E{X 4 } 3E{X 2 }2 .
Symmetry: fX (x) = fX (x) 2i1 (X) = 0 i 1.
Independence: X1 , X2 independent i (X1 + X2 ) = i (X1 ) + i (X2 ) i.
Scaling: i (aX) = ai i (X) a R.
Cumulants of the Gaussian distribution: X N (0, P ) 2 (X) = P, i (X) = 0 i 6= 2.
Skewness measures the lack of symmetry of a distribution, whereas kurtosis can be considered a
measure of the non-Gaussianity (or peakedness) of X. Kurtosis is zero for a Gaussian random
variable, it is typically positive for distributions with heavy tails and a peak at zero, and negative
for flatter-than-Gaussian densities with lighter tails. While explicit distributions with infinite
positive kurtosis exist (e.g. a limiting version of Pearsons type VII distribution3 [Pea16]),
kurtosis is fundamentally lower bounded as
4
4
4
4 (Xk ) = E{Xk4 } 3X
E{Xk2 }2 3X
= 2X
,
(3.30)
thanks to Jensens inequality and the fact that X is zero-mean4 .

3
The limiting pdf of a Pearsons type VII random variable X with zero mean, variance 2 , zero skewness, and
infinite kurtosis is fX (x) = 3 (2 + (x/)2 )5/2 .

4
The lower bound (3.30) is achievable by the distribution fX (x) = 12 ((x X ) + (x + X )).
47
3.4. On the Optimality of Gaussian-Distributed Codes
Finally, the principal tool that will be used subsequently to analyze the optimality of
Gaussian-distributed codes is Gram-Charlier expansion [Nik93], which allows us to approximate the differential entropy h(X) of the near-Gaussian random variable X around the entropy
of a Gaussian random variable with the same variance as X. If
1
2
h(X) log(2eX
)
2
2i (X)
2i
X
1 24 (X)
1 23 (X)
+
6
8
12 X
48 X
1 for i > 2 then
!
log(e),
(3.31)
is a fourth-order entropy approximation for X [Hel95]. Expression (3.31) will bridge the gap
between mutual information and cumulants in Section 3.4. Before that, let us adopt without
loss of generality the zero-mean assumption on both X1 and X2 .
3.4
On the Optimality of Gaussian-Distributed Codes
Based on the Gram-Charlier expansion of entropy, mutual information can be approximated in

terms of the cumulants of the input distributions of order four and below. This will allow us to
obtain (fourth-order) analytical results on the optimality of Gaussian-distributed codes.
Lemma 3.4 Assume that both X1 and X2 are zero-mean and E{Xk2 } = pk , k = 1, 2. A fourthorder expansion approximation of mutual information is

1
p1
I(X1 ; Y1 )
log 1 +
+ T A
2
1 + c221 p2

p2
1
log 1 +
I(X2 ; Y2 )
+ T B,
2
1 + c212 p1
(3.32)
(3.33)
where , [3 (X1 ) 4 (X1 ) 3 (X2 ) 4 (X2 )]T ,

log(e)c612 6
)
(Y2 |X2 Y6
2
12
[A]1,1 =
log(e)
12Y6 1
[B]1,1 =
[A]1,3 =
log(e)c321
6Y6 1
[B]1,3 =
[A]2,2 =
log(e)
48Y8 1
[B]2,2 =
[A]2,4 =
log(e)c421
24Y8 1
[B]2,4 =
log(e)c412
24Y8 2
(3.37)
log(e)c312
6Y6 2
log(e)c812 8
(Y2 |X2 Y8
)
2
48
(3.34)
(3.35)
(3.36)
[A]3,3 =
log(e)c621 6
(Y1 |X1 Y6
)
1
12
[B]3,3 =
log(e)
12Y6 2
(3.38)
[A]4,4 =
log(e)c821 8
(Y1 |X1 Y8
)
1
48
[B]4,4 =
log(e)
,
48Y8 2
(3.39)
the rest of entries of A and B are zero, and

Y2k = 1 + c2jk pj + pk , Y2k |Xk = 1 + c2jk pj
for k, j = 1, 2, j 6= k.
(3.40)
48
Proof. Consider I(X1 ; Y1 ) (the analysis of I(X2 ; Y2 ) is analogous). By using

Y21
Y21 |X1
= 1 + c221 P2 + P1
(3.41)
= 1 + c221 P2
(3.42)
the Gram-Charlier expansion (3.31) can be applied to each entropy term involved in mutual
information,
I(X1 ; Y1 ) = h(Y1 ) h(Y1 |X1 ) = h(X1 + c21 X2 + Z1 ) h(c21 X2 + Z1 )
(3.43)
3 (X ))2
4 (X ))2
(a) 1
(
(X
)+c
(
(X
)+c
log(e)
log(e)
3
1
2
4
1
2
21 3
21 4
(3.44)
log(2eY21 )
2
12
48
Y61
Y81
log(e) c621 23 (X2 ) log(e) c821 24 (X2 )
1
log(2eY21 |X1 ) +
+
2
12
48
Y61 |X1
Y81 |X1
log(e)

log(e)
1
p1
3 (X1 )2
4 (X1 )2
= log 1 +
2
6
2
1 + c21 p2
12Y1
48Y81
log(e)c621 6
log(e)c821 8
2
(Y1 |X1 Y6
(Y1 |X1 Y8
)
(X
)
)4 (X2 )2
+
3
2
1
1
12
48
log(e)c421
log(e)c321
(X
)
(X
)
4 (X1 )4 (X2 )
3
1
3
2
6Y61
24Y81

p1
(b) 1
= log 1 +
+ T A.
2
2
1 + c21 p2
+
(3.45)
(3.46)
(3.47)
(3.48)
(3.49)
Expression (a) is obtained when (3.31) is applied in conjunction with the independence property,
the scaling property, and the fact that Z1 is Gaussian, while (b) follows from the definition of
A (3.34)-(3.39).
The matrices A and B are upper-triangular, and hence their eigenvalues lie on their diagonal
entries (3.34), (3.36), (3.38) and (3.39). Since A and B are not negative definite (two eigenvalues
are negative and the other two are non-negative), the possibility of finding a vector of cumulants
inducing a pair of distributions outperforming Gaussian-distributed codes is not precluded.
Lemma 3.5 Gaussian-distributed codes are not -optimal for some fixed [0, /2] if the
problem
find
subject to min{T A(), T B()} > 0

[]2 2p21 ()
[]4 2p22 ()
(3.50)
(3.51)
(3.52)
(3.53)
is feasible, where A(), B() are equivalent to A, B in Lemma 3.4 but with E{Xk2 } = pk (),
k = 1, 2, in (3.40).
Proof. Gaussian distributed codes are not -optimal if another pair of distributions achieve a
rate pair outside RG in the direction given by the line R2 = tan()R1 . That amounts to finding
49
3.4. On the Optimality of Gaussian-Distributed Codes

an input distribution that yields an objective value (3.15)-(3.18) larger than min
R1G () R2G ()
cos() , sin()
o
,
where
(R1G (), R2G ())
!

1

1
p1 ()
p2 ()
log 1 +
, log 1 +
.
2
1 + c221 p2 () 2
1 + c212 p1 ()
(3.54)
Since
(
I(X1 ; Y1 ) I(X2 ; Y2 )
,
min
cos()
sin()
(a)
)
R1G () + T A R2G () + T B
min
(3.55)
,
cos()
sin()
(
)
(
)
R1G () R2G ()
T A T B
min
+ min
, (3.56)
,
,
cos() sin()
cos() sin()
where (a) follows from Lemma 3.4, a sufficient condition is that (3.50)-(3.53) is feasible, where
(3.52)-(3.53) account for the fundamental lower bound on the kurtosis (3.30).
In general, it is difficult to find a vector of cumulants satisfying Lemma 3.5 for a given GIC
and due to i) the lack of general closed-form expressions for p1 () and p2 () (which are the solution to a non-convex problem), and ii) the fact that neither A() nor B() are positive/negative
definite. However, in some particular cases the optimal power allocation (p1 (), p2 ()) can be
found in closed form.
Lemma 3.6 The power allocation (p1 (), p2 ()) that achieves the unique rate pair resulting
from the intersection of the line R2 = tan()R1 and the boundary of RG for any fixed [0, /2]
is given by
({p1 0 : (p1 , P2 ; ) = 0}, P2 )

(p1 (), p2 ()) =
(P , {p 0 : (P , p ; ) = 0})
1
where

(x, y; ) , 1 +
if (P1 , P2 ; ) 0
otherwise,
1/ cos()
1/ sin()
y
x
1
+
.
1 + c221 y
1 + c212 x
(3.57)
(3.58)
Proof. Consider that E{X12 } = p and E{X22 } = p, for some fixed 0 p P1 and 0 P2 /p.
Then, since
R1G =

1
p
1
p
G
log 1 +
,
R
=
log
1
+
2
2
2
1 + c221 p
1 + c212 p
(3.59)
are both increasing in p, which cannot exceed min{P1 , P2 /} to satisfy the transmit power
constraints, if follows that at least one of the senders must transmit at its maximum power. If

1
P1
1
P2
log 1 +
log
1
+
,
(3.60)
2 cos()
2 sin()
1 + c221 P2
1 + c212 P1
which is equivalent to (P1 , P2 ; ) 0, sender two must transmit at power p2 () = P2 and
sender one must cut-down its transmit power until both sides of (3.60) are equal, which is
achieved by p1 () = {p1 0 : (p1 , P2 ; ) = 0}. If

1
P1
1
P2
log 1 +
<
log
1
+
,
2 cos()
2 sin()
1 + c221 P2
1 + c212 P1
a similar result holds.
(3.61)
50
Corollary 3.1 In case = /4, (3.57) reduces to

p
12 ( 1 + 4c2 (1 + c2 P2 )P2 1), P2
21
12

(p1 (/4), p2 (/4)) = 2c12
P , 1 (p1 + 4c2 (1 + c2 P )P 1)
1 2c2
12 1 1
21
if
P1
1+c221 P2
P2
1+c212 P1
otherwise
, (3.62)
21
which is the power allocation that achieves the maximum symmetric achievable rate of Gaussiandistributed codes, defined as
G
Rsym
=
max
min{R1 , R2 }.
(3.63)

P2
P1
,
1 + c221 P2 1 + c212 P1
(3.64)
(R1 ,R2 )RG
Proof. It suffices to show that

sign((P1 , P2 ; /4)) = sign
and hence {p1 0 : (p1 , P2 ; /4) = 0} is the unique positive solution to the equation
c212 p21 + p1 (1 + c212 P2 )P2 = 0.
(3.65)
Corollary 3.2 For a fully symmetrical GIC with c12 = c21 , c and P1 = P2 , P , (3.62) reduces
to
p1 (/4) = p2 (/4) = P.
(3.66)
Proof. It follows after straightforward manipulation of (3.62).
Lemma 3.6, Corollary 3.1, and Corollary 3.2 allow us to state the following result.
Theorem 3.2 Gaussian-distributed codes are not optimal for the totally asynchronous GIC
G
is outperformed by other input
with single-user receivers. For the fully symmetric GIC, Rsym
distributions if
1 + c4 1
,
(3.67)
1 + c2 1 + c4
which implies that interference is at least moderate. The maximum symmetric achievable rate
P > Pth (c) ,
for zero-mean non-Gaussian X1 and X2 with symmetric pdf s, Rsym , is such that
(3.68)
G
Rsym Rsym
24 ,
where 4 = 4 (X1 ) = 4 (X2 ) is their common kurtosis.

Proof. Non-optimality of Gaussian-distributed codes follows if (3.50)-(3.53) is shown to be
feasible for some fixed , P1 , P2 , c21 , and c12 . To that end, consider the fully symmetric GIC
and = /4. By Corollary 3.2 it follows that
Y2
Y2 |X
Y21 = Y22 = 1 + (1 + c2 )P
Y21 |X1
Y22 |X2
= 1 + c P,
2
(3.69)
(3.70)
51
3.5. Numerical Results
and, according to the symmetry of the channel and the choice of , we impose that X1 and X2
have the same distribution (with skewness 3 and kurtosis 4 ) and that their pdf is symmetric
around zero. The symmetry property forces 3 = 0 and (3.50)-(3.53) reduces to
find
4 2P 2
subject to
which is feasible when
(3.71)
(1 + c4 )2 2
log(e) c8
4 > 0,
48
Y8 |X
Y8
(3.72)
p
Y2
>
1 + c4
Y2 |X
(3.73)
or, equivalently, P > Pth (c) (3.67). Moderate interference5 accounts for
1/2c2
.
c4
Since
equivalently for transmission with power above Pmod (c) =
c2 + 1 + c4 1
Pth (c) Pmod (c) =

0,
2(1 + c2 1 + c4 )c4
1+2P 1
2P
< c2 or,
(3.74)
a necessary condition for non-optimality of Gaussian codes is that the receivers experience
interference of at least moderate strength. Finally, (3.68) follows from (3.49) noticing that
G
Rsym Rsym
corresponds to the left hand side of inequality (3.72).
In essence, Theorem 3.2 shows that Gaussian-distributed codes are not optimal when interference is significant enough (at least moderate). Since Pth (c) is decreasing in c, the more limited
the achievable rates are because of interference, the easier to outperform Gaussian-distributed
codes at lower transmit powers. As long as the maximum symmetric achievable rate is concerned,
kurtosis plays a key role on the performance of non-Gaussian codes (while it also concerns certain practical implementation aspects [Bha06]). Consistently with the fact that Gaussian codes
are capacity-achieving in the AWGN, Pth (c) + when c 0.
Interestingly, [Sha07] showed that Gaussian-distributed codes and single-user detection are
sum-rate optimal provided that the interference is low enough (noisy interference, as in the
terminology of [Sha07]). Our result is consistent with that of [Sha07], which holds under weaker
interference than weak interference [Cos85], and rules out the optimality of Gaussian codes
and single-user detection for frame-synchronous GICs with stronger interference than moderate
interference.
3.5
Numerical Results
We shall present here some numerical results to illustrate Theorem 3.2. Particularly, we shall
show the existence of non-Gaussian distributions outperforming Gaussian-distributed codes for
the fully symmetric GIC to support the fourth-order analysis of Section 3.3. To that end, let us
consider the following non-Gaussian distributions:
5
A fully symmetric GIC is hampered by moderate interference [Cos85] when c2 < 1 and time-sharing is better
than Gaussian-distributed codes with single-user decoders.
52

Uniformly-distributed codes: the codewords are drawn uniformly about zero, i.e., Xk
U( 3pk , 3pk ) for some power 0 pk Pk , k = 1, 2.

Ternary-distributed codes:
the codewords are drawn from the discrete alphabet
{k , 0, k } according to the probabilities

pk
2k2
P(Xk = 0) = 1 2P(Xk = k ),
P(Xk = k ) = P(Xk = k ) =
(3.75)
(3.76)
where 0 pk Pk is the transmission power of the k-th sender, k = (3pk +4 (Xk )/pk )1/2 ,
and 4 (Xk ) 2p2k is the kurtosis, k = 1, 2. The rationale of this choice resides in that
(3.75)-(3.76) is the simplest distribution having an arbitrarily large kurtosis, a fact that
in view of Theorem 3.2 is desirable in order to increase the achievable symmetric rate
whenever near-Gaussianity holds.
Given a pair of input probability distributions, the mutual information is numerically computed
by discretizing the integrals in (3.19)-(3.21). Although these distributions are not near-Gaussian
(in the sense
2i
2i
X
1, i > 2), we shall see that the results using the Gram-Charlier expansion
hold even in this situation.
Figure 3.1 can be viewed as a constructive proof of Theorem 3.2. The achievable rate regions
of Gaussian-, uniformly-, and ternary-distributed codes are plotted for two different channels
with P = 15. For each direction (described by a different value of [0, /2]), the optimum
transmit powers for Gaussian-distributed codes (p1 (), p2 ()) are computed using Lemma 3.6,
and plug into the rest of distributions to obtain their achievable rates. For ternary-distributed
codes, the kurtosis of each users distribution is independently optimized, too. While in the
first channel c = 0.1 (left), transmission is clearly below the threshold Pth (0.1) 9950, and
none of the proposed non-Gaussian distributions can beat RG , in the second c = 1/ 2 (right),
transmission power is above the threshold Pth (1/ 2) 3.24, and achievable rate gains are
realized, showing that RG falls short of achieving the capacity region.
To see how accurate is the threshold power (3.67) given in Theorem 3.2 using the fourth-order
analysis of mutual information, Figure 3.2 (left) plots the achievable symmetric rate of Gaussian, uniformly-, and ternary-distributed codes as a function of P for c = {0.9245, 0.5436}, which
yield the theoretical values of Pth = 1 and Pth = 10, respectively. The curves show excellent
agreement with (3.67), and Gaussian-distributed codes are outperformed when the conditions
of Theorem 3.2 are satisfied. To study this in a wider range of values of c, Figure 3.2 (right)
shows the comparison between the theoretical value of the threshold power (3.67) and the actual
threshold power of uniformly-distributed codes, which is computed using a bisection method up
to a precision of 2.5%. Agreement between both curves is also observed.
Finally, to evaluate the impact of the lack of synchronization and the use of single-user
decoders on the capacity of the GIC, we have compared the achievable symmetric rates of
53
1.75
1.75
1.5
1.5
R2 [bit/ch.use]
R2 [bit/ch.use]
3.5. Numerical Results
1.25
1.25
0.75
0.75
0.5
0.5
0.25
0.25
Gaussian
uniform
ternary
0
0
0.25
0.5
Gaussian
uniform
ternary
0.75
1.25
1.5
R1 [bit/ch. use]
1.75
0
0
0.25
0.5
0.75
1.25
R1 [bit/ch. use]
1.5
1.75
Figure 3.1: Achievable rate regions of Gaussian-, uniformly-, and ternary-distributed codes for
a fully symmetric GIC with P = 15 and c = 0.1 (left) and c = 1/ 2 (right), which correspond
to a signal-to-interference ratio value of 20 dB and 3 dB, respectively.
1.25
1000
Gaussian
uniform
Gaussian
uniform
ternary
1
0.75
Pth (c)
Rsym [bit/ch. use]
750
0.5
500
0.92
250
0.25
0.9
9
0
0 1
10
15
11
20
25
30
0.2
0.4
0.6
0.8
Figure 3.2: Achievable symmetric rate of Gaussian-, uniformly-, and ternary-distributed codes
for two different values of c yielding theoretical threshold powers of 1 and 10 (left). Comparison
between the theoretical value of Pth (c) (3.63) and the threshold power of uniformly-distributed
codes as a function of the coupling coefficient (right).
Gaussian-, uniformly-, and ternary-distributed codes with the following strategies (which clearly
require synchronization and, some of them, the use of multi-user decoders):
Time-sharing: the achievable symmetric rate is independent of c and equals
1
4
log(1 + 2P ).
Since we are focusing on the fully symmetric GIC, the symmetric rate is half of the sum-rate
54

5
0.55
Gaussian
uniform
ternary
timesharing
H&K
Shang
0.5
3.5
Rsym [bit/ch. use]
Rsym [bit/ch. use]
weak moderate
interference interference
0.45
0.4
0.36
0.47
Gaussian
uniform
ternary
timesharing
H&K
Shang
4.5
3
2.5
2
0.35
1.5
1,2
0.3
2
0.5
0.34
0.36 0.77
0.45
0.34
0.25
0.2
0.4
0.8
0.6
0.8
weak moderate
interference interference
0
0
0.2
0.4
0.6
0.8
Figure 3.3: Achievable symmetric rates in the low-power regime, P = 1 (left), and high-power
regime, P = 1000 (right), as a function of the coupling coefficient c.
and
1
4
log(1 + 2P ) is also the symmetric rate of the achievable rate region of Sason [Sas04,
Thm. 1].
Han and Kobayashi (H&K) simplified region: being the most long-standing unbeaten
achievable rate region for the GIC, the general region of Han and Kobayashi [Han81, Thm.
3.2] is widely accepted as the most successful approach towards the capacity region. However, the excessive number of random variables in which it is expressed and the cardinality
of their alphabets prevents us from its direct computation. Instead, we will focus on the
symmetric rate of its simpler subregion [Han81, Thm. 4.1] which tradeoffs computational
complexity and achievable rate results.
Shangs outer bound: one of the tightest known outer bounds of the capacity region of the
GIC [Sha07], which improves upon [Kra04, Thm. 1].
Figure 3.3 compares the achievable symmetric rates of the transmission strategies described
above as a function of the coupling coefficient c in both the low-power (P = 1, shown left) and
high-power (P = 1000, shown right) regimes. When interference is weak, Gaussian-distributed
codes perform undistinguishable to H&K, and are very close to Shangs upper bound. The rest
of proposed non-Gaussian distributions, although very close to the Gaussian behavior, cannot
improve upon it because for low interference the threshold power is very large (it grows as
c4 for small c). Nevertheless, the proposed non-Gaussian distributions, which are based on
totally asynchronous transmission, are able to significantly outperform time-sharing (except for
ternary-distributed codes in the high-power regime, where the discrete nature of their alphabet
makes them reach the upper bound of log2 (3) bits/ch. use6 ). Therefore, when interference is
6
Note that the symmetric rate is strictly below log2 (3) close to those values of c which make some of the
ternary symbols of each user overlap at the receiver (c = {0.5, 1}).
3.6. Conclusions
55
weak the losses associated to the lack of transmission synchronism and the use of single-user
decoders are moderate (as long as H&Ks performance is taken as a benchmark).
When interference increases up to moderate, time-sharing appears to be the best strategy (as
long as the symmetric rate is concerned). However, it is far from Shangs upper bound, which
might not be tight in this regime. In this case, the use of the proposed non-Gaussian distributions
is beneficial in the high-power regime (where transmission power is above the threshold (3.67)),
where they can significantly reduce the capacity losses due to the lack of synchronism and the use
of single-user decoders. For instance, in the high-power regime, for 3 dB signal-to-interference
ratio the use of ternary-distributed codes can raise the achievable rates 99.8%.
3.6
Conclusions
Motivated by practical constraints arising in decentralized wireless networks with uncoordinated

nodes, we studied the totally asynchronous interference channel with single-user receivers. In
the discrete case, its capacity region was characterized using an Information Spectrum approach
since the channel is not information stable. As single-letter inner and outer bounds of the
capacity region were found, they allowed us to compute achievable rates for the Gaussian case
tackling the rather complex expressions of the capacity region.
Gaussian-distributed codes are capacity-achieving whenever the capacity region of the 2 2
GIC is known and lead to closed-form characterizations of achievable rate regions. Despite
their natural appeal, in the limited setup where transmission synchronism cannot be guaranteed
and single-user decoders are required, they fall short of maximizing the achievable rates for all
GIC instances. Sufficient conditions for the existence of other than Gaussian-distributed codes
leading to higher achievable rates were found, that for the fully symmetric GIC reduced to
ensuring that the transmission powers exceed a threshold that makes interference to be at least
moderate.
These analytical results were supported by numerical experiments aiming at i) showing
that explicit non-Gaussian distributed codes yielding higher achievable rates than Gaussiandistributed codes exist (Figure 3.1); ii) checking the agreement between the sufficient conditions
and the performance of the explicitly proposed non-Gaussian codes (Figure 3.2); and iii) quantifying the losses associated to the lack of transmission synchronism and the use of single-user
decoders and the losses/gains from using non-Gaussian distributed codes in the low- and highpower and low- and high-interference regimes (Figure 3.3).
Chapter 4
Optimal Resource Allocation in

Cellular Networks with Partial CSI
Previous chapters addressed several aspects of multiuser interference. Namely, the study of how
and when to partially cancel it (Chapter 2), and its impact on the achievable rates whenever that
is not possible (Chapter 3). Focused on canonic small-size scenarios and the infinite blocklength
regime, they disregarded part of the essential characteristics found in most of today practical
multiuser systems, e.g. cellular systems. In this chapter we try to bridge this gap by targeting
a cellular network with an arbitrary number of nodes using non-ideal codes.
Emerging cellular networks are likely to handle users with heterogeneous quality of service
requirements attending to the nature of their underlying service application, the quality of their
wireless equipment, or even their contract terms. While sharing the same physical resources
(power, bandwidth, transmission time), the utility they get from using them may be very different and arbitrage is needed to optimize the global operation of the network. In this respect, we
investigate resource allocation strategies maximizing network utility under practical constraints.
When it comes to considering a scenario with many users, the metric of interest can no
longer be the capacity region: its complexity becomes as formidable as the chances of extracting
some practical insights out of it for potential extrapolation to prospective system designs. It
is therefore sensible to adopt a particular transmission strategy and try to get the most out of
the stringently limited available resources. In particular, we focus on a cellular network with
half-duplex, MIMO terminals and relaying infrastructure in the form of fixed and dedicated
relay stations. Whereas orthogonal frequency division multiple access is assumed, it is seen as
a frequency diversity enabler since path loss is the only channel state information (CSI) known
at the transmitters, which is refreshed periodically.
The lack of complete centralized CSI of all the links leads us to nullify intracell interference by assigning resources disjointly. Despite the theoretical suboptimality of this choice from
an information theory standpoint, we see a three-fold interest in avoiding interference in this
57
58
Chapter 4. Optimal Resource Allocation in Cellular Networks with Partial CSI
scenario:
1. Not knowing the channel states between each destination and its potential undesired transmitters, the network performance is exposed to uncontrolled losses if resource allocation
is performed ignoring interference.
2. Interference couples the performance of the different links possibly in a non-analytic manner which prevents closed form expressions (see (3.19)-(3.20) in Chapter 3).
3. Whenever analytic characterizations of achievable rate degradations due to multiuser interference exist, they often lead to nonconvex expressions which preclude the use of efficient
global optimization approaches.
With this setup, the performance of a state-of-the art relay-assisted transmission protocol
is characterized in terms of the ergodic achievable rates, for which novel concave lower bounds
are developed. The use of these bounds allows us to derive two efficient algorithms computing resource allocations in polynomial time, which address the optimization of the uplink and
downlink directions jointly. First, a global optimization algorithm providing one Pareto optimal
solution maximizing network utility during all the validity of one CSI is studied, which acts as
a performance upper bound. Second, a sequential optimization algorithm maximizing network
utility frame by frame is considered as a simpler alternative. The performance of both schemes
has been compared in practical scenarios, giving special attention to the performance-complexity
and throughput-fairness tradeoffs.
4.1
Introduction
4.1.1
Motivation
The deployment of cellular networks has been traditionally associated to the provision of voice
(and low-rate data) service to mobile users. The exclusivity of this purpose, however, is in conflict
with the ubiquitous availability of wireless equipment and the steadily increasing traffic demands
arising from new interactive, multimedia services, which have opened the door to a plethora of
new potential network scenarios. From interactive gaming to wireless broadband access, different
services with heterogeneous quality of service (QoS) requirements shall converge to the same
service network. Regarding this paradigm, we identify three central issues in prospective network
design which motivate the work of this chapter:
How to characterize the user experience of the different services of the network using
homogeneous performance measures?
How to dynamically arbitrate on the shared use of the limited transmission resources of
the network by competing flows which are of different nature?
4.1. Introduction
59
How to extract the largest possible system spectral efficiency from the physical layer?
With this in mind, the optimization of the operation of the network is hence a matter of allocating
resources (power, bandwidth, rate, transmission time) efficiently for uplink (UL) and downlink
(DL) scheduled transmissions among the serving users such that some network-wide cost function
involving their service experience is maximized along time.
4.1.2
Adopted network setup
In this chapter, we tackle the network design problem adopting a cell-by-cell approach. Hence, we
focus on a cell consisting of one base station (BS) serving M mobile stations (MSs, or users). To
enable the realization of high spectral efficiencies and boost network performance, we assume that
R relay stations (RSs) are deployed within the cell coverage area to enhance the communication
between the users and the BS [Cov79, Cov06, Nab04, Wan05, Och06]. Interpreting the presence
of relays as an extension of the network infrastructure enabling relay-assisted transmission, their
locations are assumed to remain fixed, although they can indeed be optimized beforehand. All
the terminals are assumed half-duplex for practical reasons.
Since the capacity of the relay channel is still an open problem (so is determining the optimal
relaying strategy) we shall adopt here the cooperation protocol of [HM05, Prop. 2], based
on the decode-and-forward strategy [Yu05], which comprises essentially some of the protocols
in [Nab04, Och06, Val03, Doh04b] as particular cases and is able to work with partial knowledge
of the channel state. To make our approach more general, we let the BS, the RSs, and the
MSs be equipped with an arbitrary number of antennas, denoted by nBS , nRS , and nMS,m ,
respectively (nMS,m is the number of antennas of the m-th MS, 1 m M) such that extra
performance gains arising from MIMO [Big04, Tse05, Big07a] can be also captured.
Pursuing the application of our results to practical scenarios we are led to two important choices, the first one being the adoption of orthogonal frequency division multiple-access
(OFDMA). OFDMA can be efficiently implemented via FFT/IFFT, and it is able to combat
the inherent frequency selectivity of wireless channels while at the same time allowing a modular tone-based multiplexing of users. Additionally, it improves upon TDMA with respect to
achievable rates and data latency, and allows for finer granularity in resource allocation [YJC07],
a must in wideband systems. For these and other reasons, it results appealing for upcoming
wireless networking standards [Gro06, Gro07, 20005].
The second choice is related to the availability and quality of channel state information (CSI)
at each network location (BS, RSs, and MSs). In relayless OFDMA networks, centralized
perfect CSI of all the links (in the form of per-tone fading state knowledge) can be used to
allocate resources adaptively. Hence, bandwidth, power, and rate can be optimally assigned to
align with the instantaneous network conditions [Yu06], yielding enormous performance gains.
However, such perfect CSI is likely not to be available in all the (R+M+RM) links of our scenario.
60
On the one hand, the amount of processing required to take advantage of perfect CSI can be
formidable (the complexity has been shown to be NP-hard even for a relayless network [Won99])
and possibly non-affordable. On the other hand, for sufficiently fast time-varying channels, the
necessary CSI refresh interval can happen to exceed the capacity of the limited-rate feedback
channels of the network. Even worse, propagation and processing delays on the feedback channels
may result in outdated, useless CSI at the beginning of a resource allocation phase. Thus,
unlike other works [Yu06, Won99, Bae06, Lee06, Ng07] we shall study the network scenario of all
transmitters having perfect knowledge of the path loss of each of the channels, a slowly varying
scalar parameter, but being ignorant of each per-tone fading state. Although explicit path loss
estimation techniques are out of the scope of this thesis, its accurate estimation seems reasonable
provided that some pilot tones are placed within the transmission bandwidth, which is a common
practice in OFDM-based standards such as the IEEE 802.16 suite [Gro06, Gro07, 20005], the
3GPP LTE [36.], and WiMAX [Teo07] for synchronization purposes.
With this setup, we aim at optimizing the network operation for maximizing network utility
[Ng07, Kel98, Lin06, Pal07, Xue06] in a cell-by-cell approach. Centralized optimization is hence
performed at each BS which, upon collection of CSI, takes scheduling decisions and implements
resource allocation strategies shaping the instantaneous rates of all the users involved in its cell.
One nice feature of our network operation design framework is that the network resources (time,
frequency, power, and rate) devoted to UL and DL transmissions are optimized jointly, instead
of allocating a given portion of total resources to each direction in each transmission frame and
optimizing them separately.
4.1.3
We propose a centralized optimization framework for the maximization of the cell performance
based on the user experience of each serving MS. Under the setup of Section 4.1.2, the CSI of all
the links (path loss) is collected at the BS which, together with the QoS requirements of each
UL and DL flow1 and its current degree of fulfillment, decides the resource allocation strategy to
be followed during some period of time. This strategy is based on the maximization of network
utility, a cell-wide performance measure which combines the service satisfaction of all the users,
and has given rise to the following contributions to the problems raised in Section 4.1.1:
User satisfaction is measured using utility functions. Thus, the same network infrastructure
can flexibly reconfigure to optimally serve a variety of scenarios by properly choosing the
user utility function of each service under operation such that their different profiles are
conveniently reflected.
We consider here that each user requires to send and receive information, hence generating one UL flow and
another DL flow. The generalization to the setup where users may require more than one flow per direction (e.g.
when accessing different services simultaneously using the same equipment) is straightforward as each pair of UL
and DL flows can be treated as a different virtual user.
4.2. System Model and Preliminaries
61
An algorithm to efficiently compute a global optimal resource allocation strategy in polynomial time (by solving a series of convex optimization problems) is proposed. It is benchmarked against other simpler, suboptimal strategies able to retain a large fraction of
performance with significant complexity savings.
The optimal operation of the network that maximizes network utility is essentially cross-
layer, as the joint optimization of user scheduling, resource allocation, and relay-assisted
transmission is involved for UL and DL directions.
In characterizing the performance of the adopted relay-assisted transmission protocol, tight

concave lower bounds to the ergodic capacity of MIMO and distributed MIMO channels
are obtained which may find applications outside the problems addressed in this thesis.
The rest of the chapter is structured as follows. Section 4.2 describes the adopted transmission strategy for OFDMA with partial CSI and some preliminaries regarding key system
parameters. Next, Section 4.3 addresses the transmission protocol for relay-assisted communication. Its cell-wide short term performance is analytically characterized in Section 4.4 in terms
of instantaneous achievable rate regions. Then, Section 4.5 builds upon this to i) introduce
user utility functions as a useful tool to characterize user satisfaction with services of different
nature, ii) pose optimal network strategy as the solution to an optimization problem which aims
at maximizing network utility, and iii) propose an iterative algorithm to compute a global optimal solution to this problem in polynomial time. Additionally, a reduced-complexity algorithm
computing a suboptimal network strategy is also proposed and benchmarked against the global
optimal in Section 4.6, where simulation results of practical scenarios are provided. Finally,
Section 4.7 concludes the chapter summarizing results and sketching lines for future work.
4.2
System Model and Preliminaries
Consider the network setup described in Section 4.1.2, where the BS, the RSs, and the MSs are
max
max
power constrained to pmax
BS , pRS , and pMS , respectively. In every transmission frame interval,
denoted by T , the same network bandwidth B is used in the UL and DL phases, of adjustable
duration via TDD2 . In each of them, the communication of each BS-MS pair is assisted by
one RS. Let us denote the RS attached to the m-th MS by RS(m) {1, . . . , R}. The RS
assignment of the network is hence described by the connectivity matrix L {0, 1}RM , where
Lr,m = [r RS(m)] and [] is the Kronecker delta. Note that each BS-MS pair is assisted
by one RS, but the same RS can serve more than one BS-MS pair. In fact, the number of
BS-MSs pairs assisted by the r-th RS equals the number of non zero entries of the r-th row of
the connectivity matrix L.
2
Although the proposed optimization framework can be extended to the FDD mode, we have ruled it out
because it poses more restrictive complexity requirements on the RSs, which should be able to receive and
transmit simultaneously on different frequency bands.
62

M1
We shall use the vectors `1 (t) RM1
, `2 (t) RR1
to denote the CSI
+
+ , and `3 (t) R+
collected at the beginning of the t-th frame. While `1,m (t) stands for the path loss between the
BS and the m-th MS, `2,r (t) is the path loss between the BS and the r-th RS, and `3,m (t) is the
path loss between the m-th MS and its associated RS (which is the RS(m)-th). All of them are
assumed to satisfy reciprocity.
When OFDM is employed with the only knowledge of the link path loss at each transmitter,
one practical strategy is to perform uniform power allocation among groups of tones sufficiently
far apart such that their individual fading states are uncorrelated and frequency diversity is
enabled. This is the case in the IEEE 802.16e - PUSC and FUSC standards [20005]. With this
approach, coding across a sufficiently large number of tones makes the instantaneous achievable
rate, denoted by r(t), be upper bounded by the ergodic (or average) mutual information thanks
to the law of large numbers3 . By ergodic capacity we understand the instantaneous capacity
given some fading state in the frequency domain averaged over all possible fading realizations in
this domain. Therefore, no matter how short the transmission interval is nor how fast the channel
response varies, the ergodic capacity will exclusively depend on the transmission bandwidth and
the link signal-to-noise ratio, snr. The snr suffices to characterize the quality of a link since
interference between neighboring transmitters is prevented by allocating bandwidth among the
different BS-RS-MS triplets in a disjoint manner4 : each BS-MS-RS triplet is assigned a fraction
of the total bandwidth in exclusivity. This fraction may vary from UL to DL phases and also
within each of them, depending on whether the RS is active (relay-transmit subphase) or not
(relay-receive subphase). Whichever subphase we focus on, the snr of any link is given by
snr =
`Gp
,
N0 F b
(4.1)
where ` is pathloss, G is antenna gain, p is transmit power, N0 is the AWGN one-sided power
spectral density, F is the noise factor, and b is bandwidth. Whereas the specific values of p and
b are subject to optimization by the BS and N0 and ` are given, we distinguish between FBS
(GBS ), FRS (GRS ), and FMS,m (GMS,m ) to consider the general case of nodes equipped with RF
front-ends of diverse quality.
3
Consider a SISO point-to-point link for simplicity. When the transmit power is uniformly allocated over N
tones spanning some total bandwidth B, the per-tone snr is constant. If {hi }N
i=1 denote the fading states of each
tone (assumed i.i.d and unknown), the achievable rate satisfies
r(t)
N
N
X
(a)
B
1 X
log2 (1 + snrhi ) =
B log2 (1 + snrhi ) E{B log2 (1 + snrh)},
N
N
i=1
i=1
where (a) follows for large N from the law of large numbers and convergence is in probability. For finite moderate
values of N , outage events are not precluded. Its impact on system design, however, is beyond the scope of this
thesis.
4
Inter-cell interference due to frequency reuse in neighboring cells is not considered in this work.
63
4.3. Relay-Assisted Transmission
4.3
4.3.1
Relay-Assisted Transmission
Maximum instantaneous achievable rates
The use of RSs in our network setup has the advantage of realizing performance gains arising
from relay-assisted transmission. As the bandwidth is assigned orthogonally (disjointly) to each
BS-RS-MS triplet, intra-cell interference is completely nulled and it suffices to study one single
triplet to describe the overall behavior of the cell.
Considering that every RS operates in the half duplex mode, then for a given time duration
the relay is in the receive mode (we call this period the relay-receive subphase), and in the
transmit mode for the rest (we call this period the relay-transmit subphase)5 . To illustrate the
cooperation protocol, which is that of [HM05, Prop. 2], consider the specific BS-RS-MS triplet
shown in Figure 4.1, where the DL phase is described and the MS and RS index are omitted for
simplicity. The matrices H1 CnMS nBS , H2 CnRS nBS , and H3 CnMS nRS represent the
instantaneous fading states of each of the links at a given tone6 .

RS
`2 H2

BS

`1 H1
relay-rx subphase
relay-tx subphase
`3 H3
R
- MS

Figure 4.1: DL cooperation protocol: the DL phase is split into two subphases attending to the
half duplex nature of the RS.
Although the details of the coding scheme can be found in [HM05, App. A], we provide
here a brief sketch of it for the sake of clarity. The BS splits its message into two independent
components: one which is transmitted directly to the MS without the help of the RS, and
another which is transmitted through the RS to the MS. During the relay-receive subphase,
of duration 1 , the BS transmits one codeword related to the latter message component using
some power p1 while the RS and the MS listen. At the end of this subphase, the RS attempts
to decode this message component. If successful, a relay-transmit subphase of duration 2 starts
where both the RS (which re-encodes the decoded message component) and the BS (which
now transmits a codeword associated to its other message component) transmit using powers
p3 and p2 , respectively. Otherwise the RS remains silent during this subphase. The receiver
performs successive decoding: it first attempts to decode the relayed message component from
the signal of both subphases and, if successful, it subtracts the signal transmitted by the RS
in the second subphase prior to decoding the unrelayed message component. Assuming that
5
6
We shall also refer to the protocol subphases as subphase 1 (relay-receive) and subphase 2 (relay-transmit).
When UL cooperative transmission is considered, the instantaneous fading states can be described by using
the transposed matrices {HTj }3j=1
64
communication takes place over bandwidths B1 (relay-receive subphase) and B2 (relay-transmit

subphase) and that uniform power allocation across antennas is performed (see [Lia07] for relayassisted communication protocols where power allocation is performed assuming perfect CSI),
the achievable rate rDL in [bit/s] satisfies [HM05]
(1)
(2)
rDL min{rDL , rDL },
(4.2)
where the min function models whether it is the source-relay or the source-destination who act
as information bottlenecks for the relayed message component, and
n

o
n

o
snr1,2
snr2
(1)
H2 H2
H1 H1
rDL = 1 B1 E log2 det InRS +
+ 2 B2 E log2 det InMS +
(4.3)
nBS
nBS
o
n

snr1,1
(2)
H1 H1
(4.4)
rDL = 1 B1 E log2 det InMS +
nBS
n

o
snr1,2
snr3
+ 2 B2 E log2 det InMS +
H1 H1 +
H3 H3 ,
(4.5)
nBS
nRS
where the snrs amount to
snr1,j =
`1 GBS pj
,
N0 FMS Bj
snr2 =
`2 GBS p1
,
N0 FRS B1
snr3 =
`3 GRS p3
,
N0 FMS B2
(4.6)
where j = 1, 2.
While the success of decoding the relayed message component at the relay indeed impacts
on the success of decoding at the destination, the behavior of the destination is independent of
whether the relay was able to decode or not. The destination will attempt to decode first the
relayed component, perform successive interference cancellation, and go for the direct component
afterwards, no matter what happened to the relay. This makes the relay a transparent network
feature as seen by the MS, as no signalling between them is required whatsoever. In fact, as we
rely on the ergodic capacities to characterize performance (see Section 4.2), it can be assumed
that all the transmissions are reliable as long as their information rates lie below capacity.
Consequently, the performance (4.2) of the strategy [HM05] for the one-way relay channel with
half-duplex relay is such that the transmission rate of the relayed message component always
results in successful decoding at the relay.
It is important to remark that the upper bound (4.2) is only tight for Gaussian codes of infinite blocklength. When practical discrete alphabet codes of finite blocklength are used instead,
decoding errors at the RS and the MS cannot be disregarded at rates below the corresponding
ergodic capacities. However, expression (4.2) can still be used by introducing a penalizing gap
such that snrpractical = snr/ in (4.3)-(4.5)7 . We will hence use the gap from now on and omit
the subscript practical in snr for simplicity. When UL transmission is considered, an analogous
(1)
(2)
expression to (4.2) of the form rUL min{rUL , rUL } readily follows by exchanging the roles of
the MS and the BS and transposing the matrices {Hj }3j=1 .

7
The gap can be further increased to model the impact of inter-cell interference on final performance via snr
degradation.
65
Oppositely to [Ng07], where relays explicitly switched between amplify-and-forward and

decode-and-forward depending on the achievable rates, the adopted cooperation protocol has
the advantage of comprising other well-known cooperation strategies as particular cases such that
the best one is implicitly selected when the resource allocation is optimized. While it mimics
the philosophy of protocol I of [Nab04] and transmit diversity [Och06], it can also accommodate
the following:
Protocol III [Nab04], simplified transmit diversity [Och06] - Set p1 to be too small to enable
direct BS-MS reliable communication in the relay-receive subphase.
Protocol II [Nab04], receive diversity [Nab04] - Set p2 = 0.

Multihop relaying [Och06, Doh04a, Cal07e] - Set p2 = 0 and p1 to be too small to enable
direct BS-MS reliable communication in the relay-receive subphase.
Direct transmission - Set p3 = 0 and/or 2 = 0 and/or B2 = 0.
4.3.2
Universal concave lower bounds on the achievable rates
Transmission over multiple tones with uncorrelated fading makes the ergodic (or average) rates
show up in (4.3)-(4.5). They involve computing three MIMO channel ergodic capacities and
one distributed MIMO channel ergodic capacity (the 2 term in (4.5)). After averaging over
the fading distribution, i.e., the distribution of the matrices {Hj }3j=1 , the resulting expectations
depend only on the link snrs and the j Bj products, and admit closed form expressions for both
the MIMO [Shi03, Big04] and the distributed MIMO [Kie05] channel in case of Rayleigh fading.
However, analytical expressions cannot be derived for other fading distributions like Ricean,
that are common in the BS-RS link and include line-of-sight components (LOS). On top of that,
equations (4.3)-(4.5) are not concave functions of the duration of the subphases, the allocated
bandwidths, and the transmit powers. This prevents efficient methods to be applied for rate
allocation in global optimization approaches.
(1)
(2)
Alternatively, we develop universal, simpler concave lower bounds of rDL and rDL that ease
prospective optimization methods and allow for an easy concavity test. Here, by universal
we mean that parametric lower bounds with the same structure can be applied to any fading
distribution by changing the parameter values and not that the same expression holds for all of
them. Other parametric approaches have been taken to approximate MIMO ergodic capacities
[Doh05], but oppositely to our needs, concavity with respect to durations, bandwidths, and
powers was not guaranteed, parameter values were not systematically found (i.e., curve fitting
was performed), and the distributed MIMO case was not tackled. To start with, consider the
following results upon which our lower bounds are based. Their concavity analysis will be left
to the next section.
66
Lemma 4.1 A lower bound to the ergodic capacity of an nt nr MIMO channel is

nr
o X
n

snr
HH
log2 (1 + i (fH )snr/nt ),

E log2 det Inr +
nt
(4.7)
i=1

where i (fH ) , exp E{log i HH ) } and fH () denotes the pdf of the channel matrix H8 .
Proof. Proceeding as in [Big04, App. E.1], we start from the expression that relates the ergodic
capacity with the ordered eigenvalues of HH to obtain
nr
n

o
X
snr
HH
=
E{log2 (1 + i (HH )snr/nt )}
E log2 det Inr +
nt
i=1
nr
X
i=1
nr
X
(4.8)
E{log2 (1 + exp(log i (HH ))snr/nt )}
(4.9)
log2 (1 + exp(E{log i (HH )})snr/nt ),
(4.10)
i=1
where (4.10) follows from Jensens inequality and the convexity of the function I(x) = log2 (a +
b exp(x)) for all a, b 0.
Lemma 4.2 A lower bound to the ergodic capacity of an nt,1 nr and nt,2 nr distributed
MIMO channel is
nr
n

o X
snr1
snr2
E log2 det Inr +

H1 H1 +
H2 H2
log2 (1 + i (fH1 )snr1 /nt,1 + i (fH2 )snr2 /nt,2 ).

nt,1
nt,2
i=1
(4.11)
Proof. Since 1 + i (H1 H1 )snr1 /nt,1 + i (H2 H2 )snr2 /nt,2 0 for 1 i nr and both H1 H1
and H2 H2 are Hermitian matrices, it follows from [Fie71] that
n

o
snr1
snr2
E log2 det Inr +
H1 H1 +
H2 H2
nt,1
nt,2
nr
X
E{log2 (1 + i (H1 H1 )snr1 /nt,1 + i (H2 H2 )snr2 /nt,2 )}. (4.12)
i=1
The lemma follows by similarly applying Jensens inequality resorting twice to the function I(x).

Lemmas 4.1 and 4.2 lower bound the MIMO channel capacities with expressions that mimic
equivalent transmissions through virtual parallel AWGN channels of gains i (), which depend
on the antenna configuration and the fading distribution, and whose tightness is analyzed in
figures 4.2 and 4.3. As for the MIMO channel, Lemma 4.1 lower bound is extremely tight. The
tightness of Lemma 4.2 lower bound with respect to the distributed MIMO channel capacity,
however, depends on the snr.
8
Note that since rank{HH } min{nt , nr }, i (fH ) = 0 for min{nt , nr } < i nr .
67

20
18
16
4x4
bit/s/Hz
14
12
2x2
10
8
2x1
6
4
2
0
5
10
15
20
snr [dB]
25
30
Figure 4.2: Exact ergodic capacity (solid lines) and Lemma 1 lower bound (dashed lines) vs snr
for different antenna configurations and Rayleigh fading.
20
18
16
bit/s/Hz
14
12
snr2 = 20 dB
10
8
snr2 = 10 dB
6
4
snr2 = 0 dB
2
0
5
10
15
snr1 [dB]
20
25
30
Figure 4.3: Exact ergodic capacity (solid lines) and Lemma 2 lower bound (dashed lines) vs snr1
for different values of snr2 and Rayleigh fading. The antenna configuration is nr = nt,1 = nt,2 =
2.
The computation of the channel-dependent coefficients i () can be accurately performed
offline by using Monte Carlo methods. However, as for Rayleigh fading and an n 1 or 1 n
antenna configuration, results on the expectation of the logarithm of a Chi-square random

variable [Lap03] can be applied to show that
+
i (n) = e
Pn1
1
j=1 j
[i 1],
(4.13)
where 0.577 is the Euler-Mascheroni constant [Gra00, 4.352-1]. In any case, once the
channel-dependent coefficients are computed, lemmas 1 and 2 allow us to state the main result
68
of this section.
Corollary 4.1 A lower bound on the maximum DL achievable rates of the adopted relay-assisted
transmission protocol is
(1)
(2)
rDL min{rDL , rDL },
(4.14)
where
nRS
n
MS

X
X
snr1,2
snr2
(1)
(1)
rDL = 1 B1
log2 1 + i (fH2 )
+ 2 B2
log2 1 + i (fH1 )
rDL
nBS
nBS
i=1
n
MS
X
(2)
rDL = 1 B1
i=1
i=1
n
MS
X

snr1,1
log2 1 + i (fH1 )
+ 2 B2
nBS
i=1
(4.15)

snr1,2
snr3
+ i (fH3 )
log2 1 + i (fH1 )
(4.16)
nBS
nRS
(2)
rDL .
(4.17)
A similar lower bound on the maximum UL achievable rates holds by exchanging the roles of the
BS and the MS and transposing {Hj }3j=1 .
4.4
Achievable Instantaneous Rates
Given the CSI at the beginning of the t-th frame, {`j (t)}3j=1 , the instantaneous performance of
the network is given by the DL and UL achievable rate regions, i.e., the set of all rate vectors
rDL (t), rUL (t) RM1
that can be sustained during one frame duration. The achievable rates
+
depend upon the frame format as described by the vector of fractional durations R4+ ,
1T4 = 1, whose components account for the DL subphase 1 (1 ), DL subphase 2 (2 ), UL

subphase 1 (3 ), and UL subphase 2 (4 ). Each subphase duration shapes and couples the
instantaneous achievable rate regions, denoted by RDL (t; ) (DL) and RUL (t; ) (UL), and will
be subject to optimization later on, when rate allocation policies come into play in the next
section. In this section, however, we shall focus on the dependence of the achievable rates on
the disjoint allocations of power among transmitters and bandwidth among BS-RS-MS triplets.
4.4.1
DL instantaneous achievable rate region
Assuming that the duration of the DL subphases is fixed to 1 T and 2 T , the instantaneous achievable rates depend upon the allocation of bandwidth and transmit power among
the M competing flows.
p1 , p2 , p3 , b1 , b2
RM1
,
+
Let us describe the DL resource allocation by using the vectors

which represent the fractional BS power allocation in subphases 1
and 2, the fractional RSs power allocation in subphase 2 (p3,m is the fraction of power transmitted by the RS(m)-th RS in assisting the m-th MS), and the fractional bandwidth allocation in
subphases 1 and 2, respectively. By imposing non-negativity on each fraction and constraining
the sum of resources it follows that
p1 , p2 , p3 , b1 , b2 0M ,
(4.18)
69
4.4. Achievable Instantaneous Rates

1TM pj 1,
Lp3 1R ,
1TM bj 1,
(4.19)
where j = 1, 2 and L is the connectivity matrix defined in Section 4.2. Thus, applying Corollary
4.1 the DL achievable rate in [bit/s] of the m-th user, rDL,m (t), satisfies
(1)
(2)
rDL,m (t) min{rDL,m (t), rDL,m (t)},
(4.20)
where
(1)
rDL,m (t) = B1 b1,m
nRS
X
nMS,m

X
p1,m
p2,m
log2 1 + ci2,m (t)
+ B2 b2,m
(4.21)
log2 1 + ci1,m (t)
b1,m
b2,m
i=1
nMS,m
(2)
rDL,m (t) = B1 b1,m
i=1
log2 1 + ci1,m (t)
i=1
p1,m
b1,m
(4.22)
nMS,m
+ B2 b2,m
X
i=1

ci1,m (t)p2,m + ci3,m (t)p3,m
log2 1 +
b2,m
(4.23)
condense CSI into the equivalent channel gains

ci1,m (t) =
ci2,m (t) =
ci3,m (t) =
i (fH1,m )`1,m (t)GBS pmax

BS
nBS N0 FMS B
i (fH2,RS(m) )`2,RS(m) (t)GBS pmax
BS
nBS N0 FRS B
i (fH3,m )`3,m (t)GRS pmax
RS
.
nRS N0 FMS B
(4.24)
(4.25)
(4.26)
Following the notation of Section 4.3, we have used fH1,m to denote the DL fading distribution
between the BS and the m-th MS, fH2,RS(m) for the DL fading distribution between the BS and
the serving RS of the m-th MS, and fH3,m for the DL fading distribution between the m-th MS
and its serving RS. The DL achievable rate region is hence given by
RDL (t; ) =
[
(1)
(2)
{0M rDL (t) min{rDL (t), rDL (t)}},
(4.27)
where the union is taken over the allocations satisfying (4.18)-(4.19).

Lemma 4.3 The DL instantaneous achievable rate region RDL (t; ) is convex.
Proof. For fixed 1 , 2 , some properties of convex functions [Boy04] can be used to show that
the right hand side of (4.20) is concave: the minimum of concave functions is concave, and the
concavity of (4.21)-(4.23) can be shown resorting to the function G(x, y) = ax log(1 + by/x),
which is concave in x, y 0 a, b 0. This implies convexity of RDL (t; ) for fixed [Boy04],
which will turn out to be useful in ensuring global optimality in rate allocation problems. This
desirable property, which follows from the use of the universal lower bounds derived Section 4.3.2,
vanishes when the frame format (here in the form of the relative durations 1 , 2 ) is subject to
optimization too. However, this can be circumvented with the following variable change
qj , j pj ,
q3 , 2 p3 ,
wj , j bj ,
(4.28)
70
where j = 1, 2. This variable change gives rise to a new set of allocation variables, in terms of
which (4.21)-(4.23) become
#
"
nMS,m
nRS

X
X
q2,m
q1,m
(1)
i
i
(4.29)
+w2,m
log2 1+c1,m (t)
rDL,m (t) = B w1,m
log2 1+c2,m (t)
w1,m
w2,m
i=1
i=1
#
"
nMS,m
nMS,m

ci (t)q2,m +ci (t)q3,m

X
X
q
1,m
3,m
1,m
(2)
,(4.30)
+w2,m
rDL,m (t) = B w1,m
log2 1+
log2 1+ci1,m (t)
w1,m
w2,m
i=1
i=1
both of them concave functions regardless of 1 , 2 . The new set of feasible allocations transforms
accordingly into
(4.31)
q1 , q2 , q3 , w1 , w2 0M ,
1TM qj j ,
Lq3 2 1R ,
1TM wj j ,
(4.32)
where j = 1, 2 again. Formulated in terms of the new allocation variables, the region RDL (t; )
can be equivalently obtained by taking the union in (4.27) over (4.31)-(4.32), where (4.29)(4.30) are used instead of (4.21)-(4.23). This way, the convexity of RDL (t; ) with respect to
is unveiled: (4.29)-(4.30) are concave and the feasible set (4.31)-(4.32) is the intersection of
halfspaces and hence convex, something that was hidden with the original allocation variables.
Needless to say, the variable change (4.28) can be straightforwardly reversed to obtain the
allocated fractions of bandwidth and power.
4.4.2
UL instantaneous achievable rate region
Proceeding similarly, if the relative duration of the subphases is fixed to 3 T and 4 T , the UL
resource allocation can be characterized in terms of the vectors p1 , p2 , p3 , b1 , b2 RM
+ . While
the meaning of b1 , b2 , and p3 is identical, p1 and p2 refer now to the fractional MSs transmit
power in subphases 1 and 2. Thus, the feasible set of UL resource allocations is

(4.33)
p1 , p2 , p3 , b1 , b2 0M
pj 1 M ,
Lp3 1R ,
1TM bj 1,
(4.34)
where j = 1, 2. The application of Corollary 1 to the UL achievable rate in [bit/s] of the m-th
user implies that rUL,m (t) satisfies
(1)
(2)
rUL,m (t) min{rUL,m (t), rUL,m (t)},
(4.35)
where
(1)
rUL,m (t) = B1 b1,m
nRS
X
i=1
nBS,m
nBS,m

X
p1,m
p2,m
log2 1 + di3,m (t)
+ B2 b2,m log2 1 + di1,m (t)
(4.36)
b1,m
b2,m
i=1

X
p1,m
(2)
rUL,m (t) = B1 b1,m log2 1 + di1,m (t)
b1,m
(4.37)
i=1
nBS,m
+ B2 b2,m

X
di1,m (t)p2,m + di2,m (t)p3,m
log2 1 +
b2,m
i=1
(4.38)
71
4.5. Maximum Network Utility Rate Allocation Policies

use the equivalent channel gains
di1,m (t)
i (fHT )`1,m (t)GMS pmax

MS
1,m
nMS N0 FBS B
i (fHT
)`2,RS(m) (t)GRS pmax
RS
di2,m (t) =
2,RS(m)
i (fHT
di3,m (t) =
3,m
nRS N0 FBS B
)`3,m (t)GMS pmax
MS
nMS N0 FRS B
(4.39)
(4.40)
(4.41)
Note that by transposing the DL fading state matrices, the fading distributions account for UL
transmission. The UL achievable rate region can finally be expressed as
RUL (t; ) =
[
(1)
(2)
{0M rUL (t) min{rUL (t), rUL (t)}},
(4.42)
where the union is over (4.33)-(4.34).

Lemma 4.4 The UL instantaneous achievable rate region RUL (t; ) is convex.
Proof. As happened in the DL, the achievable rate region RUL (t; ) is convex when the frame
format is fixed, but not when it becomes an optimization variable. To avoid this handicap,
the same variable change as in (4.28) is proposed. This leads to the concave expressions
"
(1)
rUL,m (t)
= B w1,m
nRS
X
log2
i=1
#
nBS,m

X
q
q
1,m
2,m
1+di3,m (t)
+w2,m log2 1+di1,m (t)
w1,m
w2,m
(4.43)
i=1
nBS,m
#
nBS,m

di (t)q2,m +di (t)q3,m
X
X
q1,m
1,m
2,m
i
= B w1,m log2 1+d1,m (t)
+w2,m log2 1+
(4.44)
w1,m
w2,m
"
(2)
rUL,m (t)
i=1
i=1
and the new feasible set

q1 , q2 , q3 , w1 , w2 0M
qj j 1M ,
Lq3 2 1R ,
1TM wj 1,
(4.45)
(4.46)
where j = 1, 2. Thus, a convex representation of RUL (t; ) follows by replacing (4.36)-(4.38) by

(4.43)-(4.44) in (4.42) and performing the union over the allocations satisfying (4.45)-(4.46).
4.5
Maximum Network Utility Rate Allocation Policies
In a cellular network handling users of services with different QoS requirements, some users
might exhibit large sensitivity to transmission delays while others might only pay attention to
their experienced long-term throughput. And still other key performance indicators may play a
role, such as transmit buffer overflow probability, energy consumption... etc; even users of the
same service, attending to their contract terms, can ask for differentiated QoS requirements.
72

This poses a challenging problem: upon collection of CSI at the beginning of the t-th frame,
the BS must decide by whom, when, and at which rate any information will be transmitted/received until the next CSI refresh arrives. Assuming that CSI updates are received periodically every D frames, user scheduling and resource allocation needs to be jointly optimized for
UL and DL transmission during the frames {t, t+1, . . . , t+D 1}. On the one hand, flows of dif-
ferent nature may require completely different management policies; on the other, the network
operation should seamlessly reconfigure as the scenario (users, services, QoS requirements...)
varies with time.
We address this problem by using utility functions [Kel98, Ng07] that evaluate each users
satisfaction given the achieved throughput as compared to its requirements. By properly characterizing QoS requirements with the dependency of the utility on the throughput for each service
under operation, the different nature of the serving flows is incorporated into the BS arbitrage.
An arbitrage that will use the utility function of each user to allocate resources and perform
scheduling decisions.
4.5.1
User utility functions
Utility functions were first used in [Kel98] to introduce the proportional fair criterion in
resource allocation problems, and allow us to describe the satisfaction of one user given
its served throughput.
Although many schemes implicitly assume utilities proportional
to throughputs [Yu06, Bae06, Cal07e], we shall adopt here a more general approach as in
[Lee06, Kel98, Lin06, Pal07, Xue06, Ng07]. We define a user utility function U (R) as a concave
function of the (long-term) throughput R, which is computed using the exponentially weighted
smoothing
R(t + 1) = R(t) + (1 )r(t),
(4.47)
where R(t + 1) stands for the throughput as seen at the beginning of the (t + 1)-frame, represents the smoothing memory, and r(t) is the instantaneous rate achieved in the t-th frame. If
for any reason the user satisfaction profile of a service cannot be described using a concave function (e.g, an S-type curve), the finding of efficient methods for network utility maximization is
compromised. Fortunately, a plethora of common services comply with the concavity constraint.
4.5.1.1
QoS-oriented utility functions
We say a utility function is QoS-oriented whenever the QoS requirements of the service appear
explicitly in its expression. Although we are not constrained to it for operational reasons,
we focus on utility functions upper-bounded by 1. This way, we set the same maximum user
satisfaction level as a reference for all the services under operation in the network. Otherwise,
the use of unbounded utilities for the different services might cause the BS to bias its attention
towards services with favored utility scales. The following examples show that this is not a
73

major impairment to describe the satisfaction profile of services of different nature.
Example 1: Best-effort data service
The user satisfaction of a best-effort data service (e.g., ftp, http) without any data latency
or other QoS constraints than achieving the largest possible throughput can be modeled with
the utility function
U (R) = 1 e
log(1U0 ) RR
= 1 (1 U0 ) R0 .
(4.48)
This utility is parameterized by the satisfaction level 0 < U0 < 1 achieved when the throughput
is R0 .
Example 2: Delay-sensitive service
The user of a delay-sensitive service is interested in achieving some target throughput under
the constraint that data latency remains below some critical threshold: in practical applications
(e.g. voice service, video streaming), bits exceeding the maximum allowed delay are dropped.
Since such applications are usually of constant bit rate, allocation of rates larger than the
target throughput renders suboptimal. In other words, an overusing of resources makes no
real improvement for this user but compromises the QoS provision to the rest. This can be
alternatively viewed in terms of imposing the instantaneous rate to be as constant as possible,
thus avoiding bursty transmissions yielding the same throughput at the expense of larger idle
periods (and hence delays). One suitable utility function is
U (R) = 1
R R0
2
,
(4.49)
where R0 is the target throughput and depends on the maximum allowable delay W0 (in
number of frames). To select , we choose to set the utility of one user that was initially served
R0 but is laid aside W0 frames idle equal to U0 1. This forces the satisfaction index of this
user to move from the peak of (4.49), where utility was 1, to some unacceptable value U0 . This
way, each frame one such user is idle it is able to warn the BS about its urgency for being
scheduled by decreasing its utility. With this criterion, the appropriate is
=
(1 W0 )R0
.
1 U0
(4.50)
In case users of several delay-sensitive services with different QoS requirements (as specified by
R0 and W0 ) are present in the network, one has only to adjust according to (4.50) and use
the resulting utility function (4.49).
4.5.1.2
Best-effort utility functions
In situations where the utility function of a service does not depend on the QoS requirements,
we say that utility is best-effort. As we are not able to quantify how far we are from the user
expectations, we rather use utility as a qualitative satisfaction index. Additionally, if there is
74
only one service under operation in the network, there is no reason to focus on functions upperbounded by 1. This can allow us to consider a wider class of utility functions. A useful example
of best-effort utility function is the family [Mo00]
log(R) if = 1
,
U (R) =
R1 if 6= 1
1
(4.51)
where the choice of the parameter governs the way resources are shared among users, and its
role shall be discussed later at the end of Section 4.6.
4.5.2
Network utility maximization
To account for services with asymmetric requirements, consider different DL and UL utility functions, denoted by UDL,m and UUL,m respectively for the m-th user. Let RDL (t), RUL (t) RM1
+
denote the vectors corresponding to the vertical stacking of DL and UL per-user throughputs
at the beginning of the t-th frame, respectively. As user throughput varies with time, so does
user utility. A global snapshot is given by the vectors UDL (t), UUL (t) RM1
, where
+
[UDL (t)]m = UDL,m (RDL,m (t))
(4.52)
and a similar expression holds for UL utilities. Using (4.52), we define network utility as any
concave non-decreasing function of the user utilities NU(t) NU(UUL (t), UDL (t)). It provides
a cell-wide aggregate indicator rating the goodness of the scheduling and resource allocation
strategy carried out at the BS as far as satisfaction of all the users of the cell is concerned. For
instance, we could take a maxmin approach and set network utility as the minimum among all
the users satisfaction in either UL or DL directions, i.e.,
n
o
NU(t) = min min{[UUL (t)]m , [UDL (t)]m } .
1mM
(4.53)
Thanks to the concavity of each user utility on the throughput and the fact that NU is a concave
non-decreasing function of the utilities, it follows from the convexity properties of composite
functions [Boy04] that (4.53) is a concave function of (RUL (t), RDL (t)). This is an important
property since concavity of network utility is necessary for obtaining global optimal allocation
strategies in polynomial time. Alternatively, if there is no pressure to focus on the utility
achieved by the worst user, we can take a simpler choice and set network utility as the sum of
all the users utilities
NU(t) = 1TM (UUL (t) + UDL (t)).
(4.54)
When (4.54) is used in conjunction with (4.50), the parameter is said to enable -fairness
[Mo00]. Fairness is a wide concept which refers to the fact of not penalizing some users arbitrarily, and by tuning from 0 to + the network planner is given a tool to easily switch between
popular fair schemes. While = 0 yields utilities equal to throughputs and therefore the objective becomes maximizing the cell throughput, = 1 yields proportional fairness [Kel98], and as
+ the network operation tends to apply the maxmin criterion to the user throughputs.
75

4.5.2.1
Optimal strategy
For a given CSI, the task of the BS is to maximize network utility until the CSI becomes outdated
and a new one is received (this period spans D frames). Afterwards, the following CSI update
triggers another network utility maximization procedure for the subsequent D frames, and so
on. Expressed succinctly, the optimal strategy for a given CSI is the solution to the following
optimization problem9
maximize
{ i ,rUL (t+i),rDL (t+i)}D1
i=0
{UUL (t+i),UDL (t+i)}D
i=1
subject to
{NU(t + 1), NU(t + 2), . . . , NU(t + D)}
(4.55)
i1

X
[UDL (t + i)]m UDL,m i RDL,m (t)+(1 ) i1j rDL,m (t + j) (4.56)
j=0
[UUL (t + i)]m
i1

X
i
UUL,m RUL,m (t)+(1 ) i1j rUL,m (t + j) (4.57)
j=0
rUL (t + i) RUL (t; i )
(4.58)
1T4 i = 1, i 04 ,
(4.60)
rDL (t + i) RDL (t; i )
(4.59)
where (4.56)-(4.57) apply for 1 m M and 1 i D, and (4.58)-(4.60) for 0 i D 1.
Note that in (4.55)-(4.60) we have made implicit the resource allocation optimization with the
use of the instantaneous achievable rate regions RUL (t; i ) and RDL (t; i ).
Determining the best rate allocation for the D frames under consideration amounts to solving
the multiobjective optimization problem (4.55)-(4.60). Multiobjective problems do not usually
have unique optimal solutions, and one usually selects one solution from the set of Pareto
optimal solutions10 according to some priorization of the objectives in conflict in the problem.
Since network utility represents cell-wide quality of service, our approach will be to provide the
largest possible network utility in each of the frames under optimization indistinctly. Hence,
we will first aim at maximizing the minimum network utility during D frames, then maximize
the second smallest network utility with no penalty to the previous one, and so on. Under this
criterion, one global optimal solution11 can be iteratively computed using Algorithm 4.1.
Proposition 4.1 The solution computed by Algorithm 4.1 is Pareto optimal.
9
Note that we have omitted the dependence of network utility on each of the user utilities in (4.55) for the
sake of simplicity.
10
Some resource allocation achieving {NU(t + 1), NU(t + 2), . . . , NU(t + D)} belongs to the Pareto optimal
set if for any other allocation achieving {NU0 (t + 1), NU0 (t + 2), . . . , NU0 (t + D)} it will never happen that
NU0 (t + i) NU(t + i) for all 1 i D and NU0 (t + i) > NU(t + i) for some 1 i D.
11
In case more than one global optimal resource allocation solution exists, their achieved network utility values
are permuted versions of some reference {NU? (t + 1), NU? (t + 2), . . . , NU? (t + D)} (see the proof of Proposition
4.1 in Appendix 4.A).
76
Algorithm 4.1 Global maximization of network utility

1: Initializations: S = , NUmin (t + i) = for 1 i D.
2:
3:
while |S| < D do

Solve
min
maximize
{ i ,rUL (t+i),rDL (t+i)}D1
i=0
{UUL (t+i),UDL (t+i)}D
i=1
subject to
{NU(t + i)}
(4.61)
constraints (4.56)-(4.60)
(4.62)
i{1,2,...,D}\S
NU(t + i) NUmin (t + i)
4:
Compute imin = arg
min
i{1,2,...,D}\S
NU? (t + imin ).
i S.
(4.63)
{NU? (t+i)} and update S = S imin , NUmin (t+imin ) =
5:
end while
6:
Use the optimal resource allocation to compute the UL and DL exact achievable rates (4.2):
{r?UL (t + i), r?DL (t + i)}D1
i=0 .
7:
Update throughputs for 1 m M, 1 i D:

RUL,m (t + i) = RUL,m (t) + (1 )
D
RDL,m (t + i) = RDL,m (t) + (1 )

D
i1
X
?
i1j rUL,m
(t + j)
(4.64)
?
i1j rDL,m
(t + j).
(4.65)
j=0
i1
X
j=0
Remark 4.1 Algorithm 4.1 is able to compute one global optimal solution in polynomial time.
To see this, it is sufficient to show that each of the subproblems (4.61)-(4.63) are convex. We first
require that the objective (4.61) is concave, which follows from the concavity of network utility
and the fact that the minimum of concave functions is concave. Then, the left hand side of each
of the inequality constraints in (4.62)-(4.63), when rephrased as a function of some optimization
variables less than or equal to zero, should be convex. This follows from the concavity of the
user utility functions with respect to throughput, the fact that the throughput relates linearly
to the instantaneous rates, and the convexity of the UL and DL achievable rate regions (see
Section 4.4).
Remark 4.2 In each of the problems (4.61)-(4.63) a three-fold optimization in each of the
frames under consideration is performed: first, the frame formats (relative durations of each
relay-assisted transmission subphase for UL and DL) as described by the corresponding fourD1
dimensional vectors { i }i=0
; second, the allocated instantaneous rates, which implicitly account
for user scheduling (note that rDL,m (t+i) = 0 implies that the m-th user shall not receive any DL
data in the (t+i)-th frame); and third, the allocated resources (bandwidth and transmit power),
which are implicit in the definition of the DL and UL achievable rate regions (4.58)-(4.59) as
defined in (4.27) and (4.42).
77
Remark 4.3 In order to pose the maximization of network utility as a series of convex optimization problems, we have resorted to the concave lower bounds on the achievable rates derived
in Section 4.3.2. However, the throughput updates are performed evaluating the exact ergodic
rates (4.2) achieved by the optimal resource allocation, and not their lower bounds (4.14). As
for Rayleigh fading, we resort to [Kie05] for their computation.
4.5.2.2
Reduced-complexity suboptimal strategies
Although Algorithm 4.1 provides the best network strategy from a network utility point of view,
its computational load may become prohibitive in systems either serving a large number of users
per cell (large M) or refreshing the CSI slowly with respect to the frame duration (large D).
To see this, consider the fact that D convex problems of (14M + 4)D variables each (where we
have considered utilities, rates, frame formats, power allocations, and bandwidth allocations)
are involved in each optimization instance. Hence, two directions may be taken to cut down
complexity: either reduce the optimization window D or the number of users M to be scheduled.
How to deal with the first one is immediate: simply replace D by D0 in Algorithm 4.1 such
that D0 divides D (this is required to avoid optimization windows requiring CSI not received
yet). The second direction requires the implementation of a time-domain scheduler on top of Algorithm 4.1 such that only a subset of the M MSs are selected for network utility maximization.
We choose to select the M0 < M users that would have the smallest utilities at the end of the
optimization window if not scheduled. Despite this may have an impact on final performance
since scheduling decisions are no longer optimal, time-domain pre-scheduling renders crucial in
scenarios with a large number of users. On top of the aforementioned complexity issues, in practice, the frame structure needs to be signalled to the MSs, and this represents and additional
overhead. If all users are allowed to transmit and/or receive in the same frame, the average
quantity of the allocated resource per user and frame may go down below practical operational
values while this signalling overhead may increase significantly and thus hamper network utility.
We benchmark the global optimal solution of Algorithm 4.1 against the suboptimal strategy
which takes D0 = 1 and attempts to maximize network utility sequentially in a frame-by-frame
basis. Thus, Algorithm 4.2 is run at the beginning of each frame, which allocates resources
among the subset of M0 users selected by a time-domain pre-scheduler. This way, we are able to
quantify the performance loss of sequential optimization versus global optimization.
78
Algorithm 4.2 Sequential maximization of network utility

1: Solve
NU(t + 1)
maximize
(4.66)
,rUL (t),rDL (t),
UUL (t+1),UDL (t+1)

[UDL (t + 1)]m UDL,m RDL,m (t) + (1 )rDL,m (t)
(4.67)

[UUL (t + 1)]m UUL,m RUL,m (t) + (1 )rUL,m (t)
(4.68)
subject to
rUL (t) RUL (t; )
(4.69)
1T4 = 1, 04 ,
(4.71)
rDL (t) RDL (t; )
2:
(4.70)
Use the optimal resource allocation to compute the UL and DL exact achievable rates (4.2):
r?UL (t), r?DL (t).
3:
Update throughputs for 1 m M:

?
RUL,m (t + 1) = RUL,m (t) + (1 )rUL,m
(t)
?
RDL,m (t + 1) = RDL,m (t) + (1 )rDL,m
(t).
4.6
(4.72)
(4.73)
Simulation Results
We focus on two different scenarios sharing the same target cell spectral efficiency but having
different user population sizes. In either case, we simulate a circular cell of 500 m radius, with
R = 5 relays uniformly spaced along a circle at 375 m from the BS, which is located in the cell
center. We assume that the MS-RS links are in line-of-sight and have path loss exponent 2.6,
while the rest of links (BS-MS and MS-RS) are in non line-of-sight with path loss exponent 4.05.
All links are hampered by Rayleigh fading. See Table 4.1 for a complete list of values of the rest
of physical parameters involved.
All the users of the cell are mobile. If xm (t) R2+ denotes the position of the m-th MS at
the beginning of the t-th frame in Cartesian coordinates, then
xm (t + 1) = xm (t) + vm T
cos((t))
sin((t))
(4.74)
where vm is its speed (assumed constant) and (t) is an AR(1) stochastic process describing its
direction,
(t + 1) = 0.9(t) + 0.02t ,
(4.75)
where {t } are i.i.d. uniform random variables on [, ). Whenever a MS happens to exceed
the limits of the cell, it is forced to bounce on the cell edge by changing its instantaneous
79
4.6. Simulation Results
10 MHz
25 ms
max
max
pmax
BS , pRS , and pBS
33, 30, and 24 dBm
GBS , GRS , and GBS
10.6, 5, and 1 dB
FBS , FRS , and FBS
4, 4, and 7 dB
nBS , nRS , and nMS
2, 2, and 1
BS, RS, and MS heights
15, 5, and 0 m
N0
114 dBm/MHz
4 dB
0.95
Table 4.1: Physical layer setup of the simulated scenario

direction in order to keep constant the total number of users. As a simplifying assumption we
set the same speed of v = 3 kmph for all the MSs, and consider a feedback update rate of 100
ms, i.e., D = 4. Each time CSI is refreshed, each MS is attached to the RS towards which the
path loss is the smallest. The connectivity matrix L is updated accordingly.
With this setup, we first simulate an scenario consisting of M = 6 best-effort users (4.48) of
gold, silver, and bronze QoS classes. Gold users experience 0.9 utility when they are given 30
Mbps (DL) and 6 Mbps (UL) throughput. Silver users have the same utility level when served
20 Mbps (DL) and 4 Mbps (UL) throughput. Finally, bronze users require 10 Mbps (DL) and
2 Mbps (UL) throughput for 0.9 utility. There are two users of each QoS class and, under a
maxmin choice (4.53), if a network utility of 0.9 was realized, the cell spectral efficiency would
amount to 14.4 bps/Hz.
Figure 4.4 shows the deployment layout and compares the network utility achieved by global
and sequential optimization. To quantify separately the performance gains provided by the
presence of relays from those achieved by the optimization approach itself, we benchmark the
global and the sequential optimization algorithms against their counterparts without RSs. That
is, we also simulate resource allocation where the relay-transmit phase is always forced to have
zero duration. To make this comparison fair, we increase the transmit power constraint at the
BS and the MSs such that, frame by frame, the total UL and DL power is equal with and
without RSs in both optimization strategies. Since the number of users is relatively small,
there is no need for a time-domain pre-scheduler.
As a general trend, global optimization dominates sequential optimization in the long term,
although at the beginning the opposite holds. This is because in the first frames, the resource
allocation of sequential optimization benefits from evaluating the actual ergodic rates often
(frame by frame) as compared to global optimization, where this evaluation is carried out every
80

Network Utility
1.2
global opt.
sequential opt.
global opt. w/o RSs
sequential opt. w/o RSs
0.8
500
0.6
250
0.4
0
250
0.2
0
0
500
500 250
2
time [s]
0
3
250
500
4
Figure 4.4: Network utility achieved by global (blue) and sequential (red) optimization, with
(solid) and without (dashed) relaying infrastructure, and deployment layout.
Uplink Throughput [Mbps]
8
bronze silver gold
6
0
0
0.5
1.5
2
time [s]
2.5
3.5
3.5
Downlink Throughput [Mbps]

40
bronze silver gold
30
20
10
0
0
0.5
1.5
2
time [s]
2.5
Figure 4.5: Per-user served throughput of global (blue) and sequential (red) optimization, with
(solid) and without (dashed) relaying infrastructure.
81

Seq. Opt. w/o RSs
Global Opt. w/o RSs
Seq. Opt. w/ RSs
Global Opt. w/ RSs
2.6
33
Table 4.2: Relative execution times per transmission frame of the MATLAB implementations
of all the resource allocation strategies.
group of D frames. This has a positive impact on user throughput, although this advantage
vanishes quickly as confirmed also by Figure 4.5, where the per-user throughput achieved by
global optimization becomes the largest in less than 2 s of network operation (80 frames).
Interestingly, sequential and global optimization perform almost undistinguishably without
relays: as the ergodic capacity lower bounds are very tight in this setup (see Figure 4.2), the
previous effect does not apply. In any case, the target spectral efficiency of 14.4 bps/Hz is
achieved only with relaying infrastructure, as the steady-state performance without RSs falls
roughly 25% below QoS targets. Indeed, as showed in Table 4.2 the price to pay is an increase
in computation complexity.
Next, we focus on a practical scenario with the same target spectral efficiency at 0.9 network
utility as before, adopting the maxmin utility criterion (4.53) again, but now serving 10 times
more users with rate requirements reduced to 10% of the previous. Thus, M = 60 best-effort
MSs are present in the cell which split into 20 users per QoS class. Now, at 0.9 user utility gold
users require 3/0.6 Mbps (DL/UL) throughput, silver users 2/0.4 Mbps (DL/UL) throughput,
and bronze users 1/0.2 Mbps (DL/UL) throughput. With this system size, global optimization
renders unfeasible and we restrict our attention to sequential optimization with time-domain
pre-scheduling. In particular, we study the performance degradation as a function of M0 , the
maximum number of users per frame. In this respect, Figure 4.6 shows the deployment layout
and the network utility achieved with M0 = 12, 8, and 4. Clearly, the larger M0 , the larger
network utility but also the algorithm complexity and signalling overhead. Assuming negligible
this last effect for the range of values of M0 studied12 , the steady-state network utility loss
between M0 = 12 and M0 = 4 is on the order of 0.1.
In Figure 4.7 we show the per-user throughput averaged per QoS class, to show that, although
the general trend is similar for each value of M0 , the performance degradation when M0 = 4 is due
to the fact that gold users are served far below their 0.9 target while bronze users are satiated
more than necessary. As M0 decreases, the average number of idle frames between transmissions
for a given user increases, but the instantaneous rate per user in an active frame increases.
12
For each direction (UL and DL), each user should be signalled about the transmission rate, the fractional
bandwidth allocation, the fractional power allocation, and the duration of the protocol subphases. Assuming that
each of these parameters is quantized to nb bits, the throughput penalty per user and direction is nb (6 + 2/M0 )
which decreases with M0 since the subphase durations are common. Hence, although the global signalling overhead
increases in M0 , the per-user penalty decreases. In particular for a practical value of nb = 5 bits, this penalty is
on the order of 1 kbps and, hence, negligible.
82

Network Utility
1
0.9
0.8
0.7
500
0.6
0.5
250
0.4
0
0.3
0.2
250
0.1
500
500
0
0
250
250
500
10
time [s]
Figure 4.6: Network utility achieved by sequential optimization with time-domain pre-scheduling
allowing a maximum of 12 (black), 8 (blue), and 4 (red) users per frame, and deployment layout.
Average Uplink Throughput per QoS class [kbps]

800
gold
silver bronze
600
400
200
0
0
10
time [s]
Average Downlink Throughput per QoS class [Mbps]
4
gold
silver bronze
0
0
10
time [s]
Figure 4.7: Average per-user served throughput per QoS class achieved by sequential optimization with time-domain pre-scheduling allowing a maximum of 12 (black), 8 (blue), and 4 (red)
users per frame.
83

Maximum Delay [ms]
700
600
500
400
300
200
100
0
0
10
time [s]
Figure 4.8: Maximum user delay (number of frames idle) achieved by sequential optimization
with time-domain pre-scheduling allowing a maximum of 12 (black), 8 (blue), and 4 (red) users
per frame.
The conclusion from Figure 4.7 is that if M0 is too low the global effect is negative. Finally,
Figure 4.8 addresses the maximum per-user delay in each setup, which is shown to be roughly
proportional to 1/M0 in average.
So far, we have only explored maxmin network utility and QoS-oriented utility functions.
To conclude this section, however, we shall modify the previous system with a total of M = 60
users and M0 = 4 users at most per frame and explore the inherent tradeoff between fairness
and throughput when other network utility functions are used. In particular, we now consider
the situation where the UL and DL utility functions of every user are given by (4.51) and
network utility is sum utility (4.54). Thus, by changing the parameter we have a way of
trading fairness and throughput. We capture this relation in Figure 4.9, where we focus on the
average steady-state per-user and link direction throughput (UL and DL directions are averaged
together as their utilities are the same). It is plotted against Jains fairness index [Jai84] in each
direction for a given cell deployment. This index rates the degree of fairness incurred in serving
n competing flows with a real number between 1/n (worst case: one user gets it all) and 1 (best
case: resources are equally shared). Figure 4.9 confirms that a simultaneous increase in both
fairness and throughput cannot be achieved and quantifies the explicit tradeoff. Notice that the
time-domain pre-scheduler prevents the fairness index to fall below acceptable levels.
84

Average Throughput per User and Link Direction [Mbps]
1.24
1.23
1.22
maximum
throughput
1.21
1.2
= 1
1.19
proportional
fairness
1.18
+
maxmin
fairness
1.17
0.86
0.88
0.9
0.92
0.94
Fairness Index
0.96
0.98
Figure 4.9: Average steady-state per-user and link-direction throughput versus fairness index.
The corresponding values of are 0.25, 0.50, 0.75, 1, and 5.
4.7
Conclusions
The work of this chapter concerns the performance characterization of a relay-assisted network
deployment under practical constraints. In particular, terminals are half-duplex MIMO and path
loss is the only CSI assumed to be known at the transmitters. Network performance is evaluated
in terms of ergodic achievable rates by the development of novel lower bounds. These bounds are
employed to derive two efficient algorithms for resource allocation optimization under heterogeneous QoS requirements. The first one provides one Pareto optimal solution whereas the second
one performs a simpler frame by frame optimization by means of a sequential algorithm. The
performance of both schemes has been evaluated showing that, whenever global optimization
can be afforded, significant performance gains can be achieved with respect to sequential optimization. When systems dimensions are large, however, the complexity of sequential network
optimization can be tuned by using time-domain pre-scheduling.
The proposed network resource optimization schemes could be generalized in at least two
ways. First, by incorporating outage events into system design in those scenarios where the
number of tones is limited and moderate. Second, by considering a multi-cell configuration
allowing the incorporation of a prescribed maximum inter-cell interference as an additional
constraint.
85
4.A. Appendix: Proof of Proposition 4.1
4.A
Appendix: Proof of Proposition 4.1
Let {NU? (t+1), NU? (t+2), . . . , NU? (t+D)} denote the network utility achieved by the solution
of Algorithm 4.1, and define i1 , i2 , . . . , iD such that NU? (t+i1 ) NU? (t+i2 ) . . . NU? (t+iD ).
Assume it is not Pareto optimal. Hence there exists at least another allocation achieving {NU(t+
1), NU(t + 2), . . . , NU(t + D)} and integers j1 , j2 , . . . , jD such that NU(t + j1 ) NU(t + j2 )
. . . NU(t + jD ),
NU(t + i) NU? (t + i),
NU(t + i) > NU? (t + i),
1 i D,
for some i.
(4.76)
(4.77)
Then,
NU? (t + i1 )
(a)
(b)
(c)
NU(t + j1 ) NU? (t + j1 ) NU? (t + i1 )
(4.78)
where (a) follows by construction of the solution of Algorithm 4.1, (b) from (4.76), and (c) from
the definition of {in }. Equation (4.78) implies that NU? (t + i1 ) = NU(t + j1 ). Two cases arise
now:
Case i1 6= j1 :
(a)
(b)
(c)
NU? (t + j1 ) NU? (t + i1 ) = NU(t + j1 ) NU? (t + j1 ),
(4.79)
where (a) follows from the definition of {in }, (b) is a consequence of (4.78), and (c) follows
from (4.76). Then, NU? (t+j1 ) = NU(t+j1 ). Without loss of generality we can set i2 = j1 ,
j2 = i1 and obtain NU? (t + i2 ) = NU(t + j2 ).
Case i1 = j1 :
(a)
(b)
(c)
NU? (t + i2 ) NU(t + j2 ) NU? (t + j2 ) NU? (t + i2 )
(4.80)
where (a) follows by construction of the solution of Algorithm 4.1, (b) is derived from
(4.76), and (c) follows from the fact that j2 6= j1 = i1 and the definition of {in }. Expression
(4.80) also implies NU? (t + i2 ) = NU(t + j2 ).
Proceeding similarly, we can iteratively show that
NU? (t + in ) = NU(t + jn ),
1 n D,
(4.81)
which implies that the network utility values of both strategies are either equal or related by
an arbitrary permutation. In either case, this contradicts (4.76)-(4.77), hence implying that the
solution to Algorithm 4.1 is Pareto optimal.
Chapter 5
Multiuser Interference and

Evaluation of Capacity Regions
The computation of the channel capacity of a discrete memoryless channel is a convex problem
that can be efficiently solved using alternate maximization methods. The presence of multiuser
interference, however, complicates the extension of these methods to the evaluation of the capacity region of multiterminal networks. To start with, the capacity region of a general network
is unknown, and we are left with only the cut-set outer bound [Cov06, Sec. 15.10]. Although
many major breakthroughs in the field have been achieved (see [Cov06, Ch. 15] and [Wei06] and
references therein), there are many open problems on single-letter characterizations of capacity
regions. Second, not in all the particular setups where the capacity region is known it admits
a computable characterization. By computable characterization we mean that the number of
degrees of freedom that need to be explored in the evaluation of the achievable rates are finite
(see Chapter 3, Section 3.2, for an example non-computable capacity region). Let alone the fact
that, when computable expressions of capacity regions are known, multiuser interference makes
their evaluation depend on the solution to non-convex problems.
Problems that are initially regarded as non-convex need not be necessarily difficult to be
solved, as some hidden convexity enabling the use of efficient optimization methods can sometimes be found. This chapter is hence devoted to study the nature of the non-convexities that
arise in the problems associated to the evaluation of the performance limits (capacity regions)
of multiuser channels. The interest in doing so is three-fold:
1. By characterizing the class of non-convexities involved in the capacity region of a network
setup, we can accurately quantify the impact of multiuser interference on the computational complexity of the evaluation of the achievable rates.
2. The study of these optimization problems often gives back some insight on the structure
of the transmission strategies that achieve boundary points of capacity regions.
87
88
Chapter 5. Multiuser Interference and Evaluation of Capacity Regions

3. In some cases, upon unveiling the structure of the optimization problem at hand, efficient
methods to compute inner and outer bounds on the capacity regions can be found.
Constrained by the limited amount of results on single-letter characterizations of capacity
regions, our approach will not be general and we shall rather restrict our focus to two specific network configurations: the discrete memoryless multiple access channel (DMAC) and the
degraded discrete memoryless broadcast channel (dDMBC). Neither the general broadcast nor
interference channels shall be tackled due to the lack of capacity results.
5.1
Introduction
The evaluation of the capacity of an arbitrary single-user memoryless channel is a problem that
admits a single-letter representation in the form of a maximization of a concave function over a
convex set, e.g., a probability simplex for the Discrete Memoryless Channel (DMC). This is a
convex problem that can be numerically evaluated efficiently in practice (i.e., with polynomial
time worst-case complexity) [Boy04]. For the continuous Gaussian channel, for example, the
solution admits a simple closed-form characterization as already obtained by Shannon in 1948
[Sha48]. For the DMC there is no closed-form expression but the popular practical ArimotoBlahut (AB) algorithm [Ari72, Bla72], dating back to the early 1970s, can be efficiently used.
Things are quite different in the multiuser case.
5.1.1
The DMAC
Fortunately, for the multiple-access channel (MAC) we have a single-letter representation of the
capacity region [Ahl71, Lia72]. However, the characterization is not generally in the form of a
convex optimization problem as happened in the single-user case. Again, for the continuous
Gaussian channel, the characterization simplifies to a convex problem which can always be
numerically evaluated in an efficient way [Yu04]. For the DMAC, however, the characterization
is not in the form of a convex problem and, as a consequence, there is no efficient algorithm to
compute the capacity region in practice. In this context, many authors have recently contributed
toward the computation of the sum-capacity (or total capacity) of an arbitrary DMAC [Wat96,
Wat02], and an algorithm for its exact computation has been found [Rez04].
It was shown in [Wat96] that any two-user DMAC can be decomposed into a finite number
of elementary (two-user binary-input and binary-output) DMACs for which their total capacity
can be computed by applying a necessary and sufficient condition for optimality of the input
probability distributions. In addition, [Wat96] showed that for any 2-user non-binary inputs and
binary output DMAC, the total capacity can be determined by computing the total capacities
of the elementary DMACs of its decomposition, and provided an iterative algorithm. In a later
work, [Wat02] extended the decomposition result for the K-user DMAC (with arbitrary input
5.1. Introduction
89
and output alphabets), which allowed [Rez04] to propose an algorithm for the computation
of the total capacity based on a generalization of the AB algorithm. Other applications of
generalizations of the AB algorithm can be found in the context of the computation of channel
capacity with side information [Dup04].
Regarding the computation of the whole capacity region of the DMAC, not much work
has been done due to the intractability of the problem because of its non-convexity. As a
consequence of the non-convexity, brute-force algorithms or random search methods seem to be
the only alternative to compute inner bounds on the capacity region with no quantification on
the suboptimality.
In this chapter, we shall show that the key difficulty in computing the capacity region of
an arbitrary DMAC can be identified as a rank-one constraint (a non-convex constraint) in an
otherwise convex optimization problem. Optimization problems with this kind of constraint
arise in areas such as control theory [Apk99, Hen01, Ors04] and signal processing [Jal03, Ma02],
and cannot be solved optimally in polynomial time with state of the art knowledge. Hence,
alternative suboptimal methods must be used to obtain good approximations of the capacity
region. One approach that has reported near optimal performance when dealing with rank-one
constraints in maximum-likelihood single-user [Jal03, Wie05] and multiuser [Ma02] detection
is the use of relaxation methods. Relaxation methods are based on i) replacing the rankone constrained problem by an approximate (not equivalent) tractable convex problem and
ii) generating a potential solution to the original problem from the solution to the relaxed
problem. This way, the use of computationally demanding algorithms is avoided since efficient
interior point methods can be used to solve the convex approximation of the original problem
in polynomial time.
We propose two efficient methods for the computation of both an inner and an outer bound of
the capacity region of any arbitrary DMAC. The outer bound follows from removing the rank-one
constraint and corresponds to the achievable rates in the situation of full transmit cooperation,
since user codewords can thus be arbitrarily correlated. To generate potential solutions to the
original problem, we first focus on randomization, an approach that has shown near optimal
numerical performance in the previously mentioned areas. In essence, several rank-one input
probability distributions are generated close (in the mean sense) to the optimal solution of the
relaxed problem and the one yielding the largest achievable rates is kept. This can be viewed as
a random search algorithm with guidance on the correlation matrix of the potential solutions.
Pursuing a simpler method allowing for performance analysis, we then study a deterministic
alternative in which the solution to the relaxed problem is projected onto the feasible set via
a minimum divergence criterion. This criterion yields a candidate solution which turns out
to be the marginalization of the relaxed solution, a very simple operation scalable with the
number of users. Regarding analytical results, there exists a class of channels for which this
algorithm is able to compute exactly the capacity region. It comprises the subclass of channels
90
with identical inner and outer bounds and the subclass of channels with strict outer bound and
tight inner bound. Given a channel, we derive necessary and sufficient conditions for checking
whether it belongs to the first subclass and sufficient conditions for verifying whether it belongs
to the second subclass. These conditions are used to show that for all the two-user binary-input
deterministic DMACs as well as for some non-deterministic channels simple marginalization
of the full cooperation solution achieves capacity. Although we have not been able to fully
characterize analytically the class of channels for which marginalization is optimal, numerical
simulations for various channels show that, in practice, both randomization and marginalization
perform indistinguishably to the optimal solution obtained with a computationally intensive
brute-force full search.
5.1.2
The dDMBC
The two-user discrete memoryless broadcast channel (DMBC) [Cov72] models the situation in
which one sender wishes to send information to two receivers. Although a single-letter characterization of the capacity region of this channel is not known in general, some achievable
rate regions and outer bounds have been derived by Cover and van der Meulen [Meu75, Cov75],
Marton [Mar79], and Nair and El Gamal [Nai07].
Achievable rate regions and outer bounds for this channel are usually expressed in terms
of auxiliary random variables. We say a region is computable whenever the cardinalities of the
support set of all the random variables involved are bounded. In this respect, Nairs outer bound
and Cover-van der Meulens achievable region in case of binary inputs [Haj79] are computable.
However, apart from showing the feasibility of the evaluation of the regions, little work has been
done towards finding efficient methods for their computation. We shall try to bridge this gap
in the specific setup where the DMBC is degraded and there is no common information to be
transmitted simultaneously to both receivers. Assuming these constraints, the capacity region
of the channel is known [Ber73, Wyn73, Gal74] and computable.
By posing the computation of the capacity region of the two-user degraded DMBC (dDMBC)
as a constrained optimization problem we show that it can be characterized as a difference of convex (DC) optimization problem [Hor99]. Since this kind of problems are not convex in general,
there is no efficient algorithm available to compute the capacity region in practice. Additionally,
the lack of convexity of this problem makes the Karush-Kuhn-Tucker (KKT) conditions [Boy04]
not sufficient, not even necessary (some regularity conditions are required), for claiming optimality of an input distribution. Regarding this, we are able to show that in this problem
they are necessary optimality conditions and, therefore, by choosing the best among those input
distributions satisfying the KKT conditions the capacity region can be optimally computed.
Moreover, since the dimensionality of the set of distributions satisfying the KKT conditions is in
general substantially lower than the original feasible set, the complexity involved in computing
the capacity region is greatly reduced. These results are applied to evaluate the capacity region
5.1. Introduction
91
of the BEC-BSC dDMBC.
5.1.3
The study of the efficient evaluation of multiuser capacity regions has led to the following
contributions:
The problem of the computation of the capacity region of the MAC has been rephrased
in an alternative way that condensed all the non-convexities of the problem in a single
rank-one constraint.
This alternative formulation, which applies to any number of users, has paved the way
to two efficient relaxation methods that provide inner and outer bounds on the capacity
region: marginalization and randomization.
The mathematical amenability of marginalization has allowed to determine analytical conditions under which its bounds are tight. These conditions have been applied to rederive
existing capacity results but also to find the capacity region of new channels.
The problem of the computation of the capacity region of the BC has been shown to be a
DC problem.
Necessary conditions for the optimality of an input distributions have led to a way of
reducing the degrees of freedom that need to be explored to compute the capacity region,
as shown for the BEC-BSC dDMBC.
This chapter consists of three sections. First, Section 5.2 is devoted to the DMAC. In
particular, Section 5.2.1 introduces the problem of the computation of the capacity region of
the DMAC and reformulates it as a rank-one constrained optimization problem. Section 5.2.2
describes the proposed relaxation-based methods for the computation of inner and outer bounds
on the capacity region: randomization and marginalization. Analytical optimality conditions
that determine when the marginalization bounds are tight are provided in Section 5.2.3. Then,
the performance of the proposed algorithms among various channels is numerically compared to
that of a random search method in Section 5.2.4.
Second, Section 5.3 concerns the dDMBC. We start again by introducing the problem of
the computation of its capacity region in Section 5.3.1, where we also reformulate it as a DC
optimization problem. Next, Section 5.3.2 describes the necessary optimality conditions and how
they can be applied to obtain capacity-achieving distributions. Then, Section 5.3.3 provides the
capacity region of the BEC-BSC dDMBC by using the results of Section 5.3.2.
Finally, Section 5.4 concludes the chapter by sketching its main contributions.
92
5.2
5.2.1
The DMAC
The capacity region as a rank-one constrained optimization problem
The computation of the capacity region of an arbitrary DMAC (a convex set) is a non-convex
problem. It can be formulated in a matrix form that reveals all the non-convexity of the problem
as a single rank-one constraint.
5.2.1.1
The problem of the capacity region for two users
The capacity region C of the two-user DMAC is the convex hull of the set1 of rate pairs (R1 , R2 )
satisfying
0 R1 I(X1 ; Y |X2 )
(5.1)
R1 + R2 I(X1 X2 ; Y )
(5.3)
0 R2 I(X2 ; Y |X1 )
(5.2)
for a distribution of the form PX1 X2 Y = PX1 PX2 PY |X1 X2 on X1 X2 Y, where the input
alphabets can be characterized as
(1)
(|Xk |)
Xk = {xk , . . . , xk
},
k = 1, 2.
(5.4)
PXk is the input probability distribution of the k-th user (k = 1, 2), and PY |X1 X2 is the given
conditional distribution that characterizes the channel. It is well known that C is a convex
set [Cov06, Thm. 14.3.2] and hence, by applying the supporting hyperplane theorem [Boy04, Sec.
2.5.2], the computation of the capacity region can be parameterized for [0, 1] as2
maximize R1 + (1 )R2
(5.5)
R1 ,R2 ,PX1 ,PX2
X
subject to 0 R1
PX1 (x1 )PX2 (x2 )PY |X1 X2 (y|x1 x2 ) log P
x1 ,x2 ,y
x01
X
0 R2
PX1 (x1 )PX2 (x2 )PY |X1 X2 (y|x1 x2 ) log P
x1 ,x2 ,y
x02
X
R1 +R2 PX1 (x1 )PX2 (x2 )PY |X1 X2 (y|x1 x2 )log P
x1 ,x2 ,y
X
xk
1
PXk (xk ) = 1 , PXk (xk ) 0 xk Xk
PY |X1 X2 (y|x1 x2 )
(5.6)
PX1 (x01 )PY |X1 X2 (y|x01 x2 )
PY |X1 X2 (y|x1 x2 )
(5.7)
PX2 (x02 )PY |X1 X2 (y|x1 x02 )
PY |X1 X2 (y|x1 x2 )
(5.8)
PX1 (x01 )PX2 (x02 )PY |X1 X2 (y|x01 x02 )
x01 ,x02
k = 1, 2,
(5.9)
The convex hull is strictly necessary for convexification of C since otherwise it may not be convex in general
[Bie79].
2
Unless the logarithm basis is indicated, it can be chosen arbitrarily as long as both sides of the equation have
the same units.
93
5.2. The DMAC

R2
C
n
q

(R1? (), R2? ())
R1
Figure 5.1: The boundary of C is obtained solving (5.5)-(5.9) for each [0, 1].
where the expressions (5.6)-(5.8) correspond to (5.1)-(5.3), respectively, instantiated for the
DMAC. Note that the solutions PX? 1 () and PX? 2 () generally depend on . For each , the
problem (5.5)-(5.9) computes the intersection between the contour of the capacity region and
a tangent hyperplane with normal vector n = [, 1 ]T , as illustrated in Figure 5.1. Hence,
the capacity region is computed when (5.5)-(5.9) is solved for all [0, 1] and the convex hull
of {R1? (), R2? ()}[0,1] is taken; in other words, the solutions (R1? (), R2? ()) are samples of the
boundary of C.
5.2.1.2
A rank-one constrained optimization problem
The problem (5.5)-(5.9) of the computation of the capacity region is non-convex because the
constraints (5.6)-(5.8) are not jointly convex in PX1 and PX2 . For instance, the right hand side
of the constraint in (5.8) is not concave (note that it should be concave for the problem to be
convex). To see this, observe that even though x log(1/x) is concave, the composition with a
linear combination of terms of the form xy is not3 . Similar reasonings may be applied to the
constraints (5.6) and (5.7) to obtain again that the lack of convexity follows from the presence
of the product terms PX1 (x1 )PX2 (x2 ).
Although the problem (5.5)-(5.9) is not jointly convex in (PX1 , PX2 ), it is separately convex
in each of the input probability distributions. This would allow us to perform an alternate
(0)
(0)
(1)
(1)
(n)
optimization procedure: PX1 PX2 PX1 PX2 . . ., where PXk denotes the optimal
solution PXk at the n-th iteration. However, alternate optimization procedures applied to nonconvex problems do not generally converge to global maxima of the cost function, and for
this particular problem they do not yield acceptable results (as can be verified by numerical
simulations).
3
It is sufficient to note that the Hessian of f (x, y) = xy log(xy) has one positive and one negative eigenvalue
at (x, y) = (1/ 2, 1/ 2).
94

Interestingly, if we allow the variables X1 and X2 to be dependent with distribution PX1 X2 ,
then problem (5.5)-(5.9) becomes convex (recall that x log(1/x) is a concave function).
We will now reformulate the problem with a matrix-vector notation. Each of the input
probability distributions PXk admits a vector representation of the form pk , where [pk ]i =
(i)
PXk (xk ), 1 i |Xk |, k = 1, 2, while the joint distribution admits a matrix representation of
(i)
(j)
the form P, where [P]i,j = PX1 X2 (x1 , x2 ), 1 i |X1 |, 1 j |X2 |. Then, we define Pprod
as the subset containing all the product distributions (PX1 , PX2 ) of X1 and X2 ,

o
n

(j)
(i)
Pprod = P R|X1 ||X2 |[P]i,j = PX1 (x1 )PX2 (x2 ) for some feasible (PX1 , PX2 ) on X1 X2 .(5.10)
For any joint probability matrix P R|X1 ||X2 | , P Pprod is equivalent to rank(P) = 1, and
hence the following simpler equivalent description of Pprod can be given
o
n

Pprod = P R|X1 ||X2 | rank(P) = 1 , P 0 , 1T P1 = 1 ,
(5.11)
where denotes component-wise as well as scalar inequality indistinctly and 1 is an all-one
column vector of appropriate length. The original problem (5.5)-(5.9) can now be expressed in
an equivalent matrix form formulated in terms of P Pprod , the joint distribution of X1 and
X2 , and its marginals p1 and p2 , making use of expression (5.11) and the fact that
P1 = p1 ,
P T 1 = p2 ,
(5.12)
i.e., that PX1 and PX2 are the marginal distributions of PX1 X2 . The following reformulation
of the problem is the key point of the identification of (5.5)-(5.9) as a rank-one non-convex
optimization problem.
Proposition 5.1 The problem (5.5)-(5.9) of the computation of the capacity region of an arbitrary two-user DMAC is equivalent to the following rank-one non-convex optimization problem
maximize
R1 + (1 )R2
(5.13)
subject to
0 R1 f1 (P, p2 )
(5.14)
R1 + R2 f12 (P)
(5.16)
R1 ,R2 ,P,p1 ,p2
0 R2 f2 (P, p1 )
(5.15)
P1 = p1 , PT 1 = p2
(5.17)
P 0 , 1 P1 = 1
(5.18)
rank(P) = 1,
(5.19)
95
5.2. The DMAC

where
(i) (j)
f1 (P, p2 ) ,
X
i,j,y
(i) (j)
[P]i,j PY |X1 X2 (y|x1 x2 ) log P
PY |X1 X2 (y|x1 x2 )[p2 ]j

(i0 ) (j)
i0
[P]i0 ,j PY |X1 X2 (y|x1 x2 )
(5.20)
(i) (j)
f2 (P, p1 ) ,
X
i,j,y
(i) (j)
PY |X1 X2 (y|x1 x2 )[p1 ]i
j0
(i) (j 0 )
[P]i,j 0 PY |X1 X2 (y|x1 x2 )
(5.21)
(i) (j)
f12 (P) ,
X
i,j,y
PY |X1 X2 (y|x1 x2 )
(i) (j)
i0 ,j 0
(i0 ) (j 0 )
[P]i0 ,j 0 PY |X1 X2 (y|x1 x2 )
(5.22)
are concave in (P, p2 ), (P, p1 ), and P, respectively.

Observe that if (5.19) were removed, the resulting problem would be convex.
While (5.18) ensures that P is a feasible probability matrix, (5.19) constrains it to Pprod , and
(5.17) relates P with its marginals. The next results shows that, surprisingly, there is no need
to include (5.17) as long as (p1 , p2 ) are feasible probability vectors on X1 and X2 , respectively.
Lemma 5.1 The optimal solution to the optimization problem (5.13)-(5.19) does not change if
(5.17) is replaced by
pk 0 , 1 T pk = 1
k = 1, 2,
(5.23)
where (p1 , p2 ) are probability vectors associated to the distributions of X1 and X2 that need not
be the marginals induced by P.
1, p
2 to denote the marginals induced by P. Since (p1 , p2 ) only affect the
Proof. We use p
2 is the vector distribution that maximizes f1 .
functions f1 and f2 , it suffices to show that p
2 ) D(
2 ),
f1 (P, p2 ) = f1 (P, p
p2 ||p2 ) f1 (P, p
(5.24)
where D(a||b) is the Kullback-Leibler divergence between two probability distributions denoted
1 is the maximizing vector distribution of f2 .
by a and b. By symmetry we can induce that p
Remark 5.1 Note that (5.13)-(5.19) can be extended to include cost constraints. Let E1 and
E2 be the average expense requirements of user 1 and 2, respectively. By specifying the vectors
|X |
(i)
ek R+ k , where [ek ]i is the expense of using xk , the i-th letter of user k, the capacity region
at expenses (E1 , E2 ), C(E1 , E2 ), can be computed by introducing the constraints
eT1 p1 E1 ,
5.2.1.3
eT2 p2 E2 .
(5.25)
Extension to K users
The formulation of the computation of the capacity region as a rank-one constrained problem
introduced in Section 5.2.1.2 for two users can be extended to the K-user case. Using similar
96
equivalences but now involving tensors [RT05] of dimensionality up to K, the rank-one constraint
applies to any number of users.
The capacity region C of the K-user DMAC is the convex hull of the set of rate tuples
(R1 , . . . , RK ) satisfying
0 RS < I(X(S) ; Y |X(S c ) ),
S {1, 2, . . . , K}
(5.26)
for a distribution of the form PX1 ...XK Y = PX1 . . . PXK PY |X1 ...XK on X1 . . . XK Y.
We denote by S c = {1, 2, . . . , K} \ S the complement set of S, X(S) = {Xk : k S}, and

P
RS = kS Rk . By defining N , {1, 2, . . . , K}, the computation of the capacity region can be
parameterized as
maximize
{Rk },{PX(S) }SN
subject to
K
X
(5.27)
k Rk
k=1
X
0 RS PX(N )(x(N ) )PY |X(N )(y|x(N ) )log P
x(N ) ,y
PX(S) (x(S) ) =
Y
kS
PY |X(N ) (y|x(N ) )
0
0
x0 PX(S) (x(S) )PY |X(N ) (y|x(S) ,x(S c ) )
(S)
PXk (xk ), S N
PXk (xk ) 0 xk Xk ,
PXk (xk ) = 1,
xk
(5.28)
(5.29)
k N
(5.30)
The solution to (5.27)-(5.30) for any given = [1 . . . K ]T such that 0 and 1T = 1 is a

? ()) of the boundary of the capacity region C.
point (R1? (), . . . , RK
Similarly to what happened in the two-user case, the problem (5.27)-(5.30) is not convex
because the constraints in (5.29) are not jointly convex in PX1 . . . PXK .
However, (5.27)-
(5.30) can be also reformulated as a rank-one non-convex optimization problem if we allow

the variables X1 , . . ., XK to be dependent with distribution PX(N ) . In doing so, it is useful to extend the matrix-vector notation of Section 5.2.1.2 by using the tensor P(S) to denote

(i|S| )
(i ) (i )
PX(S) , where [P(S) ]i1 ,i2 ,...,i|S| = PX(S) xk11 , xk22 , . . . , xk|S|
1 ij |Xkj |, 1 j |S|,
S = {k1 , k2 , . . . , k|S| } N . PX(S) is the marginalization of PX(N ) into the set of user code-
words X(S) , i.e.,
PX(S) (x(S) ) =
PX(N ) (x(N ) )
(5.31)
x(S c )
or, equivalently in tensor notation,

X
P(N ) = P(S) .
(5.32)
i(S c )
Proposition 5.2 The problem (5.27)-(5.30) of the computation of the capacity region of an ar-
97
5.2. The DMAC
bitrary K-user DMAC is equivalent to the following rank-one4 non-convex optimization problem:
K
X
maximize
{Rk },{P(S) }SN
(5.34)
k Rk
k=1
0 RS fS (P(N ) , P(S c ) ),
X
P(N ) = P(S) S N
subject to
(5.35)
S N
(5.36)
i(S c )
P(N ) 0 ,
P(N ) = 1
(5.37)
i(N )
rank(P(N ) ) = 1,
(5.38)
where the functions

fS (P(N ) , P(S c ) ) =
[P(N ) ]i(N ) PY |X(N )
i(N )
y|x(N
) log
i(N ) ,y
i(N )
c
PY |X(N ) y|x(N
) [P(S ) ]i(S c )
i0(S)
[P(N ) ]i0
,i c PY |X(N )
(S) (S )
i0
(S)
(S )
y|x(S)
, x(S
c)
(5.39)
are concave in (P(N ) , P(S c ) ).

Proof. This follows from extending the definition of Pprod (5.11) to suit the K-th dimensional
tensor P(N ) and noticing that the functions fS are the generalizations of f1 (5.20), f2 (5.21),
and f12 (5.22) to the K-user case. Hence, concavity of fS can be showed by the same arguments
than in the proof of Proposition 5.1.
5.2.2
Relaxation methods
Proposition 5.1 identified all the non-convexity of the problem of the computation of the capacity
region of an arbitrary DMAC as the rank-one constraint rank(P) = 1 in (5.19). Rank-one constrained optimization problems, even with linear matrix inequalities, are non-convex problems
that cannot be solved optimally in polynomial time. Thus, we first choose to relax (5.13)-(5.19)
by removing the rank-one constraint (5.19) to obtain a tractable, convex problem equivalent to
solving the capacity region of a DMAC with arbitrarily dependent codewords (full transmitter
cooperation). We will denote by Ro the outer bound on the true capacity region obtained with
the relaxed problem (5.13)-(5.18).
If the optimal solution to the relaxed problem happens to be rank one, then it will also be
an optimal solution to the original problem. Otherwise, this solution has to be projected onto
Pprod to obtain a candidate solution (not necessarily optimal). Many different approaches can
4
|X |...|XK |
The K-th dimensional tensor P(N ) R+ 1
has rank one if and only if it can be written as
P(N ) = p1 p2 . . . pK ,
|X |
(5.33)
where denotes outer product and the vectors pk R+ k admit a similar equivalence to that of Section 5.2.1.2.
Again, rank(P(N ) ) = 1 is equivalent to imposing PX1 ...XK = PX1 . . . PXK .
98
be applied to approximate an arbitrary joint distribution P by a reduced rank distribution of

the form q1 qT2 ; here we shall explore two of them: randomization and marginalization.
5.2.2.1
Randomization
A randomization approach generates random samples of candidate probability vectors (q1 , q2 )

such that E{q1 qT2 } = P, thus approximating the solution to the relaxed problem in the mean
sense. For each of the generated pairs, the achievable rates (5.5)-(5.9) are evaluated and the pair
of distributions yielding the largest objective value is kept as the solution. This is equivalent to
performing a random search on the original problem with guidance on the correlation matrix of
the distributions under test taken from the relaxed problem.
The nature of the random pairs (q1 , q2 ), which are input probability distributions, prevents
us from easily finding statistics for generating them yielding E{q1 qT2 } = P directly. Instead, we
first choose to approximate P by the convex combination of n rank-one distributions under a
minimum divergence criterion, i.e.,
P
n
X
(5.40)
i ai bTi ,
i=1
where n is fixed, and

{i , ai , bi } =
arg min
D(P||
{i ,ai ,bi }
subject to
n
X
i ai bTi )
(5.41)
i=1
n
X
i=1
i = 1, i 0
1T ai = 1, ai 0
1in
1in
1T bi = 1, bi 0 1 i n.
(5.42)
(5.43)
(5.44)
The problem (5.41)-(5.44) is not jointly convex in {i , ai , bi } but is separately convex in {i },
{ai }, and {bi }. Thus, a practical approximation as in (5.40) can be obtained through an
(0)
(0)
(0)
(1)
alternate optimization {i } {ai } {bi } {i } . . . until convergence is achieved

((.)(r) denotes value at the r-th iteration).
Second, given that N random samples (q1 , q2 ) must be generated, the approximation in the
mean sense E{q1 qT2 } = P can be achieved using (5.40) by drawing, i = 1 . . . n, N i pairs from
a pair of independent distributions such that E{q1 } = ai , E{q2 } = bi (because of statistical
independence, this will imply E{q1 qT2 } = ai bTi ). To generate a random vector q with prescribed
average E{q} = q we use a distribution whose support q is the largest sphere centered at q
such that all its boundary lies within the probability simplex, as illustrated in Figure 5.2 for the
3-dimensional case. While the radius r of such sphere can be analytically determined resorting
to the point-line and point-plane distance formulas, its expression is omitted here for the sake of
brevity. As for the distribution of q, we choose to project a random vector x drawn uniformly
99
5.2. The DMAC
[q]3
6
@
@
@
#

r@@

s

q
@ q

@
"!

@ [q]

@-2

1

1

@
@

1
[q]1
Figure 5.2: The support of the randomly generated probability distributions q is the largest
circle centered at E{q} = q that fits within the probability simplex.
over {x R|X |1 : kx qk22 r2 }5 onto the probability simplex6 , i.e.,
q=x+
1 1T x
1,
|X |
(5.45)
which results in a circularly symmetric distribution with average q.

The evaluation of the achievable rates (5.5)-(5.9) yields the randomization inner bound, Rirand
(see Algorithm 5.1 for a pseudocode description).
5.2.2.2
Marginalization
While the numerical performance of randomization is good, it is at the price of generating many
potential solutions at random. Therefore, it is desirable to explore other simpler (deterministic)
methods retaining most of its accuracy but also allowing for performance analysis.
To this end, we adopt a projection criterion based on the Kullback-Leibler divergence D(k).
It has been used in [Cho68] and [Ku69] as the criterion for approximating joint discrete probability distributions given a dependence tree, although here the purpose is different. The use
of the information divergence as the measure that quantifies the quality of the approximation
1
offers several advantages, the most useful one being that, for some fixed P (with marginals p
2 ), the pair (p1 , p2 ) that minimizes D(P||p1 pT2 ) follows easily from
and p
T2 ),
T2 ) + D(
D(P||p1 pT2 ) = D(P||
p1 p
p1 ||p1 ) + D(
p2 ||p2 ) D(P||
p1 p
5
(5.46)
To generate a vector x with uniform distribution on the sphere, we generate random vectors drawn uniformly

on the hypercube [q]1 r, [q]1 + r . . . [q]|X | r, [q]|X | + r until the condition kx qk22 r2 is satisfied.
6
The projection onto a simplex usually has a water-filling form [Pal05], but, since by construction x belongs
to the non-negative orthant, it reduces to (5.45), where the water level has been analytically found.
100
Algorithm 5.1 Randomization

1: for each value of [0, 1] do
2:
Solve (5.13)-(5.18): (R1o (), R2o ()) = (R1? , R2? ) and P() = P? .
3:
Approximate P() using (5.40) for some specified n: {i , ai , bi }ni=1 .
4:
for i = 1 . . . n do
for j = 1 . . . N i do
5:
Generate a random pair (q1 , q2 ) according to (5.45) such that E{q1 } = ai ,
6:
E{q2 } = bi .
10:
(i,j)
, R2
).
end for
8:
9:
(i,j)
Evaluate (5.5)-(5.9) using (q1 , q2 ): (R1
7:
end for
(i,j)
i
i
Choose the best pair: (Rrand,1
(), Rrand,2
()) = max R1
i,j
11:
end for
12:
i
i
Randomization inner bound: Rirand = Co({(Rrand,1
(), Rrand,2
()), }).
13:
(i,j)
+ (1 )R2
Outer bound: Ro = Co({(R1o (), R2o ()), }).
2 ). Therefore, marginalization is the solution to the minimum

which shows that (p?1 , p?2 ) = (
p1 , p
divergence criterion8 . In order to obtain an approximation of C, it is sufficient to solve (5.13)-
(5.18) (or its equivalent form given in Lemma 5.1), take the marginal distributions of the solution,
plug them into (5.5)-(5.9), and evaluate (R1 , R2 ) (the problem (5.5)-(5.9) is convex for fixed
input probability distributions). The solution to (5.5)-(5.9) in terms of (R1 , R2 ) defines the
marginalization inner bound Rimarg (see Algorithm 5.2).
Algorithm 5.2 Marginalization
1: for each value of [0, 1] do
2:
Solve (5.13)-(5.18): (R1o (), R2o ()) = (R1? , R2? ) and (p1 (), p2 ()) = (p?1 , p?2 ).
3:
Evaluate
(R1 , R2 )
(5.5)-(5.9)
for
fixed
distributions
(p1 (), p2 ()):
i
i
(Rmarg,1
(), Rmarg,2
()) = (R1? , R2? ).
4:
end for
5:
i
i
Marginalization inner bound: Rimarg = Co({(Rmarg,1
(), Rmarg,2
()), }).
6:
Outer bound: Ro = Co({(R1o (), R2o ()), }).
5.2.2.3
Remarks
Remark 5.2 The outer bound Ro can be further tightened by applying the algorithm of [Rez04]
for the exact computation of the sum-capacity, denoted by C sum . In particular, a tighter outer
8
Another deterministic strategy is to use the singular value decomposition (SVD) of P to choose q1 and
q2 as the suitably normalized left and right singular vectors associated to the largest singular value. However,
numerical simulations of this method performed over various channels have shown that it is outperformed by
marginalization.
101
5.2. The DMAC

bound is Ro C sum , where
C sum = {(R1 , R2 ) R2 | R1 + R2 C sum }.
(5.47)
Remark 5.3 Both methods apply to the cost constrained case by adding the constraints in (5.25)
to (5.13)-(5.18) when solving the relaxed problem. As for randomization, cost constraints have
to be included also in (5.41)-(5.44) and the sphere radius r has to be chosen so as to belong to
the feasible set.
5.2.3
5.2.3.1
Performance analysis of marginalization

Analytical results
There exists a class of channels for which marginalization computes optimally the capacity
region. Although we have not been able to fully characterize this class analytically, we have
been able to show that some specific channels belong to it. For some channels, this can be
proved by showing Rimarg = Ro = C, while for others Rimarg = C Ro . We restrict our attention
to the two-user case for the sake of simplicity of the expressions.
It is worth point out that Ro is also an outer bound of the capacity region of the two-user
DMAC with feedback (see [Oza84, Sec. IV]), and it follows that the class of channels for which
Rimarg = Ro is a subset of the class of DMACs for which feedback does not increase the capacity
region. Although the capacity region of the discrete memoryless MAC with feedback is not
known in general (it is known in the continuous memoryless Gaussian (scalar) case [Oza84]),
there are some achievability results [Bro05,Cov81a] and a class of DMACs for which the achievable region of [Cov81a] is tight (see [Wil82]). Let us start first with some optimality conditions
that will be key for obtaining subsequent results.
Lemma 5.2 A sufficient condition for optimality of the joint probability distribution PX1 X2 ,
defined on X1 X2 , with respect to the relaxed problem (5.13)-(5.18) for any fixed (1 , 2 ) =
(, 1 ), assuming 2 1 , is9
1 D(PY |X1 =x1 ,X2 =x2 ||PY ) + (2 1 )D(PY |X1 =x1 ,X2 =x2 ||PY |X1 =x1 )
= Lo (1 , 2 ) if PX X (x1 , x2 ) > 0
1 2
L ( , ) if P
(x , x ) = 0
o
X1 X2
(5.48)
for all (x1 , x2 ) (X1 , X2 ) and some Lo (1 , 2 ) 0, and

I(X1 ; Y |X2 ) + I(X2 ; Y |X1 ) I(X1 X2 , Y ).
9
If 2 1 the user indexes of and X in (5.48) or (5.51) must be swapped.
(5.49)
102
If (5.48) is satisfied by some joint distribution then it becomes also a necessary optimality condition with respect to the relaxed problem (5.13)-(5.18) and, since Lo (1 , 2 ) is precisely its
objective value, it follows
Lo (1 , 2 ) = Ro? (1 , 2 ) =
max
1 R1 + 2 R2 .
(R1 ,R2 )Ro
(5.50)
Proof. See Appendix 5.B.
Lemma 5.3 A sufficient condition for optimality of the input probability distributions PX1 and
PX2 , defined on X1 and X2 , with respect to the capacity region for any fixed (1 , 2 ) = (, 1 ),
assuming 2 1 , is9
1 D(PY |X1 =x1 ,X2 =x2 ||PY ) + (2 1 )D(PY |X1 =x1 ,X2 =x2 ||PY |X1 =x1 )
= L(1 , 2 ) if PX (x1 )PX (x2 ) > 0

2
1
, (5.51)
L( , ) if P (x )P (x ) = 0
1
X1
X2
for all xk Xk , k = 1, 2. If (5.51) is satisfied for some pair of input probability distributions then
it becomes also a necessary and sufficient condition for optimality with respect to the capacity
region and to the relaxed problem and, since L(1 , 2 ) is precisely its objective value, it follows
L(1 , 2 ) = C ? (1 , 2 ) =
max
(R1 ,R2 )C
1 R1 + 2 R2 .
(5.52)
Proof. Lemma 5.3 follows from the particularization of Lemma 5.2 to a product distribution of
the form PX1 X2 = PX1 PX2 , which always satisfies (5.49). Since such product distribution is a
solution to the relaxed problem (5.13)-(5.18), it is capacity-achieving and hence optimal.
Corollary 5.1 Ro = C if and only if for each (1 , 2 ) there exists at least one pair of distributions satisfying the conditions in Lemma 5.3.
Proof. The if part is proved by noticing that existence of input distributions satisfying Lemma
5.3 for all (1 , 2 ) is equivalent to Ro? (1 , 2 ) = C ? (1 , 2 ) (1 , 2 ), which implies Ro = C, since
both C and Ro are convex sets. The only if part follows from the fact that Ro = C implies
that for each (1 , 2 ) there must exist at least one product distribution which is optimal with
respect to the relaxed problem, and hence satisfies Lemma 5.3.
Corollary 5.2 Rimarg = C if for each (1 , 2 ) there exist one or more joint distributions satisfying Lemma 5.2, all of them with product distributions induced by their marginals satisfying
Lemma 5.3.
Proof. Taking into account the fact that the relaxed problem (5.13)-(5.18) may have multiple
solutions not all of them product distributions, Corollary 5.2 implies that the inner bound Ri
of the capacity region provided by the relaxation method in Algorithm 5.1 is obtained with
capacity achieving product distributions for all (1 , 2 ).
103
5.2. The DMAC
Corollary 5.3 Rimarg = C = Ro if and only if for each (1 , 2 ) there exists at least one pair of
distributions satisfying the conditions in Lemma 5.3 and all the joint (non-product) distributions
satisfying Lemma 5.2 (if any such distributions exist) have product distributions induced by their
marginals satisfying Lemma 5.3.
Proof. This follows from corollaries 5.1 and 5.2.
5.2.3.2
Applications
The previous results describe the conditions under which the bounds provided by the relaxation
method in Algorithm 5.1 are tight10 , and provide us with a test for quantifying optimality of
the marginalization approach. This test can be performed either analytically through any of the
corollaries, or numerically (by checking if the product distribution that achieves Rimarg satisfies
the conditions in Lemma 5.3, for example). In addition, they can also be used as a tool for
deriving new capacity results or for re-deriving existing ones, as will be illustrated with several
examples.
Example 1: The binary switching MAC
The binary switching MAC (BS-MAC) is a
binary-input ternary-output (Y = {0, 1, }) deterministic multiple-access channel whose inputoutput relationship is
0 if (X1 , X2 ) = (1, 0)
Y = X2 /X1 ,
1 if (X1 , X2 ) = (1, 1) .
if X1 = 0
(5.53)
Note that, given the channel output Y , the value of X1 is always decoded without ambiguity
(H(X1 |Y ) = 0). On the other hand, the information carried by X2 can only be conveyed when
X1 = 1, since otherwise Y = independently of X2 . Therefore, a hierarchy is established
among senders: sender 1 can always convey information to Y through the control of a switch
which is also responsible for allowing sender 2 for effectively transmitting information to Y .
Proposition 5.3 (Vanroose [Van86], Vinck [Vin85a]) The capacity region C of the BS-
MAC is given by
C =
(R1 , R2 ) R2 | {0 R1 log 2, 0 R2 0.5 log 2}

{R1 h(R2 / log 2), 0.5 log 2 < R2 log 2} ,

(5.54)
(5.55)
where h(x) , x log x (1 x) log(1 x) is the binary entropy function. The point
(R1? (), R2? ()) = arg
10
max
(R1 ,R2 )C
R1 + (1 )R2 ,
Note that the inner and outer bounds are always tight for = 0 and = 1.
(5.56)
104
belonging to the boundary of the capacity region, is achieved by the input probability distributions
PX? 1 (1; ) = p() = 1 PX? 1 (0; ),
PX? 2 (0; ) = PX? 2 (1; ) = 1/2,
(5.57)
where
p() , 1 + 2(11/)
1
(5.58)
and the weight [0, 1].

Proposition 5.4 Marginalization achieves the capacity region for the BS-MAC and Rimarg =
C = Ro .
Proof. The proof follows from the application of Corollary 5.3 to this channel. See Appendix
5.C.
Vinck showed in [Vin85b] that the capacity region of the BS-MAC with and without feedback
were identical. Since this result is a necessary condition for allowing the inner and outer bounds
to coincide, it could have been inferred a posteriori from Proposition 5.4.
Not for all the channels for which the capacity region can be efficiently computed using
Algorithm 5.1 satisfy that both the inner and the outer bound are tight. In particular, for a
wider class of channels the capacity region can be computed by showing only that the outer
bound is tight through Corollary 5.1. This is the case of the non-deterministic noisy BS-MAC
considered next.
Example 2: The noisy binary-switching MAC
We denote by noisy binary-switching
MAC (nBS-MAC) the binary-input ternary-output (Y = {0, 1, }) multiple-access channel
characterized by the transition probability distribution
/2 1
/2
/2
/2 1
PYnBSMAC
|X1 X2
1

0

1
0
(5.59)
where the columns represent the different elements of Y and the rows correspond to the natural
ordering of the inputs (X1 , X2 ). This channel is a non-deterministic extension of (5.53) that
adds two random behaviors to the channel: i) a noisy switch, that with probability is not able
maintain open the circuit and outputs equally likely bits, and ii) a binary symmetric channel
with error probability when the switch is closed. Clearly, when (, ) = (0, 0) the nBS-MAC
reduces to the BS-MAC addressed in Example 1.
Proposition 5.5 The outer bound is tight for the nBS-MAC, Ro = C, and one capacityachieving pair of distributions achieving the boundary point
(R1? (), R2? ()) = arg
max
(R1 ,R2 )C
R1 + (1 )R2
(5.60)
105
5.2. The DMAC

is
PX? 1 (1; ) = p(, ; ) = 1 PX? 1 (0; ),
where
p(, ; ) =
PX? 2 (0; ) = PX? 2 (1; ) = 1/2,
A(, ; ) /(1 )
A(, ; ) + 1
(5.61)
(5.62)
if 0 < 1/2 and

p(, ; ) = {0 p < 1 : B(p, , ; ) = (h() h() + log(2))}
(5.63)
otherwise, where

A(, ; ) = exp(h() + (1 1/)h())2
(1/1)
1
1
(5.64)
+ (1 )p
(5.65)
(1 )(1 p)

+ (2 2 )p
+ (2 )p
+ (2 1) (1 /2) log
+ ( /2) log
.(5.66)
(1 )(1 p)
(1 )(1 p)
B(p, , ; ) = (1 )(1 ) log
Proof. The proof follows from the application of Corollary 5.1 to this channel. See Appendix
5.D.
In Figure 5.3 we show the capacity region of the nBS-MAC for several values of and .
Typically, when is small, sender 1 has access to the channel in much better conditions than
sender 2, which is reflected in the shape of the capacity regions. This fact can be qualitatively
appreciated in Figure 5.4 also, where p(, ; ) is plotted versus , the weight of R1 . When is
small, the rate of sender 2 is prioritized and hence sender 1 opens the tap for the transmission of
information of X2 by setting a large value for p(, ; ). When increases, the tap is progressively
closed towards a value that maximizes R1 .
The derived analytical conditions for tightness of the marginalization bounds do not exhaust
all the situations for which the marginalization approach is optimal (Lemma 5.3 is sufficient
but not necessary for a capacity-achieving product distribution). Thus, it can happen for some
channels that none of the previous results applies but marginalization is still optimal: it may
occur that C Ro but after marginalization Rimarg = C without satisfying the conditions in
Corollary 5.2. A representative example of this subclass of channels is the Binary Adder MAC
(BA-MAC) considered next.
Example 3: The binary adder MAC The binary adder MAC (BA-MAC) (or binary erasure
MAC, as named in [Cov06, Example 14.3.3]), is the binary-input ternary-output (Y = {0, 1, 2})
deterministic multiple-access channel whose input-output relationship is given by

Y = X1 + X2 ,
(5.67)
where the sum is taken over the natural numbers. In this case, none of the users has access to
the channel in privileged conditions. There are two input combinations that can be correctly
106
R2
0.75
0.5
(, ) = (0.05, 0.05)
0.25
(, ) = (0, 0.05)
(, ) = (0.05, 0)
(, ) = (0, 0)
0
0
0.25
0.5
0.75
R1
Figure 5.3: The capacity region C of the nBS-MAC, in [bit/ch. use], for different values of and
. Note that (, ) = (0, 0) corresponds to the BS-MAC.
?
PX
(1; ) = p(, ; )
1
(, ) = (0.05, 0.05)
(, ) = (0, 0.05)
(, ) = (0.05, 0)
0.9
(, ) = (0, 0)
0.8
0.7
0.6
0.5
0
0.2
0.4
0.6
0.8
Figure 5.4: The probability p(, ; ) as a function of for different values of and . Note that
(, ) = (0, 0) corresponds to p() (5.58).
decoded without ambiguity ((X1 , X2 ) {(0, 0), (1, 1)}), and two input combinations where the
decoder has ambiguity about both users. Therefore, for this channel the capacity region is
symmetric with respect to R1 and R2 .
107
5.2. The DMAC
Proposition 5.6 (Liao [Lia72]) The capacity region C of the BA-MAC is given by the set of
rate pairs (R1 , R2 ) satisfying
0 R1 log 2
0 R2 log 2
3
log 2.
R1 + R2
2
(5.68)
(5.69)
(5.70)
Furthermore, all the points of the boundary of the capacity region are achieved by a pair of
uniform input probability distributions, i.e.,
PX? i (xi ; ) = 1/2, xi {0, 1}, i = 1, 2, [0, 1].
(5.71)
The BA-MAC, as noiseless multiple-access binary erasure channel, was first studied by Liao
[Lia72], who particularized his result of the capacity region of the general DMAC to this channel:
a result that has been subsequently used in the literature [Kas76, Cha79, Cha81].
Proposition 5.7 The achievable rate region of the proposed algorithm equals the capacity region
for the BA-MAC. Furthermore, the inner and outer bounds on the capacity region provided by
the proposed algorithm do not coincide, i.e., Rimarg = C Ro .
Proof. See Appendix 5.E.
Interestingly, the BS-MAC and BA-MAC are the only two non-trivial channels that characterize the rest of binary-input ternary-output11 deterministic DMACs, which can be obtained
through isomorphisms of the input and output alphabets in one of these two canonical channels [Vin85a]. Hence, the proposed marginalization approach is tight for all of them. Finally,
the following result extends the optimality of the relaxation to a wider class of channels.
Theorem 5.1 Marginalization is tight in the sense that Rimarg = C, for all binary-input deterministic DMACs.
Proof. See Appendix 5.F.
5.2.4
Numerical results
In the following, we analyze the numerical performance of randomization and marginalization

over three different binary-inputs ternary-output non-deterministic two-user DMACs, which we
name DMAC1 , DMAC2 , and DMAC3 , characterized respectively by the transition probability
11
When we refer to M -ary output DMACs we assume that |Y| = M and all the M values of the output alphabet
can be exhausted by at least one input (there are not dummy output letters).
108
Algorithm 5.3 Random search

1: Set (N, 2 ) = (500, 1/9).
2:
for each value of [0, 1] do
3:
Set (R1 (), R2 (), f ? , p1 (), p2 ()) = (0, 0, 0, 0, 0).
4:
Set (p1 , p2 ) = (1/|X1 |, 1/|X2 |).
(0)
(0)
for j = 1 . . . N do
5:
(j)
(j)
Generate (r1 , r2 ) i.i.d. N (0, 2 ) of lengths |X1 | and |X2 |, respectively.
6:
(j)
(j1)
(j)
(j)
7:
Update12 pk = [pk
8:
Evaluate (5.5)-(5.9) using (p1 , p2 ): (R1? (), R2? ()).
9:
if R1? + (1 )R2? > f ? then
(j)
(j)
+ rk ]+ and normalize pk := pk /(1T pk ), k = 1, 2.

(j)
(j)
(j)
(j)
(R1 (), R2 (), f ? , p1 (), p2 ()) = (R1? , R2? , R1? + (1 )R2? , p1 , p2 ).
10:
end if
11:
end for
12:
13:
end for
14:
Ribrute = Co({(R1 (), R2 ()), }).
distributions
(1)
PY |X1 X2
0.2
0.7
0.5
0.3
0.3 0.5
0.4 0.1 0.5
0.1 0.2
0.3 0.2 0.5

0.3 0.5
0.2 0.1
(2)
(3)
, PY |X1 X2
, PY |X1 X2
0.3 0.4
0.1 0.4
0.5 0.4 0.1
0.4 0.3
0.2 0.8 0
0.8 0.1
0.7
0.2
. (5.72)
0.3
0.1
The columns represent the different elements of Y = {0, 1, 2}, and the rows correspond to the
natural ordering of the inputs. While DMAC1 is the multiple-access channel example used in
[Rez04] to illustrate the behavior of the algorithm for the computation of the sum-rate capacity,
DMAC2 and DMAC3 have been chosen randomly. For each of the channels we compute the
randomization and marginalization bounds described in Section 5.2.2, and use the algorithm
in [Rez04] to compute C sum . As for the randomization bound, N = 500 randomly generated
product distributions have been tested for each using the approximation (5.40) with n = 4.
Additionally, we consider the achievable region of a random search algorithm, denoted by Rirs ,
as a benchmark (see Algorithm 5.3).
Figures 5.5, 5.6, and 5.7 show the bounds for DMAC1 , DMAC2 , and DMAC3 , respectively.
DMAC1 is another example of a non-deterministic channel for which the outer bound and the
randomization and marginalization inner bounds coincide, and hence the capacity region can be
effectively computed with the proposed methods. As for DMAC2 and DMAC3 , which represent
a more general situation, the bounds do not coincide and the capacity region cannot be directly
evaluated.
12
[x]+ denotes the component-wise application of the operator [x]+ , max{x, 0}.
109
5.2. The DMAC
Rirs
Rimarg
Rirand
Ro
C sum
0.2
0.15
R2
0.078
0.1 0.076
0.074
0.05
0.072
0.07
0.1
0
0
0.101 0.102 0.103

0.025
0.05
0.075
0.1
0.125
R1
Figure 5.5: Bounds on the capacity region for DMAC1 . Units are [bit/ch. use].
However, if we consider Ro C sum we are able to obtain a much tighter outer bound. Regard-
less of the tightness of Ro , the accuracy of the sum-capacity offered by the proposed relaxation
methods is remarkable for all the channels. The performance of marginalization and randomization is indistinguishable, as shown by appropriate zooms in Figures 5.5-5.7, while the behavior
of the random search is irregular and, in general, much worse. To achieve similar performance, it
requires a number of distributions under test orders of magnitude above that of randomization.
0.16
Rirs
Rimarg
Rirand
Ro
C sum
0.12
R2
0.125
0.08
0.12
0.115
0.04
0.11
0.105
0
0
0.1
0.2
0.22
0.1
0.24
0.26
0.2
R1
0.3
0.4
Figure 5.6: Bounds of the capacity region for DMAC2 . Units are [bit/ch. use].
110

0.2
Rirs
Rimarg
Rirand
Ro
C sum
R2
0.15
0.193
0.1
0.192
0.191
0.19
0.05
0.189
0
0
0.188
0.102
0.104
0.05
0.106
0.108
0.1
0.11
0.15
0.2
R1
Figure 5.7: Bounds of the capacity region for DMAC3 . Units are [bit/ch. use].
5.3
5.3.1
The Degraded DMBC

The capacity region as a DC optimization problem
The capacity region C of the two-user dDMBC X Y1 Y2 is the convex hull of the set of
rate pairs (R1 , R2 ) satisfying
0 R1 I(X; Y1 |U )
0 R2 I(U ; Y2 )
(5.73)
(5.74)
for some choice of the distribution PU XY1 Y2 = PU X PY1 |X PY2 |X on U X Y1 Y2 , where PU X
is the joint probability distribution of the auxiliary random variable U and the transmitted
codeword X, and PY1 |X , PY2 |X are the given conditional distributions that depend on the nature
of the channel. The region C is computable since it suffices to consider |U| = min{|X |, |Y1 |, |Y2 |}
in the evaluation of (5.73)-(5.74). Relying on the fact that C is convex by time-sharing arguments,
the supporting hyperplane theorem [Boy04, Sec. 2.5.2] allows us to parameterize its computation
for some [0, 1] as
maximize
R1 + (1 )R2
(5.75)
subject to
0 R1 I(X; Y1 |U )
(5.76)
{R1 ,R2 ,PU X }
0 R2 I(U ; Y2 )
PU X (u, x) 0 (u, x) U X
X
PU X (u, x) = 1,
u,x
(5.77)
(5.78)
(5.79)
111
5.3. The Degraded DMBC

where the right hand sides of (5.76)-(5.77) amount to

P
0
x0 PU X (u, x ) PY1 |X (y1 |x)
PU X (u, x)PY1 |X (y1 |x) log P
I(X; Y1 |U ) =
0
0
x0 PU X (u, x )PY1 |X (y1 |x )
u,x,y1

X X
I(U ; Y2 ) =
PU X (u, x)PY2 |X (y2 |x)
X
u,y2
(5.80)
(5.81)
0
0
x0 PU X (u, x )PY2 |X (y2 |x )
P
.
0
0 0
0
x0 PU X (u, x )
u0 ,x0 PU X (u , x )PY2 |X (y2 |x )
log P
(5.82)
Note that the solutions to (5.75)-(5.79), denoted by (R1? (), R2? (), PU? X ()), generally depend
on .
For each , the optimal rates (R1? (), R2? ()) (which belong to the boundary of C) satisfy the
right hand side inequalities of (5.76)-(5.77) with equality. This allows us to rephrase (5.75)-(5.79)
as
maximize
PU X
I(X; Y1 |U ) + (1 )I(U ; Y2 )
subject to PU X (u, x) 0 (u, x) U X

X
PU X (u, x) = 1,
(5.83)
(5.84)
(5.85)
u,x
where R1 , R2 have been removed from the optimization since they can be computed evaluating
(5.80)-(5.82) using PU? X (), the solution to (5.83)-(5.85).
Lemma 5.4 I(X; Y1 |U ) is concave in PU X , whereas I(U ; Y2 ) is a difference of concave functions
of PU X .
Proof. See Appendix G.
The computation of C (5.83)-(5.85) amounts hence to the maximization of the difference of
two concave functions of PU X (5.83) over the probability simplex. This falls within the class
of DC problems13 , a wide class of non-convex optimization problems which can only be solved
in cases where an underlying structure can be exploited. In general, their non-convexity makes
them intractable and only brute-force or random search methods seem to be available. However,
they provide no quantification on their incurred suboptimality.
5.3.2
Optimality conditions
Pursuing the computation of C, we explore both local and global optimality conditions in order
to either characterize completely the solutions to (5.83)-(5.85) or determine their structure in

order to reduce the dimensionality of the search space. To that end, consider the following
necessary optimality condition.
13
It can equivalently be mapped to the minimization of the difference of two convex functions by minimizing
the opposite of the objective in (5.83).
112
Lemma 5.5 A necessary condition for global optimality (and sufficient condition for local optimality) of the joint probability distribution PU X for any fixed [0, 1] is

PY2 |U =u (Y2 ) = R()
D(PY1 |X=x ||PY1 |U =u ) + (1 )EY2 |X=x log
R()
PY2 (Y2 )
if PU X (u, x) > 0
if PU X (u, x) = 0
(5.86)
for all (u, x) U X and some non-negative real constant R(). Any distribution PU X satisfying
(5.86) yields an objective value in (5.83) of I(X; Y1 |U ) + (1 )I(U ; Y2 ) = R().
Proof. See Appendix H.
Corollary 5.4 A condition for global optimality of the joint probability distribution PU X for
any fixed [0, 1] is to satisfy Lemma 5.5 with
R() = C ? () ,
max
(R1 ,R2 )C
R1 + (1 )R2 ,
(5.87)
where C ? () denotes the optimal value of (5.83)-(5.85) for the given .

Proof. The best among all the distributions satisfying a necessary condition for optimality must
be optimal.
5.3.3
The BEC-BSC degraded broadcast channel
We shall consider here the application of the necessary conditions of Lemma 5.5 to the BECBSC dDMBC, whose channel transition probabilities are shown in a diagram in Figure 5.8. This
channel is such that receivers one and two see a Binary Erasure Channel (BEC) and a Binary
Symmetric Channel (BSC) respectively. It models the situation in which the sender intends to
convey independent information to one receiver equipped with erasure correction capabilities
and another not equipped so. The second receiver copies the output of the first receiver except
when there is an erasure (denoted by e), in which case the output is chosen uniformly. Hence,
if denotes the erasure probability of the BEC seen by the first receiver, the error probability
of the equivalent BSC seen by the second receiver is /2.
Proposition 5.8 The capacity region of the BEC-BSC degraded broadcast channel, C BECBSC ,

is given by the convex hull of the points (1 ) log 2, 0 , (0, (1 h(0.5)) log 2), and the set of
rate pairs (R1 (), R2 ()), 0 < <
1
1
2 (10.5)2 ,
achieved by the distribution
() ()
(1 ()())(1 + ())
1 ()()
PU X (1, 1; ) =
(1 ()())(1 + ())
PU X (0, 1) = ()PU X (0, 0; )
PU X (0, 0; ) =
PU X (1, 0) = ()PU X (1, 1; ),
(5.88)
(5.89)
(5.90)
(5.91)
113
5.4. Conclusions
Y1
1

-b
0 bZ
0
Z
Z
Z
Y2
- b0
>

0.5

Z
~ b
Z
>eZ

Z 0.5

Z

Z

Z
~ b1
Z

b
b
1
1
Figure 5.8: The BEC-BSC degraded broadcast channel

where h(x) , x log x (1 x) log(1 x) is the binary entropy function.
The parameters (), (), and () are obtained as described next. Given () > 0,

0.5 exp g((); , )/(1 ) (1 0.5)
,
() =
(5.92)
0.5 (1 0.5) exp g((); , )/(1 )
where
(1 0.5) + 0.5
log ,
(5.93)
1 0.5 + 0.5
and () > 0 is such that g((); , ) = g((); , ) (there are at most three such possible valg(; , ) = (1 ) log
ues of () for each given ()). Finally, () is determined from the following unidimensional
maximization
() = arg
max
I(X; Y1 |U ) + (1 )I(U ; Y2 ),
PU X :>0,
|g(;,)|<(1) log(2/1)
(5.94)
where it has been explicitly denoted that PU X () in (5.88)-(5.91) and hence I(X; Y1 |U ), I(U ; Y2 )
exclusively depend on .
Proof. See Appendix I.
The capacity region of the BEC-BSC degraded broadcast is shown in Figure 5.9. Note that
the application of Lemma 5.5 and Corollary 5.4 allows us to compute C BECBSC performing
the maximization of a one-dimensional function instead of maximizing the achievable rates over
the probability simplex containing the joint distributions on U X , for which three degrees of
freedom would need to be explored for each .
5.4
Conclusions
The computation of the channel capacity of discrete memoryless channels is a convex problem
that can be efficiently solved. The evaluation of capacity regions of multiterminal networks is
not such an straightforward problem since, when capacity results are known, it gives rise to
unavoidable non-convexities. In this context, we have addressed the evaluation of the capacity
regions of the DMAC and the dDMBC.
114

1
= 0.01
= 0.10
= 0.25
= 0.50
R2
0.75
0.50
0.25
0.25
0.50
0.75
R1
Figure 5.9: The capacity region C BECBSC of the BEC-BSC degraded broadcast channel, in
[bit/ch. use], for different values of .
An alternative reformulation of the capacity region of the DMAC condensed all the nonconvexities of the problem into a single rank-one constraint. Since problems with this type of
constraint cannot be solved optimally with the current state of the art, we proposed efficient
methods to compute outer and inner bounds on the capacity region by solving a relaxed version of
the problem and projecting its solution onto the original feasible set. Targeting numerical results,
we adopted a randomization approach which generates random solutions to the original problem
with distribution prescribed by the solution to the relaxed problem. Focusing on analytical
results, we studied projection via minimum divergence, which amounts to the marginalization
of the relaxed solution. In this latter case we derived sufficient conditions and necessary and
sufficient conditions for the bounds to be tight. Furthermore, we were able to show that the class
of channels for which the marginalization bounds matched exactly the capacity region included
all the two-user binary-input deterministic DMACs as well as other non-deterministic channels
(e.g., the nBS-MAC). In general, however, both methods are able to compute very tight bounds
as shown for various examples.
On the other hand, the computation of the capacity region of the two-user dDMBC was
characterized as a DC problem. Since this class of problems cannot be optimally solved in
general, we focused on obtaining local and global optimality conditions. The KKT conditions of
the problem were found to be necessary and sufficient for an input distribution to achieve a local
maximum. An immediate consequence is that, among all the distributions satisfying the KKT,
those attaining the largest objective value are capacity-achieving (i.e., globally optimal). These
5.4. Conclusions
115
results enabled the maximization of the achievable rates over a candidate set of potentially much
lower dimensionality than the original feasible set, where the BEC-BSC dDMBC served as an
example.
116
5.A
First, note that the functions f1 (5.20), f2 (5.21), and f12 (5.22) simplify to the right hand
sides of (5.6), (5.7), and (5.8), respectively, when the vector-matrix formulation is used for
some P Pprod with marginal distributions p1 and p2 . Equivalence between (5.5)-(5.9) and
(5.13)-(5.19) is hence proved thanks to constraint (5.17) and the equivalence of (5.18)-(5.19) and
P Pprod .
Regarding the concavity of the function f1 (P, p2 ) we can rewrite (5.20) as
!
X
f1 (P, p2 ) =
[P]i,j
(i) (j)
(i) (j)
PY |X1 X2 (y|x1 x2 ) log PY |X1 X2 (y|x1 x2 )
i,j
{z
(i)
(j)
H(Y |X1 =x1 ,X2 =x2 )
!
X X
(i) (j)
+
j,y
(i0 ) (j)
i0 [P]i0 ,j PY |X1 X2 (y|x1 x2 )
[p2 ]j
{z
(j)
PX2 Y (x2 ,y)
}
(j)
= H(Y |X1 X2 )
(j)
PX2 Y (x2 , y) log
j,y
PX2 Y (x2 , y)
[p2 ]j
(5.95)
The term H(Y |X1 , X2 ) is linear in P and thus concave. The second term of (5.95) is jointly
concave in (PX2 Y , p2 ) (by the same arguments that ensure the convexity of the divergence
[Cov06, Thm. 2.7.2]), but since PX2 Y is linear in P, it is also concave in (P, p2 ), so is the
function f1 (P, p2 ).
It follows by symmetry of (5.20) and (5.21) that the function f2 (P, p1 ) is jointly concave in
(P, p1 ). Finally, the function f12 (5.22) can be shown to be concave resorting to the concavity
of the mutual information of a DMAC with respect to the input distribution, in this case
represented by P.
5.B
Appendix: Proof of Lemma 5.2
Consider the relaxed problem (5.13)-(5.18). Since it is convex and satisfies Slaters conditions,
the Karush-Kuhn-Tucker (KKT) conditions are necessary and sufficient for optimality of any
(P, p1 , p2 ) [Boy04]. Taking its partial Lagrangian without relaxing the constraints (5.17)-(5.18),
e 1 , R2 , p1 , p2 , P; ) = R1 + (1 )R2 + 1 (f1 (P, p2 ) R1 )
L(R
+ 2 (f2 (P, p1 ) R2 ) + (f12 (P) R1 R2 ),
(5.96)
(5.97)
and setting its derivatives with respect to R1 and R2 equal to zero we obtain
?1 = , ?2 = (1 ) .
(5.98)
117
5.B. Appendix: Proof of Lemma 5.2
Using (5.98), (5.96)-(5.97) admits a simplified form that does not show dependency on (R1 , R2 ),
e 1 , p2 , P; ) L(R
e 1 , R2 , p1 , p2 , P; [? ? ])
L(p
1 2
(5.99)
= (f12 (P) f1 (P, p2 ) f2 (P, p1 )) + f1 (P, p2 ) + (1 )f2 (P, p1 ).

(5.100)
By grouping the primal optimization variables in y , (p1 , p2 , P), an explicit definition of the
feasibility domain D , {y | (5.17)-(5.18) are satisfied} simplifies the application of the saddlepoint property (strong duality holds) as
min
e ) = max
max L(y;
0min{,1} yD
min
yD 0min{,1}
e ),
L(y;
(5.101)
where we have used 0 min{, 1 } according to dual feasibility ( 0) and (5.98). The
e ) on in the inner minimization of the right hand side of (5.101) is linear
dependence of L(y;
(5.100) and hence its optimal value satisfies

? (y) ? ((y)) =
if (y) > 0
min{, 1 } if (y) < 0
(5.102)
where (y) , f12 (P) f1 (P, p2 ) f2 (P, p1 )14 . Thus, the optimal value of the problem is
e ? (y)) = L(y
e ? ; ? (y? )).
max L(y;
yD
(5.103)
Let us now restrict the proof to the 0 < 1/2 case for the sake of simplicity (similar results are
obtained for the case 1/2 < 1, and are included in the conditions of Lemma 5.2). To compute
the optimal value (5.103) we need to know the sign of (y? ) and adjust ? accordingly. However,
e ? (y)). Therefore, to obtain the
y? depends in turn on ? through the maximization of L(y;
e ? = min{, 1 } = ) subject to (y) 0, then

optimal y? we should first maximize L(y;
e ? = 0) subject to (y) > 0, and select the y yielding the maximum objective
maximize L(y;
value among both hypothesis.
Let us start hypothesizing (y? ) 0, which implies ? = (5.102) and simplifies (5.100) to
e ? = ) = f12 (P) + (1 2)f2 (P, p1 ),
L(y;
(5.104)
which should be maximized under the constraint (y) 0. Instead, we shall perform an
unconstrained maximization of (5.104) and later impose that the solution satisfies non-positivity
of . Thus, we aim at finding a solution to the problem
maximize
p1 ,p2 ,P
f12 (P) + (1 2)f2 (P, p1 )
subject to P1 = p1 , PT 1 = p2
P 0 , 1T P1 = 1,
14
If (y) = 0, ? (y) can take any value, but it is irrelevant since it does not affect the final result.
(5.105)
(5.106)
(5.107)
118
whose Lagrangian is
L(y; , 1 , 2 , ) = f12 (P) + (1 2)f2 (P, p1 ) + 1T ( P)1
+ T1 (P1 p1 ) + T2 (PT 1 p2 ) + (1T P1 1),
|X ||X2 |
where R+ 1
(5.108)
(5.109)
, k R|Xk | for k = 1, 2, R, and denotes Hadamard (element-wise)
product. Setting the derivatives of (5.108)-(5.109) with respect to p1 and p2 equal to zero we
obtain the optimal values of { k }:
?1 = (1 2)(log p1 + log(e)1), ?2 = 0.
(5.110)
Finally, plugging (5.110) into (5.108)-(5.109) and setting its derivative with respect to [P]i,j
equal to zero we arrive at
X
y

(i) (j)
PY |X1 X2 (y|x1 x2 ) log P
+ (1
(i) (j)
PY |X1 X2 (y|x1 x2 )
(i0 ) (j 0 )
i0 ,j 0 [P]i0 ,j 0 PY |X1 X2 (y|x1 x2 )

(i) (j)
[p1 ]i PY |X1 X2 (y|x1 x2 )
2) log P
(i) (j 0 )
j 0 [P]i,j 0 PY |X1 X2 (y|x1 x2 )
= log(e) []i,j . (5.111)

Now, by complementary slackness, we know that []i,j [P]i,j = 0 which forces
= 0 if [P]i,j > 0
[]i,j
,
0 if [P] = 0
i,j
(5.112)
and allows us to rewrite (5.111) as

D(PY |X1 =x1 ,X2 =x2 ||PY ) + (1 2)D(PY |X1 =x1 ,X2 =x2 ||PY |X1 =x1 )
= log(e) if PX X (x1 , x2 ) > 0

1 2
. (5.113)
log(e) if P
(x , x ) = 0
X1 X2
It can be shown that any P with marginals p1 , p2 satisfying (5.113) for some R has an
associated objective value log(e) . Let us impose that such a solution to the unconstrained
maximization of (5.104) satisfies also (y) 0 (the constraint of the current hypothesis under
test). We now show by contradiction that the objective value of any distribution with (y) > 0
is strictly lower. Suppose that the optimal distribution, with optimal objective value Ro? , is such
that (y? ) > 0. In this case, ? = 0 from (5.102), which results in the objective function
e ? = 0) = f1 (P, p2 ) + (1 )f2 (P, p1 ),
L(y;
(5.114)
which is maximized by y? under the constraint > 0. Then, since

f1 (P, p2 ) = f12 (P) f2 (P, p1 ) (y? ) < f12 (P) f2 (P, p1 )
(5.115)
119
5.C. Appendix: Proof of Proposition 5.4

it follows by assumption that
e ? ; ? = 0) =
Ro? = L(y
max
yD:(y)>0
e ? = 0) < max(f12 (P) + (1 2)f2 (P, p1 )) = log(e) ,

L(y;
yD
(5.116)
which contradicts optimality. Therefore, a probability distribution y with (y) 0 satisfying
(5.113) is optimal, its objective value is log(e) = Ro? , and invalidates the existence of
other optimal solutions with (y) > 0. Thus, provided such distribution exists (5.113) and
(y) 0 becomes necessary for optimality. As a final remark, note that (5.49) and (y) 0
are equivalent statements thanks to the equivalence of the functions f1 , f2 , and f12 and mutual
information.
5.C
It is sufficient to show that the conditions of Corollary 5.3 apply to the BS-MAC. For the sake of
brevity we will subsequently restrict to the 0 1/2 case (the same results are obtained for
1/2 < < 1) and simplify the notation by using the convention PX1 X2 (x1 , x2 ) = Px1 x2 , which
yields
PY
= {P10 , P11 , P00 + P01 }
(5.117)
PY |X1 =0 = {0, 0, 1}
(5.118)
PY |X1 =1 = {P10 /(P10 + P11 ), P11 /(P10 + P11 ), 0}.
(5.119)
The probabilities P10 and P11 cannot be zero simultaneously since that would imply H(Y ) = 0
and hence R1 = R2 = 0, which is clearly suboptimal. We will hypothesize P10 = 0 and P11 > 0
(similar results are obtained under the hypothesis P10 > 0 and P11 = 0).
With respect to P00 , P01 we cannot argue to infer beforehand wether these variables are
positive or not. However, when (5.48) is particularized for both (x1 , x2 ) = (0, 0) and (x1 , x2 ) =
(0, 1) the same expression is obtained. This implies that either P00 , P01 are both positive or
both zero. By hypothesizing P00 = P01 = 0, the conditions (5.48) of Lemma 5.2 particularize to
1
P00 + P01
1
P10 + P11
log
+ (1 2) log
P10
P10
1
P10 + P11
log
+ (1 2) log
P11
P11
log
Ro?
(5.120)
Ro?
(5.121)
= Ro? .
(5.122)
= Ro? + (1 2) log(`)
(5.123)
= Ro? + (1 2) log(`0 )
(5.124)
By rewriting (5.120) and (5.121) as

1
P00 + P01
1
P10 + P11
log
+ (1 2) log
P10
P10
log
120
for some 0 < `, `0 1. It follows from (5.122) and (5.124) that P11 = P10 `01/2 regardless
of `0 , and hence P10 = 0 forces P11 = 0. Similarly, P11 = 0 forces P10 = 0. Since this is a
suboptimal choice, the optimal distribution satisfies P10 , P11 > 0. This fact transforms (5.121)
into an equality (or, equivalently, forces `0 = 1 in (5.124)), and implies P11 = P10 . Combining
(5.123) with (5.121) (as an equality), we obtain
P10 = P11 = (P00 + P01 )(2`)1/2
(a)
(2 + (2`)21/ )1
(5.125)
P00 + P01
(2`)21/
,
2 + (2`)21/
(5.126)
where (a) follows from
i,j
Pij = 1. Expression (5.125) shows that P00 + P01 > 0 ` (0, 1],
which is inconsistent with the hypothesis P00 = P01 = 0. Since the optimal values of P00 and
P01 must hence be positive, the family of potential optimal distributions can be obtained by
setting ` = 1 in (5.125)-(5.126), which results in
?
?
P00
= , P01
= 1 p()
?
?
P10
= p()/2 , P11
= p()/2,
(5.127)
(5.128)
where (0, 1 p()), and p() was defined in (5.58). Among all the distributions satisfying
(5.127)-(5.128), some of them may also satisfy (5.49) and be optimal. In particular, = (1
p())/2 results in the product distribution (5.57), which satisfies Lemma 5.3 (this implies Rimarg =
C). Since that optimal product distribution is also inferred by the marginals of (5.127)-(5.128)
regardless of the choice of , it follows that Rimarg = C = Ro .
5.D
It is sufficient to show that there exists a product distribution satisfying Lemma 5.3 for any
0 < < 1. Let us start with the 0 < 1/2 case first and simplify the notation by using the
notation PXk (1) = pk = 1 pk , k = 1, 2, which yields
PY
PY |X1 =0
p1 + p1 ((1 )p2 + p2 ), p1 + p1 (p2 + (1 )p2 ), (1 )p1 (5.129)

2
2
= {/2, /2, 1 }
(5.130)
=
PY |X1 =1 = {(1 )p2 + p2 , p2 + (1 )p2 , 0}.
(5.131)
We arbitrarily assume that p2 = p2 = 1/2 and p1 , p1 > 0 such that the conditions in Lemma
5.3, after some algebraic manipulation, reduce to
log
1
+ (1 ) log
= C ? ()
p1 + p1
(1 )p1
(5.132)
for (x1 , x2 ) {(0, 0), (0, 1)} and

log
2
2(1 )
+ (1 ) log
+ (1 2)(log 2 h()) = C ? ()
p1 + p1
(1 )p1
(5.133)
121
5.D. Appendix: Proof of Proposition 5.5
for (x1 , x2 ) {(1, 0), (1, 1)}. By setting the left hand sides of (5.132)-(5.133) equal to each other,
we arrive at
(1 ) log
p1 + p1
= h() + (1 )(log 2 h()),
(1 )p1
(5.134)
which is satisfied by p(, ; ) as defined in (5.62).

The case 1/2 < < 1 is more cumbersome since we cannot find the analytical expression of
the optimal p1 . The conditional distributions
PY |X2 =1
p1 + (1 )p1 , p1 + p1 , (1 )p1

2
2
n
o
=
p1 + p1 , p1 + (1 )p1 , (1 )p1
2
2
PY |X2 =0 =
(5.135)
(5.136)
and the arbitrary setting p2 = p2 = 1/2 allow us to obtain, after some manipulations, the
conditions in Lemma 5.3 as
1
+ (1 ) log
p1 + p1
(1 )p1
1
+ (2 1) log
+ log
+ (1 ) log
= C ? () (5.137)
2
p1 + 2(1 )p1 2
p1 + 2p1
(1 )p1
(1 ) log
for (x1 , x2 ) {(0, 0), (0, 1)} and

2
2(1 )
+ log
p1 + p1
(1 )p1

2(1 )
2
+ (2 1) (1 ) log
+ log
= C ? () (5.138)
p1 + 2(1 )p1
p1 + 2p1
(1 ) (1 ) log
for (x1 , x2 ) {(1, 0), (1, 1)}. By setting the left hand sides of (5.137)-(5.138) equal to each other
and using p1 = 1 p1 we obtain
B(p1 , , ; ) = (h() h() + log 2),
(5.139)
where B(p1 , , ; ) is defined in (5.66). Since

B(0, , ; ) = (1 ) log
(a)
h() (h() h() + log 2),

1
(5.140)
where (a) follows from Gibbs lemma, and

lim B(p1 , , ; ) = + > (h() h() + log 2),
p1 1
(5.141)
it follows by continuity that an optimum p1 [0, 1) satisfying (5.139) exists for any 1/2 < 1.
Since we have been able to find product distributions satisfying Lemma 5.3 for all 0 < 1,
Corollary 5.1 implies that Ro = C.
122
5.E
Thanks to the symmetry of the problem, we restrict without loss of generality to the 0 < 1/2
case (similar results hold for 1/2 < < 1) and simplify the notation by using the convention
PX1 X2 (x1 , x2 ) = Px1 x2 , which yields
PY
= {P00 , P01 + P10 , P11 }
(5.142)
PY |X1 =0 = {P00 /(P00 + P01 ), P01 /(P00 + P01 ), 0}
PY |X1 =1 = {0, P10 /(P10 + P11 ), P11 /(P10 + P11 )}.
(5.143)
(5.144)
Let us hypothesize P01 = 0. In this case, the rest of probabilities should satisfy P00 , P10 , P11 > 0
since both X1 and X2 can be perfectly decoded given Y . If we focus on the conditions (5.48) of
Lemma 5.2 particularized to (x1 , x2 ) = (0, 0) and (x1 , x2 ) = (0, 1) we obtain
1
P00 + P01
+ (1 2) log
P00
P00
P00 + P01
1
+ (1 2) log
log
P01 + P10
P01
log
= Ro?
(5.145)
= Ro? + (1 2) log(`)
(5.146)
respectively. It follows from (5.145) and (5.146) that
12
P00 = (P01 + P10 ) 1 (`P01 ) 1 ,
(5.147)
which implies P00 = 0 regardless of the actual value of P10 . This is clearly suboptimal since
P00 > 0 causes no penalty on R2 (X2 can still be uniquely decodable from Y ) and allows user
1 to communicate at some positive rate. Similarly, if we hypothesize P10 = 0 we arrive at the
suboptimal choice P11 = 0. Hence, the optimal distribution must satisfy P01 , P10 > 0 and, at the
same time, P00 , P11 > 0 (since (x1 , x2 ) = (0, 0) and (x1 , x2 ) = (0, 0) are input values that allow
for perfect decoding of both codewords). In this situations, the conditions (5.48) of Lemma 5.2
for all (x1 , x2 ) particularize here to
P00 + P01
1
+ (1 2) log
P00
P00
P00 + P01
1
+ (1 2) log
log
P01 + P10
P01
1
P10 + P11
log
+ (1 2) log
P01 + P10
P10
1
P10 + P11
log
+ (1 2) log
P11
P11
log
= Ro?
(5.148)
= Ro?
(5.149)
= Ro?
(5.150)
= Ro? .
(5.151)
From (5.148) and (5.149) we obtain

P01 + P10 = P00 (P00 /P01 )1/2 ,
(5.152)
an equivalence that can be plug in (5.150) and used with (5.148) to obtain
P00
P11
=
.
P01
P10
(5.153)
123
5.F. Appendix: Proof of Theorem 5.1

Similarly, from (5.150)-(5.151)
P01 + P10 = P11 (P11 /P10 )1/2 .
(5.154)
Expressions (5.152) and (5.154) imply P00 = P11 , which can be indistinctly used in (5.152) or
(5.153) to arrive at the potential optimal solution
?
?
?
?
P00
= P11
= p(1 )/2 , P10
= P01
= (1 p(1 ))/2,
(5.155)
which is not a product distribution and hence C Ro (recall that p() was defined in (5.58)). To
verify that (5.155) is indeed the unique solution to the relaxed problem (5.13)-(5.18) it remains
to be checked that it satisfies (5.49). To that end, let us consider
I(X2 ; Y |X1 ) = H(Y |X1 ) = h(p(1 )) = I(X1 ; Y |X2 )
I(X1 X2 ; Y ) = H(Y ) = h(p(1 )) + p(1 ) log 2
(5.156)
(5.157)
to show
(a)
I(X1 ; Y |X2 ) + I(X2 ; Y |X1 ) = 2h(p(1 )) h(p(1 )) + 2(1 p()) log 2
(5.158)
(b)
h(p(1 )) + 2(1 2/3) log 2 = h(p(1 )) + 2/3 log 2 (5.159)
(c)
h(p(1 )) + p(1 ) log 2 = I(X1 X2 ; Y ),
(5.160)
where inequality (a) follows from the linear bound h(x) 2(1 x) log 2 for x 1/2 and the fact
that p(1 ) 1/2 (see Figure 5.4); inequalities (b) and (c) are a consequence of the fact that,
for 0 < 1/2, p(1 ) 2/3.
Finally, the last step is to notice that the product distribution induced by the marginals of
(5.155) are uniform and hence capacity achieving for any 0 < 1/2 (recall Proposition 5.6)),
implying Rimarg = C.
5.F
Consider each possible value of the size of the output alphabet |Y|:
Binary output - None of the binary-input binary-output deterministic DMACs has a capacity region dominating the timesharing line joining the points (log 2, 0) and (0, log 2)15 .
Since marginalization is tight for {0, 1} and the bounds are convexified using the
convex hull operation it follows that Rimarg = C.
Ternary output - All the binary-input ternary-output DMACs can be obtained through
isomorphisms of the input and output alphabets with respect to the transition matrices of
the BS-MAC and the BA-MAC [Vin85a]. Hence, marginalization is tight for all of them
as it is for the BS-MAC and the BA-MAC.
15
The capacity region is upper-bounded by Rk H(Y |Xk ) H(Y ) log 2, k = 1, 2, and R1 + R2 H(Y )
log 2, which defines the triangle joining the points (0, 0), (log 2, 0), and (0, log 2).
124

Quaternary output - Quaternary output channels allow for perfect decoding of both code-
words given the output. The unique arbitrary distribution maximizing simultaneously
I(X1 ; Y |X2 ), I(X2 ; Y |X1 ), and I(X1 X2 ; Y ) is PX1 X2 (x1 , x2 ) = 1/4 x1 , x2 {0, 1}, which
is a product distribution. Hence, Rimarg = Ro = C.
Higher than four-dimensional output - These channels cannot exist since there are only
four possible input combinations.
5.G
The mutual information ruling the rate of the first link can be decomposed as16
I(X; Y1 |U ) =

X
PU X (u, x)PY1 |X (y1 |x) log PY1 |X (y1 |x) + log P
u,x,y1
x0
X X
x

PU (u)
(5.161)
PU X (u, x0 )PY1 |X (y1 |x0 )

PU X (u, x) H(Y1 |X = x) D(PU Y1 ||QU Y1 ),
(5.162)
where
QU Y1 (u, y1 ) = PU (u)
(u, y1 ) U Y1
(5.163)
is a dummy function which is not a probability distribution. While the first term is linear in
PU X and hence concave, the second is concave in (PU Y1 , QU Y1 ) thanks to the convexity of the
divergence, which is based on the log-sum inequality, regardless of the fact that its two arguments
may or may not be probability distributions17 . Since (PU Y1 , QU Y1 ) are linear in PU X , it follows
that I(X; Y1 |U ) is concave in PU X .
Regarding the other mutual information term,
1
PU (u)
u
x,y2
P
0
0

X X
x0 PU X (u, x )PY2 |X (y2 |x )
+
PU X (u, x)PY2 |X (y2 |x) log P
0
0
x0 PX (x )PY2 |X (y2 |x )
u,y
x
I(U ; Y2 ) =
X X

PU X (u, x)PY2 |X (y2 |x) log
(5.164)
(5.165)
= H(U ) (D(PU Y2 ||QU Y2 )),
(5.166)
where
X
QU Y2 (u, y2 ) = PX (x)PY2 |X (y2 |x) (u, y2 ) U Y2
(5.167)
is another dummy function not satisfying the properties of a probability distribution. By the concavity of the entropy function, the convexity of the divergence, and the fact that (PU , PU Y2 , QU Y2 )
are linear in PU X it follows that I(U ; Y2 ) is a difference of concave functions of PU X .
16
17
When required, we denote the marginals of PU X by PU and PX to shorten notation.

Actually, in this case D(||) is not a proper divergence since its arguments are not probability distributions,
but this is totally irrelevant for the convexity characterization of the problem.
125
5.H. Appendix: Proof of Lemma 5.5
5.H
We will show that satisfaction of the KKT conditions, rephrased as in (5.86) is a necessary
and sufficient condition for ensuring that PU X achieves a local maximum of (5.83)-(5.85) and
therefore (5.86) is a necessary optimality condition. From [Ber95, Sec. 3.4], we know that the
KKT conditions of (5.83)-(5.85), are satisfied by any PU X achieving a local maximum (hence
proving the necessity part). The corresponding Lagrangian is
L(PU X ; U X , ) = I(X; Y1 |U ) + (1 )I(U ; Y2 )

X
X
PU X (u, x) 1 ,
U X (u, x)PU X (u, x) +
+
(5.168)
(5.169)
u,x
u,x
while the derivatives of mutual information are

I(X; Y1 |U )
PU X (u, x)
I(U ; Y2 )
PU X (u, x)
= D(PY1 |X=x ||PY1 |U =u )

PY2 |U =u (Y2 )
= EY2 |X=x log
log(e).
PY2 (Y2 )
(5.170)
(5.171)
Setting the derivative of the Lagrangian with respect to PU X (u, x) equal to zero it follows that
at any local maximum the following holds

PY2 |U =u (Y2 )
D(PY1 |X=x ||PY1 |U =u ) + (1 )EY2 |X=x log
= (1 ) log(e) U X (u, x).
PY2 (Y2 )
(5.172)
Expression
(5.172)
can
be
rephrased
as
(5.86)
using
complementary
slackness
(U X (u, x)PU X (u, x) = 0), dual feasibility (U X (u, x) 0), and noticing that an alternative formulation of the mutual informations involved is
X
I(X; Y1 |U ) =
PU X (u, x)D(PY1 |X=x ||PY1 |U =u )
(5.173)
u,x
I(U ; Y2 ) =
X
u,x

PY2 |U =u (Y2 )
PU X (u, x)EY2 |X=x log
.
PY2 (Y2 )
(5.174)
To show that any PU X satisfying the KKT conditions is indeed a local maximum of (5.83)-(5.85),
I(U ;Y2 )
1 |U )
let RPU X (u, x) , I(X;Y
PU X (u,x) + (1 ) PU X (u,x) denote a linear combination of (5.170)-(5.171)
evaluated using PU X . If
X
u,x
RPU X (u, x)(QU X (u, x) PU X (u, x)) 0
(5.175)
holds for any arbitrary distribution QU X (u, x) it follows that PU X is a local maximum of (5.83)(5.85) [Ber95, Sec. 2.1]. Since any PU X satisfying the KKT conditions has an associated
RPU X (u, x) = U X (u, x) and satisfies complementary slackness (U X (u, x)PU X (u, x) =
0), (5.175) becomes
X
u,x
U X (u, x)QU X (u, x) 0,
which indeed holds for any QU X thanks to dual feasibility.
(5.176)
126
5.I
Let us simplify the notation by using the equivalence PU X (u, x) = Pux and imposing U = {0, 1}
(|U| = 2 suffices). In this case the distributions involved amount to
PY1 |X
1
0
1 0.5
0.5
=
, PY2 |X =
0.5
1 0.5
0
1
(5.177)
for the channel transition matrices,
PY1 |U
(1)P00
P00 +P01
(1)P10
P10 +P11
(1)P01
P00 +P01
(1)P11
P10 +P11
, PY2 |U =
(10.5)P00 +0.5P01
P00 +P01
(10.5)P10 +0.5P11
P10 +P11
0.5P00 +(10.5)P01
P00 +P01
0.5P10 +(10.5)P11
P10 +P11
(5.178)
for the output distributions conditioned on U , and
P Y2 =
(1 0.5)(P00 + P10 ) + 0.5(P01 + P11 )

0.5(P00 + P10 ) + (1 0.5)(P01 + P11 )
(5.179)
for the output distribution of the second receiver. In (5.177)-(5.179) we have used the convention
that the columns of the matrices represent the elements of the output alphabets (Y1 or Y2 ) and
the rows correspond to the natural ordering of the inputs (U or X). Expressions (5.177)-(5.179)
can be used to formulate the conditions of Lemma 5.5 for the BEC-BSC dDMBC as follows
h
(1 0.5)P00 + 0.5P01
P00 + P01
+ (1 ) (1 0.5) log
P00
(P00 + P01 )[(1 0.5)(P00 + P10 ) + 0.5(P01 + P11 )]
i
0.5P00 + (1 0.5)P01
+0.5 log
= R() `00
(P00 + P01 )[0.5(P00 + P10 ) + (1 0.5)(P01 + P11 )]
(5.180)
h
P10 + P11
(1 0.5)P10 + 0.5P11
(1 ) log
+ (1 ) (1 0.5) log
P10
(P10 + P11 )[(1 0.5)(P00 + P10 ) + 0.5(P01 + P11 )]
i
0.5P10 + (1 0.5)P11
+0.5 log
= R() `10
(P10 + P11 )[0.5(P00 + P10 ) + (1 0.5)(P01 + P11 )]
(5.181)
h
P00 + P01
(1 0.5)P00 + 0.5P01
(1 ) log
+ (1 ) 0.5 log
P01
(P00 + P01 )[(1 0.5)(P00 + P10 ) + 0.5(P01 + P11 )]
i
0.5P00 + (1 0.5)P01
+(1 0.5) log
= R() `01
(P00 + P01 )[0.5(P00 + P10 ) + (1 0.5)(P01 + P11 )]
(5.182)
h
P10 + P11
(1 0.5)P10 + 0.5P11
(1 ) log
+ (1 ) 0.5 log
P11
(P10 + P11 )[(1 0.5)(P00 + P10 ) + 0.5(P01 + P11 )]
i
0.5P10 + (1 0.5)P11
+(1 0.5) log
= R() `11 ,
(P10 + P11 )[0.5(P00 + P10 ) + (1 0.5)(P01 + P11 )]
(5.183)
(1 ) log
5.I. Appendix: Proof of Proposition 5.8
127
where ìj 0 and ìj = 0 if Pij > 0. In order to find an optimal distribution satisfying (5.180)-
(5.183) we need to hypothesize on the number of entries of Pij that are equal to zero. To that
end, consider the following situations:
Hypothesis 1 - PU X has three zero entries. This implies H(U ) = H(X) = 0 and, conse-
quently, I(X; Y1 |U ) = I(U ; Y2 ) = 0, which is clearly suboptimal.
Hypothesis 2 - PU X has two zero entries. This class of distributions comprises the cases: i)
X = U and X = U , where all the capacity-achieving distributions achieve (0, (1)(1h(0.5)),
ii) U = 0 and U = 1, where all the capacity achieving distributions achieve ((1 ) log 2, 0),
and iii) X = 0 and X = 1, which imply I(X; Y1 |U ) = I(U ; Y2 ) = 0.
Hypothesis 3 - PU X has one zero entry. Consider w.l.o.g P00 = 0 and Pij > 0 (i, j) 6= (0, 0).
This implies ìj = 0 (i, j) 6= (0, 0) and `00 0. The conditions (5.180)-(5.183) cannot be
satisfied simultaneously because the left hand side of (5.180) equals +, while R() C ? () is
bounded and `00 0. Therefore, distributions with one zero entry are never optimal.
We subsequently focus on distributions PU X with strictly positive entries, which imply ìj = 0
i, j {0, 1}. Let us describe such distributions by
p p
,
PU X =
p p
(5.184)
where p , p , , > 0 and (1 + )p + (1 + )p = 1. Setting the left hand sides of (5.180) and
(5.182) equal to each other and using (5.184) it follows
(1 ) log
(10.5)+0.5 0.5(p +p )+(10.5)(p +p )

+ log = 0.
0.5+(10.5) (10.5)(p +p )+0.5(p +p )
(5.185)
Proceeding similarly with the left hand sides of (5.181) and (5.183) we arrive at
(1 ) log
(10.5)+0.5 0.5(p +p )+(10.5)(p +p )

log = 0.
0.5+(10.5) (10.5)(p +p )+0.5(p +p )
(5.186)
Since both (5.185) and (5.186) must hold, we can equal their left hand sides, which imposes
g(; , ) = g(; , ), where g is defined in (5.93). On the other hand, considering that
g(1/; , ) = g(; , ) it follows that = 1/0 with 0 such that g(0 ; , ) = g(; , ).
Rewriting (5.185) as
0.5(p +p ) + (10.5)(p +p )
= g(; , ),
(10.5)(p +p )+0.5(p +p )
(5.187)

0.5 exp g(; , )/(1 ) (1 0.5)
p + p
, .
=
p +
0.5 (1 0.5) exp g(; , )/(1 )
(5.188)
> 0 |g(; , )| < (1 ) log(2/ 1).
(5.189)
(1 ) log
it follows that
Since is the ratio of two strictly positive probabilities, we impose > 0 in (5.188) to obtain
that
128
An equivalent rephrasing of (5.188) is p /p = ( )/(1 ), which induces a distribution
of the form (5.88)-(5.91). Analyzing the monotony of g(; , ) over the interval of interest
(0, +) it can be shown that

i
2h 1
g(; , )
= sign 2 +
2(1 0.5) 1 ,
sign
(1 0.5)
which shows that for
1
2(10.5)2
(5.190)
the function g(; , ) is strictly decreasing and hence 0 = ,
= 1/. It can be checked that (5.186)-(5.187) imply = = 1 which causes R2 = 0 and the
optimal rate pair to be ((1 ) log 2, 0).

When 0 < <
1
,
2(10.5)2
the function g(; , ) has one local minimum and one local max-
imum, which bounds the number of different values of 0 such that g(0 ; , ) = g(; , ) to a
maximum of three. The maximization of the objective value I(X; Y1 |U ) + (1 )I(U ; Y2 ) over
the distributions of the class (5.88)-(5.91) satisfying (5.189) yields the best distribution satisfying
Lemma 2 with strictly positive probabilities, and its associated rate pair (R1 (), R2 ()). The convex hull of these rate pairs together with the extreme points ((1) log 2, 0), (0, (1h(0.5)) log 2)
is hence C BECBSC since there is no other rate pair achieved by a distribution satisfying Lemma
5.5.
Chapter 6
Conclusions
This dissertation has addressed the problem of multiuser interference in wireless networks under
many different points of view. We started in Chapter 2 with the when and how of partial
interference cancelation, a powerful technique to be used when the receivers of the network have
full statistical knowledge of the interference. By giving special emphasis to the broadcast and
interference channels, we addressed the network scenarios that suffer from multiuser interference
in a more strict sense. The fact that a generalization of the best coding/decoding strategy for
the broadcast channel allowing for simultaneous partial interference cancelation at the receivers
did not achieve larger rates was rather surprising. This, in combination with the realization that
a similar approach makes a big difference in the interference channel gave us the first conclusion
of this thesis: coding and decoding complexity can be traded whenever the interference is under
the control of the same source. Intuitively, appropriate coding techniques exploiting signal
correlation can alleviate the complexity burden of the receivers.
While the study of the broadcast channel did not result in better achievability results, it was
a key stepping stone that enabled the proposal of a novel transmission strategy for the interference channel: superposition coding and aided decoding. Compared to the best long-standing
achievable region for the interference channel, superposition coding required less auxiliary random variables and aided decoding allowed to relax some rate inequalities. While these facts
increase the chances of having a potentially larger achievable rate region, recent literature has
shown that the proposed strategy and the best know result yield identical achievable rates. Thus,
the only advantage of the proposed region is simplicity and potentially better performance in
the finite blocklength regime via better error exponents.
In the context of Chapter 2, it is implicitly assumed that the receivers have full knowledge
of the codebooks of the interfering users. However, that is not possible in applications backed
by decentralized wireless networks with uncoordinated nodes. With receivers totally unaware
of interference, partial cancelation becomes infeasible and leaves each sender-destination pair
armed only with point-to-point (single user) strategies. This motivated the study of the totally
129
130
Chapter 6. Conclusions
asynchronous interference channel with single-user receivers in Chapter 3. Having a capacity

region rather involved due to the need to resort to Information Spectrum formulation, the
evaluation of achievable rates is tackled based on simpler single-letter inner bounds.
The study of the Gaussian case provides the second conclusion of the thesis: oppositely
to what happens with frame synchronism, Gaussian distributed-codes fall short of maximizing
the achievable rates for all channel instances. Despite their natural appeal, basically due to
the fact that they always lead to closed-form characterizations of achievable regions, they are
clearly suboptimal whenever the channel happens to be interference-limited. Indeed, this is the
regime where appropriate statistical signal design can have a larger impact on the achievable
rates. Analytical conditions determining the existence of non-Gaussian distributed codes yielding
higher achievable rates are found to be in excellent agreement with the performance of explicit
codes. Besides, the losses associated to the lack of transmission synchronism and the use of
single-user decoders in the low- and high-power and low- and high-interference regimes have
been quantified.
At this point, we adopted in Chapter 4 a rather different approach by studying how to deal
with multiuser interference in a cellular network with an arbitrary number of users under practical constraints. By focusing on half-duplex MIMO terminals, OFDMA transmission, relaying
infrastructure, and partial channel state information, most of the characteristics of upcoming
wireless systems are taken into account for network utility maximization. With terminals unaware of interference, the rate degradation studied in Chapter 3 is mitigated thanks to the
central coordinator role of the base station, which assigns transmission resources disjointly, in
so facilitating link performance prediction for QoS provision.
Novel concave lower bounds on the ergodic capacities involved in the expression of the
achievable rates in the cell enabled the proposal of two efficient algorithms for global (Pareto)
optimal and sequential optimal resource allocation. These algorithms maximize network utility,
a cell-wide aggregate indicator accounting for overall user satisfaction, and are able to deal with
heterogeneous QoS requirements, diverse wireless equipment quality, and the use of non-ideal
codes. The high-level conclusion of this chapter is that, by describing user satisfaction with a
function depending only on the long-term throughput, the connection between physical layer
operation and higher-layer needs becomes simpler and allows for feasible cross-layer optimization.
The fewer the key performance indicators, the better.
Finally, Chapter 5 explored yet another different facet of multiuser interference: the complexity increase that it poses in the evaluation of fundamental performance limits. While the
capacity evaluation of single-user channels is a problem that can efficiently solved, the evaluation of capacity regions of multiterminal networks is not such an amenable problem, as it often
leads to unavoidable non-convexities. Conditioned by the lack of computable expressions, we focused on two specific network scenarios: the multiple access channel and the degraded broadcast
channel.
131
For both channels, the computation of the capacity region implied solving a non-convex
problem. Whereas in the multiple access channel the culprit was a rank-one constraint, in the
degraded broadcast channel the difficulty arose from a difference of convex functions in the
objective. Problems with these type of non-convexities cannot be solved optimally with the
current state of the art. Our focus, hence, was to obtain efficient inner and outer bounds to
their capacity region.
While for the multiple access channel we focused on relaxation methods that not only allowed
us to obtain numerical results but also facilitated some analysis on their optimality, for the
degraded broadcast channel the emphasis was on the characterization of the set of optimal
input probability distributions. The remark of this chapter is that, while in the single-user
case (no interference at all) a homogeneous approach can be taken in the evaluation of channel
capacity, multiuser interference prevents us from doing that in a multiterminal network. By
introducing non-convex constraints of different nature, the set of methods and strategies to be
used in the evaluation of capacity regions must therefore be diverse.
Bibliography
[20005]
IEEE Std. 802.16e 2005, IEEE standard for local and metropolitan area networks,
Part 16: Air interface for fixed and mobile broadband wireless access systems, Amendment 2: Physical and medium access control layers for combined fixed and mobile
operation in licensed bands, Tech. Rep. IEEE Standards Dept., Dec. 2005.
[36.]
3GPP TS 36.201, Evolved univeral terrestrial radio access (E-UTRA), Long term
evolution (LTE) physical layer, General description.
[Ahl68]
R. Ahlswede, The weak capacity of averaged channels, Z. Wahrscheinlichkeitstheorie

und Verw. Gebiet, Vol. 11, pp. 6173, 1968.
[Ahl71]
R. Ahlswede, Multi-way communication channels, Proc. IEEE Intl. Symp. Inform.

Theory (ISIT), Tsakhadsor, Armenian SSR, 1971.
[Apk99]
P. Apkarian, H. D. Tuan, A sequential SDP/Gauss-Newton algorithm for rankconstrained LMI problems, Proc. IEEE Conf. on Decision and Control , Phoenix,
AZ, Dec. 1999.
[Ari72]
S. Arimoto, An algorithm for computing the capacity of arbitrary discrete memoryless channels, IEEE Trans. Inform. Theory, Vol. IT-18, pp. 1420, Jan. 1972.
[Bae06]
C. Bae, D.-H. Cho, Adaptive resource allocation based on channel information in

multihop OFDM systems, Proc. IEEE VTC Fall , Montreal, Canada, Sep. 2006.
[Ben79]
R. Benzel, The capacity region of a class of discrete additive degraded interference

channels, IEEE Trans. Inform. Theory, Vol. IT-25, pp. 228231, March 1979.
[Ber73]
P. Bergmans, Random coding theorem for broadcast channels with degraded components, IEEE Trans. Inform. Theory, Vol. IT-19, pp. 197207, March 1973.
[Ber95]
D. P. Bertsekas, Nonlinear programming, Athena Scientific, 1995.
[Bha06]
S. R. Bhaskaran, E. Telatar, Kurtosis constraints in communication over fading channels, Proc. IEEE Intl. Conf. on Communications (ICC), Istanbul, Turkey, June 2006.
[Bie79]
M. Bierbaum, H.-M. Wallmeier, A note on the capacity region of the multiple-access

channel, IEEE Trans. Inform. Theory, Vol. IT-25, pp. 484, July 1979.
[Big04]
E. Biglieri, G. Taricco, Transmission and reception with multiple antennas: theoretical

foundations, Now Publishers, 2004.
[Big07a] E. Biglieri, R. Calderbank, A. Constantinides, A. Goldsmith, A. Paulraj, H. V. Poor,

MIMO Wireless Communications, Cambridge University Press, 2007.
i
ii
Bibliography
[Big07b] E. Biglieri, M. Lops, Multiuser detection in a dynamic environment - Part I: User

identification and data detection, IEEE Trans. Inform. Theory., Vol. 53, pp. 3158
3170, Sep. 2007.
[Bla72]
R. E. Blahut, Computation of channel capacity and rate-distortion functions, IEEE

Trans. Inform. Theory, Vol. IT-18, pp. 460473, Jul. 1972.
[Boy04]
S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
[Bro05]
S. I. Bross, A. Lapidoth, An improved achievable region for the discrete memoryless two-user multiple-access channel with noiseless feedback, IEEE Trans. Inform.
Theory, Vol. 51, pp. 811833, Mar 2005.
[Cal05]
E. Calvo, M. Stojanovic, A coordinate descent algorithm for multichannel multiuser

detection in underwater acoustic DS-CDMA systems, Proc. IEEE OCEANS Europe
Conf., Brest, France, June 2005.
[Cal07a] E. Calvo, J. R. Fonollosa, Efficient resource allocation for orthogonal transmission in

broadcast channels, Proc. IEEE Workshop on Signal Process. Advances for Wireless
Commun. (SPAWC), Helsinki, Finland, June 2007.
[Cal07b] E. Calvo, J. R. Fonollosa, J. Vidal, Near-optimal joint power and rate allocation
for OFDMA broadcast channels, Proc. IEEE Intl. Conf. on Acoustics, Speech, and
Signal Process. (ICASSP), Honolulu, HI, Apr. 2007.
[Cal07c] E. Calvo, D. P. Palomar, J. R. Fonollosa, J. Vidal, The computation of the capacity
region of the discrete MAC is a rank-one non-convex optimization problem, Proc.
IEEE Intl. Symp. on Inform. Theory (ISIT), Nice, France, June 2007.
[Cal07d] E. Calvo, D. P. Palomar, J. R. Fonollosa, J. Vidal, On the computation of the
capacity region of the discrete MAC, submitted to IEEE Tran. Inform. Theory, May
2007.
[Cal07e] E. Calvo, J. Vidal, J. R. Fonollosa, Resource allocation in multihop OFDMA broadcast networks, Proc. IEEE Workshop on Signal Process. Advances for Wireless Commun. (SPAWC), Helsinki, Finland, June 2007.
[Cal08a] E. Calvo, J. R. Fonollosa, J. Vidal, The totally asynchronous interference channel
with single-user receivers, submitted to IEEE Trans. Inform. Theory, Sep. 2008.
[Cal08b] E. Calvo, I. Kov
acs, L. Garca, J. R. Fonollosa, A reconfigurable downlink air interface: design, simulation methodology, and performance evaluation, Proc. ICT Mobile Summit, Stockholm, Sweden, June 2008.
[Cal08c] E. Calvo, D. P. Palomar, J. R. Fonollosa, J. Vidal, The computation of the capacity
region of the discrete degraded BC is a non-convex DC problem, Proc. IEEE Intl.
Symp. on Inform. Theory (ISIT), Toronto, Canada, July 2008.
[Cal08d] E. Calvo, M. Stojanovic, Efficient channel estimation-based multi-user detection for
underwater CDMA systems, accepted for publication in IEEE Journal of Oceanic
Engineering, 2008.
[Cal08e] E. Calvo, J. Vidal, J. R. Fonollosa, Optimal resource allocation in relay-assisted
cellular networks with partial CSI, submitted to IEEE Trans. Signal Process., May
2008.
Bibliography
iii
[Car75]
A. B. Carleial, A case where interference does not reduce capacity, IEEE Trans.
Inform. Theory, Vol. IT-21, pp. 569570, Sep. 1975.
[Car78]
A. B. Carleial, Interference channels, IEEE Trans. Inform. Theory, Vol. IT-24,

pp. 6070, Jan 1978.
[Car83]
A. B. Carleial, Outer bounds on the capacity of interference channels, IEEE Trans.

Inform. Theory, Vol. IT-29, pp. 602606, July 1983.
[Cha79]
S. Chang, E. J. Weldon Jr., Coding for t-user multiple-access channels, IEEE Trans.
Inform. Theory, Vol. IT-25, pp. 684691, Nov. 1979.
[Cha81]
S. Chang, J. K. Wolf, On the t-user m-frequency noiseless multiple-access channel

with and without intensity information, IEEE Trans. Inform. Theory, Vol. IT-27,
pp. 4148, Jan. 1981.
[Che93]
R. S. Cheng, S. Verd
u, On limiting characterizations of memoryless multiuser capacity regions, IEEE Trans. Inform. Theory., Vol. 39, pp. 609612, Mar. 1993.
[Cho68]
C. K. Chow, C. N. Liu, Approximating discrete probability distributions with dependence trees, IEEE Trans. Inform. Theory, Vol. IT-14, pp. 462467, May 1968.
[Cho06a] H. Chong, M. Motani, H. K. Garg, A comparison of two achievable rate regions for
the interference channel, Proc. of the ITA Inaugural Workshop, La Jolla, CA, Feb.
2006.
[Cho06b] H. F. Chong, M. Motani, H. K. Garg, H. El Gamal, On the Han-Kobayashi region
for the interference channel, submitted to IEEE Trans. Inform. Theory, Aug. 2006.
[Clo07]
P. Closas, E. Calvo, J. Fernandez, A. Pagès, Coupling noise effect in selfsynchronizing wireless sensor networks, Proc. IEEE Workshop on Signal Process.
Advances for Wireless Commun. (SPAWC), Helsinki, Finland, June 2007.
[Cos85]
M. H. M. Costa, On the Gaussian interference channel, IEEE Trans. Inform. Theory, Vol. IT-31, pp. 607615, Sep. 1985.
[Cos87]
M. H. M. Costa, A. El Gamal, The capacity region of the discrete memoryless interference channel with strong interference, IEEE Trans. Inform. Theory, Vol. IT-33,
pp. 710711, Sep. 1987.
[Cov72]
T. M. Cover, Broadcast channels, IEEE Trans. Inform. Theory, Vol. IT-18, pp. 2
14, Jan. 1972.
[Cov75]
T. M. Cover, An achievable rate region for the broadcast channel, IEEE Trans.
[Cov79]
T. M. Cover, A. A. El Gamal, Capacity theorems for the relay channel, IEEE Trans.
Inform. Theory, Vol. 25, pp. 572584, Sep. 1979.
[Cov81a] T. M. Cover, C. S. K. Leung, An achievable rate region for the multiple-access channel
with feedback, IEEE Trans. Inform. Theory, Vol. IT-27, pp. 292298, May 1981.
[Cov81b] T. M. Cover, R. J. McEliece, E. C. Posner, Asynchronous multiple-access channel
capacity, IEEE Trans. Inform. Theory., Vol. IT-27, pp. 409413, July 1981.
[Cov98]
T. M. Cover, Comments on broadcast channels, IEEE Trans. Inform. Theory,

Vol. 44, pp. 25242530, Oct. 1998.
iv
Bibliography
[Cov06]
T. M. Cover, J. A. Thomas, Elements of Information Theory, John Wiley & Sons,

2006.
[Dan97]
G. B. Dantzig, M. N. Thapa, Linear programming. 1: Introduction, Springer, 1997.
[Dig01]
S. N. Diggavi, T. M. Cover, The worst additive noise under a covariance constraint,

IEEE Trans. Inform. Theory., Vol. 47, pp. 30723081, Nov. 2001.
[Doh04a] M. Dohler, A. Glekias, H. Aghvami, Resource allocation for fdma-based regenerative

multihop links, IEEE Trans. Wireless Commun., Vol. 3, pp. 19891993, Nov. 2004.
[Doh04b] M. Dohler, A. Glekias, H. Aghvami, A resource allocation strategy for distributed
MIMO multi-hop communication systems, IEEE Commun. Letters, Vol. 8, pp. 99
101, Feb. 2004.
[Doh05]
M. Dohler, H. Aghvami, On the approximation of MIMO capacity, IEEE Trans.

Wireless Commun., Vol. 4, pp. 3034, Jan. 2005.
[Dup04]
F. Dupuis, W. Yu, F. M. J. Willems, Blahut-Arimoto algorithms for computing

channel capacity and rate-distortion with side information, Proc. IEEE Intl. Symp.
Inform. Theory (ISIT), Chicago, IL, June/July 2004.
[Eff08]
M. Effros, A. Goldsmith, Y. Liang, Capacity definitions for general channels with

receiver side information, submitted to IEEE Trans. Inform. Theory., Apr. 2008.
[Etk07]
R. H. Etkin, D. N. C. Tse, H. Wang, Gaussian interference channel capacity to within

one bit, submitted to IEEE Trans. Inform. Theory, Feb. 2007.
[Fie71]
M. Fiedler, Bounds for the determinant of the sum of Hermitian matrices, Proc.
American Math. Society, Vol. 30, pp. 2731, Sep. 1971.
[Gal74]
R. G. Gallager, Capacity and coding for degraded broadcast channels, Problemy

Peredaci Informaccii , Vol. 10, pp. 314, Oct. 1974.
[Gam81] A. El Gamal, E. C. van der Meulen, A proof of martons coding theorem for the
discrete memoryless broadcast channel, IEEE Trans. Inform. Theory, Vol. IT-27,
pp. 120122, Jan. 1981.
[Gam82] A. El Gamal, M. H. M. Costa, The capacity region of a class of deterministic interference channels, IEEE Trans. Inform. Theory, Vol. IT-28, pp. 343346, March
1982.
[Gra00]
I. S. Gradshteyn, I. M. Ryzhik, Table of integrals, series, and products, Academic

Press, 6th Ed., 2000.
[Gro06]
IEEE 208.16 Relay Task Group, Multi-hop relay system evaluation methodology
(channel model and performance metric), IEEE 802.16j-06/013r2 , Nov. 2006.
[Gro07]
IEEE 802.16 Broadband Wireless Access Group, Ieee 802.16m system requirements,
IEEE 802.16m-07/002r4 , Oct. 2007.
[Gup00] P. Gupta, P. R. Kumar, The capacity of wireless networks, IEEE Trans. Inform.
Theory., Vol. 46, pp. 388404, March 2000.
[Haj79]
B. E. Hajek, M. B. Pursley, Evaluation of an achievable rate region for the broadcast

channel, IEEE Trans. Inform. Theory, Vol. IT-25, pp. 3646, Jan. 1979.
Bibliography
[Han81]
T. S. Han, K. Kobayashi, A new achievable rate region for the interference channel,
IEEE Trans. Inform. Theory, Vol. IT-27, pp. 4960, Jan. 1981.
[Hel95]
C. W. Helstrom, Elements of signal detection and estimation, Prentice Hall, 1995.
[Hen01]
D. Henrion, G. Meinsma, Rank-one LMIs and Lyapunovs inequality, IEEE Trans.

Autom. Control , Vol. 46, pp. 12851288, Aug. 2001.
[HM05]
A. Hst-Madsen, J. Zhang, Capacity bounds and power allocation for wireless relay
channels, IEEE Trans. Inform. Theory, Vol. 51, pp. 20202040, June 2005.
[Hor99]
R. Horst, N. V. Thoai, DC programming: an overview, Jrnl. of Optim. Theory and

Applications, Vol. 103, pp. 143, Oct. 1999.
[Hui85]
J. Y. N. Hui, P. A. Humblet, The capacity region of the totally asynchronous

multiple-access channel, IEEE Trans. Inform. Theory., Vol. IT-31, pp. 207216, Mar.
1985.
[Iha78]
S. Ihara, On the capacity of channels with additive non-Gaussian noise, Info. Ctrl.,
Vol. 37, pp. 3439, Apr. 1978.
[Jai84]
R. Jain, D. Chiu, W. Hawe, A quantitative measure of fairness and discrimination

for resource allocation in shared systems, DEC Research Report TR-301 , 1984.
[Jal03]
J. Jalden, C. Martin, B. Ottersten, Semidefinite programming for detection in linear

systems - optimality conditions and space-time decoding, Proc. IEEE Intl. Conf. on
Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, Apr. 2003.
[Kas76]
T. Kasami, S. Lin, Coding for a multiple-access channel, IEEE Trans. Inform.

Theory, Vol. IT-22, pp. 129136, March 1976.
[Kel98]
F. P. Kelly, A. Maulloo, D. Tan, Rate control for communication networks: shadow

prices, proportional fairness, and stability, Jrnl. Operations Research Society, Vol. 49,
pp. 237252, March 1998.
[Kie05]
M. Kieling, Unifying analysis of ergodic MIMO capacity in correlated Rayleigh

fading environments, Europ. Trans. Telecomm., Vol. 16, pp. 1735, Jan. 2005.
[Kra04]
G. Kramer, Outer bounds on the capacity of Gaussian interference channels, IEEE

Trans. Inform. Theory, Vol. 50, pp. 581586, March 2004.
[Ku69]
H. Ku, S. Kullback, Approximating discrete probability distributions, IEEE Trans.

[Lap03]
A. Lapidoth, S. Moser, Capacity bounds via duality with applications to multipleantenna systems on flat fading channels, IEEE Trans. Inform. Theory, Vol. 49,
pp. 24262467, Oct. 2003.
[Lee06]
K.-D. Lee, V. C. M. Leung, Fair allocation of subcarrier and power in an OFDMA

wireless mesh network, IEEE Jrnl. Select. Areas Commun., Vol. 24, pp. 20512060,
Nov. 2006.
[Les06]
A. Leshem, E. Zehavi, Bargaining over the interference channel, Proc. IEEE Intl.
Symp. on Inform. Theory (ISIT), Seattle, WA, July 2006.
[Lia72]
H. J. Liao, Multiple access channels, Ph.D. dissertation, Dep. Elec. Eng. Univ. of
Hawaii, 1972.
vi
Bibliography
[Lia07]
Y. Liang, V. V. Veeravalli, H. V. Poor, Resource allocation for wireless fading relay

channels: max-min solution, IEEE Trans. Inform. Theory, Vol. 53, pp. 34323453,
Oct. 2007.
[Lin06]
X. Lin, N. B. Shroff, R. Srikant, A tutorial on cross-layer optimization in wireless

networks, IEEE Jrnl. Selec. Areas Commun., Vol. 24, pp. 14521463, Aug. 2006.
[Liu07]
T. Liu, P. Viswanath, An extremal inequality motivated by multiterminal information theoretic problems, IEEE Trans. Inform. Theory., Vol. 53, pp. 18391851, May
2007.
[Ma02]
W. Ma, T. N. Davidson, K. M. Wong, Z. Luo, P. Ching, Quasi-maximum-likelihood

multiuser detection using semi-definite relaxation with application to synchronous
cdma, IEEE Trans. Signal Processing, Vol. 50, pp. 912922, Apr. 2002.
[Mar79]
K. Marton, A coding theorem for the discrete memoryless broadcast channel, IEEE
Trans. Inform. Theory, Vol. IT-25, pp. 306311, May 1979.
[Meu75] E. van der Meulen, Random coding theorems for the general discrete memoryless
broadcast channel, IEEE Trans. Inform. Theory, Vol. IT-21, pp. 180190, March
1975.
[Meu77] E. C. van der Meulen, A survey of multi-way channels in information theory: 19611976, IEEE Trans. Inform. Theory, Vol. IT-23, pp. 137, Jan. 1977.
[Mo00]
J. Mo, J. Walrand, Fair end-to-end window-based congestion control, IEEE/ACM

Trans. Networking, Vol. 8, pp. 556567, Oct. 2000.
[Mu
n07] O. Mu
noz, J. Vidal, A. Agustn, E. Calvo, A. Alcon, Resource management for
relaying-enhanced WiMAX: OFDM and OFDMA, Workshop Trends in Radio Resource Manag., Barcelona, Spain, Nov. 2007.
[Nab04]
R. Nabar, H. B
olcskei, F. Kneub
uhler, Fading relay channels: performance limits and
space-time signal design, IEEE Jrnl. Select. Areas Commun., Vol. 22, pp. 10991109,
Aug. 2004.
[Nai07]
C. Nair, A. El Gamal, An outer bound to the capacity region of the broadcast

channel, IEEE Trans. Inform. Theory, Vol. 53, pp. 350355, Jan. 2007.
[Ng07]
T. C.-Y. Ng, W. Yu, Joint optimization of relay strategies and resource allocations in
cooperative cellular networks, IEEE Jrnl. Select. Areas Commun., Vol. 25, pp. 328
339, Feb. 2007.
[Nik93]
C. L. Nikias, P. A. Petropulu, Higher order spectral analysis: A nonlinear signal

processing framework , Englewood Cliffs, New Jersey: Prentice Hall, 1993.
[Och06]
H. Ochiani, P. Mitran, V. Tarokh, Variable rate two phase collaborative communication protocols for wireless networks, IEEE Trans. Inform. Theory, Vol. 52, pp. 4299
4313, Sep. 2006.
[Ors04]
R. Orsi, U. Helmke, J. B. Moore, A Newton-like method for solving rank constrained

linear matrix inequalities, Proc. IEEE Conf. on Decision and Control , Paradise Island, Bahamas, Dec. 2004.
[Oza84]
L. H. Ozarow, The capacity region of the white Gaussian multiple access channel
with feedback, IEEE Trans. Inform. Theory, Vol. IT-30, pp. 623629, July 1984.
Bibliography
vii
[Pal05]
D. P. Palomar, Convex primal decomposition for multicarrier linear MIMO

transceivers, IEEE Trans. Signal Processing, Vol. 53, pp. 46614674, Dec. 2005.
[Pal07]
D. P. Palomar, M. Chiang, Alternative distributed algorithms for network utility

maximization: framework and applications, IEEE Trans. Automatic Control , Vol. 52,
pp. 22542269, Dec. 2007.
[Pan08]
J. Pang, G. Scutari, F. Facchinei, C. Wang, Distributed power allocation with rate

constraints in parallel interference channels, IEEE Trans. Inform. Theory., Vol. 54,
pp. 34713489, Aug. 2008.
[Pap84]
A. Papoulis, Probability, random variables, and stochastic processes, McGraw.Hill,

2nd ed., 1984.
[Pea16]
K. Pearson, Mathematical contributions to the theory of evolution. XIX. Second

suplement to a memoir on skew variation, Philosophical Trans. of the Royal Society
of London, Series A, Vol. 216, pp. 429457, 1916.
[Pin64]
M. S. Pinsker, Information and information stability of random variables and processes, San Francisco: Holden-Day, 1964.
[Rez04]
M. Rezaeian, A. Grant, Computation of total capacity for discrete memoryless

multiple-access channels, IEEE Trans. Inform. Theory, Vol. 50, pp. 27792784, Nov.
2004.
[RT05]
J. R. Ruz-Tolosa, E. Castillo, From vectors to tensors, Springer, 2005.
[Sas04]
I. Sason, On achievable rate regions for the Gaussian interference channel, IEEE
Trans. Inform. Theory, Vol. 50, pp. 13451356, June 2004.
[Sat78]
H. Sato, On the capacity region of a discrete two-user channel for strong interference,
IEEE Trans. Inform. Theory, Vol. IT-24, pp. 377379, May 1978.
[Sat81]
H. Sato, The capacity of the Gaussian interference channel under strong interference, IEEE Trans. Inform. Theory, Vol. IT-27, pp. 786788, Nov. 1981.
[SB06]
A. Somekh-Barruch, S. Verd
u, General relayless networks: representation of the
capacity region, Proc. IEEE Intl. Symp. on Inform. Theory (ISIT), Seattle, WA,
July 2006.
[Scu08a] G. Scutari, D. P. Palomar, S. Barbarossa, Optimal linear precoding strategies for

wideband non-cooperative systems based on game theory - Part I: Nash equilibria,
IEEE Trans. Signal Process., Vol. 56, pp. 12301249, March 2008.
[Scu08b] G. Scutari, D. P. Palomar, S. Barbarossa, Optimal linear precoding strategies for
wideband non-cooperative systems based on game theory - Part II: Algorithms, IEEE
Trans. Signal Process., Vol. 56, pp. 12501267, March 2008.
[Sha48]
C. E. Shannon, A mathematical theory of communication, Bell Syst. Tech. J.,

Vol. 27, pp. 379423, 1948.
[Sha07]
X. Shang, G. Kramer, B. Chen, A new outer bound and the noisy-interference sumrate capacity for Gaussian interference channels, submitted to IEEE Trans. Inform.
Theory, Dec 2007.
viii
Bibliography
[Shi03]
H. Shin, J. H. Lee, Capacity of multiple-antenna fading channels: spatial fading

correlation, double scattering, and keyhole, IEEE Trans. Inform. Theory, Vol. 49,
pp. 26362647, Oct. 2003.
[Sle73]
D. Slepian, J. K. Wolf, Noiseless coding of correlated information sources, IEEE

Trans. Inform. Theory, Vol. IT-19, pp. 471480, July 1973.
[Teo07]
K. H. Teo, Z. Tao, J. Zhang, The mobile broadband WiMAX standard, IEEE Signal
Process. Mag., Vol. 24, pp. 144148, Sep. 2007.
[Tse05]
D. Tse, P. Viswanath, Fundamentals of wireless communications, Cambridge University Press, 2005.
[Val03]
M. C. Valenti, B. Zhao, Distributed turbo codes: Towards the capacity of the relay
channel, Proc. IEEE VTC Fall , Orlando, FL, Oct. 2003.
[Van86]
P. Vanroose, E. C. van der Meulen, Coding for the binary switching multiple access
channel, Proc. Symp. Inform. Theory in the Benelux , Noordwijkerhout, 1986.
[Ver89]
S. Verd
u, Multiple-access channels with memory with and without frame synchronism, IEEE Trans. Inform. Theory., Vol. 35, pp. 605619, May 1989.
[Ver94]
S. Verd
u, T. S. Han, A general formula for channel capacity, IEEE Trans. Inform.
Theory., Vol. 40, pp. 11471157, July 1994.
[Vin85a] A. J. Vinck, On the multiple access channel, Proc. 2nd Joint Swedish-Soviet Intl.
Workshop Inform. Theory, Gr
anna, Sweden, 1985.
[Vin85b] A. J. Vinck, W. Hoeks, K. Post, On the capacity of the 2-user m-ary multiple access
channel with feedback, IEEE Trans. Inform. Theory, Vol. IT-31, pp. 540543, July
1985.
[Wan05] B. Wang, J. Zhang, A. Hst-Madsen, On the capacity of MIMO relay channels,
IEEE Trans. Inform. Theory, Vol. 51, pp. 2943, Jan. 2005.
[Wat96]
Y. Watanabe, The total capacity of two-user multiple-access channel with binary

output, IEEE Trans. Inform. Theory, Vol. 42, pp. 14531465, Sep. 1996.
[Wat02]
Y. Watanabe, The total capacity of multiple-access channel, Proc. IEEE Intl. Symp.
Inform. Theory (ISIT), Lausanne, Switzerland, June/July 2002.
[Wei06]
H. Weingarten, Y. Steinberg, S. Shamai, The capacity region of the Gaussian

multiple-input multiple-output broadcast channel, IEEE Trans. Inform. Theory,
Vol. 52, pp. 39363964, Sep. 2006.
[Wie05]
A. Wiesel, Y. C. Eldar, S. Shamai, Semidefinite relaxation for detection of 16-QAM

signaling in MIMO channels, IEEE Letters on Signal Processing, Vol. 12, pp. 653
656, Sep. 2005.
[Wil82]
F. M. J. Willems, The feedback capacity region of a class of discrete memoryless

multiple access channels, IEEE Trans. Inform. Theory, Vol. IT-28, pp. 9395, Jan.
1982.
[Won99] C. Y. Wong, R. S. Cheng, K. B. Letaief, R. D. Murch, Multiuser OFDM with adaptive

subcarrier, bit, and power allocation, IEEE Jrnl. Select. Areas Commun., Vol. 17,
pp. 17471758, Oct. 1999.
Bibliography
ix
[Wyn73] A. D. Wyner, A theorem on the entropy of certain binary sequences and applications:
Part II, IEEE Trans. Inform. Theory, Vol. IT-19, pp. 772777, Nov. 1973.
[Xie04]
L. Xie, P. R. Kumar, A network information theory for wireless communication:

scaling laws and optimal operation, IEEE Trans. Inform. Theory., Vol. 50, pp. 748
767, May 2004.
[Xue06]
Y. Xue, B. Li, K. Nahrstedt, Optimal resource allocation in wireless ad hoc networks:

a price-based approach, IEEE Trans. Mobile Computing, Vol. 5, pp. 347364, Apr.
2006.
[YJC07] Y.-J-Chang, F.-T. Chien, C.-C. J. Kuo, Cross-layer QoS analysis of opportunistic
OFDM-TDMA and OFDMA networks, IEEE Jrnl. Select. Areas Commun., Vol. 25,
pp. 657666, May 2007.
[Yu04]
W. Yu, W. Rhee, S. Boyd, J. M. Cioffi, Iterative water-filling for Gaussian vector

multiple-access channels, IEEE Trans. Inform. Theory, Vol. 50, pp. 145152, Jan.
2004.
[Yu05]
M. Yu, J. Li, Is amplify-and-forward practically better than decode-and-forward

or vice versa?, Proc. IEEE Intl. Conf. Acoustics, Speech, and Signal Processing
(ICASSP), Philadelphia, PA, March 2005.
[Yu06]
W. Yu, R. Lui, Dual methods for non-convex spectrum optimization of multicarrier

systems, IEEE Trans. Commun., Vol. 54, pp. 13101322, July 2006.

21 PhDThesisEduardCalvo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

21 PhDThesisEduardCalvo

Uploaded by

Copyright:

Available Formats

U N I V E R S ITAT PO LIT CN I CA

Interference in Wireless Networks

Eduard Calvo Page

Dr. Javier Rodrguez Fonollosa

Barcelona, December 2008

Ademas, se han cuantificado las perdidas de prestaciones asociadas a la perdida de sincronismo

Eduard Calvo Page

Este trabajo ha sido parcialmente financiado por el Ministerio de Educaci

Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Partial Interference Cancelation: When and How

Simultaneous vs Alternate Partial Interference Cancelation in the BC . . . . . .

The two-layered random binning achievable rate region . . . . . . . . . .

Equality of Martons region and the two-layered random binning region .

Comparison with the interference channel . . . . . . . . . . . . . . . . . .

Partial Interference Cancelation through Aided Decoding . . . . . . . . . . . . .

The achievability result . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.A Appendix: Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.B Appendix: Proof of Theorem 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.B.1 Proof of RMT R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.B.2 Proof of R RMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.C Appendix: Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 The Totally Asynchronous Interference Channel with Single-User Receivers 37

The Capacity Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Finite expansion analysis of mutual information . . . . . . . . . . . . . . .

On the Optimality of Gaussian-Distributed Codes . . . . . . . . . . . . . . . . .

4 Optimal Resource Allocation in Cellular Networks with Partial CSI

Adopted network setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .

System Model and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . .

Maximum instantaneous achievable rates . . . . . . . . . . . . . . . . . .

Universal concave lower bounds on the achievable rates . . . . . . . . . .

Achievable Instantaneous Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DL instantaneous achievable rate region . . . . . . . . . . . . . . . . . . .

UL instantaneous achievable rate region . . . . . . . . . . . . . . . . . . .

Maximum Network Utility Rate Allocation Policies . . . . . . . . . . . . . . . . .

User utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Network utility maximization . . . . . . . . . . . . . . . . . . . . . . . . .

4.A Appendix: Proof of Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Multiuser Interference and Evaluation of Capacity Regions

The capacity region as a rank-one constrained optimization problem . . .

Performance analysis of marginalization . . . . . . . . . . . . . . . . . . . 101

Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

The Degraded DMBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

The capacity region as a DC optimization problem . . . . . . . . . . . . . 110

Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

The BEC-BSC degraded broadcast channel . . . . . . . . . . . . . . . . . 112

5.A Appendix: Proof of Proposition 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Appendix: Proof of Proposition 5.8 . . . . . . . . . . . . . . . . . . . . . . . . . . 126

The 1 2 discrete memoryless broadcast channel without common information. .

The 2 2 Discrete Memoryless IC (DMIC). . . . . . . . . . . . . . . . . . . . . .

An example 2 2 modified interference channel. . . . . . . . . . . . . . . . . . .

The regions RHK (PZ1 Z2 U1 U2 X1 X2 ) (dashed) and R(PZ1 Z2 X1 X2 ) (solid) computed

Diagram of the coding scheme: two layered random binning. . . . . . . . . . . . .

Achievable rate regions of Gaussian-, uniformly-, and ternary-distributed codes

Achievable symmetric rate of Gaussian-, uniformly-, and ternary-distributed

Achievable symmetric rates in the low-power regime, P = 1 (left), and high-power

DL cooperation protocol: the DL phase is split into two subphases attending to

Per-user served throughput of global (blue) and sequential (red) optimization,

Maximum user delay (number of frames idle) achieved by sequential optimization

Average steady-state per-user and link-direction throughput versus fairness index.

The boundary of C is obtained solving (5.5)-(5.9) for each [0, 1].

The support of the randomly generated probability distributions q is the largest

circle centered at E{q} = q that fits within the probability simplex. . . . . . . .

and . Note that (, ) = (0, 0) corresponds to the BS-MAC. . . . . . . . . . . . . 106

and . Note that (, ) = (0, 0) corresponds to the BS-MAC. . . . . . . . . . . . . 106

The probability p(, ; ) as a function of for different values of and . Note

[bit/ch. use], for different values of . . . . . . . . . . . . . . . . . . . . . . . . . . 114