ArticlePDF Available

Low-Complexity Precoding for Sum Rate Maximization in Downlink Massive MIMO Systems

Authors:

Abstract

We propose a precoding scheme to improve the downlink sum rate for a multicell massive multiple-input multiple-output (MIMO) system. We first present a lowcomplexity approach based on dirty paper coding and zeroforcing that combines a reduced form of QR decomposition and an orthogonal projection. We formulate a downlink sum rate optimization problem that takes both intracell and intercell interference into account, and then we use the convex conjugate to transform the problem into an unconstrained dual problem to find an optimal solution by applying a quasi-Newton algorithm with low complexity per iteration. We prove that the proposed algorithm exhibits faster convergence than other methods, and the numerical results verify that the proposed precoding design outperforms conventional precoding methods in multicell massive MIMO systems.
1
Low-Complexity Precoding for Sum Rate
Maximization in Downlink Massive MIMO Systems
Hieu V. Nguyen, Van-Dinh Nguyen, and Oh-Soon Shin
Abstract—We propose a precoding scheme to improve the
downlink sum rate for a multicell massive multiple-input
multiple-output (MIMO) system. We first present a low-
complexity approach based on dirty paper coding and zero-
forcing that combines a reduced form of QR decomposition
and an orthogonal projection. We formulate a downlink sum
rate optimization problem that takes both intracell and intercell
interference into account, and then we use the convex conjugate
to transform the problem into an unconstrained dual problem to
find an optimal solution by applying a quasi-Newton algorithm
with low complexity per iteration. We prove that the proposed
algorithm exhibits faster convergence than other methods, and
the numerical results verify that the proposed precoding design
outperforms conventional precoding methods in multicell massive
MIMO systems.
Index Terms—Dirty paper coding, massive MIMO, optimiza-
tion, precoding, zero-forcing.
I. INTRODUCTION
Massive multiple-input multiple-output (MIMO) has been
recently considered as a promising technology for next-
generation wireless systems [1], [2]. In particular, [2] showed
that the achievable rate of a massive MIMO system can nearly
reach the channel capacity even without intercell cooperation,
provided that a sufficiently large number of antennas are
employed at the base station (BS). As in traditional MIMO
systems, precoding at the BS plays an important role in
determining the downlink achievable rate in massive MIMO
systems. Most previous works have adopted linear precoding
schemes for downlink transmissions, including the maximum
ratio combining/maximum ratio transmission (MRC/MRT),
zero-forcing (ZF), and minimum mean square error (MMSE)
[3], [4], because linear precoding schemes can be implemented
with low computational complexity while also providing quite
good performance if the BS has a very large number of
antennas. In practice, however, the number of antennas is
limited, so linear precoding cannot fully exploit the capability
of massive antenna arrays.
To improve the throughput of a MIMO system, previous
works applied a combination of ZF and dirty paper coding
(DPC), referred to as successive zero-forcing dirty paper
coding (SZF-DPC), as originally proposed in [5] instead of
non-linear DPC with very high complexity in [6]. Accordingly,
recently published works optimize the downlink transmissions
[7], [8], and in particular, [8] designed a precoder based
on SZF-DPC to maximize the sum rate using an iterative
Newton’s method, which has a computational complexity
of O(KM 3), with Kas the number of users and Mas
the number antennas at the BS. This complexity might be
The authors are with the School of Electronic Engineering, Soongsil
University, Seoul 06978, Korea (email: {hieuvnguyen, nguyenvandinh, os-
shin}@ssu.ac.kr).
acceptable in a traditional MIMO system with only a few
antennas. However, the high level of complexity, the cube of
M, is likely a barrier to realize massive MIMO systems with
hundreds or thousands of antennas. Moreover, previous works
have only considered a single-cell system while ignoring
intercell interference. This motivates us to develop a low-
complexity precoding scheme suitable for multicell massive
MIMO systems.
In this letter, we design a low-complexity precoding method
for downlink transmission in multicell massive MIMO sys-
tems. This method is based on the SZF-DPC approach, and
we first formulate an achievable sum rate maximization prob-
lem with a sum power constraint. In particular, we exploit
economy-size QR decomposition and orthogonal projection
to eliminate both the intracell and intercell interference. As
a result, the computational complexity can be as low as
O(KM 2), which stands in contrast with the O(KM 3)com-
plexity of the scheme in [8]. We then derive a dual problem
by applying the convex conjugate to the original problem
to significantly reduce the number of variables. The optimal
solution of the dual problem is found using an iterative quasi-
Newton method we developed known as the Broyden-Fletcher-
Goldfarb-Shanno (BFGS) algorithm. Only two iterations are
required for the BFGS algorithm, no matter how the parame-
ters change. Interestingly, the complexity for each iteration is
O(1) when using the BFGS algorithm to compute a scalar
expression and O(K)when using the typical water-filling
method to solve a log-sum problem [9].
Notation: (·)Tand (·)Hrespectively denote the transpose
and the Hermitian transpose of a vector or matrix. (·)1and
tr(·)are respectively the inverse and the trace of a matrix.
diag(x)denotes a diagonal matrix whose main-diagonal en-
tities are determined by a vector x.[x]istands for the i-th
element of vector x; and xy, where xand yare the same-
size vectors, denotes [x]i>[y]i,i.1Tand 1respectively
denote a row vector and a column vector with all elements
equal to 1.0Tand 0are similarly defined. The terms inf and
sup represent the infimum and the supremum, respectively.
II. SY ST EM MO DE L
We consider a multicell system comprised of Lcells with
each BS equipped with Mantennas serving Ksingle-antenna
users. The channel between the BS in the `-th cell and the k-
th user in the j-th cell is modeled as g`,jk =pβ`,j kh`,j k,
where β`,jk stands for the path-loss and shadowing, and
h`,jk C1×Mrepresents the channel vector between M
antennas at the BS in the `-th cell and the k-th user in the j-th
cell. The entries in h`,jk are assumed to follow independent
and identically distributed (i.i.d.) CN(0,1). The BS of the `-
th cell broadcasts the information symbol vector s`to all K
2
users in the cell. Let s`=s`1s`2. . . s`K Tand assume that
Es`= 0 and Es`sH
`=IK.
Before transmitting the signal to the k-th user in the `-th
cell, the BS multiplies the information signal with a precoding
vector w`k CM×1. The signal received at the k-th user in
the `-th cell is corrupted by intercell interference from the
signal of the BS of the j-th cell (j6=`). Therefore, the signal
received at the k-th user in the `-th cell can be written as
x`k =pdg`,`kw`k s`k +pdXL
j=1
j6=`XK
i=1 gj,`kwj isj i +z`k
+pdXK
i=1
i<k
g`,`kw`i s`i +pdXK
i=1
i>k
g`,`kw`i s`i,(1)
where pddenotes the transmit power at the BS and the
additive noise z`k is assumed to follow i.i.d. CN(0,1). The
precoding column vector w`k satisfies the power constraint
tr WH
`W`= 1 where W`,[w`1w`2···w`K ]CM×K.
As discussed in [7], [8], the intracell interference component
pdPK
i=1,i<k g`,`kw`i s`i is considered to be noncausally
known to realize DPC precoding, and thus it is absolutely
eradicated from the received signal at the k-th user in the `-th
cell. Accordingly, the downlink achievable rate for all cells in
system is given as
RDPC =XL
`=1 XK
k=1 log21 + pd|g`,`kw`k |2
Iinter
`k +Iintra
`k + 1,(2)
where the intercell and the intracell interference terms are
defined as Iinter
`k ,pdPL
j=1,j6=`PK
i=1 |gj,`kwj i|2and Iintra
`k ,
pdPK
i=1,i>k |g`,`kw`i |2, respectively.
III. PROP OS ED PRECODING DESIGN
The sum rate in (2) is obviously nonconvex with respect
to the precoding vector w`k, and thus it is not easy to
solve a sum rate maximization problem. To deal with this,
the precoding matrix in `-th cell is designed to contain a
component orthogonal to the channel vectors from the BS in
the `-th cell to all users in the j-th cell (j6=`), so that the
signal from the `-th is eliminated at the users in j-th cell. The
orthogonal component is also incorporated into the precoding
design based on SZF-DPC to remove the intracell interference
as well. As a result, the sum rate maximization problem for
precoding design under the zero-forcing constraints and a sum
power constraint is derived from (2) as
maximize
w`k XK
k=1 log21 + pd|g`,`k w`k|2,(3a)
subject to g`,ji w`k = 0,j IL\{`},and i IK,
(3b)
g`,`iw`k = 0,i < k, (3c)
tr WH
`W`1,(3d)
where INdenotes an index set defined as IN,
{1,2,·· · , N }, with Nas an arbitrary natural number. The
constraints (3b) and (3c) represent the conditions to respec-
tively eliminate the intercell and intracell interference while
(3d) denotes the sum power constraint at the `-th cell. Herein,
as compared to (2) the roles of the subscript iand kin
(3c) are interchanged to be convenient for expressing and
solving the problem. Note that (3) is concave and thus can
be solved using standard convex packages. However, the
computational complexity may be very large since the convex
tools cannot take the special features of the optimization
problem into account. This motivates us to develop an efficient,
low-complexity algorithm to solve (3) while preserving the
optimality of the solution.
Let G`,j =gH
`,j1···gH
`,jK HCK×M,j IL.
Constraint (3b) can then be transformed into matrix form
as GH
`W`=0, where G`CM×(L1)Kis defined as
G`,GH
`,1·· · GH
`,(`1)GH
`,(`+1) . . . GH
`,L. The constraint
(3b) can be reduced by constructing W`=PG
`
¯
W`,
where PG
`is an orthogonal projection onto G
`, which
is computed as PG
`=IMG`GH
`G`1GH
`. As a
result, constraint (3b) is always satisfied no matter how ¯
W`
has been chosen since GH
`W`=GH
`PG
`
¯
W`=0. Note
that each column of the matrix G`and the product PG
`
¯
W`
correspond to g`,ji and w`k in (3b), respectively.
Now we define ¯
G`,` ,G`,`PG
`and express the QR
decomposition of ¯
G`,` as
¯
G`,` =Y`B`.(4)
By applying the Gram-Schmidt procedure to the rows of ¯
G`,`,
Y`CK×Kbecomes a lower triangular matrix, and B`
CK×Mhas Kpairwise orthogonal rows. It is implicit that
B`BH
`=IKbut BH
`B`6=IM. We construct ¯
W`=BH
``,
where `= diagω`1··· ω`K , and ω`k is an optimization
variable. Accordingly, the precoding matrix is expressed as
W`=PG
`BH
``.(5)
As a result, constraint (3c) is always satisfied since G`,`W`=
¯
G`,`BH
``=Y`B`BH
``=Y``. Actually, the product
Y``is a lower triangular matrix where each upper diagonal
entry is equivalent to g`,`iw`k = 0 for i<k. Meanwhile,
g`,`kw`k =Y`,k kω`k , where Y`,kk is the entry on the k-th row
and on the k-th column of Y`.
We define a Hermitian matrix as T`,B`PG
`BH
`.
From (5) and the relationship PH
G
`
=PG
`=P2
G
`
,
the left-hand side of the constraint equation (3d) can be
written as trWH
`W`= trT``H
`. Let a vector c`
C1×Kcomprise the main diagonal entries of T`, and ¯ω`,
ω2
`1. . . ω2
`k . . . ω2
`K Tbe a vector of optimization variables.
We also define ¯y`,pdY2
`,11 . . . Y 2
`,kk . . . Y 2
`,KK . Then, the
optimization problem (3) becomes equivalent to
maximize
¯ω`0XK
k=1 log21 + ¯y`k¯ω`k(6a)
subject to c`¯ω`= 1.(6b)
Note that (6) can be easily solved with a K-length vector
variable ¯ω`using a water-filling algorithm. However, the
best water-filling implementation takes O(K)per iteration
[9]. Therefore, we propose an approach that reduces the
complexity per iteration and requires a fewer numbers of
iterations by first describing the following proposition.
Proposition 1: The Lagrangian Lassociated with problem
(6) can be defined as
L(Υ`, ν`) = f(Υ`1) + ν`¯c`Υ`ρ`,
where f(Υ`),PK
k=1 log2(υ`k )and ν`is the dual
variable associated with problem (6). An alternative variable
υ`k ,1 + ¯y`k¯ω`k>1is introduced to define Υ`,
[υ`1υ`2. . . υ`K ]T1. Let ¯
Y`=diag¯y`1,¯c`,c`¯
Y`,
and ρ`,c`¯
Y`1+ 1, then the Lagrange dual problem in (6)
3
Algorithm 1: Proposed algorithm to solve (7)
Input: Initial points ν`= 1, tolerance = 105, and step
size t= 1. Let φ=−∇ν`˜g(ν`),Θ=1,and φ0= 0.
Repeat:
1: Calculate φ=φφ0and ν`= Θφ.
2: Exit the loop if kφk .
3: Update ν`,φ0and Θin the following order:
ν`ν`+tν`;φ0φ+ν`˜g(ν`); and
Θresult in (10)
Output: The dual optimal value ν`.
can be given as
max
ν`
g(ν`) = ρ`ν`+ν`¯c`1+K+
K
X
k=1
log2ν`¯cT
`k.(7)
Finally, the solution for (6) can be found from the optimal
dual variable ν
`in (7) given as
¯ω`=¯
Y`diagν
`¯cT
`11.(8)
Proof: See Appendix A.
Obviously, the maximization problem in (7) is concave. An
iterative algorithm can then be applied to find the solution.
We adopt the BFGS algorithm, which is a quasi-Newton
method, for the following reasons. (i) The BFGS algorithm
approximates the inverse of the Hessian matrix to lower the
complexity per iteration, and thus it is suitable for use in real-
time implementations [10]. (ii) By setting a special initial point
to the dual variable, the BFGS algorithm can find the optimal
solution with a limited number of iterations.
Let Θbe the approximation of the inverse of the Hessian
matrix. The gradient of the objective function ˜g(ν`) = g(ν`)
is given as
ν`˜g(ν`) = ρ`1T¯cT
`K
ν`ln 2 .(9)
In the i-th iteration, the algorithm works as follows.
1) Calculate ν(i)
`=Θ(i)ν`˜gν(i)
`.
2) Stop if k∇ν`˜gν(i)
`k .
3) Update the optimal dual value as ν(i+1)
`=ν(i)
`+ ν(i)
`.
4) Calculate φ0=ν`˜gν(i+1)
` ν`˜gν(i)
`.
5) Calculate Θ(i+1) given in [10] as
Θ(i+1) = Θ(i)Θ(i)φ0ν(i)
`T+ ν(i)
`(φ0)TΘ(i)
ν(i)
`Tφ0
+ν(i)
`Tφ0+ (φ0)TΘ(i)φ0ν(i)
`ν(i)
`T
ν(i)
`Tφ02.(10)
6) Go back to Step 1).
The above loop still requires the computation of gradients
for different values of ν`. To overcome this limitation, we
present a modified version of the original BFGS algorithm
in Algorithm 1. At the beginning, we set φ=−∇ν`˜g(ν`),
Θ=1, and φ0= 0. The update step calls for the gradient only
once, and the first step of the next iteration reuses the stored
values of φand φ0to compute −∇ν`˜g(ν`). This algorithm is
also applicable to the general case of multiple dual variables.
The dual problem in (7) is used to reduce the optimization
variable to a scalar ν`instead of a K-length vector ¯ω`in
(6). As a result, the complexity per iteration for the iterative
algorithm is only O(1). Furthermore, the proposed method can
200 300 400 500 600 700 800 900 1000
0.6
0.8
1
1.2
1.4
1.6
Number of Antennas at BS, M
(a)
Sum Throughput (Gbps)
MRT ZF MMSE Proposed DPC
0 4 8 12 16
0
0.2
0.4
0.6
0.8
1
pd= 15 dBm
Per-User Rate (bps/Hz)
(b)
Cumulative Distribution
0 4 8 12 16 20
0
0.2
0.4
0.6
0.8
1
pd= 30 dBm
Per-User Rate (bps/Hz)
(c)
Fig. 1. Comparison of the performance of the proposed design with those
of conventional schemes. (a) Sum rate per cell versus the number of antennas
M. (b) The cumulative distribution of per-user rate with pd= 15 dBm. (c)
The cumulative distribution of per-user rate with pd= 30 dBm.
2 4 6 8 10 12 14
105
104
103
102
101
100
101
102
Number of Iterations
Error Tolerance
K= 10
5 10 15 20
105
104
103
102
101
100
101
102
103
Number of Iterations
K= 20
CG Water-filling Newton’s method Proposed
Fig. 2. Convergence behavior for K= 10 and 20.
obtain an optimal solution within two iterations, no matter how
the other parameters change. The proof is given in Appendix
B. The total complexity of proposed method is determined
by computing PG
`, QR decomposition in (4), and precoding
matrix in (5), taking (L+ 3)KM 2+ ((3(L1)2+ 1)K2+
K+ 1)M+ (L1)3K3flops. And thus, it is approximated
as O(KM 2), which is far less than O(KM 3)for single-cell
precoding based on Newton’s method in [8], especially for a
large M.
IV. NUMERICAL RES ULT S
We consider a hexagonal-cell system in which a center
cell is surrounded by 6 neighboring cells, L= 7. Each BS
is assumed to be located at the center of the cell, and the
bandwidth is set to 20 MHz with a power spectral density
of additive noise assumed to be 174 dBm/Hz. We consider
two values for the transmit power pdof 15 dBm and 30
dBm, according to 3GPP TR 36.942 v.9.0.1, with a 46-dBm
maximum transmit power for the 20 MHz bandwidth. We first
examine the case where the BS in each cell serves K= 10
users. Fig. 1(a) depicts the sum rate per cell versus the number
of antennas M. The proposed design is shown to provide better
performance than conventional precoding schemes, i.e., MRT,
ZF, and MMSE1. As compared to DPC in [6], the rate loss
when using our method significantly decreases as Mincreases,
which was proved in [11, Theorem 2]. Figs. 1(b) and 1(c) show
the cumulative distribution of the per-user rate with M= 500,
when pd= 15 dBm and 30 dBm, respectively. The proposed
design is seen to outperform the others, and the gain increases
with pd.
1Note that only linear precoding was considered in previous works in the
context of multicell massive MIMO systems. This is the reason why we
compare the sum throughput of the proposed scheme only with that of linear
precoding schemes.
4
TABLE I
COMPARISON OF CONVERGENCE AND COMPLEXITY
XXXXXXXXXX
X
Methods
Convergence &
Complexity Number of
Iterations
Complexity per
Iteration
Water-filling 8 O(K)
Conjugate gradient KO(1)
Newton’s 7 O(1)
BFGS 2 O(1)
Fig. 2 compares the convergence behavior of the proposed
algorithm with other three algorithms: water-filling [9], conju-
gate gradient (CG) [12], and Newton’s method [8] for different
values of K. We capture the most typical cases of ten thousand
random loops. Overall, the convergence rate of CG linearly
changes as the number of users, while that of the others
does not change with the number of users. The proposed
algorithm is shown to provide excellent convergence behavior;
it converges in two iterations as discussed in Section III.
In particular, although the complexity per iteration of the
proposed approach is comparable to Newton’s method and CG,
it converges in far less number of iterations, as summarized
in Table I.
V. CONCLUSIONS
This letter presents a precoding design based on the SZF-
DPC for a multicell massive MIMO system. We have derived a
dual problem associated with the sum rate maximization prob-
lem by using the QR decomposition and convex conjugate.
We developed a modified BFGS algorithm to find the optimal
solution with a reduced complexity, and numerical results
were presented here to verify that the proposed precoding
approach outperforms the previously proposed schemes while
also providing low complexity with fast convergence.
APPENDIX A
THE LAG RA NG E DUAL PROB LE M
First, let υ`k ,1 + ¯y`k¯ω`kbe an alternative variable,
then Υ`,[υ`1υ`2·· · υ`K ]T1since ¯ω`0. Therefore,
¯ω`and Υ`are related as
¯ω`=¯
Y`Υ`1,(11)
where ¯
Y`=diag¯y`1. By substituting (11) into (6), the
optimization problem is written as
minimize
Υ`1fΥ`=XK
k=1 log2υ`k (12a)
subject to ¯c`Υ`=ρ`.(12b)
To derive a dual problem, we consider the Lagrangian asso-
ciated with (12) as LΥ`, ν`=f(Υ`1)+ ν`(¯c`Υ`ρ`),
where f(Υ`1)instead of f(Υ`)is used so that constraint
Υ`1is included in the Lagrangian LΥ`, ν`. The La-
grange dual function associated with the optimization problem
(6) is expressed as
g(ν`) = inf
Υ`L(Υ`, ν`) = inf
Υ`f(Υ`1) + ν`(¯c`Υ`ρ`)
=ρ`ν`fν`¯cT
`+ν`¯c`1.(13)
Note that the convex conjugate of a shifted function
f(Υ`1)is given by fν`¯cT
`ν`¯c`1. As in
[13], the Legendre transform of log (x)is given by
(1 + log (x)), where xbelongs to the dual space of a
real vector space, which contains x. Accordingly, the convex
conjugate f(ν`¯cT
`)is expressed as f(ν`¯cT
`) = K
PK
k=1 log2ν`¯cT
`k. By applying this to (13), we obtain
the dual problem given in (7). By solving the dual problem,
we get the optimal dual variable ν
`. Then, the optimal
solution for problem (6) is computed by solving the dual
feasibility equation via the Karush-Kuhn-Tucker condition for
L(Υ`, ν`), which is given as Υ`f(Υ`1) + ν
`¯cT
`=0
where Υ`f(Υ`)is the gradient of f(Υ`). The equivalent
equation is given as
υ`k =1
ln 2 ν
`¯cT
`k
+ 1.(14)
By substituting (14) into (11), the optimal solution ¯ω`of (6)
is derived as (8).
APPENDIX B
CONVERGENCE OF THE PROP OS ED ALGORITHM
With a scalar dual variable, the approximated inverse of the
Hessian Θis initially set to 1, which makes the first iteration
of the BFGS equivalent to a gradient descent. Interestingly,
the scalar ν`is simply set to ν(0)
`= 1. From (9), the gradient
of ˜g(ν(0)
`)is ν`˜g(ν(0)
`) = ξ`K
ln 2 , where ξ`=ρ`1T¯
cT
`.
In the next iteration, the value of ν(1)
`is updated as ν(1)
`=
ν(0)
` ν˜g(ν(0)
`) = 1 ξ`+K
ln 2 . Therefore, the gradient in
the second iteration becomes
ν˜g(ν(1)
`) = ξ`1Kξ`ln 2
ln 2 ξ`ln 2 + K.(15)
With ρ`and c`defined in Proposition 1, ξ`= 1 is posed in
the optimization problem. Consequently, ν˜g(ν(1)
`)in (15) is
equal to 0, which is an exit condition of the loop.
REFERENCES
[1] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta,
O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities and
challenges with very large arrays, IEEE Signal Process. Mag., vol. 30,
no. 1, pp. 40–60, Jan. 2013.
[2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-
bers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9,
no. 11, pp. 3590–3600, Nov. 2010.
[3] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pilot
contamination and precoding in multi-cell TDD systems,” IEEE Trans.
Wireless Commun., vol. 10, no. 8, pp. 2640–2651, Aug. 2011.
[4] J. Choi, “Massive MIMO with joint power control, IEEE Wireless
Commun. Lett., vol. 3, no. 4, pp. 329–332, Aug. 2014.
[5] G. Caire and S. Shamai, “On the achievable throughput of a multiantenna
Gaussian broadcast channel,” IEEE Trans. Inform. Theory, vol. 49, no. 7,
pp. 1691–1706, July 2003.
[6] N. Jindal, W. Rhee, S. Vishwanath, S. A. Jafar, and A. Goldsmith,
“Sum power iterative water-filling for multi-antenna Gaussian broadcast
channels,” IEEE Trans. Inform. Theory, vol. 51, no. 4, pp. 1570–1580,
Apr. 2005.
[7] A. D. Dabbagh and D. J. Love, “Precoding for multiple antenna Gaussian
broadcast channels with successive zero-forcing, IEEE Trans. Signal
Process., vol. 55, no. 7, pp. 3837–3850, July 2007.
[8] L.-N. Tran, M. Juntti, M. Bengtsson, and B. Ottersten, “Beamformer
designs for MISO broadcast channels with zero-forcing dirty paper
coding,” IEEE Trans. Wireless Commun., vol. 12, no. 3, pp. 1173–1185,
Mar. 2013.
[9] S. Khakurel, C. Leung, and T. Le-Ngoc, A generalized water-filling
algorithm with linear complexity and finite convergence time, IEEE
Wireless Commun. Lett., vol. 3, no. 2, pp. 225–228, Apr. 2014.
[10] A. M. Nezhad, R. A. Shandiz, and A. E. Jahromi, “A particle swarm-
BFGS algorithm for nonlinear programming problems,” Comput. Oper.
Res., vol. 40, no. 4, pp. 963–972, Apr. 2013.
[11] J. Lee and N. Jindal, “High SNR analysis for MIMO broadcast channels:
Dirty paper coding versus linear precoding,” IEEE Trans. Inform.
Theory, vol. 53, no. 12, pp. 4787–4792, Dec 2007.
[12] S. D. Gray, “Multipath reduction using constant modulus conjugate
gradient techniques,” IEEE J. Select. Areas Commun., vol. 10, no. 8,
pp. 1300–1305, Oct. 1992.
[13] S. Boyd and L. Vandenberghe, Convex Optimization. New York, NY,
USA: Cambridge Univ. Press, 2004.
... , x k−1 can be properly eliminated at the receiver of user k. The reader is referred to [5], [6], [16], [29] for further understanding of DPC. ...
... Update q ⋆ ← q ⋆ − α u 25: end while 26: end if a: The first component (steps [1][2][3][4][5][6][7][8] This component checks efficiently whether the optimal rate vector R ⋆ belongs to one of the K! corner points of the (K − 1)-dimensional polytope in roughly K 2 /2 trials. It is based on the observation that if there exists a permutation π such that R ⋆ π(K) , . . . ...
Article
Full-text available
Transmission schemes taking both sum rate and fairness into account for dirty paper coding (DPC) based MIMO downlink communications are investigated in this paper. In contrast to existing works which have mostly focused on maximizing the sum rate, we first investigate the problem of finding the maximal sum rate achieved by DPC when the qualitative notions of fairness such as max-min fairness and proportional fairness are employed. This corresponds to a nonconvex problem and cannot be solved by usual weighted sum rate techniques. Several efficient methods for finding the optimal solutions are presented in this paper when the order of users is adjustable during DPC encoding. Simulation results show surprisingly and impact greatly on the design of practical systems that it is often possible to achieve the sum rate capacity with absolute fairness, i.e., an equal rate for each user, when multiple encoding orders of users are used during transmission. When sum rate capacity and absolute fairness cannot be achieved at the same time, the optimal tradeoff between sum rate and fairness is also provided for a general class of quantitative fairness measures.
... However, the precoding scheme was also important for the performance enhancement of the sum-rate. In [25], a precoding scheme was proposed for the downlink sum-rate optimization in a multi-cell massive MIMO system, which combined a reduced form of the orthogonal projection and QR decomposition. Reference [26] proposed a precoder scheme based on perantenna power constraints to increase the sum-rate in MIMO systems. ...
Article
Full-text available
Abstract Sum‐rate optimization problem is a key issue in a time‐varying multi‐user multi‐input multi‐output (MU‐MIMO) distributed antenna system. Channel precoding matrix is the key technology in the achievable sum‐rate optimization problem. In this paper, two algorithms are proposed to solve the achievable sum‐rate optimization problem for a time‐varying MU‐MIMO distributed antenna system, namely one‐dimensional search algorithm (OSA) and cyclic MMSE (Minimum Mean Square Error) search algorithm (CMSA). In order to design the channel precoding matrix for the achievable sum‐rate optimization, a one‐dimensional search algorithm is proposed in the time‐varying MU‐MIMO distributed antenna system. For the problem with a large amount of computation in OSA, a cyclic MMSE search algorithm is proposed in the time‐varying MU‐MIMO distributed antenna system. Simulation results show that the proposed algorithms can effectively improve the sum‐rate compared to other algorithms. Simulation results also show that OSA is superior to CMSA in the sum‐rate, and the channel state information (CSI) coefficient has little effect on OSA and CMSA.
... TheDL NOMA transmission requires to be distinguished from the DL transmission scheme based on dirty paper coding (DPC), where the BS sends the encoded signals (called dirty messages) to the DL UEs. In fact, the messages sent to DL UEs are jointly encoded in order for the cumulative signal received at a certain DL UE to become a purely desired message of the UE [12][13][14]. The key idea of the DPC is based on the encoding technique at the BS, while the DL NOMA uses the SIC technique at the DL UEs. ...
Article
Full-text available
Non-orthogonal multiple access (NOMA) is a promising technology for next-generation wireless networks with emerging demands on low latency, high throughput, and massive connectivity. Unlike orthogonal multiple access, NOMA allows multiple users to share the same radio resources, which significantly improves spectral efficiency (SE). To achieve green wireless communications for numerous networked devices, NOMA helps reduce energy consumption while satisfying rate fairness and quality-of-experience requirements. The goal of this paper is to introduce the innovative approaches for NOMA in terms of the SE and energy efficiency, and discuss emerging technologies involved with NOMA. Further, its challenges and future research directions are highlighted.
Article
In this paper, rapidly converging low-complexity iterative transmit precoding (TPC) techniques are proposed for the massive multiple-input multiple-output (MIMO) downlink. First of all, the proposed random block-based iterative TPC (RBI-TPC) algorithm performs its iterations by updating multiple rather than a single component at each instant, where the updating order of each block containing multiple components relies on the samples randomly sampled from a discrete distribution. Based on the analytically derived convergence rate, we demonstrate that improved convergence is achieved by the block-based update mechanism conceived since the correlation between multiple components can be beneficially exploited. Then, the random sampling that determines the updating order is studied. By applying conditional random sampling, the updating order is optimized based on the latest updates for attaining more rapid convergence. We also demonstrate that the associated updating order may become deterministic under specific conditions so that a fixed but optimized updating order can be used for facilitating the practical implementations, which paves the way for conceiving the ordered block-based iterative TPC (OBI-TPC) algorithm. Finally, the concept of successive over-relaxation (SOR) is adopted for further convergence improvement and simulations are presented to illustrate the performance improvements of the proposed RBI and OBI TPC algorithms compared to the existing low-complexity iterative TPC schemes.
Article
In this paper, the random iterative method is introduced to massive multiple-input multiple-output (MIMO) systems for the efficient downlink linear precoding. By adopting the random sampling into the traditional iterative methods, the matrix inversion within the linear precoding schemes can be approximated statistically, which not only achieves a faster exponential convergence with low complexity but also experiences a global convergence without suffering from the various convergence requirements. Specifically, based on the random iterative method, the randomized iterative precoding algorithm (RIPA) is firstly proposed and we show its approximation error decays exponentially and globally along with the number of iterations. Then, with respect to the derived convergence rate, the concept of conditional sampling is introduced, so that further optimization and enhancement are carried out to improve both the convergence and the efficiency of the randomized iterations. After that, based on the equivalent iteration transformation, the modified randomized iterative precoding algorithm (MRIPA) is presented, which achieves a better precoding performance with low-complexity for various scenarios of massive MIMO. Finally, simulation results based on downlink precoding in massive MIMO systems are given to show the system gains of RIPA and MRIPA in terms of performance and complexity.
Preprint
We study joint unicast and multigroup multicast transmission in single-cell massive multiple-input-multiple-output (MIMO) systems, under maximum ratio transmission. For the unicast transmission, the objective is to maximize the weighted sum spectral efficiency (SE) of the unicast user terminals (UTs) and for the multicast transmission the objective is to maximize the minimum SE of the multicast UTs. These two problems are coupled to each other in a conflicting manner, due to their shared power resource and interference. To address this, we formulate a multiobjective optimization problem (MOOP). We derive the Pareto boundary of the MOOP analytically and determine the values of the system parameters to achieve any desired Pareto optimal point. Moreover, we prove that the Pareto region is convex, hence the system should serve the unicast and multicast UTs at the same time-frequency resource.
Article
Integrating massive multiple input multiple output (MIMO) architecture and non-orthogonal multiple access (NOMA) technology is deemed as a promising solution to the gridlock of ever increasing demand in data rate via extending communication frequency to millimeter wave (mmWave) band and improving system spectrum efficiency. In this paper, we consider the downlink operation of MIMO-NOMA systems, where the base station (BS) is equipped with massive antenna array that uses hybrid precoding to reduce the number of required radio-frequency (RF) chains without performance loss with less hardware cost. A clustering strategy and a joint power allocation scheme of such systems are proposed to maximize the sum-rate. Specifically, we adopt the concept of chordal distance in the clustering strategy, i.e., the selection of cluster head users and assignment of other users in each beam. Finally, we consider a power allocation scheme for each user that takes care of inter-beam and intra-beam interference. Upon the formulation of the power allocation problem while considering all relevant constraints, it is transformed to a dual problem via the Lagrange duality theorem, and then the optimal power allocation is derived by the optimal dual value. Simulation results show that the proposed scheme can achieve better performance in terms of spectrum efficiency compared to the existing methods, and the mmWave-based massive MIMO-NOMA system outperforms the MIMO-OMA system.
Preprint
We study the joint unicast and multi-group multicast transmission in massive multiple-input-multiple-output (MIMO) systems. We consider a system model that accounts for channel estimation and pilot contamination, and derive achievable spectral efficiencies (SEs) for unicast and multicast user terminals (UTs), under maximum ratio transmission and zero-forcing precoding. For unicast transmission, our objective is to maximize the weighted sum SE of the unicast UTs, and for the multicast transmission, our objective is to maximize the minimum SE of the multicast UTs. These two objectives are coupled in a conflicting manner, due to their shared power resource. Therefore, we formulate a multiobjective optimization problem (MOOP) for the two conflicting objectives. We derive the Pareto boundary of the MOOP analytically. As each Pareto optimal point describes a particular efficient trade-off between the two objectives of the system, we determine the values of the system parameters (uplink training powers, downlink transmission powers, etc.) to achieve any desired Pareto optimal point. Moreover, we prove that the Pareto region is convex, hence the system should serve the unicast and multicast UTs at the same time-frequency resource. Finally, we validate our results using numerical simulations.
Article
In this paper, we study the problem of downlink (DL) sum rate maximization in codebook based multiuser (MU) multiple input multiple output (MIMO) systems. The user equipments (UEs) estimate the DL channels using pilot symbols sent by the access point (AP) and feedback the estimates to the AP over a control channel. We present a closed form expression for the achievable sum rate of the MU-MIMO broadcast system with codebook constrained precoding based on the estimated channels, where multiple data streams are simultaneously transmitted to all users. Next, we present novel, computationally efficient, minorization-maximization (MM) based algorithms to determine the selection of beamforming vectors and power allocation to each beam that maximizes the achievable sum rate. Our solution involves multiple uses of MM in a nested fashion. Based on this approach, we propose and contrast two algorithms, which we call the square-root-MM (SMM) and inverse-MM (IMM) algorithms. The algorithms are iterative and converge to a locally optimal beamforming vector selection and power allocation solution from any initialization. We evaluate the performance and complexity of the algorithms for various values of the system parameters, compare them with existing solutions, and provide further insights into how they can be used in system design.
Article
Full-text available
To support cell-edge users with certain quality-of-service (QoS), base stations (BSs) may need to exchange their channel state information (CSI), which would not be practical in a massive multiple-input multiple-output (MIMO) system. In this paper, joint power control is considered for QoS when noncooperative beamforming is used, which results in a low overhead to exchange limited CSI between BSs. For joint power control, we derive bounds on signal-to-interference-plus-noise ratio (SINR) in terms of slow-fading coefficients. Due to joint power control, a better performance is achieved.
Article
Full-text available
We consider the beamformer design for multiple-input multiple-output (MISO) broadcast channels (MISO BCs) using zero-forcing dirty paper coding (ZF-DPC). Assuming a sum power constraint (SPC), most previously proposed beamformer designs are based on the QR decomposition (QRD), which is a natural choice to satisfy the ZF constraints. However, the optimality of the QRD-based design for ZF-DPC has remained unknown. In this paper, first, we analytically establish that the QRD-based design is indeed optimal for any performance measure under a SPC. Then, we propose an optimal beamformer design method for ZF-DPC with per-antenna power constraints (PAPCs), using a convex optimization framework. The beamformer design is first formulated as a rank-1-constrained optimization problem. Exploiting the special structure of the ZF-DPC scheme, we prove that the rank constraint can be relaxed and still provide the same solution. In addition, we propose a fast converging algorithm to the beamformer design problem, under the duality framework between the BCs and multiple access channels (MACs). More specifically, we show that a BC with ZF-DPC has the dual MAC with ZF-based successive interference cancellation (ZF-SIC). In this way, the beamformer design for ZF-DPC is transformed into a power allocation problem for ZF-SIC, which can be solved more efficiently.
Article
Full-text available
This paper surveys recent advances in the area of very large MIMO systems. With very large MIMO, we think of systems that use antenna arrays with an order of magnitude more elements than in systems being built today, say a hundred antennas or more. Very large MIMO entails an unprecedented number of antennas simultaneously serving a much smaller number of terminals. The disparity in number emerges as a desirable operating condition and a practical one as well. The number of terminals that can be simultaneously served is limited, not by the number of antennas, but rather by our inability to acquire channel-state information for an unlimited number of terminals. Larger numbers of terminals can always be accommodated by combining very large MIMO technology with conventional time- and frequency-division multiplexing via OFDM. Very large MIMO arrays is a new research field both in communication theory, propagation, and electronics and represents a paradigm shift in the way of thinking both with regards to theory, systems and implementation. The ultimate vision of very large MIMO systems is that the antenna array would consist of small active antenna units, plugged into an (optical) fieldbus.
Article
Full-text available
This paper considers a multi-cell multiple antenna system with precoding used at the base stations for downlink transmission. For precoding at the base stations, channel state information (CSI) is essential at the base stations. A popular technique for obtaining this CSI in time division duplex (TDD) systems is uplink training by utilizing the reciprocity of the wireless medium. This paper mathematically characterizes the impact that uplink training has on the performance of such multi-cell multiple antenna systems. When non-orthogonal training sequences are used for uplink training, the paper shows that the precoding matrix used by the base station in one cell becomes corrupted by the channel between that base station and the users in other cells in an undesirable manner. This paper analyzes this fundamental problem of pilot contamination in multi-cell systems. Furthermore, it develops a new multi-cell MMSE-based precoding method that mitigate this problem. In addition to being a linear precoding method, this precoding method has a simple closed-form expression that results from an intuitive optimization problem formulation. Numerical results show significant performance gains compared to certain popular single-cell precoding methods. Comment: 23 pages, 4 figures
Article
This letter presents an algorithm with linear complexity and finite convergence time for solving the generalized water-filling (WF) problem. The WF problem is generalized by using a weighted-sum-rate, weighted-sum-power, and peak power constraints. The proposed algorithm solves the optimization problems with concave (power and rate) or quasi-concave (energy-efficiency) objective functions. Additionally, it can simultaneously use maximum-power and minimum-rate constraints and give a priority to one of the constraints in the event they generate an infeasible region. Through this generalization, the algorithm can be applied to many WF-based methods proposed in the literature. Moreover, this letter shows multiple ways to further reduce the computational complexity and, via simulation, illustrates the effectiveness of the proposed algorithm.
Article
A cellular base station serves a multiplicity of single-antenna terminals over the same time-frequency interval. Time-division duplex operation combined with reverse-link pilots enables the base station to estimate the reciprocal forward- and reverse-link channels. The conjugate-transpose of the channel estimates are used as a linear precoder and combiner respectively on the forward and reverse links. Propagation, unknown to both terminals and base station, comprises fast fading, log-normal shadow fading, and geometric attenuation. In the limit of an infinite number of antennas a complete multi-cellular analysis, which accounts for inter-cellular interference and the overhead and errors associated with channel-state information, yields a number of mathematically exact conclusions and points to a desirable direction towards which cellular wireless could evolve. In particular the effects of uncorrelated noise and fast fading vanish, throughput and the number of terminals are independent of the size of the cells, spectral efficiency is independent of bandwidth, and the required transmitted energy per bit vanishes. The only remaining impairment is inter-cellular interference caused by re-use of the pilot sequences in other cells (pilot contamination) which does not vanish with unlimited number of antennas.
Article
In this correspondence, we consider the problem of maximizing sum rate of a multiple-antenna Gaussian broadcast channel (BC). It was recently found that dirty-paper coding is capacity achieving for this channel. In order to achieve capacity, the optimal transmission policy (i.e., the optimal transmit covariance structure) given the channel conditions and power constraint must be found. However, obtaining the optimal transmission policy when employing dirty-paper coding is a computationally complex nonconvex problem. We use duality to transform this problem into a well-structured convex multiple-access channel (MAC) problem. We exploit the structure of this problem and derive simple and fast iterative algorithms that provide the optimum transmission policies for the MAC, which can easily be mapped to the optimal BC policies.
Article
In this paper, we consider the multiuser Gaussian broadcast channel with multiple transmit antennas at the base station and multiple receive antennas at each user. Assuming full knowledge of the channel state information at the transmitter and the different receivers, a new transmission scheme that employs partial interference cancellation at the transmitter with dirty-paper encoding and decoding is proposed. The maximal achievable throughput of this system is characterized, and it is shown that given any ordered set of users the proposed scheme is asymptotically optimal in the high signal-to-noise ratio (SNR) regime. In addition, with optimal user ordering, the proposed scheme is shown to be optimal in the low-SNR regime. We also consider a linear transmission scheme which employs only partial interuser interference cancellation at the base station without dirty-paper coding. Given a transmit power constraint at the base station, the sum-rate capacity of this scheme is characterized and a suboptimal precoding algorithm is proposed. In several cases, it is shown that, for all values of the SNR, the achievable throughput of this scheme is strictly larger than a system which employs full interference cancellation at the base station (Spencer et al., 2004). In addition, it is shown that, in some cases, the linear transmission scheme can support simultaneously an increased number of users while achieving a larger system throughput.
Article
A conjugate-gradient (CG) constant-modulus adaptive processor is proposed. For the generalized sidelobe canceler (GSC) signal processing configuration, this algorithm, CG-GSC, exhibits improved convergence over previous methods. Theoretical expressions are presented for convergence and weight update of a linearly constrained constant modulus generalized sidelobe canceler. Theoretical expressions are then derived for the conjugate direction vectors. These vectors are used to update the filter weights for a conjugate gradient adaptation rule. A simulation study of the conjugate adaptation rule reveals the increase in convergence rate for the generalized sidelobe canceler. Performance comparisons of the CG-GSC and a first-order gradient GSC for a BPSK signal with multipath and white noise interference indicate that the CG-GSC adaptation rule not only increases convergence by a factor of five compared to the first-order gradient GSC, but in some instances improves the bit error rate of the demodulated BPSK signal