SpringerOpen Newsletter

Receive periodic news and updates relating to SpringerOpen.

This article is part of the series Recent advances in optimization techniques in wireless communication networks.

Open Access Highly Accessed Review

Optimal resource allocation in wireless communication and networking

Alejandro Ribeiro

Author Affiliations

Department of Electrical and Systems Engineering, University of Pennsylvania, 200 S. 33rd St., Philadelphia, PA, 19096, USA

EURASIP Journal on Wireless Communications and Networking 2012, 2012:272  doi:10.1186/1687-1499-2012-272

The electronic version of this article is the complete one and can be found online at: http://jwcn.eurasipjournals.com/content/2012/1/272

Received:20 February 2012
Accepted:18 July 2012
Published:23 August 2012

© 2012 Ribeiro; licensee Springer.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Optimal design of wireless systems in the presence of fading involves the instantaneous allocation of resources such as power and frequency with the ultimate goal of maximizing long term system properties such as ergodic capacities and average power consumptions. This yields a distinctive problem structure where long term average variables are determined by the expectation of a not necessarily concave functional of the resource allocation functions. Despite their lack of concavity it can be proven that these problems have null duality gap under mild conditions permitting their solution in the dual domain. This affords a significant reduction in complexity due to the simpler structure of the dual function. The article discusses the problem simplifications that arise by working in the dual domain and reviews algorithms that can determine optimal operating points with relatively lightweight computations. Throughout the article concepts are illustrated with the optimal design of a frequency division broadcast channel.


Operating variables of a wireless system can be separated in two types. Resource allocation variables p(h) determine instantaneous allocation of resources like frequencies and transmitted powers as a function of the fading coefficient h. Average variables x capture system’s performance over a large period of time and are related to instantaneous resource allocations via ergodic averages. A generic representation of the relationship between instantaneous and average variables is

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M1">View MathML</a>


where f1(h,p(h)) is a vector function that maps channel h and resource allocation p(h) to instantaneous performance f1(h,p(h)). The system’s design goal is to select resource allocations p(h) to maximize ergodic variables x in some sense.

An example of a relationship having the form in (1) is a code division multiple access channel in which case h denotes the vector of channel coefficients, p(h) the instantaneous transmitted power, f1(hp(h)) the instantaneous communication rate determined by the signal to interference plus noise ratio, and x the ergodic rates determined by the expectation of the instantaneous rates. The design goal is to allocate instantaneous power p(h) subject to a power constraint so as to maximize a utility of the ergodic rate vector x. This interplay of instantaneous actions to optimize long term performance is pervasive in wireless systems. A brief list of examples includes optimization of orthogonal frequency division multiplexing [1], beamforming [2,3], cognitive radio [4,5], random access [6,7], communication with imperfect channel state information (CSI) [8,9], and various flavors of wireless network optimization [10-18].

In many cases of interest the functions f1(h,p(h)) are nonconcave and as a consequence finding the resource allocation distribution p(h) that maximizes x requires solution of a nonconvex optimization problem. This is further complicated by the fact that since fading channels h take on a continuum of values there is an infinite number of p(h) variables to be determined. A simple escape to this problem is to allow for time sharing in order to make the range of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M2','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M2">View MathML</a> convex and permit solution in the dual domain without loss of optimality. While the nonconcave function f1(h,p(h)) still complicates matters, working in the dual domain makes solution, if not necessarily simple, at least substantially simpler. However, time sharing is not easy to implement in fading channels.

In this article, we review a general methodology that can be used to solve optimal resource allocation problems in wireless communications and networking without resorting to time sharing [19,20]. The fundamental observation is that the range of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M3','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M3">View MathML</a> is convex if the probability distribution of the channel h contains no points of positive probability (Section “Duality in wireless systems optimization”). This observation can be leveraged to show lack of duality gap of general optimal resource allocation problems (Theorem 1) making primal and dual problems equivalent. The dual problem is simpler to solve and its solution can be used to recover primal variables (Section “Recovery of optimal primal variables”) with reduced computational complexity due to the inherently separable structure of the problem Lagrangians (Section “Separability”). We emphasize that this reduction in complexity, as in the case of time sharing, just means that the problem becomes simpler to solve. In many cases it also becomes simple to solve, but this is not necessarily the case.

We also discuss a stochastic optimization algorithm to determine optimal dual variables that can operate without knowledge of the channel probability distribution (Section “Dual descent algorithms”). This algorithm is known to almost surely converge to optimal operating points in an ergodic sense (Theorem 5). Throughout the article concepts are illustrated with the optimal design of a frequency division broadcast channel (Section “Frequency division broadcast channel” in “Optimal wireless system design”, Section “Frequency division broadcast channel” in “Recovery of optimal primal variables”, and Section “Frequency division broadcast channel” in “Dual descent algorithms”).

One of the best known resource allocation problems in wireless communications concerns the distribution of power on a block fading channel using capacity-achieving codes. The solution to this problem is easy to derive and is well known to reduce to waterfilling across the fading gain, e.g., [21, p. 245]. Since this article can be considered as an attempt to generalize this solution methodology to general wireless communication and networking problems it is instructive to close this introduction by reviewing the derivation of the waterfilling solution. This is pursued in the following section.

Power allocation in a point-to-point channel

Consider a transmitter having access to perfect CSI h that it uses to select a transmitted power p(h) to convey information to a receiver. Using a capacity achieving code the instantaneous channel rate for fading realization h is <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M4','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M4">View MathML</a> where N0denotes the noise power at the receiver end. A common goal is to maximize the average rate <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M5','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M5">View MathML</a> with respect to the probability distribution mh(h) of the channel gain h—which is an accurate approximation of the long term average rate—subject to an average power constraint P0. We can formulate this problem as the optimization program

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M6','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M6">View MathML</a>


In most cases the fading channel h takes on a continuum of values. Therefore, solving (2) requires the determination of a power allocation function <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M7','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M7">View MathML</a> that maps nonnegative fading coefficients to nonnegative power allocations. This means that (2) is an infinite dimensional optimization problem which in principle could be difficult to solve. Nevertheless, the solution to this program is easy to derive and given by waterfilling as we already mentioned. The widespread knowledge of the waterfilling solution masks the fact that is is rather remarkable that (2) is easy to solve and begs the question of what are the properties that make it so. Let us then review the derivation of the waterfilling solution in order to pinpoint these properties.

To solve (2) we work in the dual domain. To work in the dual domain we need to introduce the Lagrangian, the dual function, and the dual problem. Introduce then the nonnegative dual variable <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M8','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M8">View MathML</a> and define the Lagrangian associated with the optimization problem in (2) as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M9','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M9">View MathML</a>


The dual function is defined as the maximum value of the Lagrangian across all functions <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M10','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M10">View MathML</a>, which upon defining <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M11','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M11">View MathML</a> as the set of nonnegative functions can be defined as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M12','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M12">View MathML</a>


The dual problem corresponds to the minimization of the dual function with respect to all nonnegative multipliers λ,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M13','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M13">View MathML</a>


Since the objective in (2) is concave with respect to variables p(h) and the constraint is linear in p(h) the optimization problem in (2) is convex and as such it has null duality gap in the sense that P=D.

An entity that is important for the upcoming discussion is the primal Lagrangian maximizer function <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M14','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M14">View MathML</a> whose values for given h are denoted as p(h,λ). This function is defined as the one that maximizes the Lagrangian for given dual variable λ

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M15','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M15">View MathML</a>


Using the definition of the Lagrangian maximizer function we can write the dual function as <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M16','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M16">View MathML</a>.

Computing values p(h,λ) of the Lagrangian maximizer function p(λ) is easy. To see that this is true rewrite the Lagrangian in (3) so that there is only one expectation

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M17','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M17">View MathML</a>


With the Lagrangian written in this form we can see that the maximization of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M18','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M18">View MathML</a> required by (6) can be decomposed in maximizations for each individual channel realization,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M19','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M19">View MathML</a>


That the equality in (8) is true is a consequence of the fact that the expectation operator is linear and that there are no constraints coupling the selection of values p(h1) and p(h2) for different channel realizations h1h2. Functional values p(h) in both sides of (8) are required to be nonnegative but other than that we can select p(h1) and p(h2) independently of each other as indicated in the right hand side of (8).

Since the right hand side of (8) states that to maximize the Lagrangian we can select functional values p(h) independently of each other, values p(h,λ) of the Lagrangian maximizer function p(λ) defined in (6) are given by

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M20','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M20">View MathML</a>


The similarity between (6) and (9) is deceiving as the latter is a much easier problem to solve that involves a single variable. To find the Lagrangian maximizer value p(h,λ) operating from (9) it suffices to solve for the null of the derivative with respect to p(h). Doing this yields the Lagrangian maximizer

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M21','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M21">View MathML</a>


where the operator [x] + :=max(x,0) denotes projection onto nonnegative reals, which is needed because of the constraint p(h)≥0.

Of particular interest is the Lagrangian maximizer function p(λ) corresponding to the optimal Lagrange multiplier <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M22','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M22">View MathML</a>. Returning to the definition of the dual function in (4) we can bound D=g(λ) as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M23','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M23">View MathML</a>


Indeed, since D is given by a Lagrangian maximization it must equal or exceed the values of the Lagrangian for any function p, and for the optimal function pin particular. Considering the explicit Lagrangian expression in (3) we can write <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M24','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M24">View MathML</a> as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M25','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M25">View MathML</a>


Observe that since pis the optimal power allocation function it must satisfy the power constraint implying that it must be <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M26','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M26">View MathML</a>. Since the optimal dual variable satisfies λ≥0 their product is also nonnegative allowing us to transform (12) into the bound

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M27','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M27">View MathML</a>


where the equality is true because P and pare the otpimal value and arguments of the primal optimization problem (2). Combining the bounds in (11) and (13) yields

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M28','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M28">View MathML</a>


Using the equivalence P=Dof primal and dual optimum values it follows that the inequalities in (13) must hold as equalities. The equality <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M29','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M29">View MathML</a>, in particular, implies that the function p is the Lagrangian maximizing function corresponding to λ=λ, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M30','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M30">View MathML</a>


The important consequence of (15) follows from the fact that the Lagrangian maximizer function p(λ) is easy to compute using the decomposition in (9). By extension, if the optimal multiplier λ is available, computation of the optimal power allocation function palso becomes easy. Indeed, making λ=λin (9) we can determine values p(h) of p as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M31','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M31">View MathML</a>


which are explicitly given by (10) with λ=λ.

To complete the problem solution we still need to determine the optimal multiplier λ. We show a method for doing so in Section “Dual descent algorithms”, but it is important to recognize that this cannot be a difficult problem because the dual function is single-dimensional and convex—dual functions of maximization problems are always convex. This has to be contrasted with the infinite dimensionality of the primal problem. By working in the dual domain we reduce the problem of determining the infinite dimensional optimal power allocation function to the determination of the one dimensional optimal Lagrange multiplier λ.

Recapping the derivation of the optimal power allocation p, we see that there are three conditions that make (18) simple:

(C1) Since the optimization problem is convex it is equivalent to its Lagrangian dual implying that optimal primal variables can be determined as Lagrangian maximizers associated with the optimal multiplier λ[cf. (11)–(15)].

(C2) Due to the separable structure of the Lagrangian, determination of the optimal power allocation function is carried out through the solution of per-fading state subproblems [cf. (6)–(9)].

(C3) Because there is a finite number of constraints the dual function is finite dimensional even though there is an infinite number of primal variables.

Most optimization problems in wireless systems are separable in the sense of (C2) and have a finite number of constraints as [cf. (C3)] but are typically not convex [cf. (C1)].

To illustrate this latter point consider a simple variation of (2) where instead of using capacity achieving codes we use adaptive modulation and coding (AMC) relying on a set of L communication modes. The l-th mode supports a rate αland is used when the signal to noise ratio (SNR) at the receiver end is between βl−1 and βl. Letting γ be the received SNR and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M32','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M32">View MathML</a> denote the indicator function of the event <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M33','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M33">View MathML</a>, the communication rate function C(γ) for AMC can be written as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M34','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M34">View MathML</a>


The corresponding optimal power allocation problem subject to average power constraint q0 can now be formulated as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M35','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M35">View MathML</a>


Similar to (2) the dual function of the optimization problem in (18) is one dimensional and its Lagrangian is separable in the sense that we can find Lagrangian maximizers by performing per-fading state maximizations. Alas, the problem is not convex because the AMC rate function C(γ) is not concave, in fact, it is not even continuous.

Since it does not satisfy (C1), solving (18) in the dual domain is not possible in principle. Nevertheless the condition that allows determination of p(h) as the Lagrangian maximizer p(h,λ) [cf. (16)] is not the convexity of the primal problem but the lack of duality gap. We’ll see in Section “Duality in wireless systems optimization” that this problem does have null duality gap as long as the probability distribution of the channel h contains no points of strictly positive probability. Thus, the solution methodology in (3)-(16) can be applied to solve (18) despite the discontinuous AMC rate function C(γ). This is actually a quite generic property that holds true for optimization problems where nonconcave functions appear inside expectations. We introduce this generic problem formulation in the next section.

Optimal wireless system design

Let us return to the relationship in (1) where h denotes the random fading state that ranges on a continuous space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M36','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M36">View MathML</a>, p(h) the instantaneous resource allocation, f1(h,p(h)) a vector function mapping h and p(h) to instantaneous system performance, and x an ergodic average. The expectation in (1) is with respect to the joint probability distribution mh(h) of the vector channel h.

It is convenient to think of f1(h,p(h)) as a family of functions indexed by the parameter h with p(h) denoting the corresponding variables. Notice that there is one vector p(h) per fading state h, which translates into an infinite number of resource allocation variables if h takes on an infinite number of values. Consequently, it is adequate to refer to the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M37','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M37">View MathML</a> of all resource allocations as the resource allocation function. The number of ergodic limits x of interest, on the other hand, is assumed finite.

Instantaneous resource allocations p(h) are further constrained to a given bounded set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M38','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M38">View MathML</a>. These restrictions define a set of admissible resource allocation functions that we denote as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M39','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M39">View MathML</a>


Variables p(h) determine system performance over short periods of times. As such, they are of interest to the system designer but transparent to the end user except to the extent that they determine the value of ergodic variables x. Therefore, we adopt as our design objective the maximization of a concave utility function f0(x) of the ergodic average x. Putting these preliminaries together we write the following program as an abstract formulation of optimal resource allocation problems in wireless systems

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M40','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M40">View MathML</a>


where we added further constraints on the set of ergodic averages x. These constraints are in the form of a bounded convex set inclusion <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M41','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M41">View MathML</a> and a concave function inequality f2(x)≥0. In the problem formulation in (20) the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M42','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M42">View MathML</a> is convex and the functions f0(x) and f2(x) are concave. The family of functions f1(h,p(h)) is not necessarily concave with respect to p(h) and the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M43','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M43">View MathML</a> is not necessarily convex. The sets <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M44','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M44">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M45','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M45">View MathML</a> are assumed compact to guarantee that x and p(h) are finite. For the expectation in (20) to exist we need to have f1(h,p(h)) integrable with respect to the probability distribution mh(h) of the vector channel h. This imposes a (mild) restriction on the functions f1(h,p(h)) and the power allocation function p. Integrability is weaker than continuity.

For future reference define xand p as the arguments that solve (20)

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M46','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M46">View MathML</a>


The configuration pair (x,p) attains the optimum value P=f0(x) and satisfies the constraints <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M47','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M47">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M48','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M48">View MathML</a> as well as the set constraints <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M49','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M49">View MathML</a>, and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M50','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M50">View MathML</a>. Observe that the pair (x,p) need not be unique. It may be, and it is actually a common occurrence in practice, that more than one configuration is optimal. Thus, (21) does not define a pair of variables but a set of pairs of optimal configurations. As it does not lead to confusion we use (x,p) to represent both, the set of optimal configurations and an arbitrary element of this set.

To write the Lagrangian we introduce a nonnegative Lagrange multiplier <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M51','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M51">View MathML</a> where λ10 is associated with the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M52','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M52">View MathML</a> and λ20 with the constraint f2(x). The Lagrangian of the primal optimization problem in (20) is then defined as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M53','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M53">View MathML</a>


The corresponding dual function is the maximum of the Lagrangian with respect to primal variables <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M54','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M54">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M55','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M55">View MathML</a>

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M56','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M56">View MathML</a>


The dual problem is defined as the minimum value of the dual function over all nonnegative dual variables

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M57','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M57">View MathML</a>


and the optimal dual variables are defined as the arguments that achieve the minimum in (24)

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M58','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M58">View MathML</a>


Notice that the optimal dual argument Λis a set as in the case of the primal optimal arguments because there may be more than one vector that achieves the minimum in (24). As we do with the optimal primal variables (x,p), we use Λ to denote the set of optimal dual variables and an arbitrary element of this set. A particular example of this generic problem formulation is presented next.

Frequency division broadcast channel

A common access point (AP) administers a power budget q0to communicate with a group of J nodes. The physical layer uses frequency division so that at most one terminal can be active at any given point in time. The goal is to design an algorithm that allocates power and frequency to maximize a given ergodic rate utility metric while ensuring that rates are at least rmin and not more than rmax.

Denote as hi the channel to terminal i and define the vector h:=[h1…,hJ]Tgrouping all channel realizations. In any time slot the AP observes the realization h of the fading channel and decides on suitable power allocations pi(h) and frequency assignments αi(h). Frequency assignments αi(h) are indicator variables αi(h)∈{0,1} that take value αi(h)=1 when information is transmitted to node i and αi(h)=0 otherwise. If αi(h)=1 communication towards i ensues at power pi(h) resulting in a communication rate C(hipi(h)/N0) determined by the SNR hipi(h)/N0. The specific form of the function C(hipi(h)/N0) mapping channels and powers to transmission rates depends on the type of modulation and codes used. One possibility is to select capacity achieving codes leading to <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M59','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M59">View MathML</a>. A more practical choice is to use AMC in which case C(hipi(h)/N0)=CAMC(hipi(h)/N0) with CAMC(hipi(h)/N0) as given in (17).

Regardless of the specific form of C(hipi(h)/N0) we can write the ergodic rate of terminal i as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M60','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M60">View MathML</a>


The factor C(hipi(h)/N0) is the instantaneous rate achieved if information is conveyed. The factor αi(h) indicates wether this information is indeed conveyed or not. The expectation weights the instantaneous capacity across fading states and is equivalent to the consideration of an infinite horizon time average.

Similarly, pi(h) denotes the power allocated for communication with node i, but this communication is attempted only if αi(h)=1. Thus, the instantaneous power utilized to communicate with i for channel realization h is αi(h)pi(h). The total instantaneous power is the sum of these products for all i and the long term average power consumption can be approximated as the expectation

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M61','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M61">View MathML</a>


that according to the problem statement cannot exceed the budget q0.

To avoid collisions between communication attempts the indicator variables αi(h) are restricted so that at most one of them is 1. Define the vector α(h):=[α1(h),…,αJ(h)]T corresponding to values of the function <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M62','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M62">View MathML</a> and introduce the set of vector functions

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M63','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M63">View MathML</a>


We can now express the frequency exclusion constraints as <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M64','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M64">View MathML</a>.

We still need to model the restriction that the achieved capacity ri needs to be between rmin and rmax but this is easily modeled as the constraint rminrirmax. Defining the vector r=[r1,…,rJ]T this constraint can be written as <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M65','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M65">View MathML</a> with the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M66','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M66">View MathML</a> defined as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M67','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M67">View MathML</a>


We finally introduce a monotonic nondecreasing utility function U(ri) to measure the value of rate riand formally state our design goal as the maximization of the aggregate utility <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M68','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M68">View MathML</a>. Using the definitions in (26)–(29) the operating point that maximizes this aggregate utility for a frequency division broadcast channel follows from the solution of the optimization problem

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M69','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M69">View MathML</a>


where we relaxed the rate expression in (26) to an inequality constraint, which we can do without loss of optimality because the utility U(ri) is monotonic nondecreasing.

The problem formulation in (30) is of the form in (20). The ergodic rates r in (30) are represented by the ergodic variables x in (20) whereas the power and frequency allocation functions p and α of (30) correspond to the resource allocation function pof (20). The set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M70','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M70">View MathML</a> maps to the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M71','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M71">View MathML</a> and the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M72','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M72">View MathML</a> to the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M73','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M73">View MathML</a>. There are no functions in (30) taking the place of the function f2(x) of (20). The function f1(h,p(h)) in (20) is a placeholder for the stacking of the functions αi(h)C(hipi(h)/N0) for different i and the negative of the power consumptions <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M74','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M74">View MathML</a> in (30). The power constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M75','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M75">View MathML</a> is not exactly of the form in (1) because q0is a constant not a variable but this doesn’t alter the fundamentals of the problem. The functions αi(h)C(hipi(h)/N0) are not concave and the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M76','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M76">View MathML</a> is not convex. This makes the program in (30) nonconvex but is consistent with the restrictions imposed in (20).

To write the Lagrangian corresponding to this optimization problem introduce multipliers λ:=[λ1,…,λJ]T associated with the capacity constraints and μassociated with the power constraint. The Lagrangian is then given by

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M77','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M77">View MathML</a>


The dual function, dual problem, and optimal dual arguments are defined as in (22)–(25) by replacing <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M78','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M78">View MathML</a> for <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M79','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M79">View MathML</a>. Since (30) is a nonconvex program we do not know if the dual problem is equivalent to the primal problem. We explore this issue in the following section.

Duality in wireless systems optimization

For any optimization problem the dual minimum D provides an upper bound for the primal optimum value P. This is easy to see by combining the definitions of the dual function in (23) and the Lagrangian in (22) to write

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M80','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M80">View MathML</a>


Because the dual function value g(Λ) is obtained by maximizing the right hand side of (32), evaluating this expression for arbitrary primal variables yields an upper bound on g(Λ). Using a pair (x,p) of optimal primal arguments as this arbitrary selection yields the inequality

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M81','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M81">View MathML</a>


Since the pair x,p is optimal, it is feasible, which means that we must have <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M82','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M82">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M83','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M83">View MathML</a>. Lagrange multipliers are also nonnegative according to definition. Therefore, the last two summands in the right hand side of (33) are nonnegative from which it follows that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M84','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M84">View MathML</a>


The inequality in (34) is true for any Λand therefore true for Λ=Λin particular. It then follows that the dual optimum D upper bounds the primal optimum P,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M85','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M85">View MathML</a>


as we had claimed. The difference D-P is called the duality gap and provides an indication of the loss of optimality incurred by working in the dual domain.

For the problem in (20) the duality gap is null as long as the channel probability distribution mh(h) contains no point of positive probability as we claim in the following theorem which is a simple generalization of a similar result in [20].

Theorem 1

Let P denote the optimum value of the primal problem (20) and D that of its dual in (24) and assume there exists a strictly feasible point (x0,p0) that satisfies the constraints in (20) with strict inequality. If the channel probability distribution mh(h) contains no point of positive probability the duality gap is null, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M86','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M86">View MathML</a>


The condition on the channel distribution not having points of positive probability is a mild requirement satisfied by practical fading channel models including Rayleigh, Rice, and Nakagami. The existence of a strictly feasible point (x0,p0) is a standard constraint qualification requirement which is also not stringent in practice.

In order to prove Theorem 1 we take a detour in Section “Lyapunov’s convexity theorem” to define atomic and nonatomic measures along with the presentation of Lyapunov’s Convexity Theorem. The proof itself is presented in Section “Proof of Theorem 1”. The implications of Theorem 1 are discussed in Sections “Recovery of optimal primal variables” and “Dual descent algorithms”.

Lyapunov’s convexity theorem

The proof of Theorem 1 uses a theorem by Lyapunov concerning the range of nonatomic measures [22]. Measures assign strictly positive values to sets of a Borel field. When all points have zero measure the measure is called nonatomic as we formally define next.

Definition 1 (Nonatomic measure)

Let w be a measure defined on the Borel field <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M87','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M87">View MathML</a> of subsets of a space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M88','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M88">View MathML</a>. Measure w is nonatomic if for any measurable set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M89','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M89">View MathML</a> with w(E0)>0, there exist a subset E of E0; i.e., <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M90','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M90">View MathML</a>, such that w(E0)>w(E)>0.

Familiar measures are probability related, e.g., the probability of a set for a given channel distribution. To build intuition on the notion of nonatomic measure consider a random variable X taking values in [0,1] and [2,3]. The probability of landing in each of these intervals is 1/2 and X is uniformly distributed inside each of them; see Figure 1. The space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M91','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M91">View MathML</a> is the real line, and the Borel field <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M92','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M92">View MathML</a> comprises all subsets of real numbers. For every subset <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M93','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M93">View MathML</a> define the measure of E as twice the integral of x, weighted by the probability distribution of X on the set E, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M94','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M94">View MathML</a>


thumbnailFigure 1. Nonatomic measure. The random variable X is uniformly distributed in <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M95','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M95">View MathML</a>. The measure <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M96','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M96">View MathML</a> is nonatomic because all sets of nonzero probability include a smaller set of nonzero probability. Lyapunov’s convexity theorem (Theorem 2) states that the measure range <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M97','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M97">View MathML</a> is convex. The range of wXis the, indeed convex, interval [0,3].

Note that, except for the factor 2, the value of wX(E) represents the contribution of the set E to the expected value of X and that when E is the whole space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M98','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M98">View MathML</a>, it holds <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M99','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M99">View MathML</a>. According to Definition 1, wX(E) is a nonatomic measure of elements of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M100','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M100">View MathML</a>. Every subset E0 with wX(E0)>0 includes at least an interval (a,b). The measure of the set E:=E0−((a + b)/2,b) formed by removing the upper half of (a,b) from E0 is wX(E)=wX(E0)−(ba)/2. The measure of E satisfies wX(E)>0 as required for wX(E) to be nonatomic.

To contrast this with an example of an atomic measure consider a random variable Y landing equiprobably in [0,1] or 5/2; see Figure 2. In this case, the measure <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M101','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M101">View MathML</a> is atomic because the set E0={5/2} has positive measure wY(E)=1. The only set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M102','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M102">View MathML</a> is the empty set whose measure is null.

thumbnailFigure 2. Atomic measure. The random variable Y lands with equal probability in Y=5/2 and uniformly in the interval [0,1]. The measure <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M103','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M103">View MathML</a> is atomic because the set {1} has strictly positive probability and no set other than the empty set is strictly included in {1}. Theorem 2 does not apply. The range of wY(E) is the nonconvex union of the intervals [0,1/2] and [5/2,3].

Theorem 2 (Lyapunov’s convexity theorem [[22]]).

Consider nonatomic measures w1,…,wnon the Borel field <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M104','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M104">View MathML</a> of subsets of a space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M105','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M105">View MathML</a> and define the vector measure w(E):=[w1(E),…,wn(E)]T. The range <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M106','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M106">View MathML</a> of the measure wis convex. I.e., if w(E1)=w1and w(E2)=w2, then for any α∈[0,1] there exists <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M107','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M107">View MathML</a> such that w(E0)=αw1 + (1−α)w2.

The difference between the distributions of X and Y is that Y contains a point of strictly positive probability, i.e., an atom. This implies presence of delta functions in the probability density function of Y . Or, in a rather cleaner statement the cumulative distribution function (cdf) of X is continuous whereas the cdf of Y is not.

Lyapunov’s convexity theorem introduced next refers to the range of values taken by (vector) nonatomic measures.

Returning to the probability measures defined in terms of the probability distributions of the random variables X and Y , Theorem 2 asserts that the range of wX(E), i.e., the set of all possible values taken by wX is convex. In fact, it is not difficult to verify that the range of wXis the convex interval [0,3] as shown in Figure 1. Theorem 2 does not claim anything about wY. In this case, it is easy to see that the range of wY is the (non-convex) union of the intervals [0,1/2] and [5/2,3]; see Figure 2.

Proof of Theorem 1

To establish zero duality gap we will consider a perturbed version of (20) obtained by perturbing the constraints used to define the Lagrangian in (22). The perturbation function P(Δ) assigns to each (perturbation) parameter set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M108','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M108">View MathML</a> the solution of the (perturbed) optimization problem

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M109','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M109">View MathML</a>


The perturbed problem in (38) can be interpreted as a modified version of (20), where we allow the constraints to be violated by Δamounts. To prove that the duality gap is zero, it suffices to show that P(Δ) is a concave function of Δ; see, e.g., [23].

Let <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M110','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M110">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M111','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M111">View MathML</a> be an arbitrary given pair of perturbations. Let (x,p) be a pair of ergodic limits and resource allocation variables achieving the optimum value P(Δ) corresponding to perturbation Δ. Likewise, denote as (x,p) a pair that achieves the optimum value P(Δ) corresponding to perturbation Δ. For arbitrary α∈[0,1], we are interested in the solution of (38) under perturbation Δα:=αΔ + (1−α)Δ. In particular, to show that the perturbation function P(Δ) is concave we need to establish that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M112','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M112">View MathML</a>


The roadblock to establish concavity of the perturbation functions is the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M113','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M113">View MathML</a>. More specifically, the difficulty with this constraint is the ergodic limit <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M114','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M114">View MathML</a>. Let us then isolate the challenge by defining the ergodic limit span

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M115','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M115">View MathML</a>


The set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M116','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M116">View MathML</a> contains all the possible values that the expectation <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M117','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M117">View MathML</a> can take as the resource allocation function pvaries over the admissible set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M118','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M118">View MathML</a>. When the channel distribution mh(h) contains no points of positive probability, the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M119','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M119">View MathML</a> is convex as we claim in the following theorem.

Theorem 3.

Let <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M120','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M120">View MathML</a> be ergodic limit span set in (40). If the channel probability distribution mh(h) contains no point of positive probability the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M121','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M121">View MathML</a> is convex.

Before proving Theorem 3 let us apply it to complete the proof of Theorem 1. For doing that consider two arbitrary points <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M122','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M122">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M123','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M123">View MathML</a>. If these points belong to <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M124','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M124">View MathML</a> there exist respective resource allocation functions <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M125','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M125">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M126','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M126">View MathML</a> such that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M127','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M127">View MathML</a>


If <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M128','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M128">View MathML</a> is a convex set as claimed by Theorem 3, we must have that for any given α the point yα:=αy + (1−α)y also belongs to <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M129','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M129">View MathML</a>. In turn, this means there exists a resource allocation function <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M130','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M130">View MathML</a> for which

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M131','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M131">View MathML</a>


Further define the ergodic limit convex combination xα:=αx + (1−α)x. We first show that the pair (xα,pα) is feasible for the problem for perturbation Δα.

The convex combination satisfies the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M132','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M132">View MathML</a> because the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M133','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M133">View MathML</a> is convex. We also have f2(xα)≥δ2,α because the function f2(x) is concave. To see that this is true simply notice that concavity of f2(x) implies that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M134','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M134">View MathML</a>. But f2(x)≥δ2 and f2(x)≥δ2 because x and x are feasible in (38) with perturbations Δand Δ. Substituting these latter two inequalities into the previous one yields <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M135','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M135">View MathML</a> according to the definition of Δα.

We are left to show that the pair (xα,pα) satisfies the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M136','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M136">View MathML</a>. For doing so recall that since (x,p) is feasible for perturbation Δand (x,p) is feasible for perturbation Δ′we must have

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M137','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M137">View MathML</a>


Perform a convex combination of the inequalities in (43) and use the definitions of xα:=αx + (1−α)x and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M138','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M138">View MathML</a> to write

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M139','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M139">View MathML</a>


Combining (42) and (44) it follows that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M140','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M140">View MathML</a> completing the proof that the pair (xα,pα) is feasible for the problem for perturbation Δα.

The utility yield of this feasible pair is f0(xδ) which we can bound as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M141','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M141">View MathML</a>


Since (x,p) is optimal for perturbation Δ we have f0(x)=P(Δ) and, likewise, f0(x)=P(Δ). Further noting that the optimal yield P(Δα) for perturbation Δα must exceed the yield f0(xα) of feasible point xαwe conclude that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M142','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M142">View MathML</a>


The expression in (46) implies that P(Δ) is a concave function of the perturbation vector Δ. The duality gap is therefore null as stated in (36).

We proceed now to the proof of Theorem 3.


(Theorem 3) Consider two arbitrary points <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M143','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M143">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M144','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M144">View MathML</a>. If these points belong tho <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M145','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M145">View MathML</a> there exist respective resource allocation functions <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M146','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M146">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M147','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M147">View MathML</a> such that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M148','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M148">View MathML</a>


To prove that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M149','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M149">View MathML</a> is a convex set we need to show that for any given αthe point yα:=αy + (1−α)y also belongs to <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M150','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M150">View MathML</a>. In turn, this means we need to find a resource allocation function <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M151','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M151">View MathML</a> for which

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M152','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M152">View MathML</a>


For this we will use Theorem 2 (Lyapunov’s convexity theorem). Consider the space of all possible channel realizations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M153','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M153">View MathML</a>, and the Borel field <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M154','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M154">View MathML</a> of all possible subsets of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M155','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M155">View MathML</a>. For every s et <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M156','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M156">View MathML</a> define the vector measure

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M157','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M157">View MathML</a>


where the integrals are over the set E with respect to the channel distribution mh(h). A vector of channel realizations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M158','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M158">View MathML</a> is a point in the space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M159','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M159">View MathML</a>. The set E is a collection of vectors h. Each of these sets is assigned vector measure w(E) defined in terms of the power distributions p(h) and p(h). The entries of w(E) represent the contribution of realizations hEto the ergodic limits in (48). The first group of entries measure such contributions when the resource allocation is p(h). Likewise, the second group of entries of w(E) denote the contributions to the ergodic limits of the resource distribution p(h).

Two particular sets that are important for this proof are the empty set, E=, and the entire space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M160','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M160">View MathML</a>. For <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M161','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M161">View MathML</a>, the integrals <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M162','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M162">View MathML</a> in (49) coincide with the expected value operators <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M163','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M163">View MathML</a> in (47). We write this explicitly as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M164','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M164">View MathML</a>


For E=, or any other zero-measure set for that matter, we have w()=0.

The measure w(E) is nonatomic. This follows from the fact that the channel distribution contains no points of positive probability combined with the requirement that the resource allocation values p(h) belong to the bounded sets <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M165','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M165">View MathML</a>. Combining these two properties it is easy to see that that there are no channel realizations with positive measure, i.e., w(h)=0 for all <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M166','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M166">View MathML</a>. This is sufficient to ensure that w(E) is a nonatomic measure.

Being w(E) nonatomic it follows from Theorem 2 that the range of wis convex. Hence, the vector

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M167','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M167">View MathML</a>


belongs to the range of possible measures. Therefore, there must exist a set E0 such that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M168','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M168">View MathML</a>. Focusing on the entries of w(E0) that correspond to the resource allocation function pit follows that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M169','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M169">View MathML</a>


The analogous relation holds for the entries of w(E0) corresponding to p, i.e., (52) is valid if p(h) is replaced by p(h) but this fact is inconsequential for this proof.

Consider now the complement set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M170','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M170">View MathML</a> defined as the set for which <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M171','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M171">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M172','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M172">View MathML</a>. Given this definition and the additivity property of measures, we arrive at <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M173','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M173">View MathML</a>. Combining the latter with (51), yields

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M174','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M174">View MathML</a>


We mimick the reasoning leading from (51) to (52), but now we restrict the focus of (53) to the second entries of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M175','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M175">View MathML</a>. It therefore follows that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M176','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M176">View MathML</a>


Define now power distributions pα(h) coinciding with p(h) for channel realization hE0and with p(h) when <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M177','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M177">View MathML</a>, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M178','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M178">View MathML</a>


The resource distribution pαsatisfies the set constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M179','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M179">View MathML</a> in (19). Indeed, to see that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M180','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M180">View MathML</a> for all <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M181','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M181">View MathML</a> note that p(h) and p(h) are feasible in their respective problems and as such <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M182','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M182">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M183','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M183">View MathML</a> for all channels <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M184','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M184">View MathML</a>. Because for given channel realization h it holds that either pα(h)=p(h) when hE0 or pα(h)=p(h) when <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M185','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M185">View MathML</a> it follows that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M186','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M186">View MathML</a> for all channel realizations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M187','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M187">View MathML</a>. According to the definition in (19) this implies that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M188','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M188">View MathML</a>.

Let us now ponder the ergodic limit <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M189','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M189">View MathML</a> associated with power allocation pα(h).

Using (52) and (54), the average link capacities for power allocation pα(h) can be expressed in terms of p(h), p(h) as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M190','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M190">View MathML</a>


The first equality in (56) holds becauset the space <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M191','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M191">View MathML</a> is divided into E0 and its complement <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M192','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M192">View MathML</a>. The second equality is true because when restricted to E0, pα(h)=p(h); and when restricted to <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M193','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M193">View MathML</a>, pα(h)=p(h). The third equality follows from (52) and (54).

Comparing (56) with (48) we see that the power allocation pα yields ergodic limit yα. Therefore <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M194','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M194">View MathML</a> implying convexity of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M195','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M195">View MathML</a> as we wanted to show. □

Recovery of optimal primal variables

Having null duality gap means that we can work in the dual domain without loss of optimality. In particular, instead of finding the primal maximum P we can find the dual minimum D, which we know are the same according to Theorem 1. A not so simple matter is how to recover an optimal primal pair (x,p) given an optimal dual vector Λ. Observe that recovering optimal variables is more important than knowing the maximum yield because optimal variables are needed for system implementation. In this section we study the recovery of optimum primal variables (x,p) from a given optimal multiplier Λ.

Start with an arbitrary, not necessarily optimal, multiplier Λ and define the primal Lagrangian maximizer set as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M196','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M196">View MathML</a>


The elements of (x(Λ),p(Λ)) yield the maximum possible Lagrangian value for given dual variable. Comparing the definition of the dual function in (23) and that of the Lagrangian maximizers in (57) it follows that we can write the dual function g(Λ) as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M197','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M197">View MathML</a>


Particular important pairs of Lagrangian maximizers are those associated with a given optimal dual variable Λ. As we show in the following theorem, these variables are related with the optimal primal variables (x,p).

Theorem 4.

For an optimization problem of the form in (20) let <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M198','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M198">View MathML</a> denote the Lagrangian maximizer set as defined in (57) corresponding to a given optimal multiplier Λ. The optimal argument set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M199','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M199">View MathML</a> is included in this set of Lagrangian maximizers, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M200','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M200">View MathML</a>



Reinterpret (x,p) as a particular pair of optimal primal arguments. Start by noting that the value of the dual function g(Λ) can be upper bounded by the Lagrangian evaluated at (x,p)=(x,p)

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M201','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M201">View MathML</a>


Indeed, the inequality would be true for any <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M202','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M202">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M203','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M203">View MathML</a> because that is what being the maximum means.

Consider now the Lagrangian <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M204','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M204">View MathML</a> that according to (22) we can explicitly write as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M205','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M205">View MathML</a>


Since the pair (x,p) is an optimal argument of (20) we must have <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M206','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M206">View MathML</a> and f2(x)≥0. The multipliers also satisfy <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M207','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M207">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M208','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M208">View MathML</a> because they are required to be nonnegative. Thus, the last two summands in (61) are nonnegative from where it follows that

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M209','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M209">View MathML</a>


Combining (60) and (62) and using the definitions g(Λ) and P=f0(x) yields

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M210','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M210">View MathML</a>


But since according to Theorem 1 the duality gap is null D=P and the inequalities must hold hold as equalities. In particular, we have

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M211','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M211">View MathML</a>


which according to (58) means the pair (p,Λ) is a Lagrangian maximizer associated with Λ=Λ. Since this is true for any pair of optimal variables (p,Λ) it follows that the set (p,Λ) is included in the set of Lagrangian maximizers <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M212','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M212">View MathML</a> as stated in (59). □

According to (59) optimal arguments <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M213','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M213">View MathML</a> can be recovered from Lagrangian maximizers <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M214','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M214">View MathML</a> associated with optimal multipliers. One has to take care to interpret this set inclusion properly. Equation (59) does not mean that we can always compute <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M215','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M215">View MathML</a> by finding Lagrangian maximizers and as such it may or may not be a useful result. If the Lagrangian maximizer pair is unique then the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M216','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M216">View MathML</a> is a singleton and by extension so is the (included) set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M217','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M217">View MathML</a> of optimal primal variables. In this case Lagrangian maximizers can be used as proxies for optimal arguments. When the set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M218','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M218">View MathML</a> is not a singleton this is not possible and recovering optimal primal variables from optimal multipliers Λ is somewhat more difficult.

In general, problems in optimal wireless networking are such that the Lagrangian maximizer resource allocation functions p(Λ) are unique to within a set of zero measure. The ergodic limit Lagrangian maximizers x(Λ), however, are not unique in many cases. This is more a nuisance than a fundamental problem. Algorithms that find primal optimal operating points regardless of the characteristics of the set of Lagrangian maximizers are studied in Section “Dual descent algorithms”.


To determine the primal Lagrangian maximizers in (57) it is convenient to reorder terms in the definition in (22) to write

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M219','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M219">View MathML</a>


We say that in (22) the Lagrangian is grouped by dual variables because each dual variable appears in only one summand. The expression in (65) is grouped by primal variables because each summand contains a single primal variable if we interpret <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M220','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M220">View MathML</a> as a single term and the expectation as a weighted sum—which is not true, but close enough for an intuitive interpretation.

Writing the Lagrangian as in (65) simplifies computation of Lagrangian maximizers. Since there are no constraints coupling the selection of optimal x and p in (65) we can separate the determination of the pair (x(Λ),p(Λ)) in (57) into the determination of the ergodic limits

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M221','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M221">View MathML</a>


and the resource allocation function

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M222','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M222">View MathML</a>


The computation of the resource allocation function in (67) can be further separated. The set <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M223','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M223">View MathML</a> as defined in (19) constrains separate values p(h) of the function pbut doesn’t couple the selection of p(h1) and p(h2) for different channel realizations h1h2. Further observing that expectation is a linear operation we can separate (67) into per-fading state subproblems of the form

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M224','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M224">View MathML</a>


The absence of coupling constraints in the Lagrangian maximization, which permits separation into the ergodic limit maximization in (66) and the per-fading maximizations in (68), is the fundamental difference between (57) and the primal problem in (20). In the latter, the selection of optimal x and p as well as the selection of p(h1) and p(h2) for different channel realizations h1h2 are coupled by the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M225','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M225">View MathML</a>.

The decomposition in (68) is akin to the decomposition in (16) for the particular case of a point to point channel with capacity achieving codes. It implies that to determine the optimal power allocation p it is convenient to first determine the optimal multiplier Λ. We then proceed to exploit the lack of duality gap and the separable structure of the Lagrangian to compute values p(h) of the optimal resource allocation independently of each other. The computation of optimal ergodic averages is also separated as stated in (66). This separation reduces computational complexity because of the reduced dimensionality of (66) and (68) with respect to that of (20).

For the separation in (66) and (68) to be possible we just need to have a nonatomic channel distribution and ensure existence of a strictly feasible point as stated in the hypotheses of Theorem 1. These two properties are true for most optimal wireless communication and networking problems. A particular example is discussed in the following section.

Frequency division broadcast channel

Consider the optimal frequency division broadcast channel problem in (30) whose Lagrangian <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M226','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M226">View MathML</a> is given by the expression in (31). Terms in <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M227','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M227">View MathML</a> can be rearranged to uncover the separable structure of the Lagrangian

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M228','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M228">View MathML</a>


This rearrangement is equivalent to the generic transformation leading from (22) to (65). As we observed after (65) the computation of Lagrangian maximizing ergodic limits and resource allocation functions can be separated as can the computation of resource allocation values corresponding to different fading channel realizations. This is the case in this particular example. In fact, there is more separability to be exploited in (69).

With regards to primal ergodic variables rnotice that each Lagrangian maximizing rate ri(λ,μ) depends only on λi and that we can compute each ri(λ,μ)=ri(λi) separately as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M229','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M229">View MathML</a>


This can be easily computed as U(ri) is a one dimensional concave function. As a particular case consider the identity utility U(ri)=ri. Since the Lagrangian becomes a linear function of ri, the maximum occurs at either rmaxor rmin depending on the sign of 1−λi. When λi=1 the Lagrangian becomes independent of ci. In this case any value in the interval [rmin,rmax] is a Lagrangian maximizer. Putting these observations together we have

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M230','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M230">View MathML</a>


Notice that the Lagrangian maximizer ri(λi) is not unique if λi=1. Therefore, if <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M231','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M231">View MathML</a> it is not possible to recover the optimal rate <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M232','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M232">View MathML</a> from the Lagrangian maximizer <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M233','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M233">View MathML</a> corresponding to the optimal multiplier. In fact, an optimal multiplier <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M234','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M234">View MathML</a> is uninformative with regards to the optimal ergodic rate as it just implies that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M235','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M235">View MathML</a> which we know is true because this is the feasible range of ri. If you think this is an unlikely scenario because it is too much of a coincidence to have <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M236','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M236">View MathML</a>, think again. Having <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M237','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M237">View MathML</a> is the most likely situation. If <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M238','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M238">View MathML</a> the optimal rate <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M239','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M239">View MathML</a> is either <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M240','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M240">View MathML</a> or <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M241','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M241">View MathML</a>. However, the capacity bounds rminand rmaxare selected independently of the remaining system parameters. It is quite unlikely—indeed, not true in most cases—that the optimal power and frequency allocation yields a rate <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M242','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M242">View MathML</a> determined by these arbitrarily selected parameters.

As we observed in going from (67) to (68) determination of the optimal power and frequency allocation functions requires maximization of the terms inside the expectation. These implies solution of the optimization problems

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M243','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M243">View MathML</a>


where we opened up the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M244','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M244">View MathML</a> into its per-fading state components.

The maximization in (72) can be further simplified. Begin by noting that irrespectively of the value of α(h) the best possible power allocation pi(h,λ,μ) of terminal i is the one that maximizes its potential contribution to the sum in (72), i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M245','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M245">View MathML</a>


If αi(h)=1 this contribution is added to the sum in (72). If it is multiplied by αi(h)=0 it is not added to the sum in (72). Either way pi(h,λ,μ) as given by (73) is the optimal power allocation for terminal i.

To determine the frequency allocation α(h,λ,μ) define the discriminants

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M246','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M246">View MathML</a>


which we can use to rewrite (72) as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M247','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M247">View MathML</a>


Since at most one αi(h)=1 in (75), the best we can do is to select the terminal with the largest discriminant when that discriminant is positive. If all discriminants are negative the best we can do is to make αi(h)=0 for all i.

The Lagrangian maximizers pi(h,λ,μ) and α(h,λ,μ) in (73) and (75) are almost surely unique for all values of Λand μ. In particular, optimal allocations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M248','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M248">View MathML</a> and α(h) can be obtained by making Λ=Λand μ=μ in (73)–(75).

Dual descent algorithms

Determining optimal dual variables Λ is easier than determining optimal primal pairs (p,x) because there are a finite number of multipliers and the dual function is convex. If the dual function is convex descent algorithms are guaranteed to converge towards the optimum value, which implies we just need to determine descent directions for the dual function.

Descent directions for the dual function can be constructed from the constraint slacks associated with the Lagrangian maximizers. To do so, consider a given Λ and use the definition of the Lagrangian maximizer pair (x(Λ),p(Λ)) in (57) to write the dual function as

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M249','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M249">View MathML</a>


Further consider the Lagrangian <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M250','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M250">View MathML</a> evaluated at an arbitrary multiplier <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M251','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M251">View MathML</a> and primal Lagrangian multipliers corresponding to the given Λ. This Lagrangian lower bounds the dual function value <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M252','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M252">View MathML</a> which allows us to write

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M253','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M253">View MathML</a>


Subtracting (76) from (77) yields

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M254','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M254">View MathML</a>


Defining the vector <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M255','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M255">View MathML</a> with components

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M256','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M256">View MathML</a>


and recalling the multiplier definitions <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M257','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M257">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M258','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M258">View MathML</a> we can write

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M259','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M259">View MathML</a>


If the dual function g(Λ) is differentiable, the expression in (80) implies that <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M260','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M260">View MathML</a> is its gradient. If the dual function is nondifferentiable <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M261','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M261">View MathML</a> defines a subgradient of the dual function. In either case <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M262','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M262">View MathML</a> is a descent direction of the dual function. This can be verified by substituting M=Λ in (80) to conclude that for any ΛΛit must be

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M263','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M263">View MathML</a>


Since the inner product of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M264','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M264">View MathML</a> and (ΛΛ) is negative the vectors <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M265','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M265">View MathML</a> and (ΛΛ) form an angle smaller than Π/2. This can be interpreted as meaning that standing at Λ, the vector <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M266','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M266">View MathML</a> points in the direction of Λ.

Having a descent direction available we can introduce a time index t and a stepsize εtto define the dual subgradient descent algorithm as the one with iterates λ(t) obtained through recursive application of

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M267','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M267">View MathML</a>


This algorithm is known to converge to optimal dual variables if the stepsize vanishes at a nonsummable rate and to approach Λ if the stepsize is constant; see e.g., [20], Section 6.

A problem in implementing (82) is that computing the subgradient component <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M268','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M268">View MathML</a> in (79) is costly. To compute <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M269','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M269">View MathML</a>, we need to evaluate the expectation <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M270','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M270">View MathML</a> where each of the resource allocations p(h,Λ) follows from the solution of the optimization problem in (68). Therefore, to approximate the expectation <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M271','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M271">View MathML</a> we need to determine p(h,Λ) for a grid of channel values, which gets impractical if h has large dimension. A Montecarlo approximation of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M272','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M272">View MathML</a> could be computed but that is also costly. Furthermore, to compute <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M273','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M273">View MathML</a> we need to know the probability distribution mh(h) which needs to be estimated from channel observations. To overcome these difficulties we replace the gradient <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M274','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M274">View MathML</a> in (82) by a stochastic subgradient as we discuss in the following section.

Stochastic subgradient descent

Consider a given channel realization h and given multiplier Λ and define the vector

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M275','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M275">View MathML</a>


This definition is made such that the expected value of s1(h,Λ) with respect to the channel distribution is the subgradient component <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M276','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M276">View MathML</a> in (79). Thus, if we define the vector <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M277','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M277">View MathML</a> with s1(h,Λ) as in (83) and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M278','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M278">View MathML</a> we have

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M279','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M279">View MathML</a>


Formally, (84) implies that s(h,Λ) is a stochastic subgradient of the dual function. Intutively, (84) implies that s(h,Λ) is an average descent direction of the dual function because its expectation is a descent direction. Thus, if we draw independent channel realizations h(t) and replace <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M280','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M280">View MathML</a> for s(h(t),Λ(t)) in (82) we expect to observe some sort of convergence towards optimum multipliers.

The advantage of this substitution is that to compute the stochastic subgradient s(h,Λ), we do not need to evaluate an expectation as in the case of the subgradient <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M281','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M281">View MathML</a>. As a consequence, using stochastic subgradients as descent directions results in an algorithm that is computationally lighter. Perhaps more important, we can operate without knowledge of the channel probability distribution if we use the current channel realization h(t) as our channel sample. These observations motivate the introduction of the following dual stochastic subgradient descent algorithm. (S1) Primal iteration. Given multipliers λ(t) observe current channel realization h(t) and determine primal Lagrangian maximizers [cf. (66) and (68)]

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M282','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M282">View MathML</a>


(S2) Dual stochastic subgradient. With the Lagrangian maximizers determined by (85) compute the stochastic subgradient <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M283','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M283">View MathML</a> of the dual function with components [cf. (83) and (79)]

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M284','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M284">View MathML</a>


(S3) Dual iteration. With stochastic subgradients as in (86) and given step size ε descend in the dual domain along the direction −s(t) [cf. (82)]

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M285','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M285">View MathML</a>


The core of the dual stochastic subgradient descent algorithm is the dual iteration (S3). The purpose of the primal iteration (S1) is to compute the stochastic subgradients in (S2) that are needed to implement the dual descent update in (S3). We can think of the primal variables x(Λ(t)) and p(h(t),Λ(t)) as a byproduct of the descent implementation.

Convergence properties depend on whether constant or time varying step sizes are used. If the stepsizes εt form a nonsummable but square summable series, i.e., <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M286','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M286">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M287','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M287">View MathML</a>, then using a simple supermartingale argument it can be shown that λ(t) converges to Λ almost surely [24]. If constant stepsizes εt=ε for all t are used, λ(t) does not converge to Λ but it can be shown that λ(t) visits a neighborhood of the optimal multiplier set Λ[19, Appendix]. Excursions away from this set are possible, but the set is visited infinitely often. The suboptimality of this set is controlled by the step size ε.

If λ(t) approaches or converges to Λit follows as a consequence of Theorem 4 that an optimal primal pair (x,p) can be computed from the Lagrangian maximizers if the latter are unique. Observe that this does not require a separate computation because the Lagrangian maximizers are computed in the primal iteration (S1). One may question that at time t we do not compute the Lagrangian maximizer function p(Λ(t)) but just the single value p(h(t),Λ(t)). However, h(t) is the channel realization at time t which means that p(h(t),Λ(t)) is the value we need to compute to adapt to the current channel realization.

This permits reinterpretation of (S1)–(S3) as a policy to determine wireless systems’ operating points. At time t we observe current channel realization h(t) and determine resource allocation p(h(t),Λ(t)) which we proceed to implement in the current time slot. In this case the core of the algorithm is the primal iteration (S1) and the dual variable λ(t) is an internal state that determines the operating point. Steps (S2) and (S3) are implemented to update this internal state so that as time progresses λ(t) approaches Λand the policy becomes optimal because it chooses the best possible resource allocation adapted to the current channel realization h(t).

The reinterpretation of (S1)–(S3) as a policy to determine resource allocations p(t)=p(h(t),Λ(t)) associated with observed channel realizations h(t) motivates a redefinition of the concept of solution to the wireless optimization problem in (20). In principle, solving (20) entails finding the optimal resource allocation p and the optimal ergodic average x such that the problem constraints are satisfied and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M288','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M288">View MathML</a> [cf. (21)]. Heeding the interpretation of dual stochastic subgradient descent as a policy, we are interested in the optimality of the sequences of power allocations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M289','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M289">View MathML</a> and average variables <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M290','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M290">View MathML</a> generated by (S1)–(S3). Further notice that since (S1)–(S3) is a stochastic algorithm the sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M291','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M291">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M292','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M292">View MathML</a> generated in a particular run are instantiations of respective random processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M293','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M293">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M294','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M294">View MathML</a>. We are therefore interested in the optimality of the processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M295','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M295">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M296','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M296">View MathML</a>.

To be more specific consider the channel stochastic process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M297','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M297">View MathML</a> whose instances are sequences of channel realizations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M298','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M298">View MathML</a> drawn independently from the channel probability distribution mh(h). Suppose we are also given a sequence of variables <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M299','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M299">View MathML</a> drawn from a stochastic process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M300','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M300">View MathML</a> and a resource allocation function p(h) that dictates allocation of resources p(h(t)) for channel realization h(t). Assuming that the process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M301','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M301">View MathML</a> is ergodic, the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M302','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M302">View MathML</a> is equivalent to

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M303','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M303">View MathML</a>


Indeed, since <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M304','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M304">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M305','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M305">View MathML</a> are ergodic processes the limit <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M306','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M306">View MathML</a> is a constant that we could denote by x and the limit <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M307','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M307">View MathML</a> is equivalent to the expectation <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M308','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M308">View MathML</a>.

Writing the constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M309','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M309">View MathML</a> in the more cumbersome form shown in (88) has the advantage that the latter can be generalized to cases in which we are given stochastic processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M310','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M310">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M311','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M311">View MathML</a> in which <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M312','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M312">View MathML</a> is not necessarily ergodic and realizations p(t) of the process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M313','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M313">View MathML</a> are more general than just functions of the channel state h(t). This concept of solution is formally defined next.

Definition 2

Consider the channel stochastic process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M314','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M314">View MathML</a> whose instances are sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M315','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M315">View MathML</a> drawn independently from the channel probability distribution mh(h). We say the stochastic processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M316','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M316">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M317','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M317">View MathML</a> with realizations <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M318','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M318">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M319','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M319">View MathML</a> problem in (20) if (i) Instantaneous feasibility. Sequence values satisfy the set constraints <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M320','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M320">View MathML</a>, <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M321','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M321">View MathML</a> for all times t. (ii) Almost sure average feasibility. Ergodic limits of sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M322','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M322">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M323','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M323">View MathML</a> are feasible with probability 1, i.e.,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M324','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M324">View MathML</a>


<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M325','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M325">View MathML</a>


(iii) Almost sure optimality. The yield of the ergodic limit of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M326','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M326">View MathML</a> is almost surely optimal, i.e,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M327','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M327">View MathML</a>


If the stochastic process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M328','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M328">View MathML</a> is ergodic and the process <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M329','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M329">View MathML</a> is such that realizations p(t)=p(h(t)) are functions of current channel states, Definition 2 is equivalent to (21) with <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M330','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M330">View MathML</a> and p(h(t))=p(h(t)). Definition 2 is more general because it allows correlation between values of <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M331','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M331">View MathML</a> and lets p(t) be more complex than just a function of the current channel realization h(t). This added generality is needed because processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M332','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M332">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M333','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M333">View MathML</a> defined as per (S1)–(S3) yield correlated processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M334','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M334">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M335','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M335">View MathML</a> in which p(t) is a function of the current channel realization h(t) and the current Lagrange multiplier λ(t). Processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M336','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M336">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M337','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M337">View MathML</a> are close to optimal in the sense of Definition 2 as we describe in the following theorem.

Theorem 5 (Ergodic stochastic optimization [[19]])

Consider the optimization problem in (20) as well as processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M338','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M338">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M339','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M339">View MathML</a> generated by the stochastic dual descent algorithm (S1)–(S3). Let <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M340','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M340">View MathML</a> be a bound on the second moment of the norm of the stochastic subgradients s(h,Λ) and assume the same hypotheses of Theorem 1. Sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M341','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M341">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M342','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M342">View MathML</a> are such that: (i) Feasibility. Items (i) and (ii) of Definition 2 hold true. (ii) Near optimality. The ergodic average of x(t) almost surely converges to a value with optimality gap smaller than <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M343','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M343">View MathML</a>, i.e,

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M344','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M344">View MathML</a>


The sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M345','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M345">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M346','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M346">View MathML</a> satisfy the constraints in (89) and (90) almost surely and the objective function evaluated at the ergodic limit <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M347','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M347">View MathML</a> is within <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M348','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M348">View MathML</a> of optimal. Since <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M349','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M349">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M350','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M350">View MathML</a> are compact sets it follows that the bound <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M351','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M351">View MathML</a> is finite. Therefore, reducing ε it is possible to make <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M352','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M352">View MathML</a> arbitrarily close to P and as a consequence the sequences <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M353','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M353">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M354','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M354">View MathML</a> arbitrarily close to optimal. It follows that the processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M355','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M355">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M356','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M356">View MathML</a> generated by (S1)–(S3) are arbitrarily close to processes <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M357','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M357">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M358','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M358">View MathML</a> that are optimal in the sense of Definition 2.

Variables p and x optimal in the sense of (21) are not computed by (S1)–(S3). Rather, (89) implies that, asymptotically, (S1)–(S3) is drawing resource allocation realizations p(t)=p(h(t),Λ(t)) and variables x(t):=x(Λ(t)) that are close to optimal as per Definition 2. The important point here is that having a procedure to generate stochastic processes close to optimal in the sense of Definition 2 is sufficient for practical implementation.

An example application of the dual stochastic subgradient descent algorithm (S1)–(S3) is discussed in the next section.

Frequency division broadcast channel

To implement dual stochastic descent for the frequency division broadcast channel we need to specify the primal iteration (S1) and the dual iteration (S2). To specify the primal iteration (S1) we need to compute Lagrangian maximizers for which it suffices to recall the expressions in Section “Frequency division broadcast channel” of “Recovery of optimal primal variables”. For the ergodic rate ri we make λi=λi(t) in (70) to conclude that the primal iterate ri(t)=ri(λi(t)) is

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M359','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M359">View MathML</a>


For the power allocations pi(t)=pi(h(t),λ(t),μ(t)) and the frequency assignments α(t)=α(h(t),λ(t),μ(t)) we need to set the multipliers to λ=λ(t) and μ=μ(t) and also set the value of the channel to its current state h=h(t). This substitution in (73) yields the power allocation

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M360','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M360">View MathML</a>


To determine the frequency assignments α(t) we first substitute λ=λ(t), μ=μ(t), and h=h(t) in (74) to compute the discriminants di(t)=di(h(t),λ(t),μ(t))

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M361','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M361">View MathML</a>


from where we conclude that the frequency assignment α(t)=α(h(t),λ(t),μ(t)) is given by the solution of [cf. (75)]

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M362','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M362">View MathML</a>


Recall that since at most one αi(h)=1 in (96), the optimal frequency allocation is to make αi(h)=1 for the terminal with the largest discriminant when that discriminant is positive. If all discriminants are negative we make αi(h)=0 for all i.

The ESO algorithm for optimal resource allocation in broadcast channels is completed with an iteration in the dual domain [cf. (83) and (87)]

<a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M363','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M363">View MathML</a>


As per Theorem 5 iterative application of (93)–(97) yields sequences ri(t), αi(t) and pi(t) such that: (i) The sum utility for the ergodic limits of ri(t) is almost surely within a small constant of optimal; (ii) The power constraint in (27) and the rate constraints in (26) are almost surely satisfied in an ergodic sense. This result is true despite the presence of the non-convex integer constraint <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M364','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M364">View MathML</a>, the non-concave function C(hipi(t)/N0), the lack of access to the channel’s probability distribution, and the infinite dimensionality of the optimization problem.

Numerical results

The dual stochastic subgradient descent algorithm for optimal resource allocation in frequency division broadcast channels defined by (93)–(97) is simulated for a system with J=16 nodes. Three AMC modes corresponding to capacities 1, 2 and 3 bits/s/Hz are used with transitions at SINR 1, 3 and 7. Fading channels are generated as i.i.d. Rayleigh with average powers 1 for the first four nodes, i.e., j=1,…,4, and 2, 3 and 4 for subsequent groups of 4 nodes. Noise power is N0=1 and average power available is q0=3. Rate of packet acceptance is constrained to be 0≤ri(t)≤2 bits/s/Hz. The optimality criteria is proportional fair scheduling, i.e., <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M365','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M365">View MathML</a> for all i. Steps size is ε=0.1.

Figure 3 shows evolution of dual variables λi(t) and corresponding rates ri(t) for representative nodes i=1 with average channels <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M366','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M366">View MathML</a> and i=9 with <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M367','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M367">View MathML</a>. The time average rate <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M368','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M368">View MathML</a> is also shown. Neither multipliers λi(t) nor rates ri(t) converge, but ergodic rates <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M369','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M369">View MathML</a> do converge. Multiplier λ1(t) associated with node 1 is larger than multiplier λ9(t) of node 9. This improves fairness of resource allocation by increasing the chances of allocating user 1 even when the channel h1(t) is smaller than h9(t)—recall that channel h9(t) is stronger on average. Convergence of the algorithm is ratified by Figures 4 and 5. Figure 4 shows evolution of the objective <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M370','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M370">View MathML</a> and the dual function value g(t):=g(λ(t),μ(t)). Notice that the objective value is decreasing towards the maximum objective. This is not a contradiction, because variables <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M371','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M371">View MathML</a> are infeasible but approach feasibility as t grows. The dual function’s value is an upper bound on the maximum utility and it can be observed to approach the objective as t grows. Eventually, the objective value becomes smaller than the dual value as expected. Figure 5 corroborates satisfaction of the power constraint in (27) and the rate constraints in (26). The amount by which the power constraint (27) is violated is shown in the top. In the bottom we show the corresponding figure for the rate constraint in (26). Since there are J of these constraints we show the minimum and maximum violation. All constraints are satisfied as t grows. Resulting power allocations appear in Figure 6 for a channel with average power <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M372','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M372">View MathML</a> and for a channel with <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M373','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M373">View MathML</a>. Power allocation is opportunistic in that power is allocated only when channel realizations are above average.

thumbnailFigure 3. Primal and dual iterates in dual stochastic gradient descent. Evolution of dual variables λi(t) and rates ri(t) for representative nodes with average channels <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M374','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M374">View MathML</a> and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M375','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M375">View MathML</a> for the algorithm in (93)–(97) are shown. Multipliers λi(t) and capacities ci(t) do not converge, but ergodic rates <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M376','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M376">View MathML</a> do.

thumbnailFigure 4. Optimal frequency division broadcast channel. Objective value <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M377','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M377">View MathML</a> and dual function’s value g(t):=g(λ(t),μ(t)) for the algorithm in (93)–(97) are shown along with lines marking optimal utility and 90% of optimal yield. Utility yield becomes optimal as time grows.

thumbnailFigure 5. Power and capacity constraints. Feasibility as time grows is corroborated for the power constraint in (27) (top) and rate constraints in (26) (bottom). For the rate constraint we show the maximum and minimum value of constraint violation.

thumbnailFigure 6. Power allocations. Power allocated as a function of channel realization is shown for channels with average power <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M378','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M378">View MathML</a> (top) and <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M379','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M379">View MathML</a> (bottom). The resulting power allocation is opportunistic in that power is allocated only when channel realizations are above average.


This article reviews recent results which state that problems of the form in (20) in which nonconcave functions appear inside expectations have null duality gap as long as the probability distribution of the fading coefficient h contains no points of strictly positive probability. Lack of duality gap permits solution in the dual domain leading to a substantial reduction in the computational cost of determining optimal operating points of the wireless system. Working in the dual domain leads to a solution methodology that can be interpreted as a generalization of the derivation of the waterfilling power allocation in point to point channels reviewed in Section “Power allocation in a point-to-point channel”.

Specifically, the problem of determining the optimal resource allocation function p in (20) is challenging due to its infinite dimensionality and lack of convexity. However, in the dual domain we need to determine the optimal multiplier Λthat minimizes the dual function in (23). This is simpler because the dual function is convex and finite dimensional. Once we have found an optimal dual variable we can determine optimal operating points as Lagrangian maximizers. In doing so we can exploit the separable structure of the Lagrangian to decompose the optimization problem into the per fading state subproblems in (68). We emphasize that solving the optimization programs in (68) is not necessarily easy if the dimensionality of h is large. Nevertheless solving (68) is always simpler than solving (20) and in some cases plain simple. Lack of duality gap and Lagrangian separability are further exploited to propose the dual stochastic subgradient descent algorithm (S1)–(S3) which converges to an optimal operating point with probability 1 in an ergodic sense.

There are three key points that permit the development of the solution methodology outlined in the previous paragraph: Nonatomic fading distribution. A nonatomic fading distribution leads to the lack of duality gap. The fact that P=D, i.e., that primal and dual optimal values are the same, is what allows us to work in the dual domain without loss of optimality. In formal terms, lack of duality gap is the tool that we used to recover the optimal primal variables (x,p) from the optimal dual variable Λby determining the primal Lagrangian maximizers for <a onClick="popup('http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M380','MathML',630,470);return false;" target="_blank" href="http://jwcn.eurasipjournals.com/content/2012/1/272/mathml/M380">View MathML</a> [cf. Theorem 4]. It is important to distinguish between convexity of the optimization problem and lack of duality gap. Null duality gap may follow from convexity, but convexity is rare in wireless communications systems. Lack of duality gap can also follow from a nonatomic fading distribution, which is a common occurrence in wireless systems. Lagrangian Separability. According to Theorem 4 null duality gap permits computation of the optimal pair (x,p) as the Lagrangian maximizers (x(Λ),p(Λ)). This is not a simplification per se but leads to a simplification because the computation of the Lagrangian maximizer function p(Λ) can be separated into per fading state problems whose solution determines values p(h,Λ) of this function [cf. (66)–(68)]. The Lagrangian is separable in this sense because neither the constraints nor the objective function involve a nonlinear function coupling the selection of values p(h1) and p(h2) for different channel realizations h1h2. Whenever p(h1) and p(h2) appear as part of the same constraint they appear as different terms of an expectation operation. This absence of coupling is what permits exchanging the order of maximization and expectation in going from (67) to (68). Finite number of constraints. Working in the dual domain is simpler than working in the primal domain because the dual function is finite dimensional whereas the primal problem is infinite dimensional. We have a finite dimensional dual function as long as the original optimization problem has a finite number of constraints.

Nonatomic fading distributions, Lagrangian separability, and having a finite number of constraints are properties that appear in many, indeed most, problems in optimal design of wireless systems. In such cases the methodology described in this article can be applied to their solution.

Further reading

The use of dual problems as a shortcut to solve optimization problems in communications has a rich history [25-28]; see also [29] for a comprehensive treatment. Lack of duality gap in non-convex optimization problems has also been observed in the context of asymmetric digital subscriber lines [30,31]. In network optimization lack of duality gap leads to the optimality of layered architectures which renders the complexity of wireless networking essentially identical to the complexity of physical layer optimization [32-35]. For the use of techniques discussed here in the solution of specific problems we refer the reader to [36-42]. For further details on dual stochastic sub gradient descent, the literature on convergence of subgradient descent algorithms [43-45], and stochastic subgradient descent [46-49] is of interest.

Competing interests

The author declares that he has no competing interests.


Work in this article is supported by the Army Research Office grant W911NF-10-1-0388 and the National Science Foundation award CAREER CCF-0952867. Part of the results in this article were derived while the author was at the University of Minnesota. The work presented here has benefited from discussions with Yichuan Hu, Dr. Nikolaos Gatsis, and Prof. Georgios B. Giannakis. The Associate Editor, Dr. Deniz Gunduz, provided valuable corrections to a draft version of this article.


  1. X Wang, GB Giannakis, Resource allocation for wireless multiuser OFDM networks. IEEE Trans. Inf. Theory 57(7), 4359–4372 (2011)

  2. V Ntranos, N Sidiropoulos, L Tassiulas, On multicast beamforming for minimum outage. IEEE Trans. Wirel. Commun 8(6), 3172–3181 (2009)

  3. ND Sidiropoulos, TN Davidson, ZQ Luo, Transmit beamforming for physical-layer multicasting. IEEE Trans. Signal Process 54(6), 2239–2251 (2006)

  4. JA Bazerque, GB Giannakis, Distributed scheduling and resource allocation for cognitive OFDMA radios. Mobile Nets. Apps 13(5), 452–462 (2008). Publisher Full Text OpenURL

  5. Z Quan, S Cui, AH Sayed, Optimal linear cooperation for spectrum sensing in cognitive radio networks. IEEE J. Sel. Topics Signal Process 2, 28–40 (2008)

  6. Y Hu, A Ribeiro, Optimal wireless networks based on local channel state information. IEEE Trans. Signal Process 60(9), 4913–4929 (September 2012) OpenURL

  7. Y Hu, A Ribeiro, Adaptive distributed algorithms for optimal random access channels. IEEE Trans. Wirel. Commun 10(8), 2703–2715 (2011)

  8. Y Hu, A Ribeiro, A Optimal wireless multiuser channels with imperfect channel state information. Proc. Int. Conf. Acoustics Speech Signal Process, vol. 1, ((Kyoto Japan, 2012), pp), . 1–4 OpenURL

  9. Y Hu, A Ribeiro, Optimal transmission over a fading channel with imperfect channel state information. Global Telecommun. Conf., vol. 1, ((Houston, TX, 2011), pp), . 1–5 OpenURL

  10. L Chen, SH Low, M Chiang, JC Doyle, Cross-layer congestion control, routing and scheduling design in ad hoc wireless networks. Proc. IEEE INFOCOM, vol. 1, ((Barcelona, Spain, 23–29 April 2005), pp), . 1–13 OpenURL

  11. M Chiang, SH Low, RA Calderbank, JC Doyle, Layering as optimization decomposition. Proc IEEE 95, 255–312 (2007)

  12. A Eryilmaz, R Srikant, Joint congestion control, routing, and MAC for stability and fairness in wireless networks. IEEE J. Sel. Areas Commun 24(8), 1514–1524 (2006)

  13. L Georgiadis, MJ Neely, Resource allocation and cross-layer control in wireless networks. Found Trends Netw 1, 1–144 (2006)

  14. JW Lee, RR Mazumdar, NB Shroff, Opportunistic power scheduling for dynamic multi-server wireless systems. IEEE Trans. Wirel. Commun 5(6), 1506–1515 (2006)

  15. X Lin, NB Shroff, R Srikant, A tutorial on cross-layer optimization in wireless networks. IEEE J. Sel. Areas Commun 24(8), 1452–1463 (2006)

  16. MJ Neely, E Modiano, CE Rohrs, Dynamic power allocation and routing for time-varying wireless networks, IEEE J. Sel. Areas Commun 23, 89–103 (2005)

  17. X Wang, K Kar, Cross-layer rate optimization for proportional fairness in multihop wireless networks with random access. IEEE J. Sel. Areas Commun 24(8), 1548–1559 (2006)

  18. Y Yi, S Shakkottai, Hop-by-hop congestion control over a wireless multi-hop network. IEEE/ACM Trans. Netw. 15(133–144), 1548–1559 (2007)

  19. A Ribeiro, Ergodic stochastic optimization algorithms for wireless communication and networking. IEEE Trans. Signal Process 58(12), 6369–6386 (2010)

  20. A Ribeiro, G Giannakis, Separation principles in wireless networking. IEEE Trans. Inf. Theory 56(9), 4488–4505 (2010)

  21. S Boyd, L Vandenberghe, in Convex Optimization (Cambridge University Press, Cambridge, 2004) OpenURL

  22. AA Lyapunov, Complètement, Sur les, Fonctions-vecteur, URSS, Additives. Bull. Acad. Sci. Sèr. Math 4, 465–478 (1940)

  23. RT Rockafellar, in Convex Analysis (Princeton University Press, Princeton, NJ, 1970) OpenURL

  24. NZ Shor, in Minimization Methods for Non-Differentiable Functions (Springer, Berlin, 1985) OpenURL

  25. FP Kelly, A Maulloo, D Tan, Rate control for communication networks: shadow prices, proportional fairness and stability. J. Oper. Res. Soc 49(3), 237–252 (1998)

  26. SH Low, DE Lapsley, Optimization flow control, I: basic algorithm and convergence. IEEE/ACM Trans. Netw 7(6), 861–874 (1998)

  27. SH Low, A duality model of TCP and queue management algorithms. IEEE/ACM Trans. Netw 11(4), 525–536 (2003). Publisher Full Text OpenURL

  28. SH Low, F Paganini, JC Doyle, Internet congestion control. IEEE Control Syst. Mag 22, 28–43 (2002)

  29. R Srikant, in The Mathematics of Internet Congestion Control (Birkhauser, 2004) OpenURL

  30. ZQ Luo, S Zhang, Dynamic spectrum management: complexity and duality. IEEE J. Sel. Topics Signal Process 1(2), 57–73 (2008)

  31. W Yu, R Lui, Dual methods for nonconvex spectrum optimization of multicarrier systems. IEEE Trans. Commun 54(7), 1310–1322 (2006)

  32. RA Berry, EM Yeh, Cross-layer wireless resource allocation. IEEE Signal Process. Mag 21(5), 59–68 (2004). Publisher Full Text OpenURL

  33. MJ Neely, Energy optimal control for time-varying wireless networks. IEEE Trans. Inf. Theory 52(7), 2915–2934 (2006)

  34. MJ Neely, E Modiano, CP Li, Fairness and optimal stochastic control for heterogeneous networks. Proc. IEEE INFOCOM, vol 3 ((Miami, FL 13–17, March 2005), pp), . 1723–1734 OpenURL

  35. A Ribeiro, G Giannakis, Layer separability of wireless networks. Proc. Conf. on Info. Sciences and Systems, vol. 1 ((Princeton Univ), . Princeton, NJ, 2008), pp. 821–826 OpenURL

  36. N Gatsis, A Ribeiro, G Giannakis, A class of convergent algorithms for resource allocation in wireless fading networks. IEEE Trans. Wirel. Commun 9(5), 1808–1823 (2010)

  37. N Gatsis, A Ribeiro, G Giannakis, Cross-layer optimization of wireless fading ad-hoc networks. Proc. Int. Conf. Acoustics Speech Signal Process, vol. 1 ((Taipei, Taiwan, 2009), pp), . 2353–2356 OpenURL

  38. Y Hu, A Ribeiro, Optimal wireless networks based on local channel state information. Proc. Int. Conf. Acoustics Speech Signal Process, vol. 1 ((Prague Czech Republic, 2011), pp), . 3124–3127 OpenURL

  39. Y Hu, A Ribeiro, Adaptive distributed algorithms for optimal random access channels. Proc. Allerton Conf. on Commun. Control Computing, vol. 1 ((Monticello, 2010), pp), . 1474–1481 OpenURL

  40. A Ribeiro, G Giannakis, Optimal FDMA over wireless fading mobile ad-hoc networks. Proc. Int. Conf. Acoustics Speech Signal Process, vol. 1 ((Las Vegas, NV, 2008), pp), . 2765–2768 OpenURL

  41. A Ribeiro, T Luo, N Sidiropoulos, G Giannakis, Modelling and optimization of stochastic routing for wireless multihop networks. Proc. IEEE Int. Conf. on Computer Commun, vol. 1 ((Anchorage, AK, 2007), pp), . 1748–1756 OpenURL

  42. A Ribeiro, N Sidiropoulos, G Giannakis, Optimal distributed stochastic routing algorithms for wireless multihop networks. IEEE Trans. Wirel. Commun 7(11), 4261–4272 (2008)

  43. A Juditsky, G Lan, A Nemirovski, A Shapiro, Stochastic approximation approach to stochastic programming. SIAM J. Optim 19(4), 1574–1609 (2009). Publisher Full Text OpenURL

  44. T Larsson, M Patriksson, A Str omberg, Ergodic primal convergence in dual subgradient schemes for convex programming. Math. Progr 86(2), 283–312 (1999). Publisher Full Text OpenURL

  45. A Nedic, A Ozdaglar, Approximate primal solutions and rate analysis for dual subgradient methods. SIAM J. Optim 19(4), 1757–1780 (2009). Publisher Full Text OpenURL

  46. BT Polyak, Newstochasticapproximationtypeprocedures Autom, Remote Control 51, 937–946 (1990)

  47. BT Polyak, AB Juditsky, Acceleration of stochastic approximation by averaging. SIAM J. Control Optim 30(4), 838–855 (1992). Publisher Full Text OpenURL

  48. A Ribeiro, Stochastic learning algorithms for optimal design of wireless fading networks. Proc. IEEE Workshop on Signal Process. Advances in Wireless Commun vol. 1 ((Marakech, Morocco, 2010), pp), . 1–5 OpenURL

  49. A Ribeiro, Ergodic stochastic optimization algorithms for wireless communication and networking. Proc. Int. Conf. Acoustics Speech Signal Process, vol. 1, ((Dallas, TX, 2010), pp), . 3326–3329 OpenURL