Abstract
In this paper, we present distributed cooperative and regretmatchingbased learning schemes for joint transmit power and beamforming selection for multiple antenna wireless ad hoc networks operating in a multiuser interference environment. Under the total network power minimization criterion, a joint iterative approach is proposed to reduce the mutual interference at each node while ensuring a constant received signaltointerference and noise ratio at each receiver. In cooperative and regretmatchingbased power minimization algorithms, transmit beamformers are selected from a predefined codebook to minimize the total power. By selecting transmit beamformers judiciously and performing power adaptation, the cooperative algorithm is shown to converge to a pure strategy Nash equilibrium with high probability in the interference impaired network. The proposed cooperative and regretmatchingbased distributed algorithms are also compared with centralized solutions through simulation results.
Keywords:
MIMO; ad hoc networks; game theory; beamforming1 Introduction
Multipleinput multipleoutput (MIMO) communication techniques have been shown to boost the capacity and spectral efficiency of wireless communication systems [1,2]. MIMO wireless systems can sustain more simultaneous transmissions in a reduced area through interference management [3]. When transmission parameters such as transmit power, beamformer selection, frequency, modulation, transmission rate are modified to adapt to the interference environment, MIMO systems gain an additional advantage. Adaptive wireless systems can achieve system efficiency, lower computational complexity, and overhead compared to a centralized system.
Transmit beamforming has been the focus of extensive research in the literature [411] and designing optimum signaling at the transmitter can lead to significant improvements for systems operating in varying interference [4,6,1216]. In spatial transmit beamforming, each communicating node's symbol stream is multiplied by a preselected transmit beamforming weight vector for transmission through multiple antennas such that the overall interference due to other multiple nodes is minimized. Adaptive optimizing of transmitter beamforming improves efficiency by steering the beam toward the intended receiver, while placing nulls toward the unintended receivers in order to avoid causing excessive interference to them. Transmitters may adapt their signals using a lowrate feedback from the receiver [17]. A power control mechanism can also be combined with limited rate feedback from the receiver in order to satisfy certain QualityofService (QoS) requirements at the receiver [1820].
In general, MIMO beamforming techniques in communication systems are addressed in three different systems: pointtopoint, cellular, and ad hoc networks. The great potential of MIMO in pointtopoint communication is shown in [1,4,6,21] and linear precoders (eigencoders) and beamformers have been designed for pointtopoint MIMO links in [5,7]. In cellular networks, beamforming algorithms minimize the total power and enhance capacity for arrayequipped base stations and single antenna mobile transmitters [811]. In ad hoc networks, without a central controller, distributed beamforming techniques increase system throughput and lower energy consumption [12,2224]. However, optimization solutions designed for ad hoc networks need careful study, because the environment is interference limited and the performance of MIMO techniques depends significantly on the overhead introduced by the proposed algorithms.
Distributed spatial beamforming algorithms are proposed for multiuser ad hoc MIMO networks in [23,24] under channel reciprocity conditions. Channel reciprocity holds when the channel matrix at the receiver is the transpose of the channel matrix at the transmitter, this is usually assumed in timedivision duplex (TDD) systems [24]. Bromberg [24] consider the capacity maximization problem and propose a locally enabled global optimization (LEGO) algorithm for distributed beamforming update under Gaussian otheruser interference. Iltis et al. [23] formulate the problem as a noncooperative game for overall power minimization of the network under a constant QoS constraint (i.e., target signaltointerference plus noise ratio (SINR)). The proposed iterative minimum meansquare error (IMMSE) algorithm solves an optimization problem by computing transmit/receive beamformer pairs and transmit powers in a distributed manner [23]. In the IMMSE algorithm, the receive beamformer is the conjugate of the transmit beamformer and the algorithm relies on the channel reciprocity condition. Hence, the IMMSE algorithm does not demand explicit feedback schemes for channel state information (CSI) at the receiver. However, during the updating procedure of the IMMSE algorithm, transmission overhead of training sequences and power control commands are incurred. The amount of overhead increases with iterations, since the algorithm performs transmit/receive beamformer and power updates iteratively. Moreover, if the transmitter and receiver use different channels or frequencies for transmission and reception, i.e., when channel reciprocity is not valid, CSI must be fed back to the transmitter, which necessitates overhead.
In order to lower the communication overhead between transmitter and receiver when channel reciprocity does not hold, a scheme to limit feedback by quantizing the transmit beamformer in single user MIMO systems is proposed in [21]. The concept is based on selecting a codeword in a predetermined codebook that is known to both transmitter and receiver. Selecting the transmit beamformer from a predefined codebook reduces overhead in nonreciprocal channels. Moreover, latency is reduced in highly mobile and unstable communication networks and user participation is minimized. In this scenario, the receiver only feeds back the index of the selected transmit beamformer to the transmitter. When there is no channel reciprocity between transmitters and receivers, an iterative limited feedback beamforming algorithm using a predetermined codebook is proposed in [25]. The algorithm maximizes the transmission rate in MIMO multiuser ad hoc networks using sequential discrete transmit beamformer selection updates. In each iteration, each node formulates its best response strategy, which maximizes the received SINR. However, the convergence of the algorithm has not been investigated.
Game theory has enabled efficiency and convergence proofs of some of the important problems in wireless communications such as distributed power control algorithm design [26], joint codedivision multiple access (CDMA) waveform, and power control design [19,20,27] and optimum transmission signaling strategies [28,29]. The application of game theory to distributed beamforming is problematic [23]. Lacatus and Popescu [20] and Popescu et al. [19] study joint CDMA codeword (or sequence) and power adaptation as a noncooperative game. The problem is formulated as a separable game using noncooperative convex games, with corresponding subgames: power control and codeword control game. However, in contrast to our joint optimization problem, the joint optimization of powers and CDMA codewords is investigated only over convex games (i.e., the set of action space is nonempty, compact, and convex [26,30]), and therefore the decision variables (i.e., the powers and codeword sequences) are continuous, not discrete in these games.
Optimum transmit signaling for rate maximization in MIMO interference systems has been studied using game theory [1216]. In these papers, the system is modeled as a noncooperative game where every MIMO link is a player and computes against the others by choosing the transmit covariance matrix to maximize its own rate. Liang and Dandekar [13] investigate rate maximization for MIMO ad hoc networks by performing power control. The existence of a Nash equilibrium (NE) solution is shown using concave game analysis. The convergence of the proposed algorithms is not studied. Arslan et al. [14] show that individual mutual information maximization is a concave game [31] in MIMO interference channels, which implies the existence of a NE for arbitrary channel matrices. The equilibrium is provably unique when multiuser interference (MUI) is sufficiently small. Decentralized algorithms using local information provide update strategies to determine the link parameters. As an extension of their work and for more general conditions, the uniqueness of the NE solution is provided in [15]. Scutari et al. [15] provide a unified framework for the noncooperative mutual information maximization problem for MIMO interference systems. A unified set of sufficient conditions guaranteeing the uniqueness of the NE and the convergence of asynchronous waterfilling algorithm is provided for square nonsingular channel matrices. The analysis is based on interpreting the MIMO waterfilling operator as a matrix projection onto the convex and closed set of covariance matrices. In [16], same authors extend their results for arbitrary channel matrices. However, these papers do not address the selection of discrete optimized signaling. The existence (or uniqueness) of the NE solution that is proven in [13,14] is valid either for convex or concave games or for positive definite covariance matrices that are well defined as a convex and closed set [15,16]. Cooperative and noncooperative algorithms for joint channel and power allocation chosen from the "discrete" strategy space are studied in [32] in the context of wireless mesh networks. However, the proposed noncooperative algorithm is suboptimal and one of the adaptation parameters (i.e., channel adaptation) is not followed after the first iteration.
Power minimization using distributed algorithms with transmit beamformer selection is challenging especially in ad hoc networks. Unlike power control games, in beamforming games there is no natural ordering of the actions [23]. In MIMO ad hoc networks operating in MUI environments, the interference at each user depends on the transmission parameters of the other users. The beamforming decision of each user reshapes the interference emitted to other links, in ways that may be difficult to predict. Changing the beamforming vector may reduce interference on some links, while other links may suffer from higher interference. The affected nodes will then change their own beamforming vectors, setting off an cascade of changes in the network. Moreover, if the node pairs belong to different regulation entities, the noncooperative node pairs may only want to minimize their own transmit power rather than the overall power.
The analysis for the selection of actions from the "discrete" codebook set and convergence analysis is still missing for joint transmit beamforming and power adaptation in the literature. To the best of authors' knowledge, the problem of joint discrete transmit beamforming and power adaptation has not been formalized in multiuser MIMO ad hoc networks. In this paper, we study a decentralized approach for optimizing the transmit beamformer and power levels using local information and reasonable computational burden. We consider total power minimization under a constant received target SINR constraint. Our contributions in this paper are twofold: First, we study an efficient cooperative beamforming algorithm for global power minimization problem with convergence analysis. For the cooperative algorithm, the amount of information to be exchanged between nodes will grow with the number of iterations. Second, we study a noncooperative regretmatching learning algorithm which jointly selects transmit beamformer and power to minimize the total power consumed by the network. The noncooperative solution reduces the amount of overhead by using only local information. We compare the performances of our proposed algorithms with the optimal global solution which is found by exhaustively searching the entire feasible strategy space.
The rest of this paper is organized as follows. Section 2 outlines the system model used in the paper. The optimization problem and its game theoretical interpretation are presented in Section 3. The cooperative wireless ad hoc network and noncooperative counterpart are investigated Sections 4 and 5, respectively. The performance evaluation of the proposed algorithms is provided in Section 6. Finally, Section 7 concludes the paper.
2 System model and concepts
In this paper, we consider a wireless ad hoc network consisting of multiple transmit and receive antenna node pairs as shown in Figure 1. All nodes are assumed to be using the same channel. The interference comes from the other node pairs which operate simultaneously on the same channels. In this ad hoc network model, there are N node pairs and each node pair m ∈ {1, 2, ..., N} consists of one transmitter node and one receiver node. Each transmitter and receiver node is equipped with T antennas. Each node has a unitnorm receive/transmit beamformer pair (w_{m}, t_{m}) with . The complex symbol stream transmitted is with E{b_{m}^{2}} = 1. The received symbol stream is .
Figure 1. Multiuser power control and limited feedback transmit beamforming scheme for MIMO ad hoc networks. (t_{k})_{m }represents the mth row of the kth user's transmitter vector t_{k}.
The received signal vector at the mth receiving node is given by
where H_{m,i }denotes the T × T MIMO channel between the ith transmitting node and the mth receiving node and is assumed to be quasistatic and P_{m }is the power of the mth transmitting node. The additive white Gaussian noise terms have identical covariance matrices σ^{2}I_{T }where σ^{2 }is the noise power and I_{T }is the T × T identity matrix. We note that different covariance matrices for noise will not affect the performance of the proposed algorithms. Note that the first term of the righthand side of (1) is the desired signal, whereas the second term is the interference from the other transmitting nodes.
As we are interested in the minimum achievable power, we consider the worst case where all node pairs always have some packets to transfer and all nodes in the network can transmit simultaneously. The network is assumed to be synchronous. The set of available codebook beamformers for the mth transmitting and receiving node pair is denoted by with cardinality ϒ. In a limited feedback beamforming system, the receiving node selects a transmit beamformer from the codebook and feeds back the index of the selected beamformer. Each node can select between ϒ transmit beamformer vectors. Let t_{m }∈ Δ_{m }be the selected transmit beamformer for the mth transmitting and receiving node pair. Denote Θ = [t_{1}, t_{2}, ..., t_{N}]^{T }and P = [P_{1}, P_{2}, ..., P_{N }]^{T }as the transmit beamformer selection and transmission power vectors for N nodes, respectively. The T × T the interference plus noise covariance matrix at the mth receiving node is
where Θ_{m }and P_{m }are the transmit beamformers and powers of nodes other than m.
An antenna beam pattern that adjusts the antenna gains to form nulls toward the directions of the interferers while keeping a constant gain toward the directions of the multipath of the intended receiver can be designed using receive antenna arrays. The minimum variance distortionless response beamformer [23,33] can adjust the array weights properly such that the sum of interference and noise is minimized. The normalized receive beamformer at mth receiving node is
where . The resulting received SINR at the mth receiving node due to desired transmitter of mth node pair is
where w_{m}^{2 }= t_{m}^{2 }= 1 for all m.
The proposed distributed algorithms attempt to achieve a target SINR by adjusting transmit powers. To construct a distributed iterative limited feedback beamforming scheme, let us first consider the case when there is only one node pair in the wireless network. The receiver selects the transmit beamformer from the codebook Δ_{1 }as
where is the optimal transmit beamformer selection for one node pair. Then, the receiver returns the index of the beamformer for transmit beamformer selection and the received "normalized" SINR, , through the lowrate feedback channel where since there is no interference in a single user system. The transmitter selects the transmitter beamformer in order to minimize its own transmission power P_{1}, where P_{1 }is updated as
where γ_{0 }is the target SINR value.
Consider now the case where N node pairs coexist in the wireless network. Note that for each node pair m, the value of received SINR, i.e., Γ_{m}, is a function of (Θ, P). Therefore, the transmit power of one node pair depends not only on the transmit beamformer it selects, but also on the transmit power and beamformer selection of other nodes in the network. Furthermore, in beamforming, if user i ≠ m changes its transmit beamformer t_{i }to increase its own SINR Γ_{i}, it can either increase or decrease Γ_{m}, the SINR of link m, depending on the relative positions of the nodes. Therefore, designing an optimal distributed algorithm which converges to a set of beamformers to minimize the overall transmit power while meeting target SINRs for all node pairs is not a straightforward task.
3 Optimization problem and game theoretical interpretation
The goal is to minimize the transmit power of all nodes m ∈ {1, 2, ..., N} under constant target SINR γ_{0}. The optimization problem can be defined as,
subject to Γ_{m }≥ γ_{0}, w_{m} = t_{m} = 1,
P_{min }< P_{m }≤ P_{max}, ∀m ∈ {1, 2, ..., N}, where P_{min }and P_{max }are the minimum and maximum transmit powers, respectively. We consider the above problem as a normal form game ∏ which can be mathematically defined by the triplet where is the finite set of players of the game, represents the set of all available actions for all the players and is the set of utility functions that the players associate with their strategies. The actions c_{m }∈ C_{m }for a player m are the selection of transmit powers P_{m }∈ [P_{min}, P_{max}] and the transmit beamformer t_{m }∈ Δ_{m}.
Players select actions to maximize their utility functions. One of the questions that arise is if there exists a convergence point, a set of strategies, in our case a set of beamforming selections Θ = [t_{1}, ..., t_{m}, ..., t_{N}]^{T }and power allocations P = [P_{1}, P_{2}, ..., P_{N}]^{T }from which no player would deviate. In game theory such a set of strategies is called a Nash equilibrium (NE). A NE for a game is a set of strategy profiles c = [c_{1}, c_{2}, ..., c_{N}] from which no player can increase his utility by unilateral deviations. A strategy profile (c_{m}, c_{m}) is a NE if and only if
where refers to the strategy profile in which the action of user m is changed from c_{m }to while the actions of all the other players in the game remain the same. In the following sections, we will discuss the scenarios where the node pairs are cooperative and noncooperative respectively in order to search for the best results and provide convergence guarantees.
4 Cooperative and noncooperative beamforming for MIMO ad hoc networks
4.1 Optimal (centralized) solution
In a wireless ad hoc network with a centralized agent, the transmit beamformers and the corresponding transmit powers can be jointly selected to minimize the total transmit power of all transmitting antennas as,
where and are the optimal transmit beamformer and power solutions, respectively. The transmit power P_{m }of mth node pair is defined as
where R_{m }is a function of (Θ_{m}, P_{m}) as shown in (2). A naive approach for solving the problem is to investigate all strategy profiles Θ = [t_{1}, ..., t_{m}, ...,t_{N }]^{T }exhaustively (note that for a given fixed strategy profile Θ, the corresponding power profile P can be computed using (10) for each individual node pair m). In order to compute (9), the centralized agent evaluates the total network power for possible beamforming vector combinations. For example, for a network size with 10 node pairs where each user has to select from a codebook of size beamformers, the search space is 16^{10 }strategy profiles. Consequently, finding the centralized transmit beamformer is cumbersome in largescale wireless ad hoc network. To alleviate the complexity problem, while maintaining good performance results, we propose two decentralized power minimization algorithms using cooperative and noncooperative techniques.
4.2 Cooperative power minimization using beamforming
In this section, we consider scenarios where all node pairs in the wireless network are cooperating. In a cooperative game, nodes in the network are able to coordinate and select the transmit beamformers accordingly. We want to find the optimal transmit beamformer and power assignments such that the total power by all the nodes in the network is minimized. The objective function can be written as
We assume that each user's utility function is (11). That is,
In other words, we model the game as an identical interest game which is a special case of potential games [34]. It is easy to verify that all identical interest games have at least one pure NE, which will represent any action profile that maximizes U_{network}(Θ, P) [14,32]. We analyze a cooperative power minimization algorithm (COPMA) which can converge to the optimal NE with arbitrarily high probability. This method is analogous to the decentralized negotiation method called adaptive play [14]. The key characteristic of COPMA is the randomness deliberately introduced into the decisionmaking process to avoid reaching a local solution. In COPMA, the choices of players (in our case transmit beamformer selections) lead the system to the optimal NE solution with arbitrarily high probability.
Motivated by Song et al. [32], COPMA can be implemented distributively as follows: Assume that each node pair m in the network has an unique ID_{m }and maintains two variables and which are the transmit power of the mth node pair prior to and after the change of transmit beamformer, respectively. The node pairs can be chosen randomly or in a roundrobin order for updating of the transmit beamformers. Whenever a node pair changes its strategy, it broadcasts a vector via a backbone network. After that, all the other node pairs will set , recalculate as the new transmit power and send the vector to the updating node pair m. Finally, the mth node pair will decide whether the new transmit beamformer should be kept or changed with some probability which depends on p_{current }and p_{updated }which are the total transmit power in the network prior to and after the random change of the transmit beamformer, respectively. Note that since p_{current }and p_{updated }are calculated by each node pair independently, the unique ID of each node provides a checklist to accurately add up transmit powers. For this paper, we assume that unique node IDs are built into each node and in network timing synchronization is perfect so that power updates are always received in the correct round. The detailed description of COPMA is provided as follows:
Initialization: For each transmitting and receiving pair m, the initial index of transmit beamformers for all node pairs is selected as one and the initial transmit powers are set as .
Repeat: Randomly choose a node pair m as the updating pair with probability 1/N. Denote t_{m}(n) ∈ Δ_{m }as the current transmit beamformer of the mth node pair at iteration n.
1. Set t_{m}(n) = t_{m}(n  1), . Calculate as in (10) .
2. To update node pair m, randomly choose a transmit beamformer, and calculate the transmit power required when the updated transmit beamformer is used, as in (10). Then, broadcast a data vector to all other node pairs .
3. After receiving the data vector, for each i,
 If P_{i }changes (due to change in interference perceived at the ith receiver), every other node pair sets and calculates its new transmit power from (10) and sets it to .
 If P_{i }does not change, and remain unchanged.
After and are updated for every other node pair in the network, send back the vector to node pair m.
4. Node pair m computes the current total network power as and updated total network power as with based on the received power values from all other node pairs .
5. For a smoothing factor τ > 0, set for the mth node pair with probability
i.e., the updating node pair m selects with probability (13).
6. The mth node pair broadcasts a notifying signal that contains the decision about whether the new transmit beamformer is kept. If not kept, every other node pair keeps
Until : Predefined number of iteration steps n = κ.
Note that step5 of the updating rule implies that if yields a better performance, i.e., (P_{updated } P_{current}) < 0, the mth node pair will change to updated beamformer with high probability. Otherwise, it will keep the current transmit beamformer with high probability. Note also that the tradeoff between COPMA's performance and convergence speed is controlled by the parameter. τ. Large τ represents extensive space search with slow convergence, whereas small τ represents restrained space search with fast convergence. The smoothing factor τ is selected to be a function of the number of iterations n such that as n increases, τ ↓ 0. For example, we chose τ inversely proportional to n^{2 }in our simulations. The longterm behavior of COPMA is characterized in the following theorem.
Theorem 1 : Assume that the objective of each node pair is defined as the sum power minimization in the network as defined in (9). Let Θ (k) = [t_{1}(k), t_{2}(k), ..., t_{N}(k)] denote the profile of choices at step (or iteration) k in COPMA and the optimal profile. COPMA converges to the optimal NE with arbitrarily high probability. In other words,
Proof The proof of Theorem 1 follows similar arguments as presented in [14,32,35].
Notice that the transmit beamformer selection with N players, each with codebook size, generates an Ndimensional Markovian chain on a finite state space with states or different profiles. We study the analysis for twoplayer games, i.e., N = 2 dimensional case as shown in Figure 2. The analysis can be easily extended for multiplayer games, i.e., for an Ndimensional Markovian chain.
Figure 2. Two players markov chain for COPMA.
Let t_{m }∈ Δ_{m }and t_{k }∈ Δ_{k }be the choices of two players say m and k, and without loss of generality assume that . The players m and k can choose a transmit beamformer from Δ. Let Θ_{i j }denote the state where the mth user selects the ith transmit beamformer and the nth user selects the jth transmit beamformer . At an arbitrary time instant, for any state of the Markovian chain, only one of the players can update their transmit beamformer. Therefore, for example in Figure 2, state can only transit into a state either in the same row or the same column. For any fixed τ > 0, the transition probability from state to state is given by
where Θ_{ij }and Θ_{lp }differ in exactly one transmit beamformer selection, i.e., Θ_{ij }≠ Θ_{lp }for i = l or j = p, τ is the smoothing factor of COPMA and P(Θ_{ij}) is the minimum total network power required to reach target SINR γ_{0 }for both users at state Θ_{ij }calculated using (10) for each user. If Θ_{ij }and Θ_{lp }are different in more than one position, then ℙ_{τ }(Θ_{lp} Θ_{ij}) = 0. In addition, ℙ_{τ }(Θ_{ij} Θ_{ij}) > 0 is always true. Therefore, for any fixed τ > 0, the induced Markov chain is irreducible and aperiodic.
The stationary distribution for each state can be obtained from the following balance equations (using the arrows in Figure 2):
for all and . Substituting (15) into (16) gives
Then, the stationary distribution of the induced Markov chain at step k is obtained as
for arbitrary state . Hence, from irreducibility and aperiodicity of the Markovian chain, we have
where . The result validates that COPMA converges to the optimal state with arbitrarily high probability for twoplayer (N = 2) case and the analysis can easily be extended for general multiplayer (N > 2) cases as well. ■
With the above theorem, the transmit beamformer and power level selections are shown to reach the optimal NE solution with arbitrarily high probability.
One disadvantage of cooperativebased algorithms is that the communication overhead incurred to calculate the total network power increases with the number of iterations. In the next section, we study a noncooperative learning algorithm using local information with less computations.
5 Regretmatchingbased joint transmit beamformer and power selection game (RMSG)
In this section, our goal is to obtain a distributed learning algorithm for joint transmit beamformer and power selection scheme in MIMO ad hoc networks that requires only local information for updates. We will use a utility function for noncooperative users. Note that the interaction among N "selfish" node pairs can be defined as noncooperative power minimization game where each node pair is attempting to find their own transmit beamformers to minimize their corresponding transmit powers. In the noncooperative joint iterative beamforming and power adaptation, the N node pairs care only about their own power minimizations exclusively, rather than accounting for the overall network power. Each player's utility function depends on the choice of the transmit beamformer and its own power, as well as on the other users' selections for transmit powers and beamformers via the perceived interference. Note that the noncooperative distributed beamforming algorithms for multiuser MIMO ad hoc networks lack the quality of "strategic complementarities" [36] that is found in power controlonly games [26]. It is thus not clear how to design an ordered set of actions for noncooperative beamforming games. Instead, we study a noncooperative learning algorithm called the regretmatching adaptive algorithm from [37], in which the players choose their actions based on their regret for not choosing particular actions in the past. The steadystate solution of the regretmatchingbased learning algorithm exhibits "no regret" and the probability of choosing a strategy is proportional to the player's "regret" for not having chosen other strategies.
Let denote the vector of all strategies or actions for user m, i.e., and t_{m}(i) denote the transmit beamformer vector selected by the mth user in iteration i. Define the average regret vector of user m for an action vector at iteration (or time) k as
In the regretmatchingbased joint transmit beamformer and power selection game (RMSG), each user m computes for every action t_{m }∈ Δ_{m }in all past steps when all other player's actions remain unchanged. Each player m updates its regret for every set of actions based on the following recursion formula:
At every step k > 1, each user m updates its own average regret vector for every strategy in .
In regret matching, after computing the average regret vector, , each user m chooses an action or strategy t_{m}(k), k > 1, according to probability distribution defined as
where [x]^{+ }equals x when x is positive and zero otherwise. Notice that in the regretmatching game, each user m chooses a strategy t_{m }∈ Δ_{m }at any step with a probability proportional to the average regret for not choosing that strategy t_{m }∈ Δ_{m }in the past steps. The detailed summary of RMSG using a GaussSeidel updating scheme [15] is given in Table 1 where κ is the predefined number of iterations.
Table 1. RegretMatchingbased joint transmit beamformer and power selection game (RMSG) algorithm
Every finite strategy game has a mixed strategy Nash equilibrium [30]. Using a good learning algorithm, any finite game can be shown to converge to a mixed strategy Nash equilibrium. Regretmatchingbased selection is distributed and requires limited information exchange between the users if the utility function is properly selected. The timeaveraged behavior of regretmatching game converges almost surely (with probability one) to the set of coarsecorrelated equilibrium [34,38]. Therefore, the joint transmit beamformer and power selections converges to a mixed strategy equilibrium solution. In fact, in our joint transmit beamformer and power selection game, the average regret of a user using regret matching becomes asymptotically zero, which is confirmed by our simulations.
The utility function of noncooperative or "selfish" users for the transmit beamformer and power selection game at iteration k is
Note that by using the above utility function, each user selects the transmit beamformer t_{m }∈ Δ_{m }to maximize its own "normalized" SINR, . The average regret in the recursion formula (21) can be updated locally as the best transmit beamformer is being selected.
6 Simulation results
In this section, we investigate the performance results of centralized optimization, COPMA, and noncooperative regretmatching (RMSG). We assume that the wireless ad hoc network has N homogeneous pairs where each pair has one transmitter node and one receiving node. Each entry in the channel matrix H_{m,k }∀_{m}, is assumed to be independent identically distributed complex Gaussian distribution with zero mean and unit variance. We consider a radio propagation channel with pathloss exponent ν = 4. This implies that the fading power is attenuated by where d_{m }is the distance between transmitter and receiver for mth node pair. The target SINR γ_{0 }is selected to be 10 dB. We assume that channels do not vary during the iterations. If channel conditions vary during an iteration, this will change the optimization problem and the proposed algorithms' performance degrades. However, depending on the network configuration and the parameters of the algorithm, like the smoothing factor τ for COPMA, the network optimizer can set the convergence steps to be as small as possible while trading against performance degradation in timevarying channels. The Grassmannian codebook of [21] is used for the simulation results. The codebook size is selected to be with T = 3 antennas for all users. P_{max }= 100 mW (20 dBm) and P_{min }= 1 mW (0 dBm) in our simulations. We assume six different transmit power levels: 1, 5, 20, 30, 50, and 100 mW motivated by the IEEE 802.11b standard in [39]. Note that the transmit powers are selected from this discrete power level set which corresponds to ceiling function of (10). The selected network topologies are assumed to be feasible for the given power levels [40]. The noise power is σ^{2 }= 3.16 × 10^{13 }W (95 dBm) which corresponds to approximate thermal noise power for a bandwidth of 20 MHz.
6.1 Comparison of centralized optimization, COPMA, and RMSG for N = 4 node pairs
We first consider a small wireless ad hoc network with 4 users, i.e., N = 4. All transmitting and receiving nodes are randomly located in a square of 30 m × 30 m area. We choose τ = 0.1/n^{2 }in our simulations, where n denotes the iteration step. The global optimum solution obtained by enumerating all feasible strategies, i.e., 16^{4 }profiles, is the performance benchmark. The maximum number of iterations is κ = 120 for COPMA and RMSG. The total power consumed by the network is shown in Figure 3. The global minimum power solution obtained by centralized optimization is the lower bound on the overall power consumed by the network. We observe that COPMA's performance improves over time and settles at the global optimum combination after 92 iterations. Note that 68 and 76% of the gain from using COPMA algorithm is realized within the first 59 and 83 iterations, respectively.
Figure 3. Total transmit power versus iteration with N = 4, T = 3, and .
RMSG algorithm discussed in Section 5 minimizes the total transmit power in the network defined by (9) using the utility function (23) in a noncooperative manner. Figure 3 also shows how the total power in the network varies over 120 iterations using RMSG. Note that RMSG yields inferior performance compared to COPMA in terms of the achieved overall power. However, the updating procedure is noncooperative and requires less overhead as the iterations continue. The total network power converges to a value of 135 mW on the 68th iteration whereas the centralized algorithm's solution requires 65 mW total network power. Steady state is reached when all the users select a transmit beamformer index with probability one.
Figures 4 and 5 depict the trajectories of transmit beamformer selection indices and power trajectories in COPMA for each user in the network topology. At the initialization step, each user starts with maximum power levels and first index of transmit beamformer selections. Then, each user updates iteratively following COPMA algorithm, until the optimum Nash equilibrium is achieved. Note that when the transmit beamformers Θ and power level vectors P converge in Figures 4 and 5, the corresponding overall transmit power obtained by COPMA is shown in Figure 3. The existence of NE and the convergence toward NE in COPMA are illustrated by the curves in Figures 4 and 5.
Figure 4. Transmit beamformer indexes versus iteration in COPMA with N = 4, T = 3, and .
Figure 5. Transmit powers versus iteration in COPMA with N = 4, T = 3, and .
Probability mass function (p.m.f): In this subsection, we take a look at the probability mass function (p.m.f) of the RMSG algorithm calculated in (22).
Figure 6 represents the change in the p.m.f after 1, 12, 50, and 100 iterations for one user. Initially, users choose the strategies, i.e., transmit beamformers, with equal probability. The strategies are represented by the indices 1 to in the xaxis and the probabilities of selecting these indices are given on the yaxis. After 12 iterations, the probability of choosing transmit beamformer index 9 is higher than that for any other transmit beamformer index, although the other probabilities for indices 3, 4, and 12 are not totally eliminated. After 25 iterations, all other probabilities except those of 4 and 9 are eliminated. A stationary point is reached when user 1 chooses transmit beamformer index 9 in the 100th iteration.
Figure 6. The probability distribution of RMSG for one of the users when N = 4.
6.2 Comparison of COPMA and RMSG for N = 10 node pairs
We now consider a larger wireless ad hoc network with N = 10 node pairs randomly located on a 100 m × 100 m area. The smoothing factor for COPMA is selected as τ = 200/n^{2 }in order to search more efficiently in this large strategy space. All other simulation parameters remain the same. An optimization problem with feasible points exceeding 2^{N }when N > 30 is very difficult to find [41]. The centralized approach is no longer feasible in this scenario due to the enormous strategy space of 16^{10 }profiles. Figure 7 shows the network topology and transmit beampatterns of COPMA with N = 10 users.
Figure 7. Node configuration and transmit beampatterns of COPMA with N = 10 users.
We again investigate both cooperative and regretmatching learning algorithms represented by COPMA and RMSG curves, where the maximum number of iterations is set to κ = 400 for COPMA and κ = 1500 for RMSG. COPMA's performance was found to be optimal solution for the 4 link network, so it provides a good benchmark to test the performance of RMSG. Figure 8 shows the total network power versus the number of iterations for COPMA and RMSG. This figure shows that RMSG's performance is within 75.56% of the COPMA value at the end of iterations. Furthermore, RMSG needs a larger amount of iterations compared to COPMA for the convergence. However, note that RMSG performs noncooperative updates for transmit beamformer and powers at each iteration and thus the amount of overhead is minimal.
Figure 8. Total transmit power versus iteration with N = 10, T = 3, and .
For the RMSG algorithm, the total power converges to a total network power of 0.2250 W from the 1,296th iteration. The joint selection of transmit beamformer indices and transmit powers reaches steady state when no user in the network deviates from its chosen strategy. The majority of users reach a steady state within 115 iterations. However, one user takes longer than 1,000 iterations to reach steady state.
Probability mass function (p.m.f): A similar figure for the p.m.f of RMSG for the user that takes longer convergence time than others is shown in Figure 9 for the large network size with N = 10. As can be seen in Figure 9, the probability of choosing index 16 is higher than other indices at iteration 500, but the probability of choosing index 5 and 12 is not totally eliminated even after 1,000 iterations. Since the network size is large, the learning process is slower to converge (around 1,332 iterations) to steadystate transmit beamformer indices, compared to the smaller network with N = 4 node pairs.
Figure 9. The probability distribution for one user when N = 10 in RMSG.
7 Conclusion
In this paper, we have considered both cooperative and noncooperative joint power control and beamforming in MIMO ad hoc networks using a game theoretic approach. Under constant SINR requirements, the joint transmit beamformer and power selection algorithms were studied in the context of total network power minimization. We first considered a cooperative case where all users collaborate with each other in order to minimize the overall power of the network. The game was formulated as an identical interest game, and a decentralized algorithm COPMA with high probability of convergence was proposed and analyzed. To reduce the required overhead incurred by the cooperative algorithm, we have also proposed a noncooperative solution which requires only local information. For our proposed noncooperative algorithm, users update their probabilities of choosing a transmit beamformer and power based on the "regret" of not choosing the other strategies. Numerical results illustrate the convergence properties of the proposed algorithms and their performance in terms of overall power minimization in the network.
Competing interests
The authors declare that they have no competing interests.
Note
This paper is presented in part at the IEEE Global Communications Conference 2010 (Globecom'10), Miami, FL, December 610, 2010.
References

IE TelatarL, Capacity of multiantenna Gaussian channel. Eur Trans Telecommun 10(6), 589–595 (1999)

AJ Paulraj, DA Gore, RU Nabar, H Bolcskei, An overview of MIMO communicationsa key to gigabit wireless. Proc IEEE 92(2), 198–218 (2004). Publisher Full Text

D Gesbert, S Hanly, H Huang, SS Shitz, O Simeone, W Yu, Multicell MIMO cooperative networks: a new look at interference. IEEE J Sel Areas Commun 28(9), 1380–1408 (2010)

H Sampath, P Stoica, A Paulraj, Generalized linear precoder and decoder design for MIMO channels using the weighted MMSE criterion. IEEE Trans Commun 49(12), 2198–2206 (2001). Publisher Full Text

DP Palomar, JM Cioffi, MA Lagunas, Joint TxRx beamforming design for multicarrier MIMO channels: a unified framework for convex optimization. IEEE Trans Signal Process 51(9), 2381–2401 (2003). Publisher Full Text

DP Palomar, JM Cioffi, MA Lagunas, Uniform power allocation in MIMO channels: a game theoretic approach. IEEE Trans Inf Theory 49(7), 1707–1727 (2003). Publisher Full Text

DP Palomar, MA Lagunas, JM Cioffi, Optimum linear joint transmitreceive processing for MIMO channels with QoS constraints. IEEE Trans Signal Process 52(5), 1179–1197 (2004). Publisher Full Text

F RashidFarrokhi, KJR Liu, L Tassiulas, Transmit beamforming and power control for cellular wireless systems. IEEE J Sel Areas Commun 16(10), 1437–1450 (1998)

M Schubert, H Boche, Solution of the multiuser downlink beamforming problem with individual SINR constraints. IEEE Trans Veh Technol 53(1), 18–28 (2004). Publisher Full Text

K Wong, RD Murch, KB Letaief, Performance enhancement of a multiuser MIMO wireless communication system. IEEE Trans Commun 50(12), 1960–1970 (2002). Publisher Full Text

F Farrokhi, L Tassiulas, KR Liu, Joint optimal power control and beamforming in wireless networks using antenna arrays. IEEE Trans Commun 46(10), 1313–1324 (1998). Publisher Full Text

S Ye, RS Blum, Optimized signaling for MIMO interference systems with feedback. IEEE Trans Signal Process 51(11), 2839–2848 (2003). Publisher Full Text

C Liang, KP Dandekar, Power management in MIMO adhoc networks: A gametheoretic approach. IEEE Trans Wirel Commun 6(4), 1164–1170 (2007)

G Arslan, MF Demirkol, Y Song, Equilibrium efficiency improvement in MIMO interference systems: a decentralized stream control approach. IEEE Trans Wirel Commun 6(8), 2984–2993 (2007)

G Scutari, DP Palomar, S Barbarossa, Competitive design of multiuser MIMO systems based on game theory: a unified view. IEEE J Sel Areas Commun 26(7), 1089–1103 (2008)

G Scutari, DP Palomar, S Barbarossa, The MIMO iterative waterfilling algorithm. IEEE Trans Signal Process 57(5), 1917–1935 (2009)

DJ Love, RW Heath, D Gesbert, BD Rao, M Andrews, An overview of limited feedback in wireless communication systems. IEEE J Sel Areas Commun 26(8), 1341–1365 (2008)

E Zeydan, DK Tureli, U Tureli, Iterative beamforming and power control for MIMO adhoc networks, in Proceedings of IEEE GLOBECOM'10

DC Popescu, DB Rawat, O Popescu, M Saquib, Gametheoretic approach to joint transmitter adaptation and power control in wireless systems. IEEE Trans Syst Man Cybern Part B: Cybern 40(3), 675–682 (2010)

C Lacatus, DC Popescu, Adaptive interference avoidance for dynamic wireless systems: a game theoretic approach. IEEE J Sel Topics Signal Process 1(1), 189–202 (2007)

DJ Love, RW Heath, Grassmannian beamforming for multipleinput multipleoutput wireless systems. IEEE Trans Inf Theory 49(10), 2735–2747 (2003). Publisher Full Text

MC Bromberg, BG Agee, Optimization of spatially adaptive reciprocal multipoint communication networks. IEEE Trans Commun 51(8), 1254–1257 (2003). Publisher Full Text

R Iltis, S Kim, D Hoang, Noncooperative iterative MMSE beamforming algorithms for adhoc networks. IEEE Trans Commun 54(4), 748–759 (2006)

MC Bromberg, Optimizing MIMO multipoint wireless networks assuming Gaussian otheruser interference. IEEE Trans Inf Theory 49(10), 2352–2362 (2003). Publisher Full Text

J Lee, YG Li, Iterative limited feedback beamforming for MIMO adhoc networks, Proceedings of IEEE GLOBECOM'09

CU Saraydar, NB Mandayam, DJ Goodman, Efficient power control via pricing in wireless data networks. IEEE Trans Commun 50(2), 291–303 (2002). Publisher Full Text

S Buzzi, HV Poor, Joint receiver and transmitter optimization for energyefficient CDMA communications. IEEE J Sel Areas Commun 26(3), 459–472 (2008)

G Scutari, DP Palomar, S Barbarossa, Optimal linear precoding strategies for wideband noncooperative systems based on game theory part I: Nash equilibria. IEEE Trans Signal Process 56(3), 1230–1249 (2008)

G Scutari, DP Palomar, S Barbarossa, Optimal linear precoding strategies for wideband noncooperative systems based on game theory part II: algorithms. IEEE Trans Signal Process 56(3), 1250–1267 (2008)

D Fudenberg, J Tirole, Game Theory (MIT Press, Cambridge, MA, 1991)

JB Rosen, Existence and uniqueness of equilibrium points for concave nperson games. Econometrica 33, 529–534 (1965)

Y Song, C Zheng, Y Fang, Joint channel and power allocation in wireless mesh networks: a game theoretical perspective. IEEE J Sel Areas Commun 26(7), 1149–1159 (2008)

J Chang, L Tassiulas, F RashidFarrokhi, Joint transmitter receiver diversity for efficient space division multiaccess. IEEE Trans Wirel Commun 1(1), 16–27 (2002). Publisher Full Text

JR Marden, G Arslan, JS Shamma, in Regret based dynamics: convergence in weakly acyclic games, ed. by . AAMAS '07: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (ACM, New York, NY, USA, 2007), pp. 1–8

HP Young, Individual Strategy and Social Structure (Princeton University Press, Princeton, NJ, 1998)

P Milgrom, J Roberts, Rationalizability, learning and equilibrium in games with strategic complementarities. Econometrica 58(6), 1255–1277 (1990). Publisher Full Text

S Hart, A MasColell, A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000). Publisher Full Text

HP Young, Strategic Learning and its Limits (Oxford University Press, Oxford, UK, 2004)

M Gruteser, A Jain, J Deng, F Zhao, D Grunwald, Exploiting physical layer power control mechanisms in IEEE 802.11b network interfaces. Technical report CUCS92401, Department of Computer Science, University of CO Boulder (2001)

Z Han, FR Farrokhi, KJR Liu, Joint power control and blind beamforming over wireless networks: a cross layer approach. EURASIP J Appl Signal Process 5, 751–761 (2004)

S Boyd, L Vandenberghe, Convex Optimization (Cambridge University Press, Cambridge, 2004)