Accepted Manuscript Some new results on the eigenvalues of complex non-central Wishart matrices with a rank-1 mean Prathapasinghe Dharmawansa PII: S0047-259X(16)00084-1 DOI: http://dx.doi.org/10.1016/j.jmva.2016.03.003 Reference: YJMVA 4101 To appear in: Journal of Multivariate Analysis Received date: 10 September 2014 Please cite this article as: P. Dharmawansa, Some new results on the eigenvalues of complex non-central Wishart matrices with a rank-1 mean, Journal of Multivariate Analysis (2016), http://dx.doi.org/10.1016/j.jmva.2016.03.003 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Some New Results on the Eigenvalues of Complex Non-central Wishart Matrices with a Rank-1 Mean Prathapasinghe Dharmawansa1 Department of Electronic and Telecommunication Engineering, University of Moratuwa, Moratuwa, Sri Lanka Abstract LetW be an n×n complex non-central Wishart matrix with m (≥ n) degrees of freedom and a rank-1 mean. In this paper, we consider three problems related to the eigenvalues of W. To be specific, we derive a new expression for the cumulative distribution function (c.d.f.) of the minimum eigenvalue (λmin) of W. The c.d.f. is expressed as the determinant of a square matrix, the size of which depends only on the difference m−n. This further facilitates the analysis of the microscopic limit of the minimum eigenvalue. The microscopic limit takes the form of the determinant of a square matrix with its entries expressed in terms of the modified Bessel functions of the first kind. We also develop a moment generating function based approach to derive the probability density function of the random variable tr(W)/λmin, where tr(·) denotes the trace of a square matrix. Moreover, we establish that, as m,n → ∞ with m − n fixed, tr(W)/λmin scales like n3. Finally, we find the average of the reciprocal of the characteristic polynomial det[zIn+W], | arg z| < π, where In and det[·] denote the identity matrix of size n and the determinant, respectively. Keywords: Demmel condition number, Eigenvalues, Hypergeometric function of two matrix arguments, Non-central Wishart distribution, Random matrix 2010 MSC: 60B20, 62H10, 33C15 Email address: prathapa@uom.lk (Prathapasinghe Dharmawansa) 1This work was conducted while the author was with the Department of Statistics, Sequoia Hall, 390 Serra Mall, Stanford University, Stanford, CA 94305. Preprint submitted to Journal of Multivariate Analysis March 5, 2016 *Manuscript Click here to download Manuscript: Manuscript_R4.pdf Click here to view linked References 1. Introduction The eigenvalues of random matrices are known to have far-reaching impli- cations in various scientific disciplines. Finite dimensional properties of the eigenvalues of Wishart type random matrices are of paramount importance in classical multivariate analysis (Anderson [3], Muirhead [49]), whereas recent5 multivariate statistical investigations have focused on establishing the asymp- totic properties of the eigenvalues (Johnstone [39], Onatski et al. [52]). Various links between the eigenvalues of random matrices and statistical physics, combi- natorics and integrable systems have been established over the last few decades (see e.g., Forrester [26], Mehta [47] and references therein). Apart from these10 areas, random matrices, especially matrices with complex Gaussian elements, have also found new applications in signal processing and wireless communica- tions (Chiani et al. [16], Heath Jr. and Paulraj [33], Jin et al. [38], Kang and Alouini [40, 41], Maaref and Aı¨ssa [45], Narasimhan [50], Oestges et al. [51], Or- donez et al. [53], Palomar et al. [54], Telatar [59], Tulino and Verdu´ [60], Zanella15 et al. [65]). The majority of those studies focus on random matrix ensembles derived from zero mean Gaussian matrices. However, random matrices derived from non-zero mean Gaussian matrices have been traditionally an area of interest in multivariate analysis (Anderson [3], Constantine [17], James [37], Muirhead20 [49]). Moreover, mathematical objects such as zonal polynomials (Hua [36], James [37]) and hypergeometric functions of matrix arguments (Herz [34], Kha- tri [43]) have been introduced in multivariate analysis literature to facilitate further analysis of such non-central random matrices. Interestingly, these non- central matrices have also been referred to as random matrices with external25 sources in the literature of physics (Bre´zin and Hikami [11, 12, 13], Zinn-Justin [66]). In this respect, the classical orthogonal polynomial based characterization of the eigenvalues of random matrices (Mehta [47]) has been further extended to encompass multiple orthogonal polynomials in Bleher and Kuijlaars [8, 9]. Alter- natively, capitalizing on a contour integral approach due to Kazakov (Kazakov30 2 [42]), the authors in Ben Arous and Pe´che´ [6], Bre´zin and Hikami [11, 12] have introduced a double contour integral representation for the correlation kernel of the eigenvalue point process of non-central random matrices. Some recent contributions on this matter include Bassler et al. [5], Forrester [27]. One of the salient features common to those latter studies is that they exclu-35 sively focus either on spiked correlation or mean model. It is noteworthy that these two models are mathematically related to each other (Bleher and Kuijlaars [7]). As we are well aware of, the characterization of the joint eigenvalue distri- bution of non-central random matrices2 involves the hypergeometric function of two matrix arguments (James [37]). It turns out that one of the argument ma-40 trices becomes reduced-rank in the presence of a spiked mean/correlation model. Specifically, when the spike is of rank one, an alternative representation of the hypergeometric function of two matrix arguments has recently been discovered independently by Mo (Mo [48]), Wang (Wang [62]) and Onatski (Onatski et al. [52]). The key contribution there amounts to the representation of the hyper-45 geometric function of two matrix arguments with a rank-1 argument matrix in terms of an infinite series involving a single contour integral. This repre- sentation has been subsequently used to further characterize the asymptotic behaviors of the eigenvalues of non-central random matrices (Mo [48], Wang [62]). Further generalization of the contour integral representation given in Mo50 [48], Wang [62], Onatski et al. [52] to an arbitrary hypergeometric function with two matrix arguments having a rank-1 argument matrix has been reported in Dharmawansa and Johnstone [22]. In this paper, we analyze three problems pertaining to the eigenvalues of a finite dimensional complex non-central Wishart matrix with a rank-1 mean55 matrix3. Let 0 < λ1 ≤ · · · ≤ λn be the ordered eigenvalues of an n× n complex 2Here the term “non-central random matrices” refers to non-central Gaussian and Wishart matrices. 3This is also known as the shifted mean chiral Gaussian ensemble with β = 2 (i.e., the complex case) (Forrester [27]). 3 non-central Wishart matrix W with m degrees of freedom and a rank-1 mean. We are interested in the following three problems. 1. The characterization of the cumulative distribution function (c.d.f.) of the minimum eigenvalue ofW as the determinant of a square matrix, the size60 of which depends on the difference of the number of degrees of freedom and n (i.e., m− n). 2. The statistical characterization of the random quantity tr(W)/λ1 with tr(·) denoting the trace of a square matrix. 3. The statistical average of the reciprocal of the characteristic polynomial65 det[zIn+W], | arg z| < π, with det[·] and In denoting the determinant of a square matrix and the n× n identity matrix, respectively. The above quantities have found many applications in contemporary wireless communications systems. In particular, complex non-central Wishart matrices with a rank-one mean arise in multiple-input multiple-output (MIMO) chan-70 nels characterized by strong line-of-sight components (i.e., Rician fading) (Kang and Alouini [40, 41]). Therefore, the functionals of the eigenvalues of complex non-central Wishart matrices with a rank one mean have been instrumental in MIMO signal processing (Jin et al. [38], Kang and Alouini [40, 41], Maaref and Aı¨ssa [45], Zanella et al. [65]). For example, the distribution of the minimum75 eigenvalue is important in characterizing the performance of MIMO multichan- nel beamforming (MB) systems4 (Narasimhan [50], Ordonez et al. [53], Palomar et al. [54], Jin et al. [38]). Specifically, the global performance of an MB system is dominated by the performance of the weakest link (i.e., the link corresponding to the minimum eigenmode transmission). The quantity tr(W)/λ1 is of paramount80 importance in the designing of adaptive multiantenna transmission techniques (Heath Jr. and Paulraj [33]) and the modeling of physical multiantenna trans- mission channels (Oestges et al. [51]). Moreover, condition numbers of this form 4MIMO MB systems are also known as spatial multiplexing MIMO systems with CSI (channel state information) in the literature (Ordonez et al. [53]). 4 have been introduced to mutiple antenna spectrum sensing in cognitive radio (see e.g., Debbah and Couillet [19] and references therein). More recent appli-85 cations of the eigenvalues of Wishart matrices include the design and analysis of massive MIMO systems (see e.g., Hoydis et al. [35] and references therein). The first problem has a straightforward solution in the form of the deter- minant of a square matrix of size n × n (Jin et al. [38], Maaref and Aı¨ssa [45], McKay [46], Zanella et al. [65]). This stems from the determinant repre-90 sentation of the hypergeometric function of two matrix arguments due to Khatri (Khatri [43]) (see also Gross and Richards [32]). However, in certain cases, it is convenient to have an expression with the determinant of a square matrix of size m − n. Therefore, in this work, by leveraging the knowledge of classical orthogonal polynomials, we derive an alternative expression for the c.d.f. of the95 minimum eigenvalue which involves the determinant of a square matrix of size m − n + 1. This new form is highly desirable when the difference between m and n is small irrespective of their individual magnitudes. In such a situation, this new expression circumvents the analytical complexities associated with the above straightforward solution which requires the evaluation of the determinant100 of an n× n square matrix. This key representation, in turn, facilitates the fur- ther analysis of the so-called microscopic limit of the minimum eigenvalue (i.e., the limit when m,n → ∞ such that m − n is fixed) which is known to have a Fredholm determinant representation (Ben Arous and Pe´che´ [6]). Our results reveal that this microscopic limit coincides with the corresponding limit in the105 central Wishart case. The random quantity of our interest in the second problem is commonly known as the Demmel condition number in the literature of numerical analysis (Demmel [20]). As opposed to the case corresponding to the central Wishart ma- trices (Muirhead [49]), tr(W) and λ1/tr(W) are no longer statistically indepen-110 dent. Furthermore, a direct Laplace transform relationship between λ1/tr(W) and the probability density of the minimum eigenvalue of W has not been re- ported in the literature. However, such a relationship among these random quantities in the case of central Wishart matrices has been reported in Dhar- 5 mawansa et al. [24], Krishnaiah and Schuurmann [44]. Therefore, we introduce115 a moment generating function (m.g.f.) based framework to solve the second problem. In particular, using a classical orthogonal polynomial approach, we derive the m.g.f. of the random variable of our interest in terms of a single inte- gral involving the determinant of a square matrix of size m−n+1. Upon taking the direct Laplace inversion of the m.g.f. we then obtain an exact expression120 for the probability density function (p.d.f.). The remarkable fact of having the determinant of a square matrix of size m − n + 1 makes it suitable to be used when the relative difference between m and n is small. For instance, in the special case of m = n, the p.d.f. simplifies to an expression involving a single infinite summation. Moreover, we have determined the asymptotic scaled limit125 of tr(W)/λ1 as m,n→∞ with m−n fixed. It turns out that, under the above asymptotic setting, the random quantity tr(W)/λ1 scales like n3. Statistical averages akin to the third problem are closely related to some problems of the classical number theory (see e.g., Borodin and Strahov [10] for a comprehensive list of references in this respect). A generalized framework130 based on the duality between certain matrix ensembles has been proposed in Desrosiers [21] to solve certain problems involving the averages of the reciprocals of characteristic polynomials pertaining to non-central Wishart matrices. How- ever, the third problem of our interest does not seem to be consistent with that framework, since the specific parameters associated with our problem do not sat-135 isfy the requirements in [21]. This particular problem has not been addressed in a more recent work by Forrester (Bassler et al. [5]) on the averages of char- acteristic polynomials for shifted mean chiral Gaussian ensembles. Therefore, again following the classical orthogonal polynomial approach, here we derive a new expression for this particular average. The resultant expression turns out140 to have a single infinite series. This is not surprising, since in the case of a cen- tral Wishart matrix the corresponding answer depends only on the number of characteristic polynomials rather than the size of the random matrix (Borodin and Strahov [10], Desrosiers [21], Fyodorov [29], Fyodorov and Strahov [30]). The rest of this paper is organized as follows. We begin Section 2 by deriving145 6 a new p.d.f. for the eigenvalues of a complex non-central Wishart matrix with a rank-1 mean. In Section 3, we use the new joint eigenvalue p.d.f. to derive the c.d.f. of the minimum eigenvalue in terms of the determinant of a square matrix of size m−n+1. We also establish the microscopic limit of the minimum eigenvalue. Section 4 addresses the problem of statistical characterization of150 the random quantity tr(W)/λ1 by deriving the corresponding m.g.f. and p.d.f. expressions. Moreover, we show that, as m,n → ∞ with m − n fixed, the random variable tr(W)/λ1 scales like n3. Section 5 derives the average of the reciprocal of the characteristic polynomial det[zIn +W], | arg z| < π. 2. Preliminaries155 Let us first present some results related to the p.d.f. of a complex non-central Wishart matrix. In what follows, we use (·)∗ to denote the conjugate transpose of a matrix and ||A||2F to represent tr(A∗A) where A ∈ Cm×n. Moreover, we use EA (·) to denote the mathematical expectation with respect to A. Theorem 1. Let X ∈ Cm×n be distributed as CNn,m (M, Im ⊗ In) where M ∈ Cn×n with m ≥ n. Then W = X∗X has a complex non-central Wishart distri- bution Wn (m, In,M∗M) with p.d.f. (James [37]) fW(W) = e−tr(M ∗M) Γ˜n(m) (det [W])m−ne−tr(W)0F˜1 (m;M∗MW) (1) where Γ˜n(m) = π m(m−1) 2 n∏ i=1 Γ(m− i + 1) and 0F˜1 (·; ·) denotes the complex hy- pergeometric function of one matrix argument. In particular, for a Hermitian positive definite n× n matrix A, we have (James [37]) 0F˜1 (p;A) = ∞∑ k=0 1 k! ∑ κ Cκ(A) [p]κ where Cκ(·) is the complex zonal polynomial5, κ = (k1, . . . , kn), with ki’s being160 non-negative integers, is a partition of k such that k1 ≥ · · · ≥ kn ≥ 0 and 5The zonal polynomial Cκ(A) is a symmetric, homogeneous polynomial of degree k in the eigenvalues of A. However, the specific definition of the zonal polynomial is not given here 7 ∑n i=1 ki = k. Also [n]κ = ∏n i=1(n− i+ 1)ki with (a)n = a(a+ 1) · · · (a+ n− 1) denoting the Pochhammer symbol. The following theorem is due to James (James [37]). Theorem 2. The joint density of the ordered eigenvalues 0 < λ1 ≤ · · · ≤ λn of W is given by James [37] fΛ (λ1, . . . , λn) = Km,ne−tr(M ∗M)∆2n(λ) n∏ i=1 λm−ni e −λi 0F˜1 (m;Λ,M∗M) (2) where Km,n = 1∏n i=1 Γ(m− i + 1)Γ(n− i+ 1) , Λ = diag(λ) with λ = (λ1, . . . , λn) and ∆n(λ) = ∏ 1≤i −1, the generalized Laguerre polynomial of degree M , L (ρ) M (z), is given by Szego¨ [57] L (ρ) M (z) = (ρ+ 1)M M ! M∑ j=0 (−M)j (ρ+ 1)j zj j! , (7) with the kth derivative satisfying dk dzk L (ρ) M (z) = (−1)kL(ρ+k)M−k (z). (8) Also L(ρ)M (z) satisfies the following contiguous relationship L (ρ−1) M (z) = L (ρ) M (z)− L(ρ)M−1(z). (9) Definition 4. For a negative integer −M , we have the following relation (−M)j =  (−1)j M !(M−j)! for j ≤M0 for j > M. (10) Lemma 3. Following Gradshteyn and Ryzhik [31, Eq. 7.414.7] and Andrews et al. [4, Corollary 2.2.3], for j, k ∈ {0, 1, 2, . . .}, we can establish∫ ∞ 0 xje−xL(k)M (x)dx = j! M ! (k − j)M . 10 The following compact notation has been used to represent the determinant of an M ×M block matrix: det [ai,1 bi,j−1]i=1,...,M j=2,...,M = ∣∣∣∣∣∣∣∣∣∣∣∣ a1,1 b1,1 b1,2 · · · b1,M−1 a2,1 b2,1 b2,2 · · · b2,M−1 ... ... ... . . . ... aM,1 bM,1 bM,2 · · · bM,M−1 ∣∣∣∣∣∣∣∣∣∣∣∣ . 3. Cumulative Distribution of the Minimum Eigenvalue Here we derive a new expression for the c.d.f. of the minimum eigenvalue λmin of W with a rank-1 mean. By definition, the c.d.f. of λmin is given by Fλmin(x) = Pr (λ1 < x) = 1− Pr (λ1 ≥ x) (11) where Pr (λ1 ≥ x) = ∫ x≤λ1≤···≤λn<∞ fΛ (λ1, . . . , λn) dλ1 · · ·dλn. (12) The following theorem gives the c.d.f. of λmin. Theorem 4. Let W ∼ Wn (m, In,M∗M), where M is rank-1 and tr(M∗M) = µ. Then the c.d.f. of the minimum eigenvalue of W is given by Fλmin(x) = 1− (n+ α− 1)! e−nx det [ (−µ)i−1ψi(µ, x) L(j−2)n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 (13) where α = m− n, ψi(µ, x) = 1 (α + i+ n− 2)! ∞∑ k=0 (xµ)k1F1 (α+ k;α+ n+ i+ k − 1;−µ) k!(α+ i+ n− 1)k , and 1F1(a; c; z) is the confluent hypergeometric function of the first kind.185 Proof. See Appendix A. Remark 5. Alternatively, we can express ψi(µ, x) as ψi(µ, x) = e−µ (α+ i+ n− 2)!Φ3 (n+ i− 1, n+ α+ i− 1;µ, xµ) (14) 11 where Φ3(a, c;x, y) = ∑∞ i=0 ∑∞ j=0(a)ix iyj/(c)i+ji!j! is the confluent hypergeo- metric function of two variables (Erde´lyi [25, Eq. 5.7.1.23]). In the special case of α = 0 (i.e., m = n), (13) admits the following simple form Fλmin(x) = 1− e−nx ∞∑ k=0 (xµ)k k!(n)k 1F1(k;n+ k;−µ) = 1− e−µ−nxΦ3 (n, n;µ, xµ) (15) which coincides with what we have derived in Dharmawansa and McKay [23, Eq. 32/39] purely based on a matrix integral approach7.190 In addition, it is not difficult to show that, for µ = 0, (13) simplifies to Forrester and Hughes [28, Eq. 3.19] Fλmin(x) = 1− e−nx det [ L (j−1) n+i−j(−x) ] i,j=1,...,α . (16) It is worth noting that (13) provides an efficient way of evaluating the c.d.f. of λmin particularly for small values of α. Moreover, since the algebraic complexity depends only on n and the difference of m and n (i.e., m − n), this result is instrumental in evaluating the microscopic limit of λmin. Alternative expressions8 for the c.d.f. of λmin have been reported in Jin et al.195 [38, Eq. 15], Maaref and Aı¨ssa [45, Eq. 18]. and Zanella et al. [65, Eq. 11]. The key difference between our result and those alternative expressions is that the latter involve the determinants of n × n matrices. Therefore, they are not amenable to asymptotic analysis as m and n grow large, but their difference does not.200 Figure 1 compares the analytical c.d.f. result for the minimum eigenvalue of non-central Wishart matrix with simulated data. Analytical curves are com- puted based on Theorem 4. As can be seen from the figure, our analytical 7Since the results given in Dharmawansa and McKay [23] are valid for an arbitrary co- variance matrix with α = 0, one has to assume the identity covariance to obtain the above results. 8These expressions are valid for an arbitrary rank mean matrix. 12 0 0.1 0.2 0.3 0.4 0.5 0 0.2 0.4 0.6 0.8 1 x F λ m in(x ) Simulation Analytical α=1 n=50 α=2 n=40 α=3 n=30 Figure 1: Comparison of simulated data points and the analytical c.d.f.s of λmin for various values of n and α with µ = 10. results match with the simulated data thus validating our theorem. We note that the infinite summation in (13) has been truncated to 5 terms in each of the205 calculation. This fact in turn demonstrates the fast convergence of the given infinite series. Figures 2, 3 and 4 further demonstrate the effects of n, α and µ, respectively, on the c.d.f. of λmin for different parameter configurations. For certain applications (Wang and Giannakis [63]), the behavior of the210 c.d.f. of λmin at the origin is of paramount importance. Therefore, fig. 5 further demonstrates that particular behavior. Here the infinite series (13) has been truncated to 10 terms. Remark 6. The dynamics of fig. 4 suggests that the increasing µ increases the λmin. However, it is noteworthy that for rank-one MIMO Rician channel, the215 λmin decreases with the increasing K (Jin et al. [38], Kang and Alouini [40, 41]). This inconsistency between the two models can be explained using the relations (5) and (6). 13 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x F λ m in(x ) Simulation Analytical n=10 n=20 n=30 n=40 Figure 2: Illustration of the behavior of the c.d.f. of λmin for various values of n. The results are shown for µ = 10 and α = 2. 3.1. Asymptotic Characterization (Microscopic Limit) of the Smallest Eigen- value220 Here we investigate the so called microscopic limit of the c.d.f. of λmin. To be precise, we would consider the c.d.f. of suitably scaled λmin for fixed α when m,n→∞. The following corollary gives the microscopic limit. Corollary 5. As m and n tend to ∞ such that α = m − n is fixed, the scaled minimum eigenvalue nλ1 converges in distribution to a random variable X with the c.d.f. FX(x). In particular, we have lim n→∞Fnλ1 (x) = FX(x) = 1− e −x det [ Ii−j(2 √ x) ] i,j=1,...,α . (17) Proof. See Appendix B . Clearly, the above asymptotic expression does not depend on µ. Moreover,225 this limiting c.d.f. coincides with the limiting c.d.f. corresponding to central Wishart case in Forrester and Hughes [28, Eq. 3.33]. The advantage of the asymptotic formula given in corollary 5 is that it provides an easy to use expression which compares favorably with finite n results. 14 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x F λ m in(x ) Simulation Analytical α=3 α=2 α=1 α=0 Figure 3: Illustration of the behavior of the c.d.f. of λmin for various values of α. The results are shown for µ = 10 and n = 20. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x F λ m in(x ) Simulation Analytical µ=0 Analytical µ=5 Analytical µ=30 n=10 n=5 Figure 4: Illustration of the behavior of the c.d.f. of λmin for various values of µ. The results are shown for n = 5 and n = 10 with α = 2. 15 0 0.02 0.04 0.06 0.08 0.1 0 0.002 0.004 0.006 0.008 0.01 x F λ m in(x ) Simulation Analytical n=10 n=20 n=30 Figure 5: Illustration of the behavior of the c.d.f. of λmin at the origin for various values of n. The results are shown for µ = 10 and α = 2. To further highlight this fact, in Fig. 6, we compare the analytical asymptotic230 p.d.f. derived in corollary 5 with simulated data points. Having analyzed the behavior of the minimum eigenvalue of W, let us now move on to determine the distribution of the random variable tr(W)/λ1. 4. The Distribution of tr(W)/λ1 Here we study the distribution of the quantity V = tr(W) λ1 = ∑n j=1 λj λ1 . (18) It turns out that this quantity is intimately related to the distribution of the minimum eigenvalue of W given the constraint tr(W) = 1 (i.e., fixed trace) (Chen et al. [15]). To be precise, the latter is distributed as 1/V . Apart from that, the most notable application of the distribution of V is the so- called “smoothed analysis of condition numbers” (Spielman and Teng [56]). For a given function g : Cm×n → R+ (e.g., the 2−norm condition number), A ∼ CNm,n(M, σ2Im ⊗ In) with 0 < σ ≤ 1 and M ∈ Cm×n being arbitrary such that either tr (M∗M) = 1 or ||M||2 ≤ √n is satisfied, under the smoothed 16 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 x F X (x) Simulation Analytical Asymptotic m= 33, n=30 m = 75, n=70 m = 22, n=20 Figure 6: Comparison of the analytical asymptotic c.d.f. of λmin and simulated data points for µ = 10. analysis framework, a typical problem is to study the behavior of (Wschebor [64], Sankar et al. [55], Cucker et al. [18], Bu¨rgisser et al. [14]) sup M EA (g(A)) (19) where || · ||2 denote the 2-norm. For mathematical tractability, sometimes it is assumed that the matrix M is of rank one [64]. Bounds on the quantity (19) have been derived in the literature when g(A) defines various condition numbers (see, e.g., Sankar et al. [55], Cucker et al. [18], Bu¨rgisser et al. [14] and references therein). Among those condition numbers, the one introduced by James Demmel (Demmel [20]) plays an important role in understanding the behaviors of other condition numbers arising in different contexts. For a rectangular matrix X ∈ Cm×n, the function g defined by [18] g(X) = ||X||F ||X†||2 (20) 17 with X† denoting the Moore-Penrose inverse, gives the Demmel condition num- ber9. In particular, for the matrix of our interest X ∼ CNm,n(M, Im⊗ In) with m ≥ n, (20) specializes to Dharmawansa et al. [24] g(X) = √∑n j=1 λj λ1 = √ V where λ1 ≤ · · · ≤ λn are the ordered eigenvalues ofW = X∗X. In light of these235 developments we can clearly see that the distribution of V is of great importance in performing the smoothed analysis on the Demmel condition number. Having understood the importance of the variable V in (18), we now focus on deriving its p.d.f. when the matrix W has a rank-1 mean. For this purpose, here we adopt an approach based on the m.g.f. of V . We have the following key240 result. Theorem 6. Let W ∼ Wn (m, In,M∗M), where M is rank-1 and tr(M∗M) = µ. Then the p.d.f. of V is given by f (α) V (v) =(n− 1)! e−µ vn(n+α) L−1 { e−ns s(n−1)(n+α+1) R(s, v, µ) } (21) where R(s, v, µ) = det [( − µ sv )i−1 φi(µ, s, v) L (j) n+i−1−j(−s) ] i=1,...,α+1 j=2,...,α+1 φi(µ, s, v) = ∞∑ k=0 ai(k) k! ( µ sv )k 1F1 ( n2 + nα+ k + i− 1;n+ i+ k + α− 1; µ v ) ai(k) = (n+ i− 1)(n 2 + nα+ i− 2)! (n+ i+ α− 2)! (n+ i)k(n+ i− 2)k(n2 + nα+ i− 1)k (n+ i− 1)k(n+ i+ α− 1)k and L−1(·) denotes the inverse Laplace transform. Proof. See Appendix C. 9This generalizes the condition number definition given in Demmel [20] tom×n rectangular matrices. 18 Although further simplification of (21) seems intractable for general matrix dimensions m and n, we can obtain a relatively simple expression in the im-245 portant case of square matrices (i.e., m = n), which is given in the following corollary. Corollary 7. For α = 0, (21) becomes f (0) V (v) = n(n 2 − 1)e−µ(v − n)n2−2v−n2 × ∞∑ k=0 (n2)k (n)kk! (µ v )k 3F3 ( n+ 1, n− 1, n2 + k;n, n+ k, n2 − 1;µ ( 1− n v )) ×H(v − n) (22) where H(z) denotes the unit step function and 3F3(a1, a2, a3; c1, c2, c3; z) is the generalized hypergeometric function (Erde´lyi [25]). We also note that, for µ = 0 (i.e., when W is a central Wishart matrix), (21) simplifies to f (α) V (v) = n!(n2 + nα− 1)! (n+ α− 1)!vn(n+α) × L−1 { e−ns s(n−1)(n+α+1) det [ L (j+1) n+i−j−1(−s) ] i,j=1,...,α } (23) which coincides with the corresponding result given in Dharmawansa et al. [24,250 Corollary 3.2]. Figure 7 compares the analytical p.d.f. f (0)V derived in Corollary 7 with simulated data points corresponding to n = 5, 7 and 10. The agreement between the analytical and simulation results is clearly evident from the figure. Note that we have used as few as 4 terms in the infinite summation (22) in each255 calculation. Moreover, figures 8 and 9 demonstrate the effects of µ on the p.d.f. of V corresponding to α = 0 case. Although we have a wide range of µ values, we have used only 4 terms in the infinite summation in each calculation. Remark 7. Since the random quantity V is scale invariant and for fixed m,n, µ ∝ K, it is expected that the behavior of V for different K with respect to260 rank-one MIMO Rician model is consistent with figs. 8 and 9. 19 0 100 200 300 400 0 0.5 1 1.5 2 2.5 3 3.5 4 x 10 −3 v f V(0 ) (v ) Simulation Analytical n=5 n=7 n=10 Figure 7: Comparison of simulated data points and the analytical p.d.f f (0) V (v) (i.e., α = 0) for different matrix dimensions with µ = 10. The above exact finite dimensional results pertaining to V have inherent algebraic complexity. For instance, obtaining an exact finite dimensional p.d.f. of V for a general value of α seems a formidable task. To circumvent this difficulty, it is natural to study the asymptotic behavior of the random variable265 V . The following corollary gives the asymptotic behavior of V for fixed α when m and n tend to ∞. Corollary 8. The scaled random variable V/n3 converges in distribution to the random variable 1/X as m and n tend to ∞ with α fixed. More specifically, for fixed α, we have lim n→∞FV/n 3(x) = F1/X(x) = e− 1 x det [ Ij−i ( 2√ x )] i,j=1,...,α . (24) Proof. We are aware that, as m,n → ∞ with α fixed, nλ1 converges in dis- tribution to X and tr(W)/n2 converges in probability to 110. Therefore, we can invoke the Slutsky’s lemma (Van Der Vaart [61]) to establish the fact that270 10The latter statement can easily be verified by considering the limiting m.g.f. of tr(W)/n2 as n,m→∞ with α fixed. 20 0 100 200 300 400 500 0 0.5 1 1.5 2 2.5 3 3.5 4 x 10 −3 v f V(0 ) (v ) Simulation Analytical µ=10 µ=30 µ=50 µ=100 Figure 8: Illustration of the behavior of f (0) V (v) for various values of µ with n = 5. 0 200 400 600 800 1000 0 1 2 3 4 5 x 10 −4 v f V(0 ) (v ) Simulation Analytical µ=30 µ=50 µ=100 µ=150 Figure 9: Illustration of the behavior of f (0) V (v) for various values of µ with n = 10. 21 n3/V converges in distribution to X . Now the final result follows by using the continuous mapping theorem and Corollary 5. It is noteworthy that the limiting c.d.f. of V/n3 does not depend on µ. Therefore, V/n3 should have the same asymptotic limiting c.d.f. in the case corresponding to central Wishart matrices as well (see Akemann and Vivo [2,275 Eq. 4.34] for the result corresponding to central Wishart case). 5. The Average of the Reciprocal of a Certain Characteristic Poly- nomial Here we consider the problem of determining the average of the reciprocal of a certain characteristic polynomial with respect to a complex non-central280 Wishart density with a rank-one mean. This particular problem corresponding to complex central Wishart matrices has been solved in Mehta [47], Fyodorov and Strahov [30]. A general framework to derive such averages based on duality relations has been proposed in Desrosiers [21]. However, the duality relation given in Desrosiers [21, Proposition 8] does not seem to apply here, since the285 stringent technical requirements for the validity of that formula are not satisfied by the parameters in our model of interest. Moreover, this particular case has not been addressed in a recent detailed analysis on the averages of characteristic polynomials by Forrester [27]. Therefore, in what follows, we derive the average of one of the basic forms of the reciprocal of the characteristic polynomial. The290 most general form, however, is not investigated here. Let us consider the following average EW ( 1 det[zIn +W] ) = EΛ  n∏ j=1 1 z + λj  , | arg z| < π, (25) the value of which is given in the following theorem. Theorem 9. Let W ∼ Wn (m, In,M∗M), where M is rank-1 and tr(M∗M) = µ. Then the average in (25) is given by EW ( 1 det[zIn +W] ) = zα ∞∑ k=0 (−µ)kΨ(k + n+ α;α + 1; z), | arg z| < π (26) 22 where Ψ(a; c; z) is the confluent hypergeometric function of the second kind. Proof. See Appendix D. Figures 10, 11 and 12 demonstrate the effect of each parameter (i.e., n, α295 and µ) on the average EW (1/ det[zIn +W]). The agreement between the an- alytical and simulation results is clearly evident from the figure. This verifies the accuracy of our Theorem 9. Remark 8. The behavior of EW (1/ det[zIn +W]) for different K with the rank-one MIMO Rician model is expected to be inconsistent with that of fig.300 12 due to (5) and (6). 6. Conclusions Here we have addressed three problems related to the eigenvalues of a com- plex non-central Wishart matrix W with a rank-one mean. In particular, new expressions have been derived for the c.d.f. of the minimum eigenvalue, the305 p.d.f. of tr(W)/λ1, and the statistical average EW (1/ det[zIn +W]). We have used a classical orthogonal polynomial based approach to derive the main re- sults in this paper. One of the key advantages of this approach is that the dimensionality of the main results depends on α = m− n. This α dependency is pivotal in deriving the microscopic limit of the minimum eigenvalue as well310 as establishing the stochastic convergence of the Demmel condition number. It turns out that, in the limit, the above two random quantities (once suitably scaled) do not depend on the rank-one mean, whereas the corresponding finite dimensional results depend on the rank-one mean through µ = tr(M∗M). This rank-one mean property also helps us express the average EW (1/ det[zIn +W])315 in terms of a single infinite summation. 7. Acknowledgments The author would like to thank the anonymous reviewers for their valuable comments. The author would also like to thank Iain Johnstone, Yang Chen, 23 2 4 6 8 10 10−15 10−10 10−5 z E W ( 1 /d et[ z I n+ W ] ) Simulation Analytical n=5 n=7 n=10 Figure 10: Comparison of simulated data points and the analytical average EW (1/det[zIn +W]) for different matrix dimensions with α = 2 and µ = 10. 3 4 5 6 7 8 9 10 0 0.5 1 1.5 2 2.5 3 3.5 4 x 10 −6 z E W ( 1 /d et[ z I n+ W ] ) Simulation Analytical α=5 α=10 α=15 Figure 11: Illustration of the behavior of EW (1/ det[zIn +W]) for various values of α with n = 5 and µ = 10. 24 4 5 6 7 8 0.5 1 1.5 2 2.5 x 10 −6 z E W ( 1 /d et[ z I n+ W ] ) Simulation Analytical µ=10 µ=20 µ=30 Figure 12: Illustration of the behavior of EW (1/det[zIn +W]) for various values of µ with n = 5 and α = 5. Matthew McKay, and Minhua Ding for helpful discussions. This work was320 supported in part by NIH grant R01 EB 001988 and the Simons foundation Math + X program. Appendix A. Proof of Theorem 4 Since the joint p.d.f. is symmetric in λ1, . . . , λn, we can write (12) as Pr (λ1 ≥ x) = 1 n! ∫ [0,∞)n fΛ (λ1 + x, . . . , λn + x) dλ1 · · · dλn. Now it is convenient to use (3) to arrive at Pr (λ1 ≥ x) = Kn,α n! e−µ−nx µn−1 n∑ k=1 ∫ [0,∞)n 0F1 (α+ 1;µ(λk + x))∏n i=1 i6=k (λk − λi) × n∏ i=1 (λi + x)αe−λi∆2n(λ)dλ1 · · · dλn. (A.1) 25 Since each term in the above summation contributes the same amount, we may write (A.1) as Pr (λ1 ≥ x) = Kn,α(n− 1)! e−µ−nx µn−1 ∫ [0,∞)n 0F1 (α+ 1;µ(λ1 + x))∏n i=2 (λ1 − λi) × n∏ i=1 (λi + x)αe−λi∆2n(λ)dλ1 · · ·dλn. Using the decomposition ∆2n(λ) = ∏n i=2(λ1 − λi)2∆2n−1(λ), we can rewrite the above multiple integral after relabeling the variables, λ1 = λ, λi = yi−1, i = 2, . . . , n, as Pr (λ1 ≥ x) = Kn,α(n− 1)! e−µ−nx µn−1 ∫ [0,∞) 0F1 (α+ 1;µ(λ+ x)) (λ+ x)αe−λ × (−1)(n−1)αQn−1 (λ,−x, α) dλ (A.2) where Qn (a, b, α) := ∫ [0,∞)n n∏ i=1 (a− yi)(b− yi)αe−yi∆2n(y)dy1 · · ·dyn. (A.3) As shown in Appendix E, we can solve the above multiple integral in closed-form to yield Qn (a, b, α) = Kn,α (b− a)α det [ L (0) n+i−1(a) L (j−2) n+i+1−j(b) ] i=1,...,α+1 j=2,...,α+1 (A.4) where Kn,α = (−1)n+α(n+α) ∏α+1 i=1 (n+ i− 1)! ∏n−1 i=0 i!(i+ 1)!∏α−1 i=1 i! . Therefore, using (A.4) in (A.2) with some algebraic manipulation, we obtain Pr (λ1 ≥ x) = (−1)n+1 (n+ α− 1)! α! e−µ−nx µn−1 det [ ζi(x) L (j−2) n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 (A.5) where ζi(x) = ∫ ∞ 0 0F1 (α+ 1;µ(λ+ x)) e−λL (0) n+i−2(λ)dλ. 26 The remaining task is to evaluate the above integral, which does not seem to have a simple closed-form solution. Therefore, we expand the hypergeometric function with its equivalent power series and use Corollary 3 to arrive at ζi(x) = ∞∑ l=0 ∞∑ k=0 µl+kxk k!l!(α+ 1)l+k ∫ ∞ 0 λle−λL(0)n+i−2(λ)dλ = ∞∑ l=0 ∞∑ k=0 µl+kxk k!(α+ 1)l+k (−l)n+i−2 (1)n+i−2 . (A.6) Following (10), we observe that the quantity (−l)n+i−2 is non-zero only when l ≥ n + i − 2. Therefore, we shift the summation index l with some algebraic manipulation to yield ζi(x) = (−1)n+iµn+i−2α! (n+ i+ α− 2)! ∞∑ k=0 (xµ)k k!(n+ i+ α− 1)k × 1F1(n+ i− 1;n+ α+ i+ k − 1;µ) (A.7) where we have used the relation (α + 1)k+i+n+l−2 = (α+ i+ n− 2)! α! (α+ i+ n+ k − 1)l(α+ i+ n− 1)k. Substituting (A.7) back into (A.5) with some algebra then gives Pr (λ1 ≥ x) = (n+ α− 1)!e−nx det [ (−µ)i−1ψi(µ, x) L(j−2)n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 (A.8) where we have used the Kummer relation (Erde´lyi [25]), 1F1(a; c; z) = ez1F1(c− a; c,−z). Finally, using (A.8) in (11) gives the c.d.f. of the minimum eigenvalue,325 which concludes the proof. Appendix B. Proof of Corollary 5 It is convenient to start with (A.5) which we rewrite with the help of (A.6) as Pr (λ1 ≥ x) = (−1)n+1 e −µ−nx α!µn−1 ∞∑ k=0 ∞∑ l=0 µk+lxl(−k)n−1 l!(α+ 1)k+l × det [ (n)α (n)i−1 (−k)n+i−2 (−k)n−1 L (j−2) n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 . 27 We may further rewrite it as Pr (λ1 ≥ x) = e −µ−nx α!µn−1 ∞∑ k=n−1 ∞∑ l=0 µk+lxl k! l!(α+ 1)k+l(k − n+ 1)! × det [ (n)α (n)i−1 (−k)n+i−2 (−k)n−1 L (j−2) n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 . (B.1) Now we use the decomposition (−k)n+i−2 (−k)n−1 = (−1) i−1 i−2∏ p=0 (k − n+ 1− p) (B.2) to obtain Pr (λ1 ≥ x) = e −µ−nx α!µn−1 ∞∑ k=n−1 ∞∑ l=0 µk+lxlk! l!(α+ 1)k+l(k − n+ 1)! × det [ (−1)i−1 (n)α (n)i−1 i−2∏ p=0 (k − n+ 1− p) L (j−2) n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 . (B.3) Since the first infinite summation begins with k = n− 1, we may reorganize the sum with respect to k to yield Pr (λ1 ≥ x) = (n)αe−µ−nx ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+k × det [ (−1)i−1 (n)i−1 i−2∏ p=0 (k − p) L(j−2)n+i−j(−x) ] i=1,...,α+1 j=2,...,α+1 . (B.4) 28 At this juncture, we focus on further simplification of the columns involving the generalized Laguerre polynomials. To this end, we use (7) to obtain Pr (λ1 ≥ x) = (n)αe−µ−nx∏α j=1(j − 1)! ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+k × det [ (−1)i−1 (n)i−1 i−2∏ p=0 (k − p) (n+ i− 2)! (n+ i− j)! n+i−j∑ kj=0 (−n− i+ j)kj (j − 1)kj (−x)kj (kj)!  i=1,...,α+1 j=2,...,α+1 . (B.5) Further manipulation in this form is highly undesirable due to the i, j dependent summations upper limits in the 2, . . . , α + 1 columns of the determinant. To circumvent this problem, we use the factorization (−n− i+ j)kj = (−n− i+ j)kj (−n− α− 1 + j)kj (−n− α− 1 + j)kj = (n+ i− j)! (n+ α+ 1− j)! (−n− α− 1 + j)kj α−i∏ p=0 (cj − p), (B.6) where cj = n + α + 1 − j − kj , in (B.5) with some algebraic manipulation to yield Pr (λ1 ≥ x) = (n)α e−µ−nx∏α j=1(j − 1)!(n+ α− j)! ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+k × det [ (−1)i−1 (n)i−1 i−2∏ p=0 (k − p) (n+ i− 2)! n+α+1−j∑ kj=0 (−n− α− 1 + j)kj (j − 1)kj (−x)kj (kj)!  i=1,...,α+1 j=2,...,α+1 (B.7) 29 from which we obtain Pr (λ1 ≥ x) = (n)α e −µ−nx∏α j=1(j − 1)! ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+k × n+α−1∑ k1=0 n+α−2∑ k2=0 · · · n∑ kα=0 α∏ j=1 (−n− α+ j)kj (j)kj (−x)kj (kj)! × det [ (−1)i i−1∏ p=0 (k − p) (n+ p)2 α−i−1∏ p=0 (c˜j − p) ] i=0,...,α j=1,...,α (B.8) where c˜j = n + α − j − kj and in the second equality, we have translated the indices i = 1, . . . , α+1 to i = 0, . . . , α and j = 2, . . . , α+1 to j = 1, . . . , α. Note that, under the current index assignment, an empty product is interpreted as unity. Keeping in mind that we are interested in the limit as n → ∞, we may further simplify the above determinant with the help of Dharmawansa et al. [24, Lemma A.1] to yield Pr (λ1 ≥ x) = (n)α e −µ−nx∏α j=1(j − 1)! ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+k × n+α−1∑ k1=0 n+α−2∑ k2=0 · · · n∑ kα=0 α∏ j=1 (−n− α+ j)kj (j)kj (−x)kj (kj)! Ξ(n, k, c˜) (B.9) where Ξ(n, k, c˜) = ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ 1 + o(1) c˜α1 c˜ α 2 · · · c˜αα −k n2 + o ( 1 n2 ) c˜α−11 c˜ α−1 2 · · · c˜α−1α k(k−1) n2(n+1)2 + o ( 1 n4 ) c˜α−21 c˜ α−2 2 · · · c˜α−2α ... ... ... . . . ... (−1)α−2∏α−3p=0 (k−p)(n+p)2 + o ( 1n2(α−2) ) c˜21 c˜22 · · · c˜2α (−1)α−1∏α−2p=0 (k−p)(n+p)2 c˜1 c˜2 · · · c˜α (−1)α∏α−1p=0 (k−p)(n+p)2 1 1 · · · 1 ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ (B.10) with c˜ = (c˜1, . . . , c˜α) and for non-zero g(x), f(x) = o(g(x)) is equivalent to lim x→∞ f(x)/g(x) = 0. 30 Now let us consider the expression Pr (λ1 ≥ x/n), which can be written as Pr ( λ1 ≥ x n ) = (n)α e−µ−x∏α j=1(j − 1)! ∞∑ k=0 ∞∑ l=0 µk+lxl(n)k k!l!(n)α+l+knl × n+α−1∑ k1=0 n+α−2∑ k2=0 · · · n∑ kα=0 α∏ j=1 (−n− α+ j)kj (j)kj (− xn )kj (kj)! Ξ(n, k, c˜). (B.11) Assuming that we have the freedom to take the limits term by term and observ- ing that only the terms corresponding to l = 0 contribute to a non-zero limit as n→∞, we take the limit of (B.11) as n→∞ to yield lim n→∞Pr ( λ1 ≥ x n ) = e−µ−x∏α j=1(j − 1)! ∞∑ k=0 µk k! ∞∑ k1=0 ∞∑ k2=0 · · · ∞∑ kα=0 α∏ j=1 xkj (j)kjkj ! lim n→∞Ξ(n, k, c˜). (B.12) We now focus on the limiting value of the remaining determinant Ξ(n, k, c˜). Since each of the terms c˜j, j = 1, . . . , α, contains n, a straightforward term-by- term limit will give an indeterminate form. To circumvent this problem, we try to simplify the determinant as much as possible before taking the limits. To this end, we expand the determinant using its first column as Ξ(n, k, c˜) = α+1∑ i=1 (−1)i+1Ξ1,i(n, k, c˜)M1,i (B.13) where M1,i is the minor corresponding to the ith element of the first column. As such, we can represent the M1,i as the determinant of an α× α matrix as M1,i = ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ c˜α1 c˜ α 2 · · · c˜αα c˜α−11 c˜ α−1 2 · · · c˜α−1α ... ... . . . ... c˜α−i+21 c˜ α−i+2 2 · · · c˜α−i+2α c˜α−i1 c˜ α−i 2 · · · c˜α−iα ... ... . . . ... 1 1 · · · 1 ∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣ . (B.14) 31 A few observations on the evaluation of this determinant are in order now. Clearly, by invoking the factor theorem, it can be easily seen that there exists α(α − 1)/2 factors of the form c˜i − c˜j with 1 ≤ i < j ≤ α. Most importantly, each of such factors is free of n (i.e., because c˜i − c˜j = j + kj − i − ki). Since the original minor consists of monomials of degree ν1 given by ν1 = 1 + · · ·+ α− (α− i+ 1) = 12(α 2 − α+ 2i− 2), (B.15) the other factor term should be of degree ν2, which is given by ν2 = ν1 − 12α(α − 1) = i− 1. (B.16) Therefore, we can write M1,i =  ∏ 1≤i