Chapter 18: Random Number Generation

CANCOR

Given an input array of deviate values, generates a canonical correlation array.

Required Arguments

DEVT — Array of size m by n containing m sequence elements for each of n variables.   (Input)

CORR — Array of size n by n containing canonical correlation array.   (Output)

FORTRAN 90 Interface

Generic:                              CANCOR(DEVT, CORR)

Specific:                             The specific interface names are S_CANCOR and D_CANCOR.

Description

CANCOR generates a canonical correlation matrix from an arbitrarily distributed multivariate deviate sequence DEVT with n deviate variables, m elements in each deviate sequence, and a Gaussian Copula dependence structure.

Subroutine CANCOR first maps each of the J=1…n input deviate sequences DEVT(K=1…m, J) into a corresponding sequence of variates, say VKJ (where variates are values of the empirical cumulative probability function, CDF(x), defined as the probability that random deviate variable X < x).  The variate matrix element VKJ is then mapped into standard normal N(0,1) distributed deviates  zkj  using the inverse standard normal CDF ANORIN(VKJ) and then the standard covariance estimator

is used to calculate the canonical correlation matrix CORR, where Ci j = CORR(I,J).

If a multivariate distribution has Gaussian marginal distributions, then the standard “empirical” correlation matrix given above is “unbiased”, i.e. an accurate measure of dependence among the variables.  But when the marginal distributions depart significantly from Gaussian, i.e. are skewed or flattened, then the empirical correlation may become biased. One way to remove such bias from dependence measures is to map the non-Gaussian-distributed marginal deviates to N(0,1) deviates (by mapping the non-Gaussian marginal deviates to empirically derived marginal CDF variate values, then inverting the variates to N(0,1) deviates as described above), and calculating the standard empirical correlation matrix from these N(0,1) deviates as in the equation above.  The resulting “canonical correlation” matrix thereby avoids the bias that would occur if the empirical correlation matrix were extracted from the non-Gaussian marginal distributions directly.

The canonical correlation matrix may be of value in such applications as Markowitz portfolio optimization, where an unbiased measure of dependence is required to evaluate portfolio risk, defined in terms of the portfolio variance which is in turn defined in terms of the correlation among the component portfolio instruments.

The utility of the canonical correlation derives from the observation that a “copula” multivariate distribution with uniformly-distributed deviates (corresponding to the CDF probabilities associated with the marginal deviates) may be mapped to arbitrarily distributed marginals, so that an unbiased dependence estimator derived from one set of marginals N(0,1) distributed marginals) can be used to represent the dependence associated with arbitrarily-distributed marginals. The “Gaussian Copula” (whose variate arguments are derived from N(0,1) marginal deviates) is a particularly useful structure for representing multivariate dependence.

Example: Using Copulas to Imprint and Extract Correlation Information

This example uses subroutine RNMVGC to generate a multivariate sequence gcdevt whose marginal distributions are user-defined and imprinted with a user-specified input correlation matrix corrin and then uses subroutine CANCOR to extract an output canonical correlation matrix corrout from this multivariate random sequence.

This example illustrates two useful copula related procedures. The first procedure generates a random multivariate sequence with arbitrary user-defined marginal deviates whose dependence is specified by a user-defined correlation matrix. The second procedure is the inverse of the first: an arbitrary multivariate deviate input sequence is first mapped to a corresponding sequence of empirically derived variates, i.e. cumulative distribution function values representing the probability that each random variable has a value less than or equal to the input deviate. The variates are then inverted, using the inverse standard normal CDF function, to N(0,1) deviates; and finally, a canonical covariance matrix is extracted from the multivariate N(0,1) sequence using the standard sum of products.

This example demonstrates that subroutine RNMVGC correctly imbeds the user-defined correlation information into an arbitrary marginal distribution sequence by extracting the canonical correlation from these sequences and showing that they differ from the original correlation matrix by a small relative error, which generally decreases as the number of multivariate sequence vectors increases.

 

      use rnmvgc_int

      use cancor_int

      use anorin_int

      use chiin_int

      use fin_int

      use amach_int

      use rnopt_int

      use rnset_int

      use umach_int

      use chfac_int

      implicit none

 

      integer, parameter :: lmax=15000, nvar=3

      real corrin(nvar,nvar), tol, chol(nvar,nvar), gcvart(nvar), &

         gcdevt(lmax,nvar), corrout(nvar,nvar), relerr

      integer irank, k, kmax, kk, i, j, nout

 

      data corrin /&

        1.0, -0.9486832, 0.8164965, &

        -0.9486832, 1.0, -0.6454972, &

        0.8164965, -0.6454972,  1.0/

 

      call umach (2, nout)

 

      write(nout,*)

      write(nout,*) "Off-diagonal Elements of Input " // &

         "Correlation Matrix: "

      write(nout,*)

 

      do i = 2, nvar

         do j = 1, i-1

            write(nout,'(" CorrIn(",i2,",",i2,") = ", f10.6)') &

               i, j, corrin(i,j)

         end do

      end do

 

      write(nout,*)

      write(nout,*) "Off-diagonal Elements of Output Correlation " // &

         "Matrices calculated from"

      write(nout,*) "Gaussian Copula imprinted multivariate sequence:"

 

!     Compute the Cholesky factorization of CORRIN.

      tol=amach(4)

      tol=100.0*tol

      call chfac (corrin, irank, chol, tol=tol)

 

      kmax = lmax/100

      do kk = 1, 3

         write (*, '(/" # vectors in multivariate sequence:  ", &

             i7/)') kmax

 

         call rnopt(1)

         call rnset (123457)

 

         do k = 1, kmax

 

!     Generate an array of Gaussian Copula random numbers.

            call rnmvgc (chol, gcvart)

            do j = 1, nvar

!     Invert Gaussian Copula probabilities to deviates.

               if (j .eq. 1) then

!     ChiSquare(df=10) deviates:

                  gcdevt(k, j) = chiin(gcvart(j), 10.e0)

               else if (j .eq. 2) then

!     F(dfn=15,dfd=10) deviates:

                  gcdevt(k, j) = fin(gcvart(j), 15.e0, 10.e0)

               else

!     Normal(mean=0,variance=1) deviates:

                  gcdevt(k, j) = anorin(gcvart(j))

               end if

            end do

         end do

        

!     Extract Canonical Correlation matrix.

         call cancor (gcdevt(:kmax,:), corrout)

 

         do i = 2, nvar

            do j = 1, i-1

               relerr = abs(1.0 - (corrout(i,j) / corrin(i,j)))

               write(nout,'(" CorrOut(",i2,",",i2,") = ", '// &

                 'f10.6, "; relerr = ", f10.6)') &

                 i, j, corrout(i,j), relerr

            end do

         end do

         kmax = kmax*10

      end do

      end

Output

 

Off-diagonal Elements of Input Correlation Matrix:

 

 CorrIn( 2, 1) =  -0.948683

 CorrIn( 3, 1) =   0.816496

 CorrIn( 3, 2) =  -0.645497

 

 Off-diagonal Elements of Output Correlation Matrices calculated from

 Gaussian Copula imprinted multivariate sequence:

 

 # vectors in multivariate sequence:      150

 

 CorrOut( 2, 1) =  -0.940215; relerr =   0.008926

 CorrOut( 3, 1) =   0.794511; relerr =   0.026927

 CorrOut( 3, 2) =  -0.616082; relerr =   0.045569

 

 # vectors in multivariate sequence:     1500

 

 CorrOut( 2, 1) =  -0.947443; relerr =   0.001308

 CorrOut( 3, 1) =   0.808307; relerr =   0.010031

 CorrOut( 3, 2) =  -0.635650; relerr =   0.015256

 

 # vectors in multivariate sequence:    15000

 

 CorrOut( 2, 1) =  -0.948267; relerr =   0.000439

 CorrOut( 3, 1) =   0.817261; relerr =   0.000936

 CorrOut( 3, 2) =  -0.646208; relerr =   0.001101



http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260