Chapter 2: Regression

GCSCP

Generates centered variables, squares, and crossproducts.

Required Arguments

XNRX by NVAR matrix containing the data.   (Input)

XMEAN — Vector of length NVAR containing the means of the variables.   (Input)

CSCPNRX by NVAR * (NVAR + 3)/2 matrix containing the centered variables, squares, and crossproducts.   (Output)

Columns

Description

1 to NVAR

Centered variables

NVAR+ 1 to 2 * NVAR

Squared variables

2 * NVAR + 1 to NVAR * (NVAR + 3)/2

Crossproducts

            If X is not needed, X and the first NVAR columns of CSCP may occupy the same storage locations.

Optional Arguments

IDO — Processing option.   (Input)
Default: IDO = 0.

IDO

Action

0

This is the only invocation of GCSCP for this data set, and all the data are input at once.

1

This is the first invocation, and additional calls to GCSCP will be made. Initialization and updating for the data in X are performed.

2

This is an intermediate or final invocation of GCSCP and updating for the data in X is performed.

NRX — Number of rows of data in X.   (Input)
Default: NRX = size (X,1).

NVAR — Number of variables.   (Input)
Default: NVAR = size (X,2).

LDX — Leading dimension of X exactly as specified in the dimension statement in the calling program.   (Input)
Default: LDX = size (X,1).

ICEN — Centering option.   (Input)
If IDO = 1 or IDO = 2, ICEN must equal 0.
Default: ICEN = 0.

ISUB

Action

0

CSCP contains the centered variables in columns 1 through NVAR. Square and crossproduct variables are generated from these centered variables in the remaining columns of CSCP.

1

First, the action taken when ICEN = 0 is performed. Next, the means of the square and crossproduct variables are subtracted from the square and crossproduct variables.

SCPM — Vector of length NVAR * (NVAR + 1)/2 containing the means of the generated square and crossproduct variables.   (Output, if IDO = 0 or 1; input/output, if IDO = 2)

Elements                               Description

1 to NVAR                               Squared variable means

NVAR+ 1 to NVAR * (NVAR + 1)/2            Crossproduct variable means

LDCSCP — Leading dimension of CSCP exactly as specified in the dimension statement in the calling program.   (Input)
Default: LDCSCP = size (CSCP,1).

NRMISS — Number of rows of data encountered in calls to GCSCP that contain any missing values for the variables.   (Output, if IDO = 0 or 1; input/output, if IDO = 2)
NaN (not a number) is used as the missing value code.
Default: NRMISS = 0.

NVOBS — Number of valid observations.   (Output, if IDO = 0 or 1; input/output, if IDO = 2)
Number of rows of data encountered in calls to GCSCP that do not contain any missing values for the variables.

FORTRAN 90 Interface

Generic:                              CALL GCSCP (X, XMEAN, CSCP [,…])

Specific:                             The specific interface names are S_GCSCP and D_GCSCP.

FORTRAN 77 Interface

Single:            CALL GCSCP (IDO, NRX, NVAR, X, LDX, ICEN, XMEAN, SCPM, CSCP, LDCSCP, NRMISS, NVOBS)

Double:                              The double precision name is DGCSCP.

Comments

Crossproduct variables are ordered as follows: (1, 2), (1, 3), , (1, NVAR), (2, 3), (2, 4), , (2, NVAR), , (NVAR 1, NVAR).

Programming Notes

Routine GCSCP centers a data set consisting of independent variable settings and generates (using the centered variables) the settings for all possible squared and crossproduct variables in standard order. The routine GCSCP is designed so that you can partition a large data set into submatrices (requiring less space) and make multiple calls to GCSCP (with IDO = 1, 2, 2, , 2). Alternatively, one invocation of GCSCP (with IDO = 0) can be made with the entire data set contained in X.

Let n be the number of rows in the entire data set, and let m (stored in NVAR) be the number of variables. Let xij be the i-th setting of the j-th variable (i = 1, 2, , n; j = 1, 2, , m). Denote the means (stored in XMEAN) by

The settings of the j-th centered variable (stored in the j-th column of CSCP) are given by

The settings of the j-th squared variable (stored in the (m + j)-th column of CSCP) are given by

where

(stored in the (m + j)-th column of SCPM) is the mean of the j-th squared variable. The settings of the jk crossproduct variable (stored in the

column of CSCP) are given by

where

(stored in the

location of SCPM) is the mean of the jk-th (j < k) crossproduct variable.

Example 1

With data containing 4 rows and 3 variables, GCSCP is used to center the variables and to generate (using the centered variables) the square and crossproduct variables. The data is input in one invocation (IDO = 0), and the generated squared and crossproduct variables are centered (ICEN = 1). On output, SCPM contains the means in standard order, i.e.,

Also, CSCP contains the variables in standard order, i.e.,

 

      USE GCSCP_INT

      USE UMACH_INT

      USE WRRRN_INT

 

      IMPLICIT   NONE

      INTEGER    LDCSCP, LDX, NRX, NVAR, J, ICEN

      PARAMETER  (NRX=4, NVAR=3, LDCSCP=NRX, LDX=NRX)

!

      INTEGER    NOUT, NRMISS, NVOBS

      REAL       CSCP(LDCSCP,NVAR*(NVAR+3)/2), SCPM(NVAR*(NVAR+1)/2), &

                 X(LDX,NVAR), XMEAN(NVAR)

!

      DATA (X(1,J),J=1,NVAR)/10.0,  8.0, 11.0/

      DATA (X(2,J),J=1,NVAR)/ 5.0, 15.0,  1.0/

      DATA (X(3,J),J=1,NVAR)/ 3.0,  2.0,  4.0/

      DATA (X(4,J),J=1,NVAR)/ 6.0,  3.0,  4.0/

      DATA XMEAN/6.0, 7.0, 5.0/

!

      ICEN = 1

      CALL GCSCP (X, XMEAN, CSCP, ICEN=ICEN, scpm=scpm, nrmiss=nrmiss, &

                  nvobs=nvobs)

!

      CALL UMACH (2, NOUT)

      WRITE (NOUT,*) ' NRMISS = ', NRMISS

      CALL WRRRN ('SCPM', SCPM, 1, NVAR*(NVAR+1)/2, 1)

      CALL WRRRN ('CSCP', CSCP)

      END

Output

 

NRMISS =   0

                  SCPM
   1       2       3       4       5       
6.50   26.50   13.50    2.75    7.75   -4.25

                                   CSCP
        1       2       3       4       5       6       7       8       9
1    4.00    1.00    6.00    9.50  -25.50   22.50    1.25   16.25   10.25
2   -1.00    8.00   -4.00   -5.50   37.50    2.50  -10.75   -3.75  -27.75
3   -3.00   -5.00   -1.00    2.50   -1.50  -12.50   12.25   -4.75    9.25
4    0.00   -4.00   -1.00   -6.50  -10.50  -12.50   -2.75   -7.75    8.25

Additional Example

Example 2

With data containing 4 rows and 3 variables, GCSCP is used to center the variables and to generate (using the centered variables) the square and crossproduct variables. The data is input in multiple invocations (IDO = 1, 2, 2, 2). Here, the square and crossproduct variables, generated using the centered variables, cannot be centered (ICEN = 0).

 

      USE GCSCP_INT

      USE UMACH_INT

      USE WRRRN_INT

 

      IMPLICIT   NONE

      INTEGER    LDCSCP, LDX, NRX, NVAR, J

      PARAMETER  (LDX=4, NRX=1, NVAR=3, LDCSCP=NRX)

!

      INTEGER    I, IDO, MISS, NOUT, NRMISS, NVOBS

      REAL       CSCP(LDCSCP,NVAR*(NVAR+3)/2), SCPM(NVAR*(NVAR+1)/2), &

                 X(LDX,NVAR), XMEAN(NVAR)

!

      DATA (X(1,J),J=1,NVAR)/10.0,  8.0, 11.0/

      DATA (X(2,J),J=1,NVAR)/ 5.0, 15.0,  1.0/

      DATA (X(3,J),J=1,NVAR)/ 3.0,  2.0,  4.0/

      DATA (X(4,J),J=1,NVAR)/ 6.0,  3.0,  4.0/

      DATA XMEAN/6.0, 7.0, 5.0/

!

      CALL UMACH (2, NOUT)

      MISS = 0

      DO 10  I=1, 4

         IF (I .EQ. 1) THEN

            IDO = 1

         ELSE

            IDO = 2

         END IF

         CALL GCSCP (X(I:,1:), XMEAN, CSCP, IDO=IDO, NRX=NRX, scpm=scpm,  &

                     nrmiss=nrmiss, nvobs=nvobs)

         MISS = MISS + NRMISS

         CALL WRRRN ('CSCP', CSCP)

   10 CONTINUE

      CALL WRRRN ('SCPM', SCPM, 1, NVAR*(NVAR+1)/2, 1)

      WRITE (NOUT,*) ' MISS = ', MISS

      END

Output

 

                                   CSCP
    1       2       3       4       5       6       7       8       9
 4.00    1.00    6.00   16.00    1.00   36.00    4.00   24.00    6.00

                                  CSCP
    1       2       3       4       5       6       7       8       9
-1.00    8.00   -4.00    1.00   64.00   16.00   -8.00    4.00  -32.00

                                  CSCP
    1       2       3       4       5       6       7       8       9
-3.00   -5.00   -1.00    9.00   25.00    1.00   15.00    3.00    5.00

                                  CSCP
    1       2       3       4       5       6       7       8       9
 0.00   -4.00   -1.00    0.00   16.00    1.00    0.00    0.00    4.00

                      SCPM
    1       2       3       4       5       6
 6.50   26.50   13.50    2.75    7.75   -4.25
MISS =   0



http://www.vni.com/
PHONE: 713.784.3131
FAX:713.781.9260