Computes case statistics for a polynomial regression model given the fit based on orthogonal polynomials.
X NOBS by NCOL matrix containing the data. (Input)
IRSP Column number IRSP of X contains the data for the response (dependent) variable. (Input)
IND Column number IND of X contains the data for the independent (explanatory) variable. (Input)
NDEG Degree of the polynomial regression. (Input)
SMULTC Multiplicative constant used to compute a scaled version of x on the interval −2 to 2, inclusive. (Input)
SADDC Additive constant used to compute a scaled version of x on the interval −2 to 2, inclusive. (Input)
A Vector of length NDEG containing constants used to generate orthogonal polynomials. (Input)
B Vector of length NDEG containing constants used to generate orthogonal polynomials. (Input)
SCOEF Vector of length NDEG + 1 containing the regression coefficients
of the fitted model using the scaled version of x(z). (Input)
is the estimated intercept and equals the response mean.
contains the estimated coefficient for the i-th order orthogonal polynomial using the scaled version of x(z).
D Vector of length NDEG + 1 containing the diagonal elements of the (diagonal) sums of squares and crossproducts matrix. (Input)
SSE Sum of squares for error. (Input)
DFE Degrees of freedom for error. (Input)
CASE NOBS
by 12 matrix containing the case statistics. (Output)
Columns 1
through 12 contain the following:
Col. Description
1 Observed response
2 Predicted response
3 Residual
4 Leverage
5 Standardized residual
6 Jackknife residual
7 Cooks distance
8 DFFITS
9, 10 Confidence interval on the mean
11, 12 Prediction interval
NOBS Number of
observations. (Input)
Default: NOBS = size (X,1).
NCOL Number of
columns in X.
(Input)
Default: NCOL = size (X,2).
LDX Leading
dimension of X
exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDX = size (X,1).
IWT Weighting
option. (Input)
IWT = 0 means that all
weights are 1.0. For positive IWT, column number IWT of X contains the
weights, and the computed prediction interval uses
SSE/(DFE * X(i, IWT)) for the
estimated variance of a future response.
Default: IWT = 0.
IPRED
Prediction interval option. (Input)
IPRED = 0 means that
prediction intervals are desired for a single future response. For positive
IPRED, column number IPRED of X contains the number
of future responses for which a prediction interval is desired on the average of
the future responses.
Default: IPRED = 0.
CONPCM
Confidence level for two-sided interval estimates on the mean, in
percent. (Input)
Default: CONPCM = 95.0.
CONPCP
Confidence level for two-sided prediction intervals, in percent.
(Input)
Default: CONPCP = 95.0.
PRINT Printing
option. (Input)
Default: PRINT = N.
PRINT is a character
string indicating what is to be printed. The PRINT string is
composed of one-character print codes to control printing. These print codes are
given as follows:
PRINT(I:I) Printing that Occurs
A All
N None
1 Observed response
2 Predicted response
3 Residual
4 Leverage
5 Standardized residual
6 Jackknife residual
7 Cooks distance
8 DFFITS
M Confidence interval on the mean
P Prediction interval
X Influential cases (unusual x-value)
Y Outlier cases (unusual y-value)
The concatenated print codes A, N, 1, , P that comprise the PRINT string give the combination of statistics to be printed. Concatenation of these codes with print codes X or Y restricts printing to cases determined to be influential or outliers. Here are a few examples:
PRINT Printing Action
A All.
N None.
46 Leverage and jackknife residual for all cases.
AXY All statistics are printed for cases that are highly influential or are outliers.
46XY Leverage and jackknife residual are printed for cases that are highly influential or are outliers.
LDCASE Leading
dimension of CASE exactly as
specified in the dimension statement in the calling program.
(Input)
Default: LDCASE = size (CASE,1).
NRMISS Number of rows of CASE containing NaN (not a number). (Output)
Generic: CALL RCASP (X, IRSP, IND, NDEG, SMULTC, SADDC, A, B, SCOEF, D, SSE, DFE, CASE [, ])
Specific: The specific interface names are S_RCASP and D_RCASP.
Single: CALL RCASP (NOBS, NCOL, X, LDX, IRSP, IND, IWT, IPRED, CONPCM, CONPCP, NDEG, SMULTC, SADDC, A, B, SCOEF, D, SSE, DFE, PRINT, CASE, LDCASE, NRMISS)
Double: The double precision name is DRCASP.
Routine RCASP assumes a polynomial model
where the observed values of the yis constitute the response, the xis are the settings of the independent variable, the βjs are the regression coefficients and the ɛ is are the errors that are independently distributed normal with mean 0 and variance σ2/wi. Given the results of a polynomial regression, fitted using orthogonal polynomials and weights wi, routine RCASP produces predicted values, residuals, confidence intervals, prediction intervals, and diagnostics for outliers and influential cases.
Often a predicted value and confidence interval are desired for a setting of the independent variable not used in computing the regression fit. This can be accomplished by including the independent variable setting as part of the data matrix and by setting the response equal to NaN (not a number). NaN can be retrieved by AMACH(6).
Results from routine RFORP, which produces the fit using orthogonal polynomials, are used for input. The fitted model from RFORP is
where the zis are settings of the independent variable x scaled to the interval [−2, 2] and where the pj(z)s are the orthogonal polynomials. The XT X matrix for this model is a diagonal matrix with elements dj (stored in D). The case statistics are easily computed from this model and are equal to those from the original polynomial model with the βjs as the regression coefficients.
The leverage is computed as
The estimated variance of
is given by his2/wi. The computation of the remainder of the case statistics follows easily from their definitions. See the Diagnostics for Individual Cases section in the chapter introduction for definitions of the case diagnostics .
1. Workspace may be explicitly provided, if desired, by use of R2ASP/DR2ASP. The reference is:
CALL R2ASP (NOBS, NCOL, X, LDX, IRSP, IND, IWT, IPRED, CONPCM, CONPCP, NDEG, SMULTC, SADDC, A, B, SCOEF, D, SSE, DFE, PRINT, CASE, LDCASE, NRMISS, WK)
The additional argument is:
WK Work vector of length NDEG + 1.
2. Informational errors
Type Code
4 1 A weight is negative. Weights must be nonnegative.
4 8 The number of future observations for a prediction interval must be positive.
3 9 A leverage much greater than one is computed. It is set to one.
3 10 A deleted residual mean square much less than zero is computed. It is set to 0.0.
A polynomial model is fitted to data discussed by Neter and Wasserman (1974, pages 279−285). The data set contains the response variable y measuring coffee sales (in hundred gallons) and the number of self-service coffee dispensers. Responses for fourteen similar cafeterias are in the data set.
USE RCASP_INT
USE RFORP_INT
IMPLICIT NONE
INTEGER LDCASE, LDCOEF, LDSQSS, LDTLOF, LDX, MAXDEG, NCOL, &
NOBS, J
PARAMETER (MAXDEG=2, NCOL=2, NOBS=14, LDCASE=NOBS, &
LDCOEF=MAXDEG+1, LDSQSS=MAXDEG, LDTLOF=MAXDEG, &
LDX=NOBS)
!
INTEGER ICRIT, IFRQ, IND, IPRED, IRSP, IWT, LOF, NDEG, NRMISS
REAL A(MAXDEG), B(MAXDEG), CASE(LDCASE,12), CONPCM, &
CONPCP, CRIT, D(MAXDEG+1), DFE, DFPE, SADDC, &
SCOEF(MAXDEG+1), SMULTC, SSE, SSPE, X(LDX,NCOL)
CHARACTER PRINT*1
!
DATA (X(1,J),J=1,2) /0.0, 508.1/
DATA (X(2,J),J=1,2) /5.0, 787.6/
DATA (X(3,J),J=1,2) /0.0, 498.4/
DATA (X(4,J),J=1,2) /1.0, 568.2/
DATA (X(5,J),J=1,2) /2.0, 651.7/
DATA (X(6,J),J=1,2) /7.0, 854.7/
DATA (X(7,J),J=1,2) /2.0, 657.0/
DATA (X(8,J),J=1,2) /4.0, 755.3/
DATA (X(9,J),J=1,2) /6.0, 831.8/
DATA (X(10,J),J=1,2) /4.0, 758.9/
DATA (X(11,J),J=1,2) /5.0, 792.1/
DATA (X(12,J),J=1,2) /6.0, 841.4/
DATA (X(13,J),J=1,2) /7.0, 871.4/
DATA (X(14,J),J=1,2) /1.0, 577.3/
!
IRSP = 2
IND = 1
LOF = 1
CALL RFORP (X, IRSP, IND, MAXDEG, NDEG, A, B, SCOEF, D, LOF=LOF, &
SMULTC=SMULTC, SADDC=SADDC, DFE=DFE, SSE=SSE)
!
PRINT = 'A'
CALL RCASP (X, IRSP, IND, NDEG, SMULTC, SADDC, A, B, SCOEF, D, SSE, &
DFE, CASE, PRINT=PRINT)
!
END
* * * Case Analysis * * *
Obs. Observed
Predicted Residual Leverage Std. Res. Jack.
Res
Cooks
D DFFITS 95.0% CI 95.0%
CI 95.0% PI 95.0% PI
1
508.1000 503.3459
4.7541 0.3554
0.7367
0.7204
0.0997 0.5349 492.8003
513.8916 482.7510 523.9409
2
787.6000 798.8150 -11.2150
0.1429 -1.5072
-1.6132
0.1262 -0.6586 792.1288
805.5012 779.9034 817.7266
3
498.4000 503.3459 -4.9460
0.3554 -0.7664
-0.7511
0.1079 -0.5577 492.8003
513.8916 482.7510 523.9409
4
568.2000 578.3177 -10.1177
0.1507 -1.3660
-1.4293
0.1104 -0.6021 571.4498
585.1857 559.3412 597.2943
5
651.7000 645.3505
6.3495 0.1535
0.8586
0.8476
0.0446 0.3609 638.4200
652.2810 626.3513 664.3498
6
854.7000 861.4297 -6.7297
0.3650 -1.0508
-1.0563
0.2116 -0.8008 850.7420
872.1175 840.7617 882.0978
7
657.0000 645.3505 11.6495
0.1535 1.5753
1.7069
0.1500 0.7268 638.4200
652.2810 626.3513 664.3498
8
755.3000 755.5992 -0.2992
0.1897 -0.0414
-0.0394
0.0001 -0.0191 747.8945
763.3038 736.3040 774.8943
9
831.8000 834.0919 -2.2919
0.1429 -0.3080
-0.2949
0.0053 -0.1204 827.4056
840.7782 815.1804 853.0035
10
758.9000 755.5992 3.3008
0.1897 0.4562
0.4392
0.0162 0.2125 747.8945
763.3038 736.3040 774.8943
11
792.1000 798.8150 -6.7150
0.1429 -0.9024
-0.8942
0.0452 -0.3650 792.1288
805.5012 779.9034 817.7266
12
841.4000 834.0919
7.3081 0.1429
0.9821
0.9804
0.0536 0.4002 827.4056
840.7782 815.1804 853.0035
13
871.4000 861.4297
9.9703 0.3650
1.5567
1.6809
0.4643 1.2745 850.7420
872.1175 840.7617 882.0978
14
577.3000 578.3177 -1.0178
0.1507 -0.1374
-0.1311
0.0011 -0.0552 571.4498
585.1857 559.3412 597.2943
Figure 2- 9 Second Degree Polynomial Fit With 95% One-at-a-Time Prediction Intervals
PHONE: 713.784.3131 FAX:713.781.9260 |