Computes partial association statistics for log-linear models in a multidimensional contingency table.
NCLVAL Vector of length NCLVAR containing, in its i-th element, the number of levels or categories of the i-th classification variable. (Input)
TABLE Vector of
length NCLVAL(1)
* NCLVAL(2) *
* NCLVAL(NCLVAR) containing the
entries in the cells of the table to be fit. (Input)
See Comment 3 for
comments on the ordering of the elements of TABLE.
ZERO Vector of
length NCLVAL(1)
* NCLVAL(2) *
* NCLVAL(NCLVAR) indicating
structural zeros in TABLE.
(Input)
ZERO
has the same structure as TABLE.
Structural zeros in the TABLE
are specified by setting the corresponding element of ZERO to 0.0. All other
elements of zero must be positive. If structural zeros do not exist in TABLE,
TABLE
and ZERO can
share the same storage locations. See Comment 3 for the
ordering of the elements of ZERO.
ASSOC 2NCLVARNCLVAR − 1 by 4 matrix containing the partial association statistics for each effect in the model. (Output)
Col. |
Statistic |
1 |
Likelihood ratio partial association chi-squared for testing that all parameters in the effect are zero against a model containing all interactions of the same order |
2 |
Degrees of freedom in chi-squared in columns 1 and 4 |
3 |
p-value for the chi-squared statistic in column 1 |
4 |
Number of zeros (structural and sampling) in the marginal table of the effect |
The rows of ASSOC are ordered with main effects first, followed by two-way interactions, followed by the three-way interactions, etc., until the last row, which contains the single NCLVAR-way interaction. Thus, if there are 3 classification variables, there would be 8 rows in ASSOC and the rows would contain the A, B, C, AB, AC, BC, and the ABC effects where A represents the first (in INDCL) classification variable, B represents the second classification variable, etc.
CHIHI NCLVAR by 5 matrix
containing chi-squared statistics testing that all k and higher
interactions are zero where k = 1, 2,
, NCLVAR.
(Output)
In the following, k is the row number of the statistic where
the row numbers are
1, 2,
, NCLVAR.
Col. |
Statistic |
1 |
Likelihood ratio chi-squared statistic for testing that all interactions higher than k are zero against a model including all interactions of order k |
2 |
p-value for the chi-squared value in column 1 |
3 |
Degrees of freedom for chi-squared in columns 1 and 4 |
4 |
Pearson chi-squared corresponding to column 1 |
5 |
p-value for the chi-squared value in column 4 |
CHISIM NCLVAR by 5 matrix
containing chi-squared statistics for testing that all k-factor
interactions are simultaneously zero where k = 1,
, NCLVAR.
(Output)
In the following, k is the row number of the statistic where
the row numbers are
1, 2,
, NCLVAR.
Col. |
Statistic |
1 |
Likelihood ratio chi-squared statistic for testing that all k-factor interactions are all simultaneously zero given the model in which all k-way interactions are present |
2 |
p-value for the chi-squared value in column 1 |
3 |
Degrees of freedom for chi-squared in columns 1 and 4 |
4 |
Pearson chi-squared corresponding to column 1 |
5 |
p-value for the chi-squared value in column 4 |
NCLVAR Number
of classification variables. (Input)
A variable specifying a
margin in the table is a classification variable. The first classification
variable is named A, the
second classification variable is named B, etc.
Default: NCLVAR = size (NCLVAL,1).
EPS Convergence
criterion. (Input)
Convergence is assumed when the maximum
deviation between an observed and a fitted marginal total is less than EPS. EPS = 0.10 is a
typical value.
Default: EPS = 0.10.
MAXIT Maximum
number of iterations. (Input)
MAXIT = 15 is a
typical value. When there are structural zeros a larger value, say MAXIT = 100,
should be used.
Default: MAXIT = 100.
IPRINT Printing
option. (Input)
Default: IPRINT = 0.
IPRINT |
Action |
0 |
No printing is performed. |
1 |
Printing of ASSOC, CHIHI, and CHISIM is performed. |
2 |
ASSOC, CHIHI, CHISIM, and TABLE are printed. |
LDASSO Leading
dimension of ASSOC exactly as
specified in the dimension statement in the calling program.
(Input)
Default: LDASSO = size (ASSOC,1).
LDCHIH Leading
dimension of CHIHI exactly as
specified in the dimension statement in the calling program.
(Input)
Default: LDCHIH = size (CHIHI,1).
LDCHIS Leading
dimension of CHISIM exactly as
specified in the dimension statement in the calling program.
(Input)
Default: LDCHIS = size (CHISIM,1).
Generic: CALL CTASC (NCLVAL, TABLE, ZERO, ASSOC, CHIHI, CHISIM [, ])
Specific: The specific interface names are S_CTASC and D_CTASC.
Single: CALL CTASC (NCLVAR, NCLVAL, TABLE, ZERO, EPS, MAXIT, IPRINT, ASSOC, LDASSO, CHIHI, LDCHIH, CHISIM, LDCHIS)
Double: The double precision name is DCTASC.
Routine CTASC
computes likelihood-ratio and Pearson Χ2 tests of
partial-association for each effect in a hierarchical log-linear model. Also
computed are likelihood ratio and Pearson
chi-squared tests that all
interactions above a given level are simultaneously zero. All of these tests are
asymptotic in nature. All models are hierarchical so that all lower order
interactions that may be composed from a higher order effect in the model are
automatically included in the model. All models are fit via the iterative
proportional fitting algorithm, which is implemented in routine PRPFT. The algorithm proceeds as follows:
1.
The hierarchical model including all k-factor interactions is fit with
k = 0,
, m and
m = NCLVAR. The k =
0 model corresponds to a constant probability in each cell in the table while
the k = m model is the full model. For each value of k, the
likelihood ratio chi-squared statistic for testing that all interactions not
included in the fitted model are all simultaneously zero (against the
alternative that this is not the case) is computed as
where
oi is
the observed count in the i-th cell, fi is the fitted
count for the given model, and the summation is over all cells in the table.
Also computed (for comparison, the two statistics are asymptotically equivalent)
is the usual Pearson
chi-squared statistic,
2. Let gi = NCLVAL(i), and let
and assume that there are no structural zeros in the table. Then, the number of degrees of freedom in the chi-squared statistic for testing that all k-order interactions are simultaneously zero is the sum over all k-th order interaction effects of the degrees of freedom for the effect. In the no structural zero case, the degrees of freedom for an effect may be computed as
where j indexes the factors in the effect. Denote the sum of these degrees of freedom at level k by sk, and let s0 = 1. Then, the degrees of freedom in the k-th test is given by sk.
When more than one structural zero is present, the degrees of freedom in the
chi-squared statistics are computed by fitting a least-squares model for the
full full hierarchical model in which all interactions are included. Routine
RGIVN (see Chapter 2, Regression;) is used in fitting the model. Cells with
sampling (as opposed to structural) zeros are included (but only when degrees of
freedom are computed) by using a cell count of 0.5. Observations corresponding
to structural zeros are not included. (Note that a structural zero is a model
restriction that requires that the estimated count for a cell be zero. A
sampling zero occurs by chance.) The degrees of freedom for each effect are
found by summing over the estimated parameters for the effect. Parameters that
are linearly related to previous parameters in the model (as determined through
RGIVN via input
argument TOL where TOL is 100 * AMACH(4) are not estimated. When there is only
one structural zero, degrees of freedom are computed as if there were no
structural zeros except for the highest level interaction term, which is given
one fewer degree of freedom.
Chi-squared statistics for testing that all effects at a given level k
are simultaneously zero (given a hierarchical model in which all effects above
level k are absent) are computed as the difference between the
chi-squared statistics testing that all k and higher interactions are
zero and that of k + 1. That is, for J = 1 and 4, and
I = 1, 2,
, NCLVAR − 1, then CHISIM(I, J) = CHIHI(I, J) CHIHI(I + 1, J), and
CHISIM(NCLVAR, J) = CHIHI(NCLVAR, J).
3. For each effect, a partial association likelihood ratio chi-squared statistic may be used to test the hypothesis that all parameters in the effect are simultaneously zero, given a model in which all interactions at the same level (or lower) are present, and all higher level interactions are absent. The degrees of freedom for the effect are computed as in Step 2.
1. Workspace may be explicitly provided, if desired, by use of C2ASC/DC2ASC. The reference is:
CALL C2ASC (NCLVAR, NCLVAL, TABLE, ZERO, EPS, MAXIT, IPRINT, ASSOC, LDASSO, CHIHI, LDCHIH, CHISIM, LDCHIS, FITWK, NCVEF, IXEF, AMAR, INDX, WK, IWK, COVWK)
The additional arguments are as follows:
FITWK Work vector of length 3 * NCLVAL(1) * * NCLVAL(NCLVAR).
NCVEF Work vector of length 2NCLVARNCLVAR − 1.
IXEF Work vector of length NCLVAR * 2(NCLVAR-1)
AMAR Work vector of length n. In defining n, let q(k) be the sum of the sizes of all possible marginal tables with k effects. For example, q(2) is the sum over all possible two-way interactions I and J of NCLVAL(I) * NCLVAL(J) and q(NCLVAR) is the product NCLVAL(1) * * NCLVAL(NCLVAR). Then, n = max(q(k)), k = 1, , NCLVAR.
INDX Work vector of length m where m is the maximum number of interactions at any level. That is, m = max(BINOM(NCLVAR, I)), I = 1, , NCLVAR, where BINOM(NCLVAR, I) is the binomial coefficient (see routine BINOM (IMSL MATH/LIBRARY Special Functions)).
WK
Work vector of length 3 * NCLVAL(1) *
* NCLVAL(NCLVAR) if there
exists more than one structural zero in TABLE, and of length
NCLVAL(1) *
* NCLVAL(NCLVAR)
otherwise.
IWK Work vector of length 2 * NCLVAR.
COVWK Work vector of length (NCLVAL(1) * * NCLVAL(NCLVAR))2 if there exists more than one structural zero in TABLE. Otherwise, COVWK is not referenced and can be dimensioned of length one in the calling program. On output, COVWK contains the upper triangular matrix containing the R matrix from a QR decomposition of the matrix of regressors for the full log-linear model.
2. Informational errors
Type Code
3 1 The optimization algorithm did not converge to the desired accuracy, EPS, within MAXIT iterations.
3 5 The label for one or more of the tables exceeds the buffer limit.
3 11 The label for one or more effects exceeds the buffer limit.
3. The cells
of the vectors TABLE
and ZERO are
sequenced so that the first variable cycles from 1 to NCLVAL(1) the slowest,
the second variable cycles from 1 to NCLVAL(2) the next
slowest, etc., up to the NCLVAR-th variable,
which cycles from 1 to NCLVAL(NCLVAR) the fastest.
Example: For NCLVAR = 3, NCLVAL(1) = 2, NCLVAL(2) = 3, and
NCLVAL(3) = 2,
the cells of table X(I, J, K)
are entered into TABLE(1)
through TABLE(12)
in the following order:
X(1,
1, 1), X(1,
1, 2), X(1,
2, 1), X(1,
2, 2), X(1,
3, 1), X(1,
3, 2), X(2, 1, 1),
X(2, 1,
2), X(2,
2, 1), X(2,
2, 2), X(2,
3, 1), X(2,
3, 2). The elements of FIT are similarly
sequenced.
1. When sampling zeros are present, the likelihood ratio test statistics may not follow the appropriate chi-squared distribution closely. A common (but not necessarily the best) practice in this case is to add a small positive constant, often 0.5, to each cell in the table. This addition is easily accomplished via routine SADD (IMSL MATH/LIBRARY). The addition of such a constant should not effect the computed degrees of freedom.
2. When marginal totals of zero are obtained, the optimization algorithm may be slow to converge. In this case, increase the value of argument MAXIT.
The following example illustrates the use of CTASC
for model building in a four-way table involving brand preference. The first
three factors each have 2 levels, while the fourth factor has 3 levels. The data
are originally from Lee (1977) and are printed in the output. A model with
two-way interaction effects AD, BC, and BD looks promising.
USE CTASC_INT
IMPLICIT NONE
INTEGER IPRINT, LDASSO, LDCHIH, LDCHIS, LTAB
REAL EPS
PARAMETER (EPS=0.01, IPRINT=2, LDASSO=15, LDCHIH=4, LDCHIS=4, &
LTAB=24)
!
INTEGER NCLVAL(4)
REAL ASSOC(LDASSO,4), CHIHI(LDCHIH,5), CHISIM(LDCHIS,5), &
TABLE(LTAB)
!
DATA TABLE/19, 57, 29, 63, 29, 49, 27, 53, 23, 47, 33, 66, 47, &
55, 23, 50, 24, 37, 42, 68, 43, 52, 30, 42/
DATA NCLVAL/3, 2, 2, 2/
!
CALL CTASC (NCLVAL, TABLE, TABLE, ASSOC, CHIHI, CHISIM, &
EPS=EPS, IPRINT=IPRINT)
!
END
Variable Number of Levels
1
A
3
2
B
2
3
C
2
4
D
2
----------
Table 1: B = 1 C = 1
D (row) by
A (column)
1 2
3
1 19.00 23.00
24.00
2 57.00 47.00
37.00
----------
Table 2: B = 1 C = 2
D (row) by
A (column)
1 2
3
1 29.00 33.00
42.00
2 63.00 66.00
68.00
----------
Table 3: B = 2 C = 1
D (row) by
A (column)
1 2
3
1 29.00 47.00
43.00
2 49.00 55.00
52.00
----------
Table 4: B = 2 C = 2
D (row) by
A (column)
1 2
3
1 27.00 23.00
30.00
2 53.00 50.00
42.00
Partial Association Statistics
Omitted
Degrees
of
Marginal
Effect
Chi-Square Freedom
P-value
Zeros
A
0.50
2.0
0.7782
0.0
B
0.06
1.0
0.8010 0.0
C
1.92
1.0
0.1657
0.0
D
73.21
1.0
0.0000
0.0
A*B
0.22
2.0
0.8978
0.0
A*C
1.01
2.0
0.6050
0.0
A*D
6.10
2.0
0.0475
0.0
B*C
19.89
1.0
0.0000
0.0
B*D
3.74
1.0
0.0532
0.0
C*D
0.74
1.0
0.3898
0.0
A*B*C
4.57
2.0
0.1017
0.0
A*B*D
0.16
2.0
0.9223
0.0
A*C*D
1.38
2.0
0.5022
0.0
B*C*D
2.22
1.0 0.1361
0.0
A*B*C*D
0.74
2.0
0.6917 0.0
Chi-square statistics for
testing that all k and higher interactions
are
zero.
Likelihood
Degrees of
k Ratio
P-Value Freedom
Pearson
P-Value
1 118.63
0.0000
23.0
115.71
0.0000
2
42.93
0.0008
18.0
43.90
0.0006
3
9.85
0.3631
9.0
9.87
0.3611
4
0.74
0.6917
2.0
0.74 0.6915
Chi-square
statistics for testing that all k-factor interactions
are
simultaneously zero.
Likelihood
Degrees of
k
Ratio P-Value
Freedom Pearson
P-Value
1
75.70
0.0000
5.0
71.81 0.0000
2
33.08
0.0001
9.0
34.03 0.0001
3
9.11
0.2449
7.0
9.13 0.2433
4
0.74
0.6917
2.0
0.74 0.6915
PHONE: 713.784.3131 FAX:713.781.9260 |