Computes the biserial correlation coefficient for a dichotomous variable and a classification variable.
A — 2 by K matrix containing
the frequencies. (Input)
The first row of A contains frequencies
for the classification variable when the dichotomous variable takes on one of
its values, and the second row of A contains the
frequencies when the dichotomous variable takes on its other value. No ordering
is assumed for the values of the classification variable. The elements of A must be
nonnegative.
STAT — Vector of length 5 containing various statistics. (Output)
I STAT(I)
1 Total count of the first value of the dichotomous variable (the sum of the first row of A)
2 Total count for the second value
3 Total count (sum of STAT(1) and STAT(2))
4 Absolute value of the biserial correlation coefficient
5 Square of the biserial correlation coefficient
K — Number of
classes for the classification variable. (Input)
Default: K
= size (A,2).
LDA — Leading
dimension of A
exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA
= size (A,1).
Generic: CALL BSCAT (A, STAT [,…])
Specific: The specific interface names are S_BSCAT and D_BSCAT.
Single: CALL BSCAT (K, A, LDA, STAT)
Double: The double precision name is DBSCAT.
Routine BSCAT computes the biserial correlation coefficient for a dichotomous variable and a classification variable. The data are input in a 2 × k array, A, where the row indicates the value of the dichotomous variable, and the column indicates the value of the classification variable. In BSCAT, column scores are computed as xi = Φ−1(a1i/(a1i + a2i)), and the row score is computed as y = Φ−1(a∙1/(a∙1 + a∙2)), where a∙1 is the sum of the counts in row 1, a∙2 is the sum of the counts for row 2, and Φ denotes the cumulative normal distribution. Let N denote the total number of observations (the sum of the elements of A). Then, the biserial correlation is computed as
An underlying bivariate normal distribution is assumed. The validity of the estimate depends heavily upon this assumption.
The example is taken from Kendall and Stuart (1979, page 327). The data involve the classification of criminals as alcoholic (first row) or nonalcoholic for each level of a crimetype classification. The severity of the crime decreases with increasing column number. The absolute value of the biserial correlation is 0.23.
USE WRRRN_INT
USE BSCAT_INT
USE WRRRL_INT
IMPLICIT NONE
INTEGER K, LDA
PARAMETER (K=6, LDA=2)
!
REAL A(LDA,K), STAT(5)
CHARACTER CLABEL(2)*10, RLABEL(5)*10
!
DATA A/50, 43, 88, 62, 155, 110, 379, 300, 18, 14, 63, 144/
DATA RLABEL/'Count-1', 'Count-2', 'Count', 'r-b', '(r-b)**2'/
DATA CLABEL/'Statistic', ' '/
!
CALL WRRRN ('A', A)
!
CALL BSCAT (A, STAT)
!
CALL WRRRL (' ', STAT, RLABEL, CLABEL, FMT='(W12.6)')
END
A
1 2
3 4
5 6
1
50.0 88.0 155.0
379.0 18.0 63.0
2
43.0 62.0 110.0
300.0 14.0 144.0
Statistic
Count-1
753.00
Count-2
673.00
Count
1426.00
r-b
0.23
(r-b)**2
0.05
PHONE: 713.784.3131 FAX:713.781.9260 |