1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
**********************************************************************
White's test with SPSS:
======================

* First of all, read "Heterocedasticity: testing and correcting in SPSS", by Gwilym Pryce
    http://pages.infinit.net/rlevesqu/spss.htm#Heteroscedasticity
    This macro is based on his paper.
*   SPSS Code by Marta Garcia-Granero 2002/04/04.

* Steps:

* Create several new variables:
  * The square of the unstandardized residuals.
  * The square of every predictor variable in the model you want to test.
  * The cross-product of all the predictors.

* Run a regression model to predict the squared residuals with
	the predictors, their squares and cross-products.

* Multiply the model R-square (unadjusted) by the sample size (n*R-square).
* This is the White's statistic. Its significance is tested by comparing
	it with the critical value of the Chi-square distribution with "p" degrees
	of freedom, where "p" is the total number of regressors in the last
	regression model (original+squares+cross-products).


* IMPORTANT:
* If any of the original predictors is binary (dummy variable), then its square
will be identical to the original, and they will correlate perfectly.
* In this case, the regression model will drop one of them (the original or its square),
and "p" has to be decreased in 1 unit for each binary predictor in the model.

* WHITE'S TEST MACRO *

* The MACRO needs 5 arguments:
*   a) the number of predictors,
*   b) the number of cross-products that will be created:
"	predictors*(predictors-1)/2"
*      [I could not find other way of making VECTOR to accept the	number],
*   c) "P" (predictors+squares+cross-products), corrected for binary predictors,
*   d) the name of the dependent variable and
*   e) the list of predictors in the form 'first predictor TO last predictor'
*      (ordered and consecutive in the database).

* MACRO definition.

DEFINE whitest(!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1)
              /!POSITIONAL !TOKENS(1) /!POSITIONAL !TOKENS(1)
              /!POSITIONAL !CMDEND).

* >>>> 1st regression model to get the residuals <<<< *.
REGRESSION
  /STATISTICS R ANOVA
  /DEPENDENT !4
  /METHOD=ENTER !5
  /SCATTERPLOT=(*ZRESID,*ZPRED)
  /SAVE RESID(residual) .

* >>>> New variables <<<< *.
* New dependent variable.
COMPUTE sq_res=residual**2.
* Getting rid of superfluous variables (dependent and residuals).
SAVE OUTFILE='c:\\windows\\temp\\tempdat_.sav'
   /keep=sq_res !5.
GET FILE='c:\\windows\\temp\\tempdat_.sav'.
EXECUTE.
* Vectors for all new predictor variables.
VECTOR v=!5 /sq(!1) /cp(!2).
* Squares of all predictors.
LOOP #i=1 to !1.
COMPUTE sq(#i)=v(#i)**2.
END LOOP.
* Cross-products of all predictors.
* Modification of a routine by Ray Levesque.
COMPUTE #idx=1.
LOOP #cnt1=1 TO !1-1.
LOOP #cnt2=#cnt1+1 TO !1.
COMPUTE cp(#idx)=v(#cnt1)*v(#cnt2).
COMPUTE #idx=#idx+1.
END LOOP.
END LOOP.
EXECUTE.

* >>>> White's test <<<< *.
* Regression of sq_res on all predictors.
REGRESSION /VARIABLES=ALL
  /STATISTICS R
  /DEPENDENT sq_res
  /METHOD= ENTER
  /SAVE RESID(residual) .
* Final report.
* Routine by Gwilym Pryce (slightly modified).
matrix.
compute p=!3.
get sq_res /variables=sq_res.
get residual /variables=residual.
compute sq_res2=residual&**2.
compute n=nrow(sq_res).
compute rss=msum(sq_res2).
compute ii_1=make(n,n,1).
compute i=ident(n).
compute m0=i-((1/n)*ii_1).
compute tss=transpos(sq_res)*m0*sq_res.
compute regss=tss-rss.
print regss
 /format="f8.4"
 /title="Regression SS".
print rss
 /format="f8.4"
 /title="Residual SS".
print tss
 /format="f8.4"
 /title="Total SS".
compute r_sq=1-(rss/tss).
print r_sq
 /format="f8.4"
 /title="R-squared".
print n
 /format="f4.0"
 /title="Sample size (N)".
print p
 /format="f4.0"
 /title="Number of predictors (P)".
compute wh_test=n*r_sq.
print wh_test
 /format="f8.3"
 /title="White's General Test for Heteroscedasticity"
+ " (CHI-SQUARE df=P)".
compute sig=1-chicdf(wh_test,p).
print sig
 /format="f8.4"
 /title="Significance level of Chi-square df=P (H0:"
+ "homoscedasticity)".
end matrix.

!ENDDEFINE.

* Sample data Nr. 1: continuous predictors *.
INPUT PROGRAM.
- VECTOR x(5).
- LOOP #I = 1 TO 100.
-  LOOP #J = 1 TO 5.
-   COMPUTE x(#J) = NORMAL(1).
-  END LOOP.
-  END CASE.
- END LOOP.
- END FILE.
END INPUT PROGRAM.
execute.
* x1 is the dependent and x2 TO x5 the predictors.
rename variables x1=y.
execute.
* MACRO call: there are 4 predictors, therefore, 6 cross-products and 14
regressors.
whitest 4 6 14 y x2 TO x5.

* Sample data Nr. 2: one binary predictor *.
INPUT PROGRAM.
- VECTOR x(5).
- LOOP #I = 1 TO 100.
-  LOOP #J = 1 TO 5.
-   COMPUTE x(#J) = NORMAL(1).
-  END LOOP.
-  END CASE.
- END LOOP.
- END FILE.
END INPUT PROGRAM.
execute.
RECODE x2  (Lowest thru 0=0)  (0 thru Highest=1)  .
EXECUTE .

* x1 is the dependent and x2 TO x5 the predictors.
rename variables x1=y.
execute.
* MACRO call: as before, 4 predictors, 6 cross-products but ONLY 13
regressors.
whitest 4 6 13 y x2 TO x5.

* As you can see from the output, X2 is not included in the model.