Do All-Subsets regressions
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | * (Q) How can I do an all-subsets regression using SPSS? Whereas a stepwise regression yields one final equation, the goal of all-subsets regression is to perform all possible regressions combination of and then let the user (rather than the stepwise regression) choose the "best" equation. * So, if one had 5 independent variables, the all-subsets regression would perform 5 regressions of each predictor on y, and then work up towards one final regression with all the predictors. The output can be any number of things, such as the r^2 for each equation, but I would rather use the adjusted predicted variables that SPSS can already create. * (A) by rlevesque@videotron.ca 2001/08/30; SPSS Dedicated web site http://pages.infinit.net/rlevesqu/index.htm. * Note: This is a complex syntax but it is very easy to use. See the example at the end. * 3 macros are defined by the syntax, the first of these macros (combine) writes a syntax file which subsequently writes a text file. The text file is read into the data editor and used to write a syntax file consisting of a series of macro calls; each macro call does one of the regressions. SET MPRINT=no. */////////////////////////////. DEFINE !combine (n=!TOKENS(1) /m=!TOKENS(1) /dep=!TOKENS(1) /indepv=!CMDEND). /* Find all combinations on n items out of m */ /* August 30,2001 rlevesque@videotron.ca */ !DO !thisn=1 !TO !n NEW FILE. INPUT PROGRAM. LOOP i=1 TO !thisn. END CASE. END LOOP. END FILE. END INPUT PROGRAM. LIST. !LET !list=!NULL !DO !cnt=1 !TO !thisn !LET !list=!CONCAT(!list," ","j",!cnt) !DOEND COMPUTE n=!thisn. * Calculate variable names for LOOP of the next WRITE command *. STRING cntname cntbeg(A8). COMPUTE cntname=CONCAT('j',LTRIM(STRING(i,F8.0))). * Calculate first parameter for the LOOP of the next WRITE command *. DO IF i=1. COMPUTE cntbeg="1". ELSE. COMPUTE cntbeg=CONCAT('j',LTRIM(STRING(i-1,F8.0))," + 1"). END IF. * Calculate second parameter for the LOOP of the next WRITE command *. COMPUTE k=!m - !thisn + i. FORMATS i k n(F8.0). STRING quote(A1) strlist(A255). COMPUTE quote='"'. COMPUTE strlist=!QUOTE(!list). * Write the syntax file which will store all the combinations in the list.txt file*. WRITE OUTFILE "c:\\temp\\macro.sps" /"LOOP "cntname"="cntbeg" TO "k".". DO IF i=!thisn. + WRITE OUTFILE "c:\\temp\\macro.sps" /"WRITE OUTFILE "quote"c:\\temp\\list.txt"quote "/" strlist "." . + LOOP cnt=1 TO !thisn. + WRITE OUTFILE "c:\\temp\\macro.sps" /"END LOOP.". + END LOOP. + WRITE OUTFILE "c:\\temp\\macro.sps" /"EXECUTE.". END IF. EXECUTE. INCLUDE FILE="c:\\temp\\macro.sps". /* Convert data from list.txt to the corresponding sav file */. DATA LIST FILE='c:\\temp\\list.txt' LIST /!list. SAVE OUTFILE=!QUOTE(!CONCAT('c:\\temp\\list',!thisn,'.sav')). !DOEND /* Combine all the sav files */. GET FILE='c:\\temp\\list1.sav'. !DO !nb=2 !TO !n. ADD FILES FILE=* /FILE=!QUOTE(!CONCAT('c:\\temp\\list',!nb,'.sav.')). !DOEND /* Eliminate duplicates */. SORT CASES BY ALL. MATCH FILES FILE=* /BY=ALL /FIRST=first. SELECT IF first. SAVE OUTFILE='c:\\temp\\all combinations.sav'. /* Find name of last variables */ !DO !var !IN (!indepv) !LET !lastone=!var !DOEND VECTOR vnames(!m A8). !LET !cnt=!BLANK(1) /* Create variables containing the names of the indep variables */ !DO !var !IN (!indepv) COMPUTE vnames(!LEN(!cnt))=!QUOTE(!var). !LET !cnt=!CONCAT(!cnt,!BLANK(1)) !DOEND /* Construct the string containing the list of indep var of each regression */. STRING dep (A8) indepv(A255). COMPUTE dep=!QUOTE(!dep). VECTOR j=j1 TO !CONCAT('j',!n) /ind=vnames1 TO !CONCAT('vnames',!m). COMPUTE nvar=NVALID(j1 TO !CONCAT('j',!n)). LOOP cnt=1 TO nvar. COMPUTE indepv=CONCAT(RTRIM(indepv)," ",vnames(j(cnt))). END LOOP. * Write the syntax file which will run all regressions */. WRITE OUTFILE='c:\\temp\\syntax to do all regressions.sps' /"!regres dep=" dep "indepv=" indepv ".". EXECUTE. !ENDDEFINE. *////////////////////////////. *////////////////////////////. DEFINE !regres (dep=!TOKENS(1) /indepv=!CMDEND) REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT !dep /METHOD=ENTER !indepv /SAVE ZPRED . !ENDDEFINE. *////////////////////////////. *************************. * EXAMPLE OF USE. *************************. * You have to make any changes you require in the above regression macro *. SET MPRINT=yes. ****** Run the following macro to do the preparatory work. !combine n=5 m=5 dep=salary indepv=educ jobcat jobtime prevexp minority. ****** Load your data file and do the regressions. GET FILE='c:\\Program Files\\SPSS\\Employee data.sav'. INCLUDE FILE='c:\\temp\\syntax to do all regressions.sps'. |
Related pages
...