Remove unused variables from many files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | * QUESTION:Given files (year1972, year1973, year1974) with unused variables, how can I delete all unused variables. * Note: A variable with a constant value for all cases is considered an unused variable. * SOLUTION. * (was tested with version 10.07). * Raynald Levesque 2000/09/23. * assumptions: the number of variables is the same in all files. * Variables must all be numeric. * Comments: This syntax is not completely general. * You will have to replace all paths, then. * enter the total number of variables in the file (see *### (note 2) below). * enter the name of the first and last variables in the file ((see *### (note 3) below). *****************************************. * 3 data files are created for illustration purposes. *****************************************. * first data file contains a missing value (for variable d) to show that this var is also deleted by syntax. * The missing value generates a warning message which is to be ignored. DATA LIST LIST /a b c d z. BEGIN DATA 1 2 2 -1 2 1 1 2 . 1 1 2 2 -1 3 1 1 2 -1 5 END DATA. LIST. * we will want to keep only variables b and z. SAVE OUTFILE='c:\\data\\year1972.sav'. DATA LIST LIST /a b c d z. BEGIN DATA 2 2 2 4 2 2 1 2 5 2 2 2 2 -1 2 2 1 2 -1 2 END DATA. LIST. * we will want to keep only variables b and d. SAVE OUTFILE='c:\\data\\year1973.sav'. DATA LIST LIST /a b c d z. BEGIN DATA 3 2 2 4 5 3 2 2 4 2 3 2 2 -1 2 3 2 2 -1 2 END DATA. LIST. * we will want to keep only variables d and z. SAVE OUTFILE='c:\\data\\year1974.sav'. *****************************************. * this is the end of preparatory work. *****************************************. SET MPRINT=yes. * Define a macro to do one file (year) at a time. *////////////////////////////////////////////////. DEFINE !delete(year=!TOKENS(1)). GET FILE=!QUOTE(!CONCAT('c:\\data\\year',!year,'.sav.')). * Save the variable names in a file. N OF CASES 1. FLIP. COMPUTE varnum=$casenum. SAVE OUTFILE=!QUOTE(!CONCAT('c:\\data\\vn',!year,'.sav.')). CACHE. EXE. GET FILE=!QUOTE(!CONCAT('c:\\data\\year',!year,'.sav.')). *### (note 2) replace 5 by number of variables in original file. *### (note 3) replace a and z by name of first and last variable in file. RENAME VARIABLE (a TO z = var1 TO var5). COMPUTE dum=1. * If the SD of a variable is zero, it means all cases have same value. * since variables have been renamed with consecutive names (var1, var2 etc), the aggregate * command does not require us to list each variable name. *### (note 2) replace 5 by number of variables in original file. AGGREGATE /OUTFILE=* /BREAK=dum /var1 TO var5 = SD(var1 TO var5). FLIP. * Get rid of the dummy case. SELECT IF SUBSTR(case_lbl,1,3)<>"DUM". *Retrieve the original variable names. MATCH FILES FILE=* /DROP=case_lbl. COMPUTE varnum=$casenum. EXECUTE. * Keep only name of unused variables. SELECT IF var001=0. MATCH FILES /FILE=* /TABLE=!QUOTE(!CONCAT('c:\\data\\vn',!year,'.sav.')) /RENAME (var001 = d0) /BY varnum /DROP= d0. *write a syntax file to delete these variables. WRITE OUTFILE='c:\\data\\delete vars.sps' /"MATCH FILES FILE=* /DROP=" case_lbl ".". EXE. * Apply syntax to original file. GET FILE=!QUOTE(!CONCAT('c:\\data\\year',!year,'.sav.')). INCLUDE 'c:\\data\\delete vars.sps'. SAVE OUTFILE=!QUOTE(!CONCAT('c:\\data\\y',!year,'.sav.')). !ENDDEFINE. *////////////////////////////////////////////////. *### (note 4) in next macro, replace 1974 by the last year you have. *////////////////////////////////////////////////. DEFINE !main(). !DO !y=1972 !TO 1974. !delete year=!y. !DOEND. !ENDDEFINE. *////////////////////////////////////////////////. * Call main macro to do the entire job. !main. |
Related pages
...