Replace missing by median values within the case
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | ************************************************. * Syntax to replace missing values in a case by the median of the non missing values. * Raynald Levesque. ************************************************. * Warnings are reported when a case has no missing values (because the RMV command cannot do its job). Of course, these warnings are to be ignored. SET MPRINT=on. * Define dummy data file for illustration purposes. NEW file. INPUT PROGRAM. SET SEED=98675423. VECTOR q(8F8.2). LOOP id=1 TO 25. LEAVE id. LOOP #cnt=1 TO 8. * say about 10% of answers should have missing value. IF UNIFORM(1)<.9 q(#cnt)=1+TRUNC(UNIFORM(9)). END LOOP. END CASE. END LOOP. END FILE. END INPUT PROGRAM. EXECUTE. LIST. *assume without loss of generality that all data are numeric. *if this is not the case: * save the Numeric Columns you are interested in a file NC along with the numeric case id. * delete the NC from you main file and save the resulting file as FILEB. * apply the following code to the file containing the NC. * merge the resulting file with FILEB. *Construct the variable names to be used after the FLIP. STRING nnames(A8). COMPUTE cn=$casenum. COMPUTE nnames=CONCAT('v',LTRIM(STRING(cn,F6.0))). EXECUTE. * Define a macro which will replace the missing values. */////////////////////////////////////////////////////. DEFINE !doit (nbvar=!TOKENS(1)). FLIP /NEWNAMES=nnames. VECTOR !CONCAT('r(',!nbvar,'F8)'). SUMMARIZE /TABLES=v1 TO !CONCAT(v,!nbvar) /FORMAT=NOLIST TOTAL /TITLE='Print Medians for checking purposes' /MISSING=VARIABLE /CELLS=MEDIAN . !DO !cnt=1 !TO !nbvar RMV !CONCAT(r,!cnt) = MEDIAN(!CONCAT(v,!cnt,',ALL)'). !DOEND MATCH FILES FILE=* /KEEP=case_lbl r1 TO !CONCAT(r,!nbvar). FLIP. *### in next line 8 is the number of variables in initial file. MATCH FILES FILE=* /KEEP=q1 TO q8 id. !ENDDEFINE. */////////////////////////////////////////////////////. *### in next line 25 is the number of cases in the original file. !doit nbvar=25. LIST. |
Related pages
...