Replace Values Higher than n (Updated)
This is a version of syntax Replace values higher than n by the mean of the other values, adapted to capabilities of modern SPSS versions (tested with IBM SPSS Statistics 24). Code is much simpler as we don't need anymore to store aggregated values to external file and match it with original data. AGGREGATE
command simply saves aggregates to the original file as new variable. There are just 4 steps remain: calculate 'valid' values, aggregate over them, apply condition with replacing value, and (optionally) delete extra variables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | * Encoding: UTF-8. * Topic: replacing values, greater then specified value by mean for the remaining cases. * Data example. DATA LIST LIST /hrt clientid linenum. BEGIN DATA 485 1002 1 280 1002 2 100 1002 3 420 1002 4 410 1002 5 510 1002 6 END DATA. LIST. * Solution. IF hrt<=480 hrt_norm = hrt. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /meanhrt = MEAN(hrt_norm). IF hrt>480 hrt=meanhrt. EXECUTE. * Delete temporary variables. DELETE VARIABLES hrt_norm meanhrt. |
You may note, there is no BREAK
subcommand in AGGREGATE
: in modern SPSS version if you aggregating over the whole file this subcommand can be omitted, as well as no need to create dummy nobreak variable. You may also note relatively new DELETE VARIABLES
command for clearing temporary variables. ADD FILES
with KEEP
subcommand (as in original syntax) is still preferrable in general as it allows at the same time not only keep needed variables but also reorder them.