Count outliers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | * Question was: How can I show the number of outliers in a box plot (when many points are equal). * Answer by rlevesque@videotron.ca. INPUT PROGRAM. SET SEED=98765413. LOOP cnt=1 to 200. COMPUTE var1=RV.NORMAL(100,10). END CASE. END LOOP. END FILE. END INPUT PROGRAM. * Define 4 outliers having similar values. IF $CASENUM<5 var1=150+$CASENUM/5. * Round the values of var1 to the nearest 2.5. * Rounding is necessary because if outliers are close to each other, the labels indicating * the number of values will overlap and will not be readable. Rounding will vary with * the size of the variables; for large variables rounding might be to the nearest 100 or 1,000. COMPUTE var1=RND(var1/2.5)*2.5. AUTORECODE var1 /INTO var2. SORT CASES BY var2 (A) . DO IF $casenum =1. COMPUTE varlabel=1. ELSE IF var2=lag(var2). COMPUTE varlabel=lag(varlabel)+1. ELSE. COMPUTE varlabel=1. END IF. SORT CASES BY var2 (D) varlabel(D) . DO IF $casenum =1. COMPUTE varlab=varlabel. ELSE IF (var2<>lag(var2)) & (varlabel>1). COMPUTE varlab=varlabel. END IF. FORMATS varlab(F8.0). EXAMINE VARIABLES=var1 /ID= varlab /PLOT BOXPLOT STEMLEAF /COMPARE GROUP /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. |
Related pages
...