Hide Cell With Less Than N Persons
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | * For confidentiality purposes, it is often required that certain information not be shown in * Pivot Tables when the number of cases within a given cell is less than 5 for example. * Manually (or programmatically) deleting these cells from the Pivot Table itself is not trivial because * it is NOT sufficient to simply delete cells with less than 5 cases. * Example of the problem: * say for a given age band we have 3 cases for males and 7 cases for females and assume * results for the both genders combined are also included in the PT. Simply deleting the cell with 3 cases * does NOT provide confidentiality because the original content of that cell CAN be derived from * results in the female and total cells. * The solution adopted here is to * 1) determine which cases fall in cells with less than 5 cases and * 2) set the relevant variable to SYSMIS. * This way, in the above example, the information for both gender would equal the information for females * and it would not be possible to determine the original content of the male cell using results from the PT. GET FILE='c:\program files\spss\employee data.sav'. COMPUTE age = DATEDIFF( DATE.DMY(1,1,1990), bdate, "days") / 365.25 . RECODE age (LO THRU 20=1) (LO THRU 30=2)(LO THRU 40=3)(LO THRU 50=4)(LO THRU 60=5)(LO THRU HI=6) INTO ageband. * Custom Tables. CTABLES /VLABELS VARIABLES=salary ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES CAPTION="All salaries are shown". * Want to hide salary info if there are less than 5 persons in a cell. SORT CASES BY ageband gender. AGGREGATE /BREAK=ageband gender /nmissing=NMISS(salary) /totaln=N. COMPUTE nbValid = totaln - nmissing. COMPUTE salary2=salary. APPLY DICTIONARY FROM * /SOURCE VARIABLES = salary /TARGET VARIABLES = salary2 /VARINFO ALL . IF nbValid<5 salary2=$SYSMIS. * Custom Tables. CTABLES /VLABELS VARIABLES=salary2 ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary2 [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES CAPTION="Salary of cells with less than 5 cases are not considered (for confidentiality purposes)". **************************. * Version using a macro. **************************. *///////////////////////. DEFINE !hide(minnb=!TOKENS(1) /var=!TOKENS(1) /newvar=!TOKENS(1) /cell=!CMDEND) SORT CASES BY !cell . AGGREGATE /PRESORTED /BREAK=!cell /nmissing=NMISS(!var) /totaln=N. COMPUTE nbValid = totaln - nmissing. COMPUTE !newvar=!var. APPLY DICTIONARY FROM * /SOURCE VARIABLES = !var /TARGET VARIABLES = !newvar /VARINFO ALL . IF nbValid< !minnb !newvar=$SYSMIS. !ENDDEFINE. *///////////////////////. GET FILE='c:\program files\spss\employee data.sav'. COMPUTE age= RND(DATEDIFF(DATE.DMY(1,1,1990), bdate, "days") / 365.25) . RECODE age (LOW THRU 20=1) (LOW THRU 30=2)(LOW THRU 40=3)(LOW THRU 50=4)(LOW THRU 60=5)(LOW THRU HI=6) INTO ageband. * Custom Tables. CTABLES /VLABELS VARIABLES=salary ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES TITLE="Average salaries by ageband and gender" CAPTION="All salaries are shown". !hide minnb=5 var=salary newvar=salary2 cell=ageband gender. * Custom Tables. CTABLES /VLABELS VARIABLES=salary2 ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary2 [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES TITLE="Average salaries by ageband and gender" CAPTION="Salary of cells with less than 5 cases are not considered (for confidentiality purposes)". |
Related pages
...