* For confidentiality purposes, it is often required that certain information not be shown in * Pivot Tables when the number of cases within a given cell is less than 5 for example. * Manually (or programmatically) deleting these cells from the Pivot Table itself is not trivial because * it is NOT sufficient to simply delete cells with less than 5 cases. * Example of the problem: * say for a given age band we have 3 cases for males and 7 cases for females and assume * results for the both genders combined are also included in the PT. Simply deleting the cell with 3 cases * does NOT provide confidentiality because the original content of that cell CAN be derived from * results in the female and total cells. * The solution adopted here is to * 1) determine which cases fall in cells with less than 5 cases and * 2) set the relevant variable to SYSMIS. * This way, in the above example, the information for both gender would equal the information for females * and it would not be possible to determine the original content of the male cell using results from the PT. GET FILE='c:\program files\spss\employee data.sav'. COMPUTE age = DATEDIFF( DATE.DMY(1,1,1990), bdate, "days") / 365.25 . RECODE age (LO THRU 20=1) (LO THRU 30=2)(LO THRU 40=3)(LO THRU 50=4)(LO THRU 60=5)(LO THRU HI=6) INTO ageband. * Custom Tables. CTABLES /VLABELS VARIABLES=salary ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES CAPTION="All salaries are shown". * Want to hide salary info if there are less than 5 persons in a cell. SORT CASES BY ageband gender. AGGREGATE /BREAK=ageband gender /nmissing=NMISS(salary) /totaln=N. COMPUTE nbValid = totaln - nmissing. COMPUTE salary2=salary. APPLY DICTIONARY FROM * /SOURCE VARIABLES = salary /TARGET VARIABLES = salary2 /VARINFO ALL . IF nbValid<5 salary2=\$SYSMIS. * Custom Tables. CTABLES /VLABELS VARIABLES=salary2 ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary2 [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES CAPTION="Salary of cells with less than 5 cases are not considered (for confidentiality purposes)". **************************. * Version using a macro. **************************. *///////////////////////. DEFINE !hide(minnb=!TOKENS(1) /var=!TOKENS(1) /newvar=!TOKENS(1) /cell=!CMDEND) SORT CASES BY !cell . AGGREGATE /PRESORTED /BREAK=!cell /nmissing=NMISS(!var) /totaln=N. COMPUTE nbValid = totaln - nmissing. COMPUTE !newvar=!var. APPLY DICTIONARY FROM * /SOURCE VARIABLES = !var /TARGET VARIABLES = !newvar /VARINFO ALL . IF nbValid< !minnb !newvar=\$SYSMIS. !ENDDEFINE. *///////////////////////. GET FILE='c:\program files\spss\employee data.sav'. COMPUTE age= RND(DATEDIFF(DATE.DMY(1,1,1990), bdate, "days") / 365.25) . RECODE age (LOW THRU 20=1) (LOW THRU 30=2)(LOW THRU 40=3)(LOW THRU 50=4)(LOW THRU 60=5)(LOW THRU HI=6) INTO ageband. * Custom Tables. CTABLES /VLABELS VARIABLES=salary ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES TITLE="Average salaries by ageband and gender" CAPTION="All salaries are shown". !hide minnb=5 var=salary newvar=salary2 cell=ageband gender. * Custom Tables. CTABLES /VLABELS VARIABLES=salary2 ageband gender DISPLAY=DEFAULT /TABLE ageband [C] BY gender > salary2 [COUNT VALIDN F40.0, MEAN] /CATEGORIES VARIABLES=ageband gender ORDER=A KEY=VALUE EMPTY=EXCLUDE TOTAL=YES /TITLES TITLE="Average salaries by ageband and gender" CAPTION="Salary of cells with less than 5 cases are not considered (for confidentiality purposes)".