Count unique occurrences of a multiple response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | Chris, Here are a few ways of approaching the problem. Fundamentally different in their approaches. I am sure there are other ways as well, but these come immediately to mind. Regards, David Marso SPSS Consulting Services -------- ************************************************** * A method which compares each variable to each * preceding variable, clobbers the duplicates and * then tallys the surviving instances. **************************************************. data list free/ id prog1 prog2 prog3 prog4 prog5. begin data 001 345 345 876 509 345 002 . 220 220 . 350 end data. * Copy the array * . VECTOR P=PROG1 TO PROG5 / #TMP(5). LOOP #I=1 to 5. + compute #TMP(#I)=P(#I). END LOOP. * Compare to preceding variables *. LOOP #I=2 to 5. + LOOP #J=1 TO #I-1. + IF #TMP(#I)=#TMP(#J) #TMP(#I)=$SYSMIS. + END LOOP IF MISSING(#TMP(#I)). END LOOP. COMPUTE N=NVALID(#TMP1 TO #TMP5). EXECUTE. ************************************************** * A method which restructures the data file into * * multiple cases per record and then aggregates *. **************************************************. data list free/ id prog1 prog2 prog3 prog4 prog5. begin data 001 345 345 876 509 345 002 . 220 220 . 350 end data. *save file for later merge *. SAVE OUTFILE 'TMP'. VECTOR PROG=PROG1 TO PROG5. loop P=1 to 5. compute program=PROG(P). DO IF NOT (MISSING(PROGRAM)). XSAVE OUTFILE 'PROG' / KEEP ID PROGRAM. END IF. END LOOP. EXECUTE. GET FILE 'PROG' . AGGREGATE OUTFILE * / BREAK ID PROGRAM / N=N. AGGREGATE OUTFILE * / BREAK ID / N=N. MATCH FILES FILE 'TMP' / FILE * / BY ID. EXECUTE. Chris Conway wrote: > > I'm posting this for a colleague: > > I'm working on a student registration data file. Each unique student > record has information on up to five different registrations. For each > of the five registrations, I have a variable indicating what program the > student was registered in. > > The record layout therefore is: > > ID prog1 prog2 prog3 prog4 prog5 > > ID and Prog1 - prog5 are numeric. There are missing data fields. What I > want to do is determine how many unique programs the student has > registered in? > > For example, for the following cases, the answers are "3 unique > programs" and "2 unique programs" respectively: > > 001 345 345 876 509 345 > 002 220 220 350 > (id) (prog1) (prog2) (prog3) (prog4) (prog5) > > Any help would be appreciated. > > Thanks, Chris Conway |
Related pages
...