* --------------------------------------------------------------------------- File: double entry check.sps Author: Bruce Weaver Date: 19-Dec-2001 Notes: Use MATCH FILES to check for double entry . * --------------------------------------------------------------------------- . * First create a file with data for ID numbers 6-12 and save it. DATA LIST LIST /ID (f2.0) a (f2.0) b(f2.0) c (f2.0) Y (f2.0) . BEGIN DATA. 6 2 1 1 13 6 2 1 2 12 6 2 2 1 8 6 2 2 2 11 6 2 3 1 9 6 2 3 2 8 7 3 1 1 16 7 3 1 2 17 7 3 2 1 12 7 3 2 2 14 7 3 3 1 11 7 3 3 2 12 8 3 1 1 16 8 3 1 2 13 8 3 2 1 10 8 3 2 2 15 8 3 3 1 12 8 3 3 2 16 9 3 1 1 18 9 3 1 2 21 9 3 2 1 17 9 3 2 2 18 9 3 3 1 22 9 3 3 2 23 10 4 1 1 19 10 4 1 2 17 10 4 2 1 20 10 4 2 2 17 10 4 3 1 14 10 4 3 2 16 11 4 1 1 10 11 4 1 2 11 11 4 2 1 14 11 4 2 2 12 11 4 3 1 13 11 4 3 2 16 12 4 1 1 21 12 4 1 2 18 12 4 2 1 22 12 4 2 2 12 12 4 3 1 15 12 4 3 2 17 END DATA. compute fnum=2. exe. formats fnum (f2.0). var lab fnum 'File number'. save outfile = 'c:\\file2.sav' /compressed. * Now create another file with data for ID numbers 1-7 . * Note that both files will have data for IDs 6 and 7 . DATA LIST LIST /ID (f2.0) a (f2.0) b(f2.0) c (f2.0) Y (f2.0) . BEGIN DATA. 1 1 1 1 14 1 1 1 2 12 1 1 2 1 17 1 1 2 2 20 1 1 3 1 12 1 1 3 2 10 2 1 1 1 11 2 1 1 2 13 2 1 2 1 12 2 1 2 2 16 2 1 3 1 11 2 1 3 2 13 3 1 1 1 12 3 1 1 2 13 3 1 2 1 15 3 1 2 2 18 3 1 3 1 15 3 1 3 2 13 4 2 1 1 11 4 2 1 2 16 4 2 2 1 13 4 2 2 2 14 4 2 3 1 15 4 2 3 2 17 5 2 1 1 9 5 2 1 2 11 5 2 2 1 13 5 2 2 2 14 5 2 3 1 12 5 2 3 2 8 6 2 1 1 13 6 2 1 2 12 6 2 2 1 8 6 2 2 2 11 6 2 3 1 9 6 2 3 2 8 7 3 1 1 16 7 3 1 2 17 7 3 2 1 12 7 3 2 2 14 7 3 3 1 11 7 3 3 2 12 END DATA. compute fnum=1. exe. formats fnum (f2.0). var lab fnum 'File number'. * Stack the two files . ADD FILES /FILE=* /FILE='C:\\file2.sav'. EXECUTE. * Check for duplicate records by using MATCH FILES . * First need to sort by ID etc. sort cases by id a b c y . MATCH FILES file=* /BY id a b c y / FIRST = unique . EXEC . freq unique. * There are 12 records that are not unique. * Data for IDs 6 and 7 were in both files, so list these. use all. compute f = any(id,6,7). filter by f. exe. list id to unique. * Delete the duplicates. use all. filter off. select if unique. exe. freq unique. * Now all records are unique. * --------------------------------------------------------------------------- .