Find duplicates
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | * How to find or delete duplicates. * rlevesque@videotron.ca. *Define dummy data for illustration purposes. DATA LIST LIST /id(F8) var1(F8) var2(F8). BEGIN DATA. 2 1 1 2 1 1 5 3 6 4 4 7 3 4 8 3 4 8 3 4 8 END DATA. SORT CASES BY id. * Find duplicates (variable n contains the number of duplicates). * If you ignore the variable n, the file does not have any duplicates. AGGREGATE OUTFILE='temp.sav' /BREAK=ALL /N=n. *add number of duplicates to each record (if that's what you need). MATCH FILES /FILE=* /TABLE='temp.sav' /RENAME (var1 var2 = d0 d1) /BY id /DROP= d0 d1. EXECUTE. *If you want to simply exclude duplicates, do the following (starting from raw data). MATCH FILES /FILE=* /FIRST = top /BY id. SELECT IF top. EXECUTE. |
Related pages
...