Proportional sampling without replace
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | * Proportionate sampling without replacement. *actually sampling is done with replacement but a block is retained only the first time it is selected. *City blocks have different populations. The objective is to select a random number of blocks where the probability of selecting a given block is proportional to its population. *The 3 files draw file1.sav draw file2.sav draw file3.sav contain intermediary results. The sole purpose of these files is to help understand and verify what the syntax is doing. * Raynald Levesque raynald@spsstools.net . *************************************************************. *Generate a file of 1000 cases for illustration purposes. *************************************************************. NEW file. input program. loop block=1 to 1000. leave block. compute pop=RND(uniform(10000)). FORMATS block pop (F8.0). end case. end loop. end file. end input program. execute. SET MPRINT=ON. */////////////////////////// BEG OF MACRO//////////////////////////////. DEFINE !SAMPLE (draw=!TOKENS(1) /keep=!TOKENS(1)) COMPUTE case# =$casenum. CREATE t_weight=CSUM(pop). SORT CASES BY case#(D). CREATE c_weight= CSUM(pop). SORT CASES BY case#. IF $casenum=1 #tot_pop=c_weight. VECTOR rva rvb (!draw F8.0). !DO !cnt = 1 !TO !draw DO IF $casenum=1. COMPUTE !CONCAT(rva,!cnt)=UNIFORM(#tot_pop). LEAVE !CONCAT(rva,!cnt). COMPUTE !CONCAT(rvb,!cnt)=!CONCAT(rva,!cnt)<t_weight. ELSE. COMPUTE !CONCAT(rvb,!cnt)=(!CONCAT(rva,!cnt)<t_weight) AND (!CONCAT(rva,!cnt)>lag(t_weight)). END IF. !DOEND COMPUTE flag=MAX(rvb1 TO !CONCAT('rvb',!draw)). SAVE OUTFILE='draw file1.sav' /COMPRESSED. SELECT IF (flag=1). EXECUTE. SAVE OUTFILE='draw file2.sav' /COMPRESSED. VECTOR rvb=rvb1 TO !CONCAT(rvb,!draw). * the same block may have been selected more than once but only the first occurence is kept. COMPUTE done=0. LOOP cnt=1 TO !draw. DO IF rvb(cnt)=1 AND done=0. XSAVE OUTFILE='draw file3.sav' /KEEP block cnt. COMPUTE done=1. END IF. END LOOP. EXECUTE. GET FILE='draw file3.sav'. SORT CASES BY cnt. SELECT IF ($casenum<=!keep). EXECUTE. !ENDDEFINE. */////////////////////////// END OF MACRO//////////////////////////////. * In this example, we want 120 blocks, we select 150 (with replacement) we delete blocks which were selected more than once we keep the first 120 remaining blocks. !SAMPLE draw=150 keep=120. |
Related pages
...