Get random sample of x% of each stratum
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | *(Q) How can I select x% of cases in each stratum. I do not know the number of cases within each startum nor the total number of cases. *(A) Emailed to the person who asked the question in Dec 2001. * by rlevesque@videotron.ca. * Construct a dummy data file for illustration purposes. NEW FILE. INPUT PROGRAM. LOOP H=1 TO 10. * assume the minimum number of cases in a stratum (hn) is 15. COMPUTE hn=15 + TRUNC(UNIFORM(25)). LEAVE h hn. LOOP id=1 TO hn. + COMPUTE val=UNIFORM(100). + END CASE. END LOOP. END LOOP. END FILE. END INPUT PROGRAM. LIST. FREQ VAR=hn. SET MPRINT=no. * Start the job. * (You would replace the above code by your own data using a GET FILE command) * Define a macro which will do the job. *////////////////////////. DEFINE !sample (hvar=!TOKENS(1) /frac=!TOKENS(1)) /* Calculate the number of cases per stratum*/. /* (I assume the variable hn does not exist)*/. COMPUTE nobreak=1. RANK VARIABLES=nobreak BY !hvar /N INTO hn2. * Calculate the population size (N). RANK VARIABLES=!hvar BY nobreak /N INTO n. * Calculate required sample size (ssize) in each strata (rounded to nearest integer). COMPUTE ssize=RND(!frac*hn2). COMPUTE draw=UNIFORM(1). RANK VARIABLES=draw BY !hvar /RANK INTO rdraw. EXECUTE. * Select the sample. SELECT IF (rdraw <= ssize). !ENDDEFINE. *////////////////////////. * Use the macro. * Here strata variable is called h and 20% of cases are required. SET MPRINT=yes. !sample hvar=h frac=.2. * Next line is to facilitate checking that results are ok. SORT CASES BY h rdraw. |
Related pages
...