Match cases on basis of propensity scores
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | *(Q) When comparing two groups (treated and untreated) it is useful to adjust for confounding differences between the groups. Maybe, for instance, one treatment receives "harder patients" than the other. One way of doing so is to create what is called "propensity scores." Essentially the idea is that we compare those who are similar to each other (=have similar propensity scores). One way of creating these propensity scores is to use logistic regression. I have done all this. The three key colums are then: A: The column which says whether a patient has received the treatment (0 or 1) B: A column with a propensity score (which says how likely it is that a person was in the group receiving treatment given certain other values - sex, gender, history i.e. the values used in the logistic regression) C: A column with the result of the treatment (e.g. absolute or percentage improvement) Now, the question is not about the theory or about statistics, it is simply this: I want to create a fourth colum of "control cases." The values in this fourth colum should be the improvement for the person who has the closest propensity score (is most similar) to the treated person (for each row with a treated person). *(A)Posted to SPSSX-L by rlevesque@videotron.ca on 2001/11/07. * http://www.spsstools.net * The solution assumes that the number of cases receiving the treatment is known. * This could restriction could be removed if necessary. * Create a data file for illustration purposes. INPUT PROGRAM. SET SEED=2365847. LOOP caseid=1 TO 20. COMPUTE treatm=TRUNC(UNIFORM(1)+.5). COMPUTE propen=UNIFORM(100). COMPUTE improv=UNIFORM(100). END CASE. END LOOP. END FILE. END INPUT PROGRAM. SORT CASES BY treatm(D) propen. COMPUTE idx=$CASENUM. SAVE OUTFILE='c:\\temp\\mydata.sav'. * Erase the previous temporary result file, if any. ERASE FILE='c:\\temp\\results.sav'. COMPUTE key=1. SELECT IF (1=0). * Create an empty data file to receive results. SAVE OUTFILE='c:\\temp\\results.sav'. ********************************************. * Define a macro which will do the job. ********************************************. SET MPRINT=no. *////////////////////////////////. DEFINE !match (nbtreat=!TOKENS(1)) !DO !cnt=1 !TO !nbtreat GET FILE='c:\\temp\\mydata.sav'. SELECT IF idx=!cnt OR treatm=0. DO IF $CASENUM=1. COMPUTE #target=propen. ELSE. COMPUTE delta=propen-#target. END IF. EXECUTE. SELECT IF ~MISSING(delta). IF (delta<0) delta=-delta. SORT CASES BY delta. SELECT IF $CASENUM=1. COMPUTE key=!cnt. ADD FILES FILE=* /FILE='c:\\temp\\results.sav'. SAVE OUTFILE='c:\\temp\\results.sav'. !DOEND !ENDDEFINE. *////////////////////////////////. SET MPRINT=yes. **************************. * Call macro (we know that there are 7 treatment cases). **************************. !match nbtreat=7. * Sort results file to allow matching. GET FILE='c:\\temp\\results.sav'. SORT CASES BY key. SAVE OUTFILE='c:\\temp\\results.sav'. * Match each treatment cases with the most similar non treatment case. GET FILE='c:\\temp\\mydata.sav'. MATCH FILES /FILE=* /FILE='C:\\Temp\\results.sav' /RENAME (idx = d0) caseid=caseid2 improv=improv2 propen=propen2 treatm=treatm2 key=idx /BY idx /DROP= d0. EXECUTE. * That's it!. |
Related pages
...