DoT-Test with only means, SD and Ns
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | * I send you some syntax to perform a T test for independent samples with summary data. Although you already have a solution for that (with matrix data input in ANOVA), this method is far more complete: - A test for equality of variances is performed first (Hartley's F test) - Both the standard T test (assuming equal variances) and Welch test (not assuming equal variances) are calculated. - Asymptotic (if sample sizes are greater than 30) 95%CI for the difference of means are calculated for both tests - Non asymptotic 95%CI are also given if sample sizes are low. * I also provide the original data used to get the means, sd and sample sizes (with the "normal" T test) for comparison purposes. *Best regards *Marta SYNTAX: * Just one set of data (one row) can be processed each time (see below the original data). data list list /mean1(f8.3) sd1(F8.3) n1(F8.0) mean2(f8.3) sd2(F8.3) n2(F8.0). begin data 187.643 38.098 14 235.929 54.286 14 end data. * T-test *. matrix. PRINT /TITLE "T TEST FOR INDEPENDENT SAMPLES FROM SUMMARY DATA". GET DATA /FILE=* /names=vecnam. get mean1 /var=mean1. get sd1 /var=sd1. get n1 /var=n1. get mean2 /var=mean2. get sd2 /var=sd2. get n2 /var=n2. compute sem1=sd1/sqrt(n1). compute sem2=sd2/sqrt(n2). print {n1,mean1,sd1,sem1;n2,mean2,sd2,sem2} /title='Input data' /clabels='N','Mean','sd','sem' /rlabels='Sample 1','Sample 2' /format='f8.2'. compute diff=mean1-mean2. compute var1=sd1**2. compute var2=sd2**2. do if var1 ge var2. compute ftest=var1/var2. compute fsig=1-fcdf(ftest,n1,n2). else if var1 lt var2. compute ftest=var2/var1. compute fsig=1-fcdf(ftest,n2,n1). end if. print {ftest,fsig} /title='Hartley test for equality of variances' /clabels='F','Sig.' /format='f8.3'. compute n=n1+n2. compute poolvar=((n1-1)&*(var1)+(n2-1)&*(var2))/(n-2) . compute eedif1=sqrt(poolvar*(1/n1+1/n2)). compute t1=diff/eedif1. compute df1=n-2. compute t1sig=2*(1-tcdf(abs(t1),df1)). compute eedif2=sqrt(var1/n1+var2/n2). compute t2=diff/eedif2. compute df2=((var1/n1+var2/n2)**2)/(((var1/n1)**2)/(n1-1)+((var2/n2)**2)/(n2-1)). compute t2sig=2*(1-tcdf(abs(t2),df2)). print {diff,eedif1,t1,df1,t1sig;diff,eedif2,t2,df2,t2sig} /title='T test for independent means with equal or unequal variances' /clabels='Diff.','SE(dif)','t','df','2-Sig.' /rlabels='Equal','Unequal' /format='f8.3'. do if (n1 ge 30) and (n2 ge 30). compute low1=diff-1.96*eedif1. compute upp1=diff+1.96*eedif1. compute low2=diff-1.96*eedif2. compute upp2=diff+1.96*eedif2. print {low1,upp1;low2,upp2} /title='Aproximate 95%CI for diff (asymptotic)' /clabels='Lower','Upper' /rlabels='Equal','Unequal' /format='f8.3'. end if. compute data={data,diff,eedif1,df1,eedif2,df2}. compute vecnam={vecnam,"diff","eedif1","df1","eedif2","df2"}. save data /outfile=* /names=vecnam. end matrix. * Computation of exact (non asymptotic) 95%CI for diff *. COMPUTE low1 = diff -eedif1* IDF.T(0.975,df1) . COMPUTE upp1 = diff +eedif1* IDF.T(0.975,df1) . COMPUTE low2 = diff -eedif2* IDF.T(0.975,df2) . COMPUTE upp2 = diff +eedif2* IDF.T(0.975,df2) . EXECUTE . REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER) /VARIABLES=low1 upp1 /TITLE "95%CI for diff assuming equal variances". REPORT FORMAT=LIST AUTOMATIC ALIGN(CENTER) /VARIABLES=low2 upp2 /TITLE "95%CI for diff not assuming equal variances". * Original data (for comparison purposes) *. data list free /group(F8.0) wgain(F8.0). begin data 1 175 1 149 1 132 1 187 1 218 1 123 1 151 1 248 1 200 1 206 1 219 1 179 1 234 1 206 2 142 2 214 2 311 2 249 2 337 2 176 2 262 2 211 2 302 2 216 2 195 2 236 2 253 2 199 end data. var label group 'Diet'/wgain 'Weight gain (lb.)'. value labels group 1 'Control' 2 'A Vitamin'. T-TEST GROUPS=group(1 2) /VARIABLES=wgain /CRITERIA=CIN(.95) . * The only difference between both methods is the use of Hartley's F test instead of Levene's F test (the last method evaluates residuals and requires the original data, not aggregated). |
Related pages
...