USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS
William M. Holmes
SPSS COMMANDS FOR PROPENSITY USE
Many uses of propensity scores are possible with SPSS commands. The following presents some of these commands. There may be other ways of accomplishing the same result. Some uses of propensity scores are not possible directly using SPSS commands. However, with an add-on R extender for SPSS, any procedures not possible within SPSS directly can be executed through R . An overview of the R interface for SPSS by Felix Thoemmes, can be found at http://arxiv.org/ftp/arxiv/papers/1201/1201.6385.pdf
NORMALITY TESTS
Testing whether the distribution is normal or some other shape can be done either with the Kolmogorov-Smirnoff onesample test within NPAR.
NPTESTS /ONESAMPLE TEST ( confounder1, confounder2)
KOLMOGOROV_SMIRNOV(NORMAL=SAMPLE EXPONENTIAL=SAMPLE
POISSON=SAMPLE ).
/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE.
IMBALANCE ASSESSMENT PROCEDURES
Imbalance tests with SPSS can be done with the MEANS program (for ANOVA statistics) or T-TEST.
MEANS TABLES=confounder1, confounder2, confounder3, confounder4
BY treatment/CELLS MEAN STDDEV VARIANCE COUNT
SUM / STAT ANOVA.
T-TEST GROUPS=treatment(0 1) /MISSING=ANALYSIS
/VARIABLES= confounder1, confounder2, confounder3, confounder4
/CRITERIA=CI(.95).
PROPENSITY ESTIMATION
Propensity scores can be estimated using a REGRESSION program, LOGISTIC REGRESSION, GLM, or DISCRIMINANT.
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT treatment
/METHOD=ENTER confounder1, confounder2, confounder3, confounder4
/SAVE PRED (propen).
LOGISTIC REGRESSION VARIABLES treatment /METHOD=ENTER confounder1,
confounder2, confounder3, confounder4 /SAVE=PRED (propen)
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).
GLM
DISCRIMINANT /GROUPS=treatment0 1) /VARIABLES=confounder1
confounder2 confounder3, confounder4 /ANALYSIS ALL
/save PROBS (propen) /PRIORS EQUAL /STATISTICS=MEAN STDDEV
UNIVF COEFF /CLASSIFY=NONMISSING POOLED.
MATCHING
Most matching with SPSS has to be done using the R Extender add-on to execute R programs that do matching. A version of exact or coarsened matching and of greedy matching are two exceptions.
Exact and Coarsened Exact
Break the file into treatment and comparison group files containing propensity scores. Aggregate each file. Merge aggregated control file into disaggregated treatment using propensity scores as IDs, trimmed to as many significant digits as desired. Add cases of comparison disaggregated file to treatment file. Create flag for cases having merge matches. Select Cases meeting flag. This produces a 1-many match. For 1-1 match, purge comparison cases having duplicate subject identifiers (the subject ID’s, not the propensities used as case ID’s).
Nearest Neighbor and Caliper
Painter (2004) has created an SPSS Macro to do Nearest Neighbor matching. It has been extended by Clark (2012). The Painter macro is available from: http://www.unc.edu/~painter/SPSSsyntax/propen.txt. It requires an input file containing a propensity score named propen and an intervention variable named treatm. You must also specify the number of cases. Instructions are contained in the file propen.txt. The Clark version and instructions are posted at http://faculty.umb.edu/william_holmes/clarkmacro.htm.
1-Many
This is currently available through R programs via the R Extender Add-on.
Optimized, Full, and Genetic
Optimized, Full, and Genetic matching within SPSS commands is not currently possible. Optimized programs can be executed within SPSS using the R extension add-on.
STRATIFYING
Stratifying using propensity scores is achieved in SPSS by recoding the propensity score into 5 groups whose range of values are equal. The new, variable groups are the strata.
RECODE propen (.20 THRU .29=1)(.30 THRU .39=2)(.40 THRU .49=3)
(.50 THRU .59=4)(.60 THRU .69=5) INTO propenstrata.
REGRESSION AND THE GENERAL LINEAR MODEL
These procedures can accomplished in SPSS with the GLM program or the REGRESSION program.
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT treatment
/METHOD=ENTER confounder1 confounder2 confounder3 confounder4
/SAVE PRED (propen).
GLM income BY treatment /EMMEANS TABLES(treatment)
/PRINTDESCRIPTIVES PARAMETER.
TWO-STAGE LEAST SQUARES
Two-Stage Least Squares may be done either with 2SLS or with WLS procedures.
2SLS income WITH treatment /INSTRUMENTS age /CONSTANT
/SAVE PRED RESID.
WLS income WITH treatment /INSTRUMENTS age /CONSTANT
/SAVE PRED RESID.
SAMPLE WEIGHTING
COMPUTE ipw=1/propen.
IF (treatment EQ 1)ipw=1/(1-propen).
WEIGHT BY ipw.
WEIGHTED LEAST SQUARES
This may be done either with the WLS procedure or the GLM procedure.
GLM income BY treatment /EMMEANS TABLES(treatment)
/REGWGT ipw /PRINT DESCRIPTIVES PARAMETER.
GENERALIZED LINEAR MODEL
GZLM is done with the GENLIN PROGRAM. The following produces logit predicted propensity scores.
GENLIN
treatment (REFERENCE=LAST) BY confounder1
confounder2 (ORDER=ASCENDING) /MODEL confounder1
confounder2 INTERCEPT=YES DISTRIBUTION=BINOMIAL
LINK=LOGIT /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL
MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=.001
(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95
CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION
/SAVE MEANPRED (propen) .
MISSING DATA ANALYSIS
Missing value analysis is done in SPSS with the MVA command. It also allows em estimation of missing values that can be saved as an output file. It can also be done with the multiple imputation command.
MVA VARIABLES=educ_1 boyorgrl_1 agekdbrn_1 finrela_1 /TPATTERN
PERCENT=1 DESCRIBE=agekdbrn_1 /EM(TOLERANCE=0.001
CONVERGENCE=0.0001 ITERATIONS=25) /EM
(OUTFILE='c:\spssdata\gss\emdata.sav') .
MULTIPLE IMPUTATION confounder1 confounder2 confounder3
confounder4 /IMPUTE METHOD=NONE
/MISSINGSUMMARIES VARIABLES (MINPCTMISSING=.001) .
IMPUTATION OF MISSING DATA
Multiple imputation in SPSS is done with the multiple imputation command. A new data file is saved containing imputed values. Running subfiles for a command on this file calculates the results with average imputed results.
MULTIPLE IMPUTATION confounder1, confounder2, confounder3, confounder4
/IMPUTE METHOD=FCS /CONSTRAINTS confounder1
(RND=1 MIN=1) /CONSTRAINTS confounder1 (MAX=20)
/OUTFILE IMPUTATIONS =
IMPUTEDDATA.
Return to Home
Revised 1/31/2012