USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS

USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS

William M. Holmes

SPSS COMMANDS FOR PROPENSITY USE

Many uses of propensity scores are possible with SPSS commands. The following presents some of these commands. There may be other ways of accomplishing the same result. Some uses of propensity scores are not possible directly using SPSS commands. However, with an add-on R extender for SPSS, any procedures not possible within SPSS directly can be executed through R . An overview of the R interface for SPSS by Felix Thoemmes, can be found at http://arxiv.org/ftp/arxiv/papers/1201/1201.6385.pdf

NORMALITY TESTS

Testing whether the distribution is normal or some other shape can be done either with the Kolmogorov-Smirnoff onesample test within NPAR.

NPTESTS /ONESAMPLE TEST ( confounder1, confounder2)

KOLMOGOROV_SMIRNOV(NORMAL=SAMPLE EXPONENTIAL=SAMPLE

POISSON=SAMPLE ).

/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE.

IMBALANCE ASSESSMENT PROCEDURES

Imbalance tests with SPSS can be done with the MEANS program (for ANOVA statistics) or T-TEST.

MEANS TABLES=confounder1, confounder2, confounder3, confounder4

BY treatment/CELLS MEAN STDDEV VARIANCE COUNT

SUM / STAT ANOVA.

T-TEST GROUPS=treatment(0 1) /MISSING=ANALYSIS

/VARIABLES= confounder1, confounder2, confounder3, confounder4

/CRITERIA=CI(.95).

PROPENSITY ESTIMATION

Propensity scores can be estimated using a REGRESSION program, LOGISTIC REGRESSION, GLM, or DISCRIMINANT.

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA

/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT treatment

/METHOD=ENTER confounder1, confounder2, confounder3, confounder4

/SAVE PRED (propen).

LOGISTIC REGRESSION VARIABLES treatment /METHOD=ENTER confounder1,

confounder2, confounder3, confounder4 /SAVE=PRED (propen)

/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).

GLM

DISCRIMINANT /GROUPS=treatment0 1) /VARIABLES=confounder1

confounder2 confounder3, confounder4 /ANALYSIS ALL

/save PROBS (propen) /PRIORS EQUAL /STATISTICS=MEAN STDDEV

UNIVF COEFF /CLASSIFY=NONMISSING POOLED.

MATCHING

Most matching with SPSS has to be done using the R Extender add-on to execute R programs that do matching. A version of exact or coarsened matching and of greedy matching are two exceptions.

Exact and Coarsened Exact

Break the file into treatment and comparison group files containing propensity scores. Aggregate each file. Merge aggregated control file into disaggregated treatment using propensity scores as IDs, trimmed to as many significant digits as desired. Add cases of comparison disaggregated file to treatment file. Create flag for cases having merge matches. Select Cases meeting flag. This produces a 1-many match. For 1-1 match, purge comparison cases having duplicate subject identifiers (the subject ID’s, not the propensities used as case ID’s).

Nearest Neighbor and Caliper

Painter (2004) has created an SPSS Macro to do Nearest Neighbor matching. It has been extended by Clark (2012). The Painter macro is available from: http://www.unc.edu/~painter/SPSSsyntax/propen.txt. It requires an input file containing a propensity score named propen and an intervention variable named treatm. You must also specify the number of cases. Instructions are contained in the file propen.txt. The Clark version and instructions are posted at http://faculty.umb.edu/william_holmes/clarkmacro.htm.

1-Many

This is currently available through R programs via the R Extender Add-on.

Optimized, Full, and Genetic

Optimized, Full, and Genetic matching within SPSS commands is not currently possible. Optimized programs can be executed within SPSS using the R extension add-on.

STRATIFYING

Stratifying using propensity scores is achieved in SPSS by recoding the propensity score into 5 groups whose range of values are equal. The new, variable groups are the strata.

RECODE propen (.20 THRU .29=1)(.30 THRU .39=2)(.40 THRU .49=3)

(.50 THRU .59=4)(.60 THRU .69=5) INTO propenstrata.

REGRESSION AND THE GENERAL LINEAR MODEL

These procedures can accomplished in SPSS with the GLM program or the REGRESSION program.

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA

/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT treatment

/METHOD=ENTER confounder1 confounder2 confounder3 confounder4

/SAVE PRED (propen).

GLM income BY treatment /EMMEANS TABLES(treatment)

/PRINTDESCRIPTIVES PARAMETER.

TWO-STAGE LEAST SQUARES

Two-Stage Least Squares may be done either with 2SLS or with WLS procedures.

2SLS income WITH treatment /INSTRUMENTS age /CONSTANT

/SAVE PRED RESID.

WLS income WITH treatment /INSTRUMENTS age /CONSTANT

/SAVE PRED RESID.

SAMPLE WEIGHTING

COMPUTE ipw=1/propen.

IF (treatment EQ 1)ipw=1/(1-propen).

WEIGHT BY ipw.

WEIGHTED LEAST SQUARES

This may be done either with the WLS procedure or the GLM procedure.

GLM income BY treatment /EMMEANS TABLES(treatment)

/REGWGT ipw /PRINT DESCRIPTIVES PARAMETER.

GENERALIZED LINEAR MODEL

GZLM is done with the GENLIN PROGRAM. The following produces logit predicted propensity scores.

GENLIN treatment (REFERENCE=LAST) BY confounder1

confounder2 (ORDER=ASCENDING) /MODEL confounder1

confounder2 INTERCEPT=YES DISTRIBUTION=BINOMIAL

LINK=LOGIT /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL

MAXITERATIONS=100 MAXSTEPHALVING=5 PCONVERGE=.001

(ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95

CITYPE=WALD LIKELIHOOD=FULL /MISSING CLASSMISSING=EXCLUDE

/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION

/SAVE MEANPRED (propen) .

MISSING DATA ANALYSIS

Missing value analysis is done in SPSS with the MVA command. It also allows em estimation of missing values that can be saved as an output file. It can also be done with the multiple imputation command.

MVA VARIABLES=educ_1 boyorgrl_1 agekdbrn_1 finrela_1 /TPATTERN

PERCENT=1 DESCRIBE=agekdbrn_1 /EM(TOLERANCE=0.001

CONVERGENCE=0.0001 ITERATIONS=25) /EM

(OUTFILE='c:\spssdata\gss\emdata.sav') .

MULTIPLE IMPUTATION confounder1 confounder2 confounder3

confounder4 /IMPUTE METHOD=NONE

/MISSINGSUMMARIES VARIABLES (MINPCTMISSING=.001) .

IMPUTATION OF MISSING DATA

Multiple imputation in SPSS is done with the multiple imputation command. A new data file is saved containing imputed values. Running subfiles for a command on this file calculates the results with average imputed results.

MULTIPLE IMPUTATION confounder1, confounder2, confounder3, confounder4

/IMPUTE METHOD=FCS /CONSTRAINTS confounder1

(RND=1 MIN=1) /CONSTRAINTS confounder1 (MAX=20)

/OUTFILE IMPUTATIONS = IMPUTEDDATA.

Return to Home

Revised 1/31/2012