USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS
William M. Holmes
R COMMANDS FOR PROPENSITY USE
There are many ways to use R to perform propensity analysis. These are some of them. Alternative ways are possible. The commands in R, as well as variable names, are usually printed in lower case.
NORMALITY TESTS
Normality of a distribution can be tested with the rnorm function or with the ks test, where the number is the sample size.
rnorm <- rnorm(500, mean(confounder1), sd(confounder1))
cc <- cbind(rnorm,confounder1)
g <- goodfit(cc, method="MinChisq")
summary(g)
z<-rnorm(500)
ks.test(z,pnorm,.5,.2)
IMBALANCE ASSESSMENT
Imbalance assessment can be done with oneway or with the MatchBalance function.
oneway.test (confounder1 ~ treatment)
mb <- MatchBalance(treatment~confounder1 + confounder2 +confounder3)
summary mb
PROPENSITY ESTIMATION
Propensity estimation may be done wither with the multnom function or the lda function. Those familiar with boosted regression may use the ps package. Propensity estimation for logistic and discriminant analysis are as follows:
result <- MULTINOM(treatment ~ confounder1+ confounder2 , ABSTOL=1.0e-20,
fit <- lda(treatment~ confounder1+ confounder2, data=mydata,
na.action= "na.omit")
MATCHING
The match, and matchit functions perform a variety of different matching procedures.
Exact and Coarsened Exact
Exact and coarsened matching is included in the match function. An example with 1-1 matching is given below. It can also be done with the Matchit function
rr <- Match(Y=Y, Tr=Tr, X=X, M=1);
summary(rr)
m.out <- matchit(treatment ~ propen, data = mydata, method = "exact")
summary (m.out)
m.out <- matchit(treatment ~ propen, data = mydata, method = "cem")
summary (m.out)
Nearest Neighbor and Caliper
There are a variety of nearest neighbor matching packages in R. These include the matchit function.
m.out <- matchit(treatment ~ propen, data = mydata, method = "nearest")
summary (m.out)
1-Many
The Matching package does a variety of 1-many matches, as well as 1-1 matching and genetic matching. It uses the match function (see above). Set the parameter “m” to the number of matches desired.
m.out <- matchit(treatment ~ propen, data = mydata, m=2, method = "nearest")
summary (m.out)
Optimized, Full, and Genetic Matching
The matchit, and optmatch perform optimized, full, and genetic matching. The genmatch function performs genetic matching.
m.out <- matchit(treatment ~ propen, data = mydata, method = "optimal", ratio=2)
summary (m.out)
m.out <- matchit(treatment ~ propen, data = mydata, method = "full")
summary (m.out)
STRATIFYING
A stratifying variable can be computed with a series of conditional assignment statements.
mydata$propenstrata[propen > .80] <- ".80+"
mydata$propenstrata[propen > .60 & propen <= .80] <- ".60-.80"
mydata$propenstrata[propen >.40 & propen <=.60] <- ".40-.60"
mydata$propenstrata[propen>.20 & propen<=.40] <- “.20-.40”
mydata$propenstrata[propen<=.20]
<-“0-.20”
detach(mydata)
REGRESSION AND THE GENERAL LINEAR MODEL
Within R regression and GLM can be done using the lm function.
lm.r = lm(treatment ~ confounder1 +confounder2 + confounder3)
TWO-STAGE LEAST SQUARES
The systemfit package can be used for 2SLS in R.
Fit2sls <- systemfit(treatment, method = "2SLS", inst = ~instrument1 + instrument2
+ instrument3)
SAMPLE WEIGHTING
Sample weights can be calculated using conditional assignment. Sample weighting is done within the survey analysis package.
Mydata$ipw[treatment=0]<- 1/propen
Mydata$ipw[treatment=1]<- 1/(1-propen)
mydataweighted<-svydesign(id=~1, weights=~ipw, data=mydata,)
WEIGHTED LEAST SQUARES
Many analysis programs allow a weights parameter. The linear modeling, lm, function gives an example.
outcomepred<- lm(outcome ~ treatment , weights = ipw)
summary outcomepred
GENERALIZED LINEAR MODEL
Generalized linear model estimation in R is done with the glm function.
Outcomepred<- glm(formula= outcome ~ treatment , family=gaussian)
MISSING DATA ANALYSIS
Analysis of missing data requires using the summary program and the missingness map function within the Amelia package. See creation of a.out under imputation.
summary(confounder1,confounder2,confounder3)
missmap(a.out)
IMPUTATION OF MISSING DATA
Imputation of data can be done with the Amelia package.
a.out <- amelia(confounder1, m = 5)
Return to Home
Revised 1/31/2013