USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS

USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS

William M. Holmes

R COMMANDS FOR PROPENSITY USE

There are many ways to use R to perform propensity analysis. These are some of them. Alternative ways are possible. The commands in R, as well as variable names, are usually printed in lower case.

NORMALITY TESTS

Normality of a distribution can be tested with the rnorm function or with the ks test, where the number is the sample size.

rnorm <- rnorm(500, mean(confounder1), sd(confounder1))

cc <- cbind(rnorm,confounder1)

g <- goodfit(cc, method="MinChisq")

summary(g)

z<-rnorm(500)

ks.test(z,pnorm,.5,.2)

IMBALANCE ASSESSMENT

Imbalance assessment can be done with oneway or with the MatchBalance function.

oneway.test (confounder1 ~ treatment)

mb <- MatchBalance(treatment~confounder1 + confounder2 +confounder3)

summary mb

PROPENSITY ESTIMATION

Propensity estimation may be done wither with the multnom function or the lda function. Those familiar with boosted regression may use the ps package. Propensity estimation for logistic and discriminant analysis are as follows:

result <- MULTINOM(treatment ~ confounder1+ confounder2 , ABSTOL=1.0e-20,

fit <- lda(treatment~ confounder1+ confounder2, data=mydata,

na.action= "na.omit")

MATCHING

The match, and matchit functions perform a variety of different matching procedures.

Exact and Coarsened Exact

Exact and coarsened matching is included in the match function. An example with 1-1 matching is given below. It can also be done with the Matchit function

rr <- Match(Y=Y, Tr=Tr, X=X, M=1);

summary(rr)

m.out <- matchit(treatment ~ propen, data = mydata, method = "exact")

summary (m.out)

m.out <- matchit(treatment ~ propen, data = mydata, method = "cem")

summary (m.out)

Nearest Neighbor and Caliper

There are a variety of nearest neighbor matching packages in R. These include the matchit function.

m.out <- matchit(treatment ~ propen, data = mydata, method = "nearest")

summary (m.out)

1-Many

The Matching package does a variety of 1-many matches, as well as 1-1 matching and genetic matching. It uses the match function (see above). Set the parameter “m” to the number of matches desired.

m.out <- matchit(treatment ~ propen, data = mydata, m=2, method = "nearest")

summary (m.out)

Optimized, Full, and Genetic Matching

The matchit, and optmatch perform optimized, full, and genetic matching. The genmatch function performs genetic matching.

m.out <- matchit(treatment ~ propen, data = mydata, method = "optimal", ratio=2)

summary (m.out)

m.out <- matchit(treatment ~ propen, data = mydata, method = "full")

summary (m.out)

STRATIFYING

A stratifying variable can be computed with a series of conditional assignment statements.

mydata$propenstrata[propen > .80] <- ".80+"

mydata$propenstrata[propen > .60 & propen <= .80] <- ".60-.80"

mydata$propenstrata[propen >.40 & propen <=.60] <- ".40-.60"

mydata$propenstrata[propen>.20 & propen<=.40] <- “.20-.40”

mydata$propenstrata[propen<=.20] <-“0-.20”
detach(mydata)

REGRESSION AND THE GENERAL LINEAR MODEL

Within R regression and GLM can be done using the lm function.

lm.r = lm(treatment ~ confounder1 +confounder2 + confounder3)

TWO-STAGE LEAST SQUARES

The systemfit package can be used for 2SLS in R.

Fit2sls <- systemfit(treatment, method = "2SLS", inst = ~instrument1 + instrument2

+ instrument3)

SAMPLE WEIGHTING

Sample weights can be calculated using conditional assignment. Sample weighting is done within the survey analysis package.

Mydata$ipw[treatment=0]<- 1/propen

Mydata$ipw[treatment=1]<- 1/(1-propen)

mydataweighted<-svydesign(id=~1, weights=~ipw, data=mydata,)

WEIGHTED LEAST SQUARES

Many analysis programs allow a weights parameter. The linear modeling, lm, function gives an example.

outcomepred<- lm(outcome ~ treatment , weights = ipw)

summary outcomepred

GENERALIZED LINEAR MODEL

Generalized linear model estimation in R is done with the glm function.

Outcomepred<- glm(formula= outcome ~ treatment , family=gaussian)

MISSING DATA ANALYSIS

Analysis of missing data requires using the summary program and the missingness map function within the Amelia package. See creation of a.out under imputation.

summary(confounder1,confounder2,confounder3)

missmap(a.out)

IMPUTATION OF MISSING DATA

Imputation of data can be done with the Amelia package.

a.out <- amelia(confounder1, m = 5)

Return to Home

Revised 1/31/2013