Package 'stratamatch' reference manual

Title:	Stratification and Matching for Large Observational Data Sets
Description:	A pilot matching design to automatically stratify and match large datasets. The manual_stratify() function allows users to manually stratify a dataset based on categorical variables of interest, while the auto_stratify() function does automatically by allocating a held-aside (pilot) data set, fitting a prognostic score (see Hansen (2008) <doi:10.1093/biomet/asn004>) on the pilot set, and stratifying the data set based on prognostic score quantiles. The strata_match() function then does optimal matching of the data set in parallel within strata.
Authors:	Rachael C. Aikens [aut, cre], Joseph Rigdon [aut], Justin Lee [aut], Michael Baiocchi [aut], Jonathan Chen [aut]
Maintainer:	Rachael C. Aikens <[email protected]>
License:	GPL-3
Version:	0.1.9
Built:	2025-03-13 05:33:06 UTC
Source:	https://github.com/cran/stratamatch

Auto Stratify

Description

Automatically creates strata for matching based on a prognostic score formula or a vector of prognostic scores already estimated by the user. Creates a auto_strata object, which can be passed to strata_match for stratified matching or unpacked by the user to be matched by some other means.

Usage

auto_stratify(
  data,
  treat,
  prognosis,
  outcome = NULL,
  size = 2500,
  pilot_fraction = 0.1,
  pilot_size = NULL,
  pilot_sample = NULL,
  group_by_covariates = NULL
)
auto_stratify(
  data,
  treat,
  prognosis,
  outcome = NULL,
  size = 2500,
  pilot_fraction = 0.1,
  pilot_size = NULL,
  pilot_sample = NULL,
  group_by_covariates = NULL
)

Arguments

`data`	`data.frame` with observations as rows, features as columns
`treat`	string giving the name of column designating treatment assignment
`prognosis`	information on how to build prognostic scores. Three different input types are allowed: vector of prognostic scores for all individuals in the data set. Should be in the same order as the rows of `data`. a `formula` for fitting a prognostic model an already-fit prognostic score model
`outcome`	string giving the name of column with outcome information. Required if prognostic_scores is specified. Otherwise it will be inferred from prog_formula
`size`	numeric, desired size of strata (default = 2500)
`pilot_fraction`	numeric between 0 and 1 giving the proportion of controls to be allotted for building the prognostic score (default = 0.1)
`pilot_size`	alternative to pilot_fraction. Approximate number of observations to be used in pilot set. Note that the actual pilot set size returned may not be exactly `pilot_size` if `group_by_covariates` is specified because balancing by covariates may result in deviations from desired size. If `pilot_size` is specified, `pilot_fraction` is ignored.
`pilot_sample`	a data.frame of held aside samples for building prognostic score model. If `pilot_sample` is specified, `pilot_size` and `pilot_fraction` are both ignored.
`group_by_covariates`	character vector giving the names of covariates to be grouped by (optional). If specified, the pilot set will be sampled in a stratified manner, so that the composition of the pilot set reflects the composition of the whole data set in terms of these covariates. The specified covariates must be categorical.

Details

Stratifying by prognostic score quantiles can be more effective than manually stratifying a data set because the prognostic score is continuous, thus the strata produced tend to be of equal size with similar prognosis.

Automatic stratification requires information on how the prognostic scores should be derived. This is primarily determined by the specifciation of the prognosis argument. Three main forms of input for prognosis are allowed:

A vector of prognostic scores. This vector should be the same length and order of the rows in the data set. If this method is used, the outcome argument must also be specified; this is simply a string giving the name of the column which contains outcome information.
A formula for prognosis (e.g. outcome ~ X1 + X2). If this method is used, auto_stratify will automatically split the data set into a pilot_set and an analysis_set. The pilot set will be used to fit a logistic regression model for outcome in the absence of treatment, and this model will be used to estimate prognostic scores on the analysis set. The analysis set will then be stratified based on the estimated prognostic scores. In this case the outcome argument need not be specified since it can be inferred from the input formula.
A model for prognosis (e.g. a glm object). If this method is used, the outcome argument must also be specified

Value

Returns an auto_strata object. This contains:

outcome - a string giving the name of the column where outcome information is stored
treat - a string giving the name of the column encoding treatment assignment
analysis_set - the data set with strata assignments
call - the call to auto_stratify used to generate this object
issue_table - a table of each stratum and potential issues of size and treat:control balance. In small or imbalanced strata, it may be difficult or infeasible to find high-quality matches, while very large strata may be computationally intensive to match.
strata_table - a table of each stratum and the prognostic score quantile bin to which it corresponds
prognostic_scores - a vector of prognostic scores.
prognostic_model - a model for prognosis fit on a pilot data set. Will be NULL if a vector of prognostic scores was provided as the prognosis argument to auto_stratify rather than a model or formula.
pilot_set - the set of controls used to fit the prognostic model. These are excluded from subsequent analysis so that the prognostic score is not overfit to the data used to estimate the treatment effect. Will be NULL if a pre-fit model or a vector of prognostic scores was provided as the prognosis argument to auto_stratify rather than formula.

Troubleshooting

This section suggests fixes for common errors that appear while fitting the prognostic score or using it to estimate prognostic scores on the analysis set.

Encountered an error while fitting the prognostic model... numeric probabilities 0 or 1 produced. This error means that the prognostic model can perfectly separate positive from negative outcomes. Estimating a treatment effect in this case is unwise since an individual's baseline characteristics perfectly determine their outcome, regardless of whether they recieve the treatment. This error may also appear on rare occaisions when your pilot set is very small (number of observations approximately <= number of covariates in the prognostic model), so that perfect separation happens by chance.
Encountered an error while estimating prognostic scores ... factor X has new levels ... This may indicate that some value(s) of one or more categorical variables appear in the analysis set which were not seen in the pilot set. This means that when we try to obtain prognostic scores for our analysis set, we run into some new value that our prognostic model was not prepared to handle. There are a few options we have to troubleshoot this problem:
- Rejection sampling. Run auto_stratify again with the same arguments until this error does not occur (i.e. until some observations with the missing value are randomly selected into the pilot set)
- Eliminate this covariate from the prognostic formula.
- Remove observations with the rare covariate value from the entire data set. Consider carefully how this exclusion might affect your results.

Other errors or warnings can occur if the pilot set is too small and the prognostic formula is too complicated. Always make sure that the number of observations in the pilot set is large enough that you can confidently fit a prognostic model with the number of covariates you want.

Examples

# make sample data set
set.seed(111)
dat <- make_sample_data(n = 75)

# construct a pilot set, build a prognostic score for `outcome` based on X2
# and stratify the data set based on the scores into sets of about 25
# observations
a.strat_formula <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# stratify the data set based on a model for prognosis
pilot_data <- make_sample_data(n = 30)
prognostic_model <- glm(outcome ~ X2, pilot_data, family = "binomial")
a.strat_model <- auto_stratify(dat, "treat", prognostic_model,
  outcome = "outcome", size = 25
)

# stratify the data set based on a vector of prognostic scores
prognostic_scores <- predict(prognostic_model,
  newdata = dat,
  type = "response"
)
a.strat_scores <- auto_stratify(dat, "treat", prognostic_scores,
  outcome = "outcome", size = 25
)

# diagnostic plots
plot(a.strat_formula)
plot(a.strat_formula, type = "AC", propensity = treat ~ X1, stratum = 1)
plot(a.strat_formula, type = "hist", propensity = treat ~ X1, stratum = 1)
plot(a.strat_formula, type = "residual")
# make sample data set
set.seed(111)
dat <- make_sample_data(n = 75)

# construct a pilot set, build a prognostic score for `outcome` based on X2
# and stratify the data set based on the scores into sets of about 25
# observations
a.strat_formula <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# stratify the data set based on a model for prognosis
pilot_data <- make_sample_data(n = 30)
prognostic_model <- glm(outcome ~ X2, pilot_data, family = "binomial")
a.strat_model <- auto_stratify(dat, "treat", prognostic_model,
  outcome = "outcome", size = 25
)

# stratify the data set based on a vector of prognostic scores
prognostic_scores <- predict(prognostic_model,
  newdata = dat,
  type = "response"
)
a.strat_scores <- auto_stratify(dat, "treat", prognostic_scores,
  outcome = "outcome", size = 25
)

# diagnostic plots
plot(a.strat_formula)
plot(a.strat_formula, type = "AC", propensity = treat ~ X1, stratum = 1)
plot(a.strat_formula, type = "hist", propensity = treat ~ X1, stratum = 1)
plot(a.strat_formula, type = "residual")

Extract cutoffs between strata

Description

By default, returns only the internal cut points. Cutoffs at 0 and 1 are implied.

Usage

extract_cut_points(x)
extract_cut_points(x)

Arguments

`x`	an autostrata object

Value

a vector of the score values delineating cutoffs between strata

Examples

dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
cutoffs <- extract_cut_points(a.strat)
dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
cutoffs <- extract_cut_points(a.strat)

Extract cutoffs between strata

Description

Extract cutoffs between strata

Usage

## S3 method for class 'auto_strata'
extract_cut_points(x)
## S3 method for class 'auto_strata'
extract_cut_points(x)

Arguments

`x`	an autostrata object

Value

a vector of the score values delineating cutoffs between strata

Demographics and comorbidities of 10,157 ICU patients

Description

An deidentified data set containing the demographics, comorbidities, DNR code status, and surgical team assignment of 10,157 patients in the Stanford University Hospital Intensive Care Unit (ICU). This data was extracted from the electronic record system, deidentified, and made publically available by Chavez et al (2018) <doi:10.1371/journal.pone.0190569>. It was reprocessed for use in the stratamatch package as a sample data set. For more details on the data extraction and inclusion criteria, see Chavez et al.

Usage

ICU_data
ICU_data

Format

A data frame with 10157 rows and 29 variables:

patid: patient id, numeric
Birth.preTimeDays: age of patient at time of admission to the ICU in days, numeric
Female.pre: whether the patient was documented to be female prior to ICU visit, binary
RaceAsian.pre: whether the patient's race/ethnicity was documented as Asian prior to ICU visit, binary
RaceUnknown.pre: whether the patient's race/ethnicity was unknown prior to ICU visit, binary
RaceOther.pre: whether the patient's race/ethnicity was documented as Other" prior to ICU visit, binary
RaceBlack.pre: whether the patient's race/ethnicity was documented as Black/African American prior to ICU visit, binary
RacePacificIslander.pre: whether the patient's race/ethnicity was documented as PacificIslander prior to ICU visit, binary
RaceNativeAmerican.pre: whether the patient's race/ethnicity was documented as Native American prior to ICU visit, binary
self_pay: whether the patient was "self pay" (i.e. uninsured), binary
all_latinos: whether the patient was documented to be latino prior to ICU visit, binary
DNR: whether the patient had code status set to any DNR "Do not resuscitate" order at any point during their ICU stay, binary
surgicalTeam: whether the patient was assigned to a surgical team at any point during their ICU stay, binary

Details

License information for this data is as follows:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Source

https://simtk.org/frs/download_confirm.php/latestzip/1969/ICUDNR-latest.zip?group_id=892

Checks `auto_strata` class

Description

Checks if the target object is an auto_strata object.

Usage

is.auto_strata(object)
is.auto_strata(object)

Arguments

object

any R object

Value

Returns TRUE if its argument has auto_strata among its classes and FALSE otherwise.

Examples

dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
is.auto_strata(a.strat) # returns TRUE
dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
is.auto_strata(a.strat) # returns TRUE

Checks `manual_strata` class

Description

Checks if the target object is a manual_strata object.

Usage

is.manual_strata(object)
is.manual_strata(object)

Arguments

object

any R object

Value

Returns TRUE if its argument has manual_strata among its classes and FALSE otherwise.

Examples

dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
is.manual_strata(m.strat) # returns TRUE
dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
is.manual_strata(m.strat) # returns TRUE

Checks `strata` class

Description

Checks if the target object is a strata object.

Usage

is.strata(object)
is.strata(object)

Arguments

object

any R object

Value

Returns TRUE if its argument has strata among its classes and FALSE otherwise.

Examples

dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
is.strata(m.strat) # returns TRUE
dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
is.strata(m.strat) # returns TRUE

Make match distances within strata

Description

Makes the match distance with strata specifications for strata_match. This function is largely unecessary to call outside of stratamatch, but it is exported for the benefit of the user to aid in debugging. Note that this function requires that the R package optmatch is installed.

Usage

make_match_distances(object, model, method)
make_match_distances(object, model, method)

Arguments

`object`	a strata object
`model`	(optional) formula for matching. If left blank, all columns of the analysis set in `object` will be used as covariates in the propensity model or mahalanobis match (except outcome, treatment and stratum)
`method`	either "prop" for propensity score matching based on a glm fit with model `model`, or "mahal" for mahalanobis distance matching by the covariates in `model`.

Value

a match distance matrix for optmatch

Examples


dat <- make_sample_data(n = 75)

# stratify with auto_stratify
a.strat <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# make match distances.  Requires optmatch package to be installed.

md <- make_match_distances(a.strat, treat ~ X1 + X2, method = "mahal")

dat <- make_sample_data(n = 75)

# stratify with auto_stratify
a.strat <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# make match distances.  Requires optmatch package to be installed.

md <- make_match_distances(a.strat, treat ~ X1 + X2, method = "mahal")

Make sample data

Description

Makes a simple data frame with treat (binary), outcome (binary), and five covariates: X1 (continuous), X2 (continuous), B1 (binary), B2 (binary), and C1 (categorical). Probability outcome = 1 is sigmoid(treat + X1). Probability treatment = 1 is sigmoid(- 0.2 * X1 + X2 - B1 + 2 * B2)

Usage

make_sample_data(n = 100)
make_sample_data(n = 100)

Arguments

`n`	the size of the desired data set

Examples

# make sample data set of 30 observations
dat <- make_sample_data(n = 30)
# make sample data set of 30 observations
dat <- make_sample_data(n = 30)

Manual Stratify

Description

Stratifies a data set based on a set of blocking covariates specified by the user. Creates a manual_strata object, which can be passed to strata_match for stratified matching or unpacked by the user to be matched by some other means.

Usage

manual_stratify(data, strata_formula, force = FALSE)
manual_stratify(data, strata_formula, force = FALSE)

Arguments

`data`	data.frame with observations as rows, features as columns
`strata_formula`	the formula to be used for stratification. (e.g. `treat ~ X1`) the variable on the left is taken to be the name of the treatment assignment column, and the variables on the left are taken to be the variables by which the data should be stratified
`force`	a boolean. If true, run even if a variable appears continuous. (default = FALSE)

Value

Returns a manual_strata object. This contains:

treat - a string giving the name of the column encoding treatment assignment
covariates - a character vector with the names of the categorical columns on which the data were stratified
analysis_set - the data set with strata assignments
call - the call to manual_stratify used to generate this object
issue_table - a table of each stratum and potential issues of size and treat:control balance. In small or imbalanced strata, it may be difficult or infeasible to find high-quality matches, while very large strata may be computationally intensive to match.
strata_table - a table of each stratum and the covariate bin to which it corresponds

Examples

# make sample data set
dat <- make_sample_data(n = 75)

# stratify based on B1 and B2
m.strat <- manual_stratify(dat, treat ~ B1 + B2)

# diagnostic plot
plot(m.strat)
# make sample data set
dat <- make_sample_data(n = 75)

# stratify based on B1 and B2
m.strat <- manual_stratify(dat, treat ~ B1 + B2)

# diagnostic plot
plot(m.strat)

Plot method for `auto_strata` object

Description

Generates diagnostic plots for the product of a stratification by auto_stratify. There are four plot types:

"SR" (default) - produces a scatter plot of strata by size and treat:control ratio
"hist" - produces a histogram of propensity scores within a stratum
"AC" - produces a Assignment-Control plot of individuals within a stratum
"residual" - produces a residual plot for the prognostic model

Usage

## S3 method for class 'auto_strata'
plot(
  x,
  type = "SR",
  label = FALSE,
  stratum = "all",
  strata_lines = TRUE,
  jitter_prognosis,
  jitter_propensity,
  propensity,
  ...
)
## S3 method for class 'auto_strata'
plot(
  x,
  type = "SR",
  label = FALSE,
  stratum = "all",
  strata_lines = TRUE,
  jitter_prognosis,
  jitter_propensity,
  propensity,
  ...
)

Arguments

`x`	an `auto_strata` object returned by `auto_stratify`
`type`	string giving the plot type (default = `"SR"`). Other options are `"hist"`, `"AC"` and `"residual"`
`label`	ignored unless `type = "SR"`. If `TRUE`, a clickable plot is produced. The user may click on any number of strata and press finish to have those strata labeled. Note: uses `identify`, which may not be supported on some devices
`stratum`	ignored unless `type = "hist"` or `type = "AC"`. A number specifying which stratum to plot.
`strata_lines`	default = `TRUE`. Ignored unless `type = "AC"`. If TRUE, lines on the plot indicate strata cut points.
`jitter_prognosis`	ignored unless `type = "AC"`. Amount of uniform random noise to add to prognostic scores in plot.
`jitter_propensity`	ignored unless `type = "AC"`. Amount of uniform random noise to add to propensity scores in plot.
`propensity`	ignored unless `type = "hist"` or `type = "AC"`. Specifies propensity score information for plots where this is required. Accepts either a vector of propensity scores, a `glm` model for propensity scores, or a formula for fitting a propensity score model.
`...`	other arguments

Examples

dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
plot(a.strat) # makes size-ratio scatter plot
plot(a.strat, type = "hist", propensity = treat ~ X1, stratum = 1)
plot(a.strat, type = "AC", propensity = treat ~ X1, stratum = 1)
plot(a.strat, type = "residual")
dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
plot(a.strat) # makes size-ratio scatter plot
plot(a.strat, type = "hist", propensity = treat ~ X1, stratum = 1)
plot(a.strat, type = "AC", propensity = treat ~ X1, stratum = 1)
plot(a.strat, type = "residual")

Plot method for `manual_strata` object

Description

Generates diagnostic plots for the product of a stratification by manual_stratify. There are two plot types:

"SR" (default) - produces a scatter plot of strata by size and treat:control ratio
"hist" - produces a histogram of propensity scores within a stratum.

Note that residual plots and AC plots are not supported for manual_strata objects because no prognostic model is fit.

Usage

## S3 method for class 'manual_strata'
plot(x, type = "SR", label = FALSE, stratum = "all", propensity, ...)
## S3 method for class 'manual_strata'
plot(x, type = "SR", label = FALSE, stratum = "all", propensity, ...)

Arguments

`x`	a `manual_strata` object returned by `manual_stratify`
`type`	string giving the plot type (default = `"SR"`). Other option is `"hist"`
`label`	ignored unless `type = "SR"`. If `TRUE`, a clickable plot is produced. The user may click on any number of strata and press finish to have those strata labeled. Note: uses `identify`, which may not be supported on some devices
`stratum`	ignored unless `type = "hist"`. A number specifying which stratum to plot.
`propensity`	ignored unless `type = "hist"`. Specifies propensity score information for plots where this is required. Accepts either a vector of propensity scores, a `glm` model for propensity scores, or a formula for fitting a propensity score model.
`...`	other arguments

Examples

dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
plot(m.strat) # makes size-ratio scatter plot
plot(m.strat, type = "hist", propensity = treat ~ X1, stratum = 1)
dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
plot(m.strat) # makes size-ratio scatter plot
plot(m.strat, type = "hist", propensity = treat ~ X1, stratum = 1)

Print Auto Strata

Description

Print method for auto_strata object

Usage

## S3 method for class 'auto_strata'
print(x, ...)
## S3 method for class 'auto_strata'
print(x, ...)

Arguments

`x`	an `auto_strata` object
`...`	other arguments

Examples

dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
print(a.strat) # prints information about a.strat
dat <- make_sample_data()
a.strat <- auto_stratify(dat, "treat", outcome ~ X1 + X2)
print(a.strat) # prints information about a.strat

Print Manual Strata

Description

Print method for manual_strata object

Usage

## S3 method for class 'manual_strata'
print(x, ...)
## S3 method for class 'manual_strata'
print(x, ...)

Arguments

`x`	a `manual_strata` object
`...`	other arguments

Examples

dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
print(m.strat) # prints information about m.strat
dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
print(m.strat) # prints information about m.strat

Split data into pilot and analysis sets

Description

Given a data set and some parameters about how to split the data, this function partitions the data accordingly and returns the partitioned data as a list containing the analysis_set and pilot_set.

Usage

split_pilot_set(
  data,
  treat,
  pilot_fraction = 0.1,
  pilot_size = NULL,
  group_by_covariates = NULL
)
split_pilot_set(
  data,
  treat,
  pilot_fraction = 0.1,
  pilot_size = NULL,
  group_by_covariates = NULL
)

Arguments

`data`	`data.frame` with observations as rows, features as columns
`treat`	string giving the name of column designating treatment assignment
`pilot_fraction`	numeric between 0 and 1 giving the proportion of controls to be allotted for building the prognostic score (default = 0.1)
`pilot_size`	alternative to pilot_fraction. Approximate number of observations to be used in pilot set. Note that the actual pilot set size returned may not be exactly `pilot_size` if `group_by_covariates` is specified because balancing by covariates may result in deviations from desired size. If `pilot_size` is specified, `pilot_fraction` is ignored.
`group_by_covariates`	character vector giving the names of covariates to be grouped by (optional). If specified, the pilot set will be sampled in a stratified manner, so that the composition of the pilot set reflects the composition of the whole data set in terms of these covariates. The specified covariates must be categorical.

Value

a list with analaysis_set and pilot_set

Examples

dat <- make_sample_data()
splt <- split_pilot_set(dat, "treat", 0.2)
# can be passed into auto_stratify if desired
a.strat <- auto_stratify(splt$analysis_set, "treat", outcome ~ X1,
  pilot_sample = splt$pilot_set
)
dat <- make_sample_data()
splt <- split_pilot_set(dat, "treat", 0.2)
# can be passed into auto_stratify if desired
a.strat <- auto_stratify(splt$analysis_set, "treat", outcome ~ X1,
  pilot_sample = splt$pilot_set
)

Strata Match

Description

Match within strata in series using optmatch. Note that this function requires that the R package optmatch is installed.

Usage

strata_match(object, model = NULL, method = "prop", k = 1)
strata_match(object, model = NULL, method = "prop", k = 1)

Arguments

`object`	a strata object
`model`	(optional) formula for matching. If left blank, all columns of the analysis set in `object` will be used as covariates in the propensity model or mahalanobis match (except outcome, treatment and stratum)
`method`	either "prop" for propensity score matching based on a glm fit with model `model`, or "mahal" for mahalanobis distance matching by the covariates in `model`.
`k`	the number of control individuals to be matched to each treated individual. If `"k = full"` is used, fullmatching is done instead of pairmatching

Value

a named factor with matching assignments

Examples

# make a sample data set
set.seed(1)
dat <- make_sample_data(n = 75)

# stratify with auto_stratify
a.strat <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# 1:1 match based on propensity formula: treat ~ X1 + X2
# Requires optmatch package to be installed.

strata_match(a.strat, model = treat ~ X1 + X2, k = 1)


# full match within strata based on mahalanobis distance.
# Requires optmatch package to be installed.

strata_match(a.strat, model = treat ~ X1 + X2, method = "mahal", k = 1)

# make a sample data set
set.seed(1)
dat <- make_sample_data(n = 75)

# stratify with auto_stratify
a.strat <- auto_stratify(dat, "treat", outcome ~ X2, size = 25)

# 1:1 match based on propensity formula: treat ~ X1 + X2
# Requires optmatch package to be installed.

strata_match(a.strat, model = treat ~ X1 + X2, k = 1)


# full match within strata based on mahalanobis distance.
# Requires optmatch package to be installed.

strata_match(a.strat, model = treat ~ X1 + X2, method = "mahal", k = 1)

stratamatch: stratify and match large data sets

Description

This package employs a pilot matching design to automatically stratify and match large datasets. The manual_stratify function allows users to manually stratify a dataset based on categorical variables of interest, while the auto_stratify function does automatically by allocating a held-aside (pilot) data set, fitting a prognostic score (see Hansen (2008) <doi:10.1093/biomet/asn004>) on the pilot set, and stratifying the data set based on prognostic score quantiles. The strata_match function then does optimal matching of the data set within strata.

Summary for strata object

Description

Summarize number and sizes of strata in a strata object. Also prints number of strata with potential issues.

Usage

## S3 method for class 'strata'
summary(object, ...)
## S3 method for class 'strata'
summary(object, ...)

Arguments

`object`	a `strata` object
`...`	other arguments

Details

For more information, access the issue table for your strata object with mystrata$issue_table.

Examples

dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
summary(m.strat) # Summarizes strata in m.strat
dat <- make_sample_data()
m.strat <- manual_stratify(dat, treat ~ C1)
summary(m.strat) # Summarizes strata in m.strat

Package 'stratamatch'

Help Index

Auto Stratify

Description

Usage

Arguments

Details

Value

Troubleshooting

See Also

Examples

Extract cutoffs between strata

Description

Usage

Arguments

Value

Examples

Extract cutoffs between strata

Description

Usage

Arguments

Value

Demographics and comorbidities of 10,157 ICU patients

Description

Usage

Format

Details

Source

Checks auto_strata class

Description

Usage

Arguments

Value

Examples

Checks manual_strata class

Description

Usage

Arguments

Value

Examples

Checks strata class

Description

Usage

Arguments

Value

Examples

Make match distances within strata

Description

Usage

Arguments

Value

See Also

Examples

Make sample data

Description

Usage

Arguments

Examples

Manual Stratify

Description

Usage

Arguments

Value

See Also

Examples

Plot method for auto_strata object

Description

Usage

Arguments

See Also

Examples

Plot method for manual_strata object

Description

Usage

Arguments

Examples

Print Auto Strata

Description

Usage

Arguments

Checks `auto_strata` class

Checks `manual_strata` class

Checks `strata` class

Plot method for `auto_strata` object

Plot method for `manual_strata` object