Title: | Penalized Meta-Analysis |
---|---|
Description: | Conduct penalized meta-analysis, see Van Lissa, Van Erp, & Clapper (2023) <doi:10.31234/osf.io/6phs5>. In meta-analysis, there are often between-study differences. These can be coded as moderator variables, and controlled for using meta-regression. However, if the number of moderators is large relative to the number of studies, such an analysis may be overfit. Penalized meta-regression is useful in these cases, because it shrinks the regression slopes of irrelevant moderators towards zero. |
Authors: | Caspar J van Lissa [aut, cre] , Sara J van Erp [aut] |
Maintainer: | Caspar J van Lissa <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.3 |
Built: | 2024-10-26 05:52:22 UTC |
Source: | https://github.com/cjvanlissa/pema |
Penalized meta-regression shrinks the regression slopes of irrelevant moderators towards zero (Van Lissa & Van Erp, 2021).
Van Lissa, C. J., van Erp, S., & Clapper, E. B. (2023). Selecting relevant moderators with Bayesian regularized meta-regression. Research Synthesis Methods. doi:10.31234/osf.io/6phs5
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.26.2. https://mc-stan.org
Create a stanfit
object from an object for which a method exists,
so that all methods for stanfit
objects can be used.
as.stan(x, ...)
as.stan(x, ...)
x |
An object for which a method exists. |
... |
Arguments passed to or from other methods. |
An object of class stanfit
, as documented in rstan::stan.
stanfit <- "a" class(stanfit) <- "stanfit" converted <- as.stan(stanfit)
stanfit <- "a" class(stanfit) <- "stanfit" converted <- as.stan(stanfit)
This meta-analysis of rodent studies examined whether early life adversity (ELA) alters cognitive performance in several domains. The data include over 400 independent experiments, involving approximately 8600 animals.
data(bonapersona)
data(bonapersona)
A data.frame with 734 rows and 65 columns.
Bonapersona, V., Kentrop, J., Van Lissa, C. J., van der Veen, R., Joels, M., & Sarabdjitsingh, R. A. (2019). The behavioral phenotype of early life adversity: A 3-level meta-analysis of rodent studies. Neuroscience & Biobehavioral Reviews, 102, 299–307. doi:10.1016/j.neubiorev.2019.04.021
This function conducts Bayesian regularized meta-regression (Van Lissa & Van
Erp, 2021). It uses the stan
function
rstan::sampling to fit the model. A lasso or horseshoe prior is used to
shrink the regression coefficients of irrelevant moderators towards zero.
See Details.
brma(x, ...) ## S3 method for class 'formula' brma( formula, data, vi = "vi", study = NULL, method = "hs", standardize = TRUE, prior = switch(method, lasso = c(df = 1, scale = 1), hs = c(df = 1, df_global = 1, df_slab = 4, scale_global = 1, scale_slab = 2, relevant_pars = NULL)), mute_stan = TRUE, ... ) ## Default S3 method: brma( x, y, vi, study = NULL, method = "hs", standardize, prior, mute_stan = TRUE, intercept, ... )
brma(x, ...) ## S3 method for class 'formula' brma( formula, data, vi = "vi", study = NULL, method = "hs", standardize = TRUE, prior = switch(method, lasso = c(df = 1, scale = 1), hs = c(df = 1, df_global = 1, df_slab = 4, scale_global = 1, scale_slab = 2, relevant_pars = NULL)), mute_stan = TRUE, ... ) ## Default S3 method: brma( x, y, vi, study = NULL, method = "hs", standardize, prior, mute_stan = TRUE, intercept, ... )
x |
An k x m numeric matrix, where k is the number of effect sizes and m is the number of moderators. |
... |
Additional arguments passed on to |
formula |
An object of class |
data |
Either a |
vi |
Character. Name of the column in the |
study |
Character. Name of the column in the
|
method |
Character, indicating the type of regularizing prior to use.
Supports one of |
standardize |
Either a logical argument or a list. If |
prior |
Numeric vector, specifying the prior to use. Note that the
different |
mute_stan |
Logical, indicating whether mute all 'Stan' output or not. |
y |
A numeric vector of k effect sizes. |
intercept |
Logical, indicating whether or not an intercept should be included in the model. |
The Bayesian regularized meta-analysis algorithm (Van Lissa & Van Erp, 2021) penalizes meta-regression coefficients either via the lasso prior (Park & Casella, 2008) or the regularized horseshoe prior (Piironen & Vehtari, 2017).
The Bayesian equivalent of the lasso penalty is obtained when
placing independent Laplace (i.e., double exponential) priors on the
regression coefficients centered around zero. The scale of the Laplace
priors is determined by a global scale parameter scale
, which
defaults to 1 and an inverse-tuning parameter
which is given a chi-square prior governed by a degrees of freedom
parameter
df
(defaults to 1). If standardize = TRUE
,
shrinkage will
affect all coefficients equally and it is not necessary to adapt the
scale
parameter. Increasing the df
parameter will allow
larger values for the inverse-tuning parameter, leading to less shrinkage.
One issue with the lasso prior is that it has relatively light
tails. As a result, not only does the lasso have the desirable behavior of
pulling small coefficients to zero, it also results in too much shrinkage
of large coefficients. An alternative prior that improves upon this
shrinkage pattern is the horseshoe prior (Carvalho, Polson & Scott, 2010).
The horseshoe prior has an infinitely large spike at zero, thereby pulling
small coefficients toward zero but in addition has fat tails, which allow
substantial coefficients to escape the shrinkage. The regularized horseshoe
is an extension of the horseshoe prior that allows the inclusion of prior
information regarding the number of relevant predictors and can
be more numerically stable in certain cases (Piironen & Vehtari, 2017).
The regularized horseshoe has a global shrinkage parameter that influences
all coefficients similarly and local shrinkage parameters that enable
flexible shrinkage patterns for each coefficient separately. The local
shrinkage parameters are given a Student's t prior with a default df
parameter of 1. Larger values for df
result in lighter tails and
a prior that is no longer strictly a horseshoe prior. However, increasing
df
slightly might be necessary to avoid divergent transitions in
Stan (see also https://mc-stan.org/misc/warnings.html). Similarly,
the degrees of freedom for the Student's t prior on the global shrinkage
parameter df_global
can be increased from the default of 1 to, for
example, 3 if divergent transitions occur although the resulting
prior is then strictly no longer a horseshoe. The scale for the Student's t
prior on the global shrinkage parameter scale_global
defaults to 1
and can be decreased to achieve more shrinkage. Moreover, if prior
information regarding the number of relevant moderators is available, it is
recommended to include this information via the relevant_pars
argument by setting it to the expected number of relevant moderators. When
relevant_pars
is specified, scale_global
is ignored and
instead based on the available prior information. Contrary to the horseshoe
prior, the regularized horseshoe applies additional regularization on large
coefficients which is governed by a Student's t prior with a
scale_slab
defaulting to 2 and df_slab
defaulting to 4.
This additional regularization ensures at least some shrinkage of large
coefficients to avoid any sampling problems.
A list
object of class brma
, with the following structure:
list( fit # An object of class stanfit, for compatibility with rstan coefficients # A numeric matrix with parameter estimates; these are # interpreted as regression coefficients, except tau2 and tau, # which are interpreted as the residual variance and standard # deviation, respectively. formula # The formula used to estimate the model terms # The predictor terms in the formula X # Numeric matrix of moderator variables Y # Numeric vector with effect sizes vi # Numeric vector with effect size variances tau2 # Numeric, estimated tau2 R2 # Numeric, estimated heterogeneity explained by the moderators k # Numeric, number of effect sizes study # Numeric vector with study id numbers )
Van Lissa, C. J., van Erp, S., & Clapper, E. B. (2023). Selecting relevant moderators with Bayesian regularized meta-regression. Research Synthesis Methods. doi:10.31234/osf.io/6phs5
Park, T., & Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association, 103(482), 681–686. doi:10.1198/016214508000000337
Carvalho, C. M., Polson, N. G., & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika, 97(2), 465–480. doi:10.1093/biomet/asq017
Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics, 11(2). https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-11/issue-2/Sparsity-information-and-regularization-in-the-horseshoe-and-other-shrinkage/10.1214/17-EJS1337SI.pdf
data("curry") df <- curry[c(1:5, 50:55), c("d", "vi", "sex", "age", "donorcode")] suppressWarnings({res <- brma(d~., data = df, iter = 10)})
data("curry") df <- curry[c(1:5, 50:55), c("d", "vi", "sex", "age", "donorcode")] suppressWarnings({res <- brma(d~., data = df, iter = 10)})
A systematic review and meta-analysis of the effects of performing acts of kindness on the well-being of the actor.
data(curry)
data(curry)
A data.frame with 56 rows and 18 columns.
study_id | factor |
Unique identifier of the study |
effect_id | integer |
Unique identifier of the effect size |
d | numeric |
Standardized mean difference between the control group and intervention group |
vi | numeric |
Variance of the effect size |
n1i | numeric |
Number of participants in the intervention group |
n1c | numeric |
Number of participants in the control group |
sex | numeric |
Percentage of male participants |
age | numeric |
Mean age of participants |
location | character |
Geographical location of the study |
donor | character |
From what population did the donors (helpers) originate? |
donorcode | factor |
From what population did the donors (helpers) originate? Dichotomized to Anxious or Typical |
interventioniv | character |
Description of the intervention / independent variable |
interventioncode | factor |
Description of the intervention / independent variable, categorized to Acts of Kindness, Prosocial Spending, or Other |
control | character |
Description of the control condition |
controlcode | factor |
Description of the control condition, categorized to Neutral Activity, Nothing, or Self Help (performing a kind act for oneself) |
recipients | character |
Who were the recipients of the act of kindness? |
outcomedv | character |
What was the outcome, or dependent variable, of the study? |
outcomecode | factor |
What was the outcome, or dependent variable, of the study? Categorized into Happiness, Life Satisfaction, PN Affect (positive or negative), and Other |
Curry, O. S., Rowland, L. A., Van Lissa, C. J., Zlotowitz, S., McAlaney, J., & Whitehouse, H. (2018). Happy to help? A systematic review and meta-analysis of the effects of performing acts of kindness on the well-being of the actor. Journal of Experimental Social Psychology, 76, 320-329. doi:10.1016/j.ecresq.2007.04.005
I2 represents the amount of heterogeneity relative to the total amount of variance in the observed effect sizes (Higgins & Thompson, 2002). For three-level meta-analyses, it is additionally broken down into I2_w (amount of within-cluster heterogeneity) and I2_b (amount of between-cluster heterogeneity).
I2(x, ...)
I2(x, ...)
x |
An object for which a method exists. |
... |
Arguments passed to other functions. |
Numeric matrix, with rows corresponding to I2 (total heterogeneity), and optionally I2_w and I2_b (within- and between-cluster heterogeneity).
I2(matrix(1:20, ncol = 1))
I2(matrix(1:20, ncol = 1))
Find the parameter estimate with the highest posterior probability density given a vector of samples.
maxap(x, dens = NULL, ...)
maxap(x, dens = NULL, ...)
x |
Numeric vector. |
dens |
Optional object of class |
... |
Arguments passed to |
Atomic numeric vector with the maximum a-posteriori estimate of
vector x
.
maxap(c(1,2,3,4,5))
maxap(c(1,2,3,4,5))
To perform a rudimentary sensitivity analysis, plot the posterior distributions of multiple BRMA models and compare them visually.
plot_sensitivity(..., parameters = NULL, model_names = NULL)
plot_sensitivity(..., parameters = NULL, model_names = NULL)
... |
Objects of class |
parameters |
Optional character vector with the names of
parameters that exist in the models in |
model_names |
Optional character vector with the names used
to label the models in |
An object of class ggplot
plot_sensitivity(samples = list( data.frame(Parameter = "b", Value = rnorm(10), Model = "M1"), data.frame(Parameter = "b", Value = rnorm(10, mean = 2), Model = "M2")), parameters = "b")
plot_sensitivity(samples = list( data.frame(Parameter = "b", Value = rnorm(10), Model = "M1"), data.frame(Parameter = "b", Value = rnorm(10, mean = 2), Model = "M2")), parameters = "b")
Samples from a prior
distribution with parameters defined in prior
. The result can be plotted
using the plot
function.
sample_prior( method = c("hs", "lasso"), prior = switch(method, lasso = c(df = 1, scale = 1), hs = c(df = 1, df_global = 1, df_slab = 4, scale_global = 1, scale_slab = 2, par_ratio = NULL)), iter = 1000 )
sample_prior( method = c("hs", "lasso"), prior = switch(method, lasso = c(df = 1, scale = 1), hs = c(df = 1, df_global = 1, df_slab = 4, scale_global = 1, scale_slab = 2, par_ratio = NULL)), iter = 1000 )
method |
Character string, indicating which prior to sample from.
Default: first element of |
prior |
Numeric vector, specifying the prior to use. See brma for more details. |
iter |
A positive integer specifying the number of iterations to sample. Default: 1000 |
NULL, function is called for its side-effect of plotting to the graphics device.
sample_prior("lasso", iter = 10)
sample_prior("lasso", iter = 10)
Launches a Shiny
app that allows interactive comparison of
different priors for brma
.
shiny_prior()
shiny_prior()
NULL, function is called for its side-effect of launching a Shiny app.
## Not run: shiny_prior() ## End(Not run)
## Not run: shiny_prior() ## End(Not run)
This function simulates a meta-analytic dataset based on the random-effects model. The simulated effect size is Hedges' G, an estimator of the Standardized Mean Difference (Hedges, 1981; Li, Dusseldorp, & Meulman, 2017). The functional form of the model can be specified, and moderators can be either normally distributed or Bernoulli-distributed. See Van Lissa, in preparation, for a detailed explanation of the simulation procedure.
simulate_smd( k_train = 20, k_test = 100, mean_n = 40, es = 0.5, tau2 = 0.04, alpha = 0, moderators = 5, distribution = "normal", model = "es * x[, 1]" )
simulate_smd( k_train = 20, k_test = 100, mean_n = 40, es = 0.5, tau2 = 0.04, alpha = 0, moderators = 5, distribution = "normal", model = "es * x[, 1]" )
k_train |
Atomic integer. The number of studies in the training dataset. Defaults to 20. |
k_test |
Atomic integer. The number of studies in the testing dataset. Defaults to 100. |
mean_n |
Atomic integer. The mean sample size of each simulated study in
the meta-analytic dataset. Defaults to |
es |
Atomic numeric vector. The effect size, also known as beta, used in
the model statement. Defaults to |
tau2 |
Atomic numeric vector. The residual heterogeneity. For a range of
realistic values encountered in psychological research, see Van Erp,
Verhagen, Grasman, & Wagenmakers, 2017. Defaults to |
alpha |
Vector of slant parameters, passed to sn::rsn. |
moderators |
Atomic integer. The number of moderators to simulate for
each study. Make sure that the number of moderators to be simulated is at
least as large as the number of moderators referred to in the model
parameter. Internally, the matrix of moderators is referred to as |
distribution |
Atomic character. The distribution of the moderators.
Can be set to either |
model |
Expression. An expression to specify the model from which to
simulate the mean true effect size, mu. This formula may use the terms |
List of length 4. The "training" element of this list is a data.frame with k_train rows. The columns are the variance of the effect size, vi; the effect size, yi, and the moderators, X. The "testing" element of this list is a data.frame with k_test rows. The columns are the effect size, yi, and the moderators, X. The "housekeeping" element of this list is a data.frame with k_train + k_test rows. The columns are n, the sample size n for each simulated study; mu_i, the mean true effect size for each simulated study; and theta_i, the true effect size for each simulated study.
set.seed(8) simulate_smd() simulate_smd(k_train = 50, distribution = "bernoulli") simulate_smd(distribution = "bernoulli", model = "es * x[ ,1] * x[ ,2]")
set.seed(8) simulate_smd() simulate_smd(k_train = 50, distribution = "bernoulli") simulate_smd(distribution = "bernoulli", model = "es * x[ ,1] * x[ ,2]")