Title: | Prediction Intervals |
---|---|
Description: | An implementation of prediction intervals for overdispersed count data, for overdispersed binomial data and for linear random effects models. |
Authors: | Max Menssen [aut, cre] |
Maintainer: | Max Menssen <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.2.1.9000 |
Built: | 2025-02-16 05:16:50 UTC |
Source: | https://github.com/maxmenssen/predint |
This data set contains artificial historical control data that was sampled in order to mimic the number of revertant colonies based on two or three petri dishes.
ames_HCD
ames_HCD
A data.frame
with 2 rows and 10 columns:
no. of revertant colonies
no. of petri dishes in the control group
data.frame
Get the prediction intervals or limits of an object of class predint
and
save them as a data.frame
.
## S3 method for class 'predint' as.data.frame(x, ...)
## S3 method for class 'predint' as.data.frame(x, ...)
x |
object of class |
... |
additional arguments to be passed to |
This function returns the prediction intervals or limits stored in an
object of class "predint"
as a data.frame
### PI for quasi-Poisson data pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE) # Return the prediction intervals as a data.frame as.data.frame(pred_int) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
### PI for quasi-Poisson data pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE) # Return the prediction intervals as a data.frame as.data.frame(pred_int) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
This data set contains sampled beta-binomial data from 10 clusters
each of size 50. The data set was sampled with rbbinom(n=10, size=50, prob=0.1, rho=0.06)
.
bb_dat1
bb_dat1
A data.frame
with 10 rows and 2 columns:
number of successes
number of failures
This data set contains sampled beta-binomial data from 3 clusters each of
different size. The data set was sampled with rbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06)
.
bb_dat2
bb_dat2
A data.frame
with 3 rows and 2 columns:
number of successes
number of failures
bb_pi()
is a helper function that is internally called by beta_bin_pi()
. It
calculates simple uncalibrated prediction intervals for binary
data with overdispersion changing between the clusters (beta-binomial).
bb_pi( newsize, histsize, pi, rho, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
bb_pi( newsize, histsize, pi, rho, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
newsize |
number of experimental units in the historical clusters |
histsize |
number of experimental units in the future clusters |
pi |
binomial proportion |
rho |
intra class correlation |
q |
quantile used for interval calculation |
alternative |
either "both", "upper" or "lower"
|
newdat |
additional argument to specify the current data set |
histdat |
additional argument to specify the historical data set |
algorithm |
used to define the algorithm for calibration if called via
|
This function returns a simple uncalibrated prediction interval
with as the number of experimental units in the
future clusters,
as the estimate for the binomial proportion obtained from the
historical data,
as the estimate for the intra class correlation
and
as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life data
is not recommended. Please use beta_bin_pi()
for real life applications.
bb_pi()
returns an object of class c("predint", "betaBinomialPI")
with prediction intervals or limits in the first entry ($prediction
).
# Pointwise uncalibrated PI bb_pred <- bb_pi(newsize=c(50), pi=0.3, rho=0.05, histsize=rep(50, 20), q=qnorm(1-0.05/2)) summary(bb_pred)
# Pointwise uncalibrated PI bb_pred <- bb_pi(newsize=c(50), pi=0.3, rho=0.05, histsize=rep(50, 20), q=qnorm(1-0.05/2)) summary(bb_pred)
beta_bin_pi()
calculates bootstrap calibrated prediction intervals for
beta-binomial data
beta_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
beta_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
histdat |
a |
newdat |
a |
newsize |
a vector containing the future cluster sizes |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1- |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
where
with as the number of experimental units in the future clusters,
as the estimate for the binomial proportion obtained from the
historical data,
as the bootstrap-calibrated coefficient,
as the estimate for the intra class correlation (Lui et al. 2000)
and
as the number of experimental units per historical cluster.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only prediction limits are of interest.
beta_bin_pi
returns an object of class c("predint", "betaBinomialPI")
with prediction intervals or limits in the first entry ($prediction
).
Lui et al. (2000): Confidence intervals for the risk ratio
under cluster sampling based on the beta-binomial model. Statistics in Medicine.
doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica. doi:10.1111/stan.12260
# Prediction interval pred_int <- beta_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100) summary(pred_int) # Upper prediction bound pred_u <- beta_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# Prediction interval pred_int <- beta_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100) summary(pred_int) # Upper prediction bound pred_u <- beta_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
This helper function returns a bootstrap calibrated coefficient for the calculation of prediction intervals (and limits).
bisection( y_star_hat, pred_se, y_star, alternative, quant_min, quant_max, n_bisec, tol, alpha, traceplot = TRUE )
bisection( y_star_hat, pred_se, y_star, alternative, quant_min, quant_max, n_bisec, tol, alpha, traceplot = TRUE )
y_star_hat |
a list of length |
pred_se |
a list of length |
y_star |
a list of length |
alternative |
either "both", "upper" or "lower".
|
quant_min |
lower start value for bisection |
quant_max |
upper start value for bisection |
n_bisec |
maximal number of bisection steps |
tol |
tolerance for the coverage probability in the bisection |
alpha |
defines the level of confidence ( |
traceplot |
if |
This function is an implementation of the bisection algorithm of Menssen
and Schaarschmidt 2022. It returns a calibrated coefficient for the
calculation of pointwise and simultaneous prediction intervals
lower prediction limits
or upper prediction limits
that cover all of future observations.
In this notation, are the expected future observations for each of
the
future clusters,
is the
calibrated coefficient and
are the standard errors of the prediction.
This function returns in the equation above.
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future
observations based on linear random effects models. Statistica Neerlandica.
doi:10.1111/stan.12260
boot_predint()
is a helper function to bootstrap new data from the simple
uncalibrated prediction intervals implemented in predint.
boot_predint(pred_int, nboot, adjust = "within")
boot_predint(pred_int, nboot, adjust = "within")
pred_int |
object of class |
nboot |
number of bootstraps |
This function only works for binomial and Poisson type data. For the sampling
of new data from random effects models see lmer_bs
.
boot_predint
returns an object of class c("predint", "bootstrap")
which is a list with two entries: One for bootstrapped historical observations
and one for bootstrapped future observations.
# Simple quasi-Poisson PI test_pi <- qp_pi(histoffset=c(3,3,3,4,5), newoffset=3, lambda=10, phi=3, q=1.96) # Draw 5 bootstrap samples test_boot <- boot_predint(pred_int = test_pi, nboot=50) str(test_boot) summary(test_boot) # Please note that the low number of bootstrap samples was chosen in order to # decrease computing time. For valid analysis draw at least 10000 bootstrap samples.
# Simple quasi-Poisson PI test_pi <- qp_pi(histoffset=c(3,3,3,4,5), newoffset=3, lambda=10, phi=3, q=1.96) # Draw 5 bootstrap samples test_boot <- boot_predint(pred_int = test_pi, nboot=50) str(test_boot) summary(test_boot) # Please note that the low number of bootstrap samples was chosen in order to # decrease computing time. For valid analysis draw at least 10000 bootstrap samples.
c2_dat1 contains data that is sampled from a balanced cross-classified design.
This data set is used in order to demonstrate the functionality of the lmer_pi_...()
functions.
c2_dat1
c2_dat1
A data.frame
with 27 rows and 3 columns:
observations
treatment a
treatment b
c2_dat2 contains data that was sampled from an unbalanced cross-classified design.
This data set is used in order to demonstrate the functionality of the lmer_pi_...()
functions.
c2_dat2
c2_dat2
A data.frame
with 21 rows and 3 columns:
observations
treatment a
treatment b
c2_dat3 contains data that was sampled from a balanced cross-classified design.
This data set is used in order to demonstrate the functionality of the lmer_pi_...()
functions.
c2_dat3
c2_dat3
A data.frame
with 8 rows and 3 columns:
observations
treatment a
treatment b
c2_dat4 contains data that was sampled from an unbalanced cross-classified design.
This data set is used in order to demonstrate the functionality of the lmer_pi_...()
functions.
c2_dat4
c2_dat4
A data.frame
with 6 rows and 3 columns:
observations
treatment a
treatment b
lmer_bs()
draws bootstrap samples based on the estimates for the mean
and the variance components drawn from a random effects model fit with lme4::lmer()
.
Contrary to lme4::bootMer()
, the number of observations for each random factor
can vary between the original data set and the bootstrapped data. Random effects
in model
have to be specified as (1|random effect)
.
lmer_bs(model, newdat = NULL, futmat_list = NULL, nboot)
lmer_bs(model, newdat = NULL, futmat_list = NULL, nboot)
model |
a random effects model of class lmerMod |
newdat |
a |
futmat_list |
a list that contains design matrices for each random factor |
nboot |
number of bootstrap samples |
The data sampling is based on a list of design matrices (one for each random factor)
that can be obtained if newdat
and the model formula are provided to
lme4::lFormula()
. Hence, each random factor that is part of the initial
model must have at least two replicates in newdat
.
If a random factor in the future data set does not have any replicate, a list
that contains design matrices (one for each random factor) can be provided via
futmat_list
.
A list of length nboot
containing the bootstrapped observations.
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Using c2_dat2 as newdat c2_dat2 lmer_bs(model=fit, newdat=c2_dat2, nboot=100) #---------------------------------------------------------------------------- ### Using futmat_list # c2_dat4 has no replication for b. Hence the list of design matrices can not be # generated by lme4::lFormula() and have to be provided by hand via futmat_list. c2_dat4 # Build a list containing the design matrices fml <- vector(length=4, "list") names(fml) <- c("a:b", "b", "a", "Residual") fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1)) fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["Residual"]] <- diag(6) fml lmer_bs(model=fit, futmat_list=fml, nboot=100)
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Using c2_dat2 as newdat c2_dat2 lmer_bs(model=fit, newdat=c2_dat2, nboot=100) #---------------------------------------------------------------------------- ### Using futmat_list # c2_dat4 has no replication for b. Hence the list of design matrices can not be # generated by lme4::lFormula() and have to be provided by hand via futmat_list. c2_dat4 # Build a list containing the design matrices fml <- vector(length=4, "list") names(fml) <- c("a:b", "b", "a", "Residual") fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1)) fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["Residual"]] <- diag(6) fml lmer_bs(model=fit, futmat_list=fml, nboot=100)
This function is deprecated. Please use lmer_pi_unstruc()
,
lmer_pi_futvec()
or lmer_pi_futmat()
.
lmer_pi( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, lambda_min = 0.01, lambda_max = 10, traceplot = TRUE, n_bisec = 30 )
lmer_pi( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, lambda_min = 0.01, lambda_max = 10, traceplot = TRUE, n_bisec = 30 )
model |
a random effects model of class |
newdat |
a |
m |
number of future observations |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1- |
nboot |
number of bootstraps |
lambda_min |
lower start value for bisection |
lambda_max |
upper start value for bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
This function returns a bootstrap calibrated prediction interval
with as the predicted future observation,
as the observed future observations,
as the prediction standard error and
as the bootstrap calibrated coefficient that
approximates a quantile of the multivariate t-distribution.
Please note that this function relies on linear random effects models that are
fitted with lmer()
from the lme4 package. Random effects have to be specified as
(1|random_effect)
.
If newdat
is specified: A data.frame
that contains the future data,
the historical mean (hist_mean), the calibrated coefficient (quant_calib),
the prediction standard error (pred_se), the prediction interval (lower and upper)
and a statement if the prediction interval covers the future observation (cover).
If m
is specified: A data.frame
that contains the number of future observations (m)
the historical mean (hist_mean), the calibrated coefficient (quant_calib),
the prediction standard error (pred_se) and the prediction interval (lower and upper).
If alternative
is set to "lower": Lower prediction limits are computed instead
of a prediction interval.
If alternative
is set to "upper": Upper prediction limits are computed instead
of a prediction interval.
If traceplot=TRUE
, a graphical overview about the bisection process is given.
# This function is deprecated. # Please use lmer_pi_unstruc() if you want exactly the same functionality. # Please use lmer_pi_futmat() or lmer_pi_futvec() if you want to take care # of the future experimental design
# This function is deprecated. # Please use lmer_pi_unstruc() if you want exactly the same functionality. # Please use lmer_pi_futmat() or lmer_pi_futvec() if you want to take care # of the future experimental design
lmer_pi_futmat()
calculates a bootstrap calibrated prediction interval for one or more
future observation(s) based on linear random effects models. With this approach,
the experimental design of the future data is taken into account (see below).
lmer_pi_futmat( model, newdat = NULL, futmat_list = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
lmer_pi_futmat( model, newdat = NULL, futmat_list = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
model |
a random effects model of class |
newdat |
either 1 or a |
futmat_list |
a list that contains design matrices for each random factor |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1- |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
with as the expected future observation (historical mean) and
as the
variance components and
as the residual variance obtained from the random
effects model fitted with
lme4::lmer()
and as the as the bootstrap-calibrated
coefficient used for interval calculation.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only
prediction limits are of interest.
If newdat
is defined, the bootstrapped future observations used for the calibration
process mimic the structure of the data set provided via newdat
. The
data sampling is based on a list of design matrices (one for each random factor)
that can be obtained if newdat
and the model formula are provided to
lme4::lFormula()
. Hence, each random factor that is part of the initial
model must have at least two replicates in newdat
.
If a random factor in the future data set does not have any replicate, a list
that contains design matrices (one for each random factor) can be provided via futmat_list
.
This function is an implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4, except, that the bootstrap calibration values are drawn from bootstrap samples that mimic the future data as described above.
lmer_pi_futmat()
returns an object of class c("predint", "normalPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Using newdat # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) # Upper prediction limit for m=1 future observations pred_u <- lmer_pi_futmat(model=fit, newdat=1, alternative="upper", nboot=100) summary(pred_u) #---------------------------------------------------------------------------- ### Using futmat_list # c2_dat4 has no replication for b. Hence the list of design matrices can not be # generated by lme4::lFormula() and have to be provided by hand via futmat_list. c2_dat4 # Build a list containing the design matrices fml <- vector(length=4, "list") names(fml) <- c("a:b", "b", "a", "Residual") fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1)) fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["Residual"]] <- diag(6) fml # Please note, that the design matrix for the interaction term a:b is also # provided even there is no replication for b, since it is assumed that # both, the historical and the future data descent from the same data generating # process. # Calculate the PI pred_fml <- lmer_pi_futmat(model=fit, futmat_list=fml, alternative="both", nboot=100) summary(pred_fml) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Using newdat # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) # Upper prediction limit for m=1 future observations pred_u <- lmer_pi_futmat(model=fit, newdat=1, alternative="upper", nboot=100) summary(pred_u) #---------------------------------------------------------------------------- ### Using futmat_list # c2_dat4 has no replication for b. Hence the list of design matrices can not be # generated by lme4::lFormula() and have to be provided by hand via futmat_list. c2_dat4 # Build a list containing the design matrices fml <- vector(length=4, "list") names(fml) <- c("a:b", "b", "a", "Residual") fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1)) fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1)) fml[["Residual"]] <- diag(6) fml # Please note, that the design matrix for the interaction term a:b is also # provided even there is no replication for b, since it is assumed that # both, the historical and the future data descent from the same data generating # process. # Calculate the PI pred_fml <- lmer_pi_futmat(model=fit, futmat_list=fml, alternative="both", nboot=100) summary(pred_fml) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
lmer_pi_futvec()
calculates a bootstrap calibrated prediction interval for one or more
future observation(s) based on linear random effects models. With this approach,
the experimental design of the future data is taken into account (see below).
lmer_pi_futvec( model, futvec, newdat = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
lmer_pi_futvec( model, futvec, newdat = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
model |
a random effects model of class lmerMod |
futvec |
an integer vector that defines the structure of the future data based on the
row numbers of the historical data. If |
newdat |
a |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1- |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
with as the expected future observation (historical mean) and
as the
variance components and
as the residual variance obtained from the random
effects model fitted with
lme4::lmer()
and as the as the bootstrap-calibrated
coefficient used for interval calculation.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only
prediction limits are of interest.
Be aware that the sampling structure of the historical data must contain the structure of the future data. This means that the observations per random factor must be less or equal in the future data compared to the historical data.
This function is an implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4 except that the bootstrap calibration values are drawn from bootstrap samples that mimic the future data.
lmer_pi_futvec()
returns an object of class c("predint", "normalPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Prediction interval using c2_dat3 as future data # without printing c2_dat3 in the output # Row numbers of the historical data c2_dat1 that define the structure of # the future data c2_dat3 futvec <- c(1, 2, 4, 5, 10, 11, 13, 14) # Calculating the PI pred_int <- lmer_pi_futvec(model=fit, futvec=futvec, nboot=100) summary(pred_int) #---------------------------------------------------------------------------- ### Calculating the PI with c2_dat3 printed in the output pred_int_new <- lmer_pi_futvec(model=fit, futvec=futvec, newdat=c2_dat3, nboot=100) summary(pred_int_new) #---------------------------------------------------------------------------- ### Upper prediction limit for m=1 future observation pred_u <- lmer_pi_futvec(model=fit, futvec=1, alternative="upper", nboot=100) summary(pred_u) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) #---------------------------------------------------------------------------- ### Prediction interval using c2_dat3 as future data # without printing c2_dat3 in the output # Row numbers of the historical data c2_dat1 that define the structure of # the future data c2_dat3 futvec <- c(1, 2, 4, 5, 10, 11, 13, 14) # Calculating the PI pred_int <- lmer_pi_futvec(model=fit, futvec=futvec, nboot=100) summary(pred_int) #---------------------------------------------------------------------------- ### Calculating the PI with c2_dat3 printed in the output pred_int_new <- lmer_pi_futvec(model=fit, futvec=futvec, newdat=c2_dat3, nboot=100) summary(pred_int_new) #---------------------------------------------------------------------------- ### Upper prediction limit for m=1 future observation pred_u <- lmer_pi_futvec(model=fit, futvec=1, alternative="upper", nboot=100) summary(pred_u) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
lmer_pi_unstruc()
calculates a bootstrap calibrated prediction interval for one or more
future observation(s) based on linear random effects models as described in section
3.2.4. of Menssen and Schaarschmidt (2022).
Please note, that the bootstrap calibration used here does not consider the sampling
structure of the future data, since the calibration values are drawn randomly from
bootstrap data sets that have the same structure as the historical data.
lmer_pi_unstruc( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
lmer_pi_unstruc( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22" )
model |
a random effects model of class lmerMod |
newdat |
a |
m |
number of future observations |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1- |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
with as the expected future observation (historical mean) and
as the
variance components and
as the residual variance obtained from the random
effects model fitted with
lme4::lmer()
and as the as the bootstrap-calibrated
coefficient used for interval calculation.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only
prediction limits are of interest.
This function is an direct implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4.
lmer_pi_futvec()
returns an object of class c("predint", "normalPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_unstruc(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) # Upper prediction limit for m=3 future observations pred_u <- lmer_pi_unstruc(model=fit, m=3, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# loading lme4 library(lme4) # Fitting a random effects model based on c2_dat1 fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) summary(fit) # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_unstruc(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) # Upper prediction limit for m=3 future observations pred_u <- lmer_pi_unstruc(model=fit, m=3, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
This data set contains historical control data about the mortality of male B6C3F1-mice obtained in long term carcinogenicity studies at the National Toxicology Program presented in NTP Historical Control Reports from 2013 to 2016. It was used in Menssen and Schaarschmidt 2019 as a real life example.
mortality_HCD
mortality_HCD
A data.frame
with 2 rows and 10 columns:
no. of dead mice
no. of living mice
Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomial
data with application to historical controls. Statistics in Medicine.
doi:10.1002/sim.8124
NTP Historical Control Reports: https://ntp.niehs.nih.gov/data/controls
nb_pi()
is a helper function that is internally called by neg_bin_pi()
. It
calculates simple uncalibrated prediction intervals for negative-binomial data
with offsets.
nb_pi( newoffset, histoffset, lambda, kappa, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
nb_pi( newoffset, histoffset, lambda, kappa, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
newoffset |
number of experimental units in the future clusters |
histoffset |
number of experimental units in the historical clusters |
lambda |
overall Poisson mean |
kappa |
dispersion parameter |
q |
quantile used for interval calculation |
alternative |
either "both", "upper" or "lower".
|
newdat |
additional argument to specify the current data set |
histdat |
additional argument to specify the historical data set |
algorithm |
used to define the algorithm for calibration if called via
|
This function returns a simple uncalibrated prediction interval
with as the number of experimental units in
future clusters,
as the estimate for the Poisson mean obtained from the
historical data,
as the estimate for the dispersion parameter,
as the number of experimental units per historical cluster and
.
The direct application of this uncalibrated prediction interval to real life data
is not recommended. Please use the neg_bin_pi()
function for real life applications.
np_pi
returns an object of class c("predint", "negativeBinomialPI")
.
# Prediction interval nb_pred <- nb_pi(newoffset=3, lambda=3, kappa=0.04, histoffset=1:9, q=qnorm(1-0.05/2)) summary(nb_pred)
# Prediction interval nb_pred <- nb_pi(newoffset=3, lambda=3, kappa=0.04, histoffset=1:9, q=qnorm(1-0.05/2)) summary(nb_pred)
neg_bin_pi()
calculates bootstrap calibrated prediction intervals for
negative-binomial data.
neg_bin_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
neg_bin_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
histdat |
a |
newdat |
|
newoffset |
vector with future number of experimental units per historical study. |
alternative |
either "both", "upper" or "lower".
|
adjust |
specifies if simultaneous prediction should be done for several
control groups of different studies ( |
alpha |
defines the level of confidence ( |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
with as the number of experimental units in the future clusters,
as the estimate for the Poisson mean obtained from the
historical data,
as the estimate for the dispersion parameter,
as the number of experimental units per historical cluster and
.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only prediction limits are of interest.
neg_bin_pi()
returns an object of class c("predint", "negativeBinomialPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
# HCD from the Ames test ames_HCD # Pointwise prediction interval for one future number of revertant colonies # obtained in three petridishes pred_int <- neg_bin_pi(histdat=ames_HCD, newoffset=3, nboot=100) summary(pred_int) # Simultaneous prediction interval for the numbers of revertant colonies obtained in # the control and three treatment groups of a future trial pred_int_w <- neg_bin_pi(histdat=ames_HCD, newoffset=c(3, 3, 3, 3), adjust="within", nboot=100) summary(pred_intw) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# HCD from the Ames test ames_HCD # Pointwise prediction interval for one future number of revertant colonies # obtained in three petridishes pred_int <- neg_bin_pi(histdat=ames_HCD, newoffset=3, nboot=100) summary(pred_int) # Simultaneous prediction interval for the numbers of revertant colonies obtained in # the control and three treatment groups of a future trial pred_int_w <- neg_bin_pi(histdat=ames_HCD, newoffset=c(3, 3, 3, 3), adjust="within", nboot=100) summary(pred_intw) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
normal_pi()
is a helper function that is internally called by the lmer_pi_...()
functions.
It calculates simple uncalibrated prediction intervals for normal distributed
observations.
normal_pi( mu, pred_se, m = 1, q = qnorm(1 - 0.05/2), alternative = "both", futmat_list = NULL, futvec = NULL, newdat = NULL, histdat = NULL, algorithm = NULL )
normal_pi( mu, pred_se, m = 1, q = qnorm(1 - 0.05/2), alternative = "both", futmat_list = NULL, futvec = NULL, newdat = NULL, histdat = NULL, algorithm = NULL )
mu |
overall mean |
pred_se |
standard error of the prediction |
m |
number of future observations |
q |
quantile used for interval calculation |
alternative |
either "both", "upper" or "lower"
|
futmat_list |
used to add the list of future design matrices to the output
if called via |
futvec |
used to add the vector of the historical row numbers that define
the future experimental design to the output if called via |
newdat |
additional argument to specify the current data set |
histdat |
additional argument to specify the historical data set |
algorithm |
used to define the algorithm for calibration if called via
|
This function returns a simple uncalibrated prediction interval as given in Menssen and Schaarschmidt 2022
with as the expected future observation (historical mean) and
as the
variance components and
as the residual variance and
as the quantile used for interval calculation.
The direct application of this uncalibrated prediction interval to real life data
is not recommended. Please use the lmer_pi_...()
functions for real life applications.
normal_pi()
returns an object of class c("predint", "normalPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
# simple PI norm_pred <- normal_pi(mu=10, pred_se=3, m=1) summary(norm_pred)
# simple PI norm_pred <- normal_pi(mu=10, pred_se=3, m=1) summary(norm_pred)
pi_rho_est()
estimates the overall binomial proportion and the intra
class correlation
of data that is assumed to follow the beta-binomial
distribution. The estimation of
and
is done following
the approach of Lui et al. 2000.
pi_rho_est(dat)
pi_rho_est(dat)
dat |
a |
a vector containing estimates for and
Lui, K.-J., Mayer, J.A. and Eckhardt, L: Confidence intervals for the risk ratio under cluster sampling based on the beta-binomial model. Statistics in Medicine.2000;19:2933-2942. doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q
# Estimates for bb_dat1 pi_rho_est(bb_dat1)
# Estimates for bb_dat1 pi_rho_est(bb_dat1)
predint
objectsThis function provides methodology for plotting the prediction intervals or limits that are calculated using the functionality of the predint package.
## S3 method for class 'predint' plot(x, ..., size = 4, width = 0.05, alpha = 0.5)
## S3 method for class 'predint' plot(x, ..., size = 4, width = 0.05, alpha = 0.5)
x |
object of class |
... |
arguments handed over to |
size |
size of the dots |
width |
margin of jittering |
alpha |
opacity of dot colors |
Since plot.predint()
is based on ggplot2::ggplot
, it returns
an object of class c("gg", "ggplot")
.
### PI for quasi-Poisson data pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE) ### Plot the PI plot(pred_int) ### Since plot.predint is based on ggplot, the grafic can be altered using # the methodology provided via ggplot2 plot(pred_int)+ theme_classic()
### PI for quasi-Poisson data pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE) ### Plot the PI plot(pred_int) ### Since plot.predint is based on ggplot, the grafic can be altered using # the methodology provided via ggplot2 plot(pred_int)+ theme_classic()
predint
Print objects of class predint
## S3 method for class 'predint' print(x, ...)
## S3 method for class 'predint' print(x, ...)
x |
an object of class |
... |
additional arguments passed over to |
prints output to the console
This data set contains sampled quasi-binomial data from 10 clusters
each of size 50. The data set was sampled with rqbinom(n=10, size=50, prob=0.1, phi=3)
.
qb_dat1
qb_dat1
A data.frame
with 3 rows and 2 columns:
numbers of success
numbers of failures
This data set contains sampled quasi binomial data from 3 clusters with
different size.The data set was sampled with rqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3)
.
qb_dat2
qb_dat2
A data.frame
with 3 rows and 2 columns:
numbers of success
numbers of failures
qb_pi()
is a helper function that is internally called by quasi_bin_pi()
. It
calculates simple uncalibrated prediction intervals for binary
data with constant overdispersion (quasi-binomial assumption).
qb_pi( newsize, histsize, pi, phi, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
qb_pi( newsize, histsize, pi, phi, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL )
newsize |
number of experimental units in the historical clusters. |
histsize |
number of experimental units in the future clusters. |
pi |
binomial proportion |
phi |
dispersion parameter |
q |
quantile used for interval calculation |
alternative |
either "both", "upper" or "lower"
|
newdat |
additional argument to specify the current data set |
histdat |
additional argument to specify the historical data set |
algorithm |
used to define the algorithm for calibration if called via
|
This function returns a simple uncalibrated prediction interval
with as the number of experimental units in the
future clusters,
as the estimate for the binomial proportion obtained from the
historical data,
as the estimate for the dispersion parameter
and
as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life data
is not recommended. Please use the beta_bin_pi()
functions for real life applications.
qb_pi
returns an object of class c("predint", "quasiBinomailPI")
.
qb_pred <- qb_pi(newsize=50, pi=0.3, phi=3, histsize=c(50, 50, 30), q=qnorm(1-0.05/2)) summary(qb_pred)
qb_pred <- qb_pi(newsize=50, pi=0.3, phi=3, histsize=c(50, 50, 30), q=qnorm(1-0.05/2)) summary(qb_pred)
This data set contains sampled quasi-Poisson data for 10 clusters.
The data set was sampled with rqpois(n=10, lambda=50, phi=3)
.
qp_dat1
qp_dat1
A data.frame with two columns
numbers of eventzs
size of experimental units
This data set contains sampled quasi-Poisson data for 3 clusters.
The data set was sampled with rqpois(n=3, lambda=50, phi=3)
.
qp_dat2
qp_dat2
A data.frame with two columns
numbers of eventzs
size of experimental units
qp_pi()
is a helper function that is internally called by quasi_pois_pi()
. It
calculates simple uncalibrated prediction intervals for Poisson
data with constant overdispersion (quasi-Poisson assumption).
qp_pi( newoffset, histoffset, lambda, phi, q = qnorm(1 - 0.05/2), alternative = "both", adjust = NULL, newdat = NULL, histdat = NULL, algorithm = NULL )
qp_pi( newoffset, histoffset, lambda, phi, q = qnorm(1 - 0.05/2), alternative = "both", adjust = NULL, newdat = NULL, histdat = NULL, algorithm = NULL )
newoffset |
number of experimental units in the future clusters |
histoffset |
number of experimental units in the historical clusters |
lambda |
overall Poisson mean |
phi |
dispersion parameter |
q |
quantile used for interval calculation |
alternative |
either "both", "upper" or "lower"
|
adjust |
only important if called via |
newdat |
additional argument to specify the current data set |
histdat |
additional argument to specify the historical data set |
algorithm |
used to define the algorithm for calibration if called via
|
This function returns a simple uncalibrated prediction interval
with as the number of experimental units in the
future clusters,
as the estimate for the Poisson mean obtained from the
historical data,
as the estimate for the dispersion parameter
and
as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life data
is not recommended. Please use the quasi_pois_pi_pi()
functions for real life applications.
qp_pi
returns an object of class c("predint", "quasiPoissonPI")
.
# Prediction interval qp_pred <- qp_pi(newoffset=3, lambda=3, phi=3, histoffset=1:9, q=qnorm(1-0.05/2)) summary(qp_pred)
# Prediction interval qp_pred <- qp_pi(newoffset=3, lambda=3, phi=3, histoffset=1:9, q=qnorm(1-0.05/2)) summary(qp_pred)
quasi_bin_pi()
calculates bootstrap calibrated prediction intervals for binomial
data with constant overdispersion (quasi-binomial assumption).
quasi_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
quasi_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
histdat |
a |
newdat |
a |
newsize |
a vector containing the future cluster sizes |
alternative |
either "both", "upper" or "lower". |
alpha |
defines the level of confidence (1-alpha) |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
where
with as the number of experimental units in the future clusters,
as the estimate for the binomial proportion obtained from the
historical data,
as the bootstrap-calibrated coefficient,
as the estimate for the dispersion parameter
and
as the number of experimental units per historical cluster.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only prediction limits are of interest.
quasi_bin_pi
returns an object of class c("predint", "quasiBinomialPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomial
data with application to historical controls. Statistics in Medicine.
doi:10.1002/sim.8124
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future
observations based on linear random effects models. Statistica Neerlandica,
doi:10.1111/stan.12260
# Pointwise prediction interval pred_int <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100) summary(pred_int) # Pointwise upper prediction limit pred_u <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# Pointwise prediction interval pred_int <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100) summary(pred_int) # Pointwise upper prediction limit pred_u <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
quasi_pois_pi()
calculates bootstrap calibrated prediction intervals for Poisson
data with constant overdispersion (quasi-Poisson).
quasi_pois_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
quasi_pois_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod" )
histdat |
a |
newdat |
a |
newoffset |
vector with future number of experimental units per historical study. |
alternative |
either "both", "upper" or "lower".
|
adjust |
specifies if simultaneous prediction should be done for several
control groups of different studies ( |
alpha |
defines the level of confidence ( |
nboot |
number of bootstraps |
delta_min |
lower start value for bisection |
delta_max |
upper start value for bisection |
tolerance |
tolerance for the coverage probability in the bisection |
traceplot |
if |
n_bisec |
maximal number of bisection steps |
algorithm |
either "MS22" or "MS22mod" (see details) |
This function returns bootstrap-calibrated prediction intervals as well as lower or upper prediction limits.
If algorithm
is set to "MS22", both limits of the prediction interval
are calibrated simultaneously using the algorithm described in Menssen and
Schaarschmidt (2022), section 3.2.4. The calibrated prediction interval is given
as
with as the number of experimental units in the future clusters,
as the estimate for the Poisson mean obtained from the
historical data,
as the bootstrap-calibrated coefficient,
as the estimate for the dispersion parameter
and
as the number of experimental units per historical cluster.
If algorithm
is set to "MS22mod", both limits of the prediction interval
are calibrated independently from each other. The resulting prediction interval
is given by
Please note, that this modification does not affect the calibration procedure, if only prediction limits are of interest.
quasi_pois_pi
returns an object of class c("predint", "quasiPoissonPI")
with prediction intervals or limits in the first entry ($prediction
).
Menssen and Schaarschmidt (2022): Prediction intervals for all of M future observations based on linear random effects models. Statistica Neerlandica, doi:10.1111/stan.12260
#' # Historical data qp_dat1 # Future data qp_dat2 # Pointwise prediction interval pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100) summary(pred_int) # Pointwise upper prediction pred_u <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
#' # Historical data qp_dat1 # Future data qp_dat2 # Pointwise prediction interval pred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100) summary(pred_int) # Pointwise upper prediction pred_u <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, alternative="upper", nboot=100) summary(pred_u) # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
rbbinom()
samples beta-binomial data according to Menssen and Schaarschmidt (2019).
rbbinom(n, size, prob, rho)
rbbinom(n, size, prob, rho)
n |
defines the number of clusters ( |
size |
integer vector defining the number of trials per cluster ( |
prob |
probability of success on each trial ( |
rho |
intra class correlation ( |
For beta binomial data with clusters, the variance is
with as the intra class correlation coefficient
For the sampling is defined as
where and
. Then, the binomial proportions
for each cluster are sampled from the beta distribution
and the number of successes for each cluster are sampled to be
In this parametrization and
.
Please note, that
is a constant if all cluster sizes are
the same and hence, in this special case, also the quasi-binomial assumption is
fulfilled.
a data.frame
with two columns (succ, fail)
Menssen M, Schaarschmidt F.: Prediction intervals for overdispersed binomial data with application to historical controls. Statistics in Medicine. 2019;38:2652-2663. doi:10.1002/sim.8124
# Sampling of example data set.seed(234) bb_dat1 <- rbbinom(n=10, size=50, prob=0.1, rho=0.06) bb_dat1 set.seed(234) bb_dat2 <- rbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06) bb_dat2
# Sampling of example data set.seed(234) bb_dat1 <- rbbinom(n=10, size=50, prob=0.1, rho=0.06) bb_dat1 set.seed(234) bb_dat2 <- rbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06) bb_dat2
rnbinom()
samples negative-binomial data.
The following description of the sampling process is based on the parametrization
used by Gsteiger et al. 2013.
rnbinom(n, lambda, kappa, offset = NULL)
rnbinom(n, lambda, kappa, offset = NULL)
n |
defines the number of clusters ( |
lambda |
defines the overall Poisson mean ( |
kappa |
dispersion parameter ( |
offset |
defines the number of experimental units per cluster ( |
The variance of the negative-binomial distribution is
Negative-biomial observations can be sampled based on predefined values of ,
and
:
Define the parameters of the gamma distribution as and
. Then, sample the Poisson means for each cluster
Finally, the observations are sampled from the Poisson distribution
rnbinom()
returns a data.frame
with two columns:
y
as the observations and offset
as the number of offsets per
observation.
Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013): Using historical control information for the design and analysis of clinical trials with overdispersed count data. Statistics in Medicine, 32: 3609-3622. doi:10.1002/sim.5851
# Sampling of negative-binomial observations # with different offsets set.seed(123) rnbinom(n=5, lambda=5, kappa=0.13, offset=c(3,3,2,3,2))
# Sampling of negative-binomial observations # with different offsets set.seed(123) rnbinom(n=5, lambda=5, kappa=0.13, offset=c(3,3,2,3,2))
rqbinom samples overdispersed binomial data with constant overdispersion from the beta-binomial distribution such that the quasi-binomial assumption is fulfilled.
rqbinom(n, size, prob, phi)
rqbinom(n, size, prob, phi)
n |
defines the number of clusters ( |
size |
integer vector defining the number of trials per cluster ( |
prob |
probability of success on each trial ( |
phi |
dispersion parameter ( |
It is assumed that the dispersion parameter ()
is constant for all
clusters, such that the variance becomes
For the sampling is defined as
where and
. Then, the binomial proportions
for each cluster are sampled from the beta distribution
and the numbers of success for each cluster are sampled to be
In this parametrization and
.
Please note, the quasi-binomial assumption is not in contradiction with
the beta-binomial distribution if all cluster sizes are the same.
a data.frame
with two columns (succ, fail)
# Sampling of example data set.seed(456) qb_dat1 <- rqbinom(n=10, size=50, prob=0.1, phi=3) qb_dat1 set.seed(456) qb_dat2 <- rqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3) qb_dat2
# Sampling of example data set.seed(456) qb_dat1 <- rqbinom(n=10, size=50, prob=0.1, phi=3) qb_dat1 set.seed(456) qb_dat2 <- rqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3) qb_dat2
rqpois()
samples overdispersed Poisson data with constant overdispersion from
the negative-binomial distribution such that the quasi-Poisson assumption is fulfilled.
The following description of the sampling process is based on the parametrization
used by Gsteiger et al. 2013.
rqpois(n, lambda, phi, offset = NULL)
rqpois(n, lambda, phi, offset = NULL)
n |
defines the number of clusters ( |
lambda |
defines the overall Poisson mean ( |
phi |
dispersion parameter ( |
offset |
defines the number of experimental units per cluster ( |
It is assumed that the dispersion parameter ()
is constant for all
clusters, such that the variance becomes
For the sampling is defined as
where and
. Then, the Poisson means
for each cluster are sampled from the gamma distribution
and the observations per cluster are sampled to be
Please note, that the quasi-Poisson assumption is not in contradiction with the
negative-binomial distribution, if the data structure is defined by the number
of clusters only (which is the case here) and the offsets are all the same
.
a data.frame containing the sampled observations and the offsets
Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013): Using historical control information for the design and analysis of clinical trials with overdispersed count data. Statistics in Medicine, 32: 3609-3622. doi:10.1002/sim.5851
# set.seed(123) qp_dat1 <- rqpois(n=10, lambda=50, phi=3) qp_dat1 # set.seed(123) qp_dat2 <- rqpois(n=3, lambda=50, phi=3) qp_dat2
# set.seed(123) qp_dat1 <- rqpois(n=10, lambda=50, phi=3) qp_dat1 # set.seed(123) qp_dat2 <- rqpois(n=3, lambda=50, phi=3) qp_dat2
predint
This function gives a summary about the prediction intervals (and limits) computed with predint.
## S3 method for class 'predint' summary(object, ...)
## S3 method for class 'predint' summary(object, ...)
object |
object of class |
... |
further arguments passed over to |
A data.frame
containing the current data (if provided via newdat
),
the prediction interval (or limit), the expected value for the future observation,
the bootstrap calibrated coefficient(s), the prediction standard error and
a statement about the coverage for each future observation, if new observations
were provided via newdat
.
# Fitting a random effects model based on c2_dat1 fit <- lme4::lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.
# Fitting a random effects model based on c2_dat1 fit <- lme4::lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1) # Prediction interval using c2_dat2 as future data pred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100) summary(pred_int) #---------------------------------------------------------------------------- # Please note that nboot was set to 100 in order to decrease computing time # of the example. For a valid analysis set nboot=10000.