This document gives a brief description of the estimation methods for population type data that can be used with NONMEM Version V. These include, in particular, a few methods that are new with this version, the centered and hybrid methods. The more important changes from the earlier edition published in 1992, but not all changes, are highlighted with the use of vertical bars in the right margin. This document contains no information about how to communicate with the NONMEM program.
To read this document it may be helpful to have some familiarity with the notation used with the representation of statistical models for the NONMEM program. See discussions of models in NONMEM Users Guide - Part I, but if one’s interest is only in using NONMEM with PREDPP, see discussions of models in NONMEM Users Guides - Parts V and VI. Particular notation used in this Guide VII is given next.
The jth observation from the ith
individual is denoted
. Each individual may have a different number of
observations. Each observation may be measured on a
different scale: continuous, categorical, ordered
categorical, discrete-ordinal.†
----------
† This document
provides a description of estimation methods that can be
used with observations of the same or different type.
However, essentially, it neither contains any specific
information about how to analyze observations of particular
types, nor any information about how to communicate with
NONMEM in order to do this.
----------
An individual can have multivariate observations, each of different lengths. However, the multivariate nature of an observation is suppressed, as this is not relevant to the descriptions given in this document, and so the separate (scalar-valued) observations comprising the multivariate observations are all separately indexed by j. Each multivariate observation may have a different length. The vector of all the observations from the ith individual is denoted .
It is assumed that there exists a separate statistical model for each . This model is called the intraindividual model or the individual model for the ith individual. It is parameterized by , a (vector-valued) parameter common to all the separate intraindividual models, and , a (vector-valued) parameter specific to the intraindividual model for . Under this model, the likelihood of for the data (conditional on ) is denoted by , the dependence on being supressed in the notation. This likelihood is called here the conditional likelihood of
When all the elements of
are measured on a continuous scale, an often-used
intraindividual model is given by the multivariate normal
model with mean
and variance-covariance matrix
(usually,
is comprised of parameters
which are the only ones affecting
, and other parameters which, along with
, affect
).††
----------
†† Here and
elsewhere in this section an explicit assumption concerning
the normal probablility distribution is made. This is done
primarily to keep the discussion simple. To various degrees
in different situations the normality assumption does not
play as important a role as our formally making the
assumption might indicate.
----------
This type of model shall be referred to as the mean-variance model It is usually expressed in terms of a multivariate normal vector with mean 0 and variance-covariance matrix . In the notation used here, the parameter includes (ignoring the matrix structure of ). For example,
where is an instance of a univariate normal variable with variance . (When is multivariate, the observation is modeled in terms of a single instance of this multivariate random vector. A few other observations as well may be modeled in terms of this same instance, and thus under the model, all such observations are correlated and comprise a multivariate observation.) In this example, is (the mean of ), and is (the variance of ). Since the ratio of the standard deviation of to the mean of is the constant , this particular model is called the constant coefficient of variation model.
The dependence of on is often a consequence of the intraindividual variance depending on the mean function, as with the above example, which in turn depends on . This dependence represents an interaction between and . With the (homoscedastic) model expressed by
there is no such interaction; is just . There are two variants of the first-order conditional estimation method described in chapter II, one that takes this interaction into account and another that ignores it.
When an intraindividual model involving is presented to NM-TRAN (the "front-end" of the NONMEM system), the model is automatically transformed. A linearization of the right side of the equation is used: a first-order approximation in about 0, the mean value of . Since the approximate model is linear in , it is a mean-variance model. Clearly, if the given model is itself a mean-variance model, the transformed model is identical to the given model. Consider, for example, an intraindividual model where the elements of are regarded as lognormally distributed (because the normally distributed appear as logarithms):
In this case the transformed model is the constant cv model given above. (Therefore, no matter whether the given intraindividual model or the constant cv model is presented to NM-TRAN, the results of the analysis will be the same.)
Alternatively, the user might be able to transform the data so that a mean-variance model applies to the transformed data, which can then be presented directly to NM-TRAN. With the above example, and using the log transformation on the data , an appropriate mean-variance model to present to NM-TRAN would be
(Actually, NM-TRAN allows one to explicitly accomplish the log transformation of both the data and the .) The results of the analysis differ depending on whether or not the log transformation is used. Without the log transformation, the values of the are regarded as arithmetic means (under the approximate model obtained by linearizing), and with the log transformation, these values are regarded as geometric means. Use of the log transformation (when this can be done; when there are no or with value 0) can often lead to a better analysis.
It is also assumed that as individuals are sampled randomly from the population, the are also being sampled randomly (and statistically independently), although these values are not observable. The value is called the random interindividual effect for . It is assumed that the are instances of the random vector , normally distributed with mean 0 and variance-covariance matrix . The density function of this distribution (at ) is denoted by .
Often, some quantity P (viewed as a function of values of the covariates and the ) is common to different intraindividual models. For example, a clearance parameter may be common to different intraindividual models, but its value differs between different intraindividual models because the values of the covariates and the differ. The randomness of the in the population induces randomness in P. The quantity P is said to be a randomly dispersed parameter When speaking of its distribution, we are imagining that the values of the covariates are fixed, so that indeed, there is a unique distribution in question.
From the above assumptions, the (marginal) likelihood of and for the data is given by
In general, this integral is difficult to compute exactly. The likelihood for all the data is given by
The first-order estimation method was the first population estimation method available with NONMEM. This method produces estimates of the population parameters and , but it does not produce estimates of the random interindividual effects. An estimate of is nonetheless obtainable, conditional on the first-order estimates for and (or on any other values for these parameters), by maximizing the empirical Bayes posterior density of , given : , with respect to . In other words, the estimate is the mode of the posterior distribution. Since this estimate is obtained after values for and are obtained, it is called the posthoc estimate When a mean-variance model is used, and a request is put to NONMEM to compute a posthoc estimate, by default this estimate is computed using . In other words, the intraindividual variance-covariance is assumed to be the same as that for the mean individual the hypothetical individual having the mean interindividual effect, 0, and sharing the same values of the covariates as has the ith individual). However, it is also possible to obtain the posterior mode without this assumption.
The posterior density can be maximized using any given values for and . Since the resulting estimate for is obtained conditionally on these values, it is sometimes called a conditional estimate at these values, to emphasize its conditional nature.
In contrast with the first-order
method, the conditional estimation methods to be described
produce estimates of the population parameters and,
simultaneously, estimates of the random
interindividual effects. With each different method, a
different approximation to the likelihood function (1) is
used, and (2) is maximized with respect to
and
. The approximation to (1) at the values
and
depends on an estimate
, and as this estimate itself depends on the values
and
, the approximation gives rise to a further dependence of
on the values of
and
, one not expressed in (1). Consequently, as different
values
and
are tried, different estimates
are obtained as a part of the maximization of (2).
The estimates
at the values
and
that maximize (2) constitute the estimates of the random
interindividual effects produced by the method (except
for the hybrid method†). The estimate
also depends on
, and so, the approximation gives rise to a further
dependence of
on
, one also not expressed in (1).
----------
† After obtaining
the population parameter estimates with the hybrid method
(see chapter II), NONMEM ignores the estimates of the
that have been produced simultaneously with the population
parameter estimates, and as with the first-order method, the
posthoc estimates (described above) are the ones reported as
the estimates of the random interindividual effects.
----------
In contrast with the first-order
method, a conditional estimation method involves multiple
maximizations within a maximization. The estimate
is the value of
that maximizes the posterior distribution of
given
(except for the hybrid method††). For each
different value of
and
that is tried by the maximization algorithm used to maximize
(2), first a value
is found that maximizes the posterior distribution given
, then a value
is found that maximizes the posterior distribution given
, etc. Therefore, maximizing (2) is a very difficult and CPU
intensive task. The numerical methods by which this is
accomplished are not described in this document.
----------
†† With the
hybrid method, a constrained maximum is computed.
----------
Fortunately, it often suffices to use the first-order method; a conditional estimation method is not needed, or if it is, sometimes it is needed only minimally during the course of a data analysis. Some guidance is given in chapter III. Briefly, the need for a conditional estimation method increases with the degree to which the intraindividual models are nonlinear in the . Population pharmacokinetic models are often actually rather linear in this respect, although the degree of nonlinearity increases with the degree of multiple dosing. Population pharmacodynamic models are more nonlinear. The potential for a conditional estimation method to produce results different from those obtained with the first-order estimation method decreases as the amount of data per individual decreases, since a conditional estimation method uses conditional estimates of the , which are all shrunken to 0, and the shrinkage is greater the less the amount of data per individual. Many population analyses involve little amounts of data per individual.
The conditional estimation methods that are available with NONMEM and which are described in chapter II are: the first-order conditional estimation method (with and without interaction when mean-variance models are used, and with or without centering), the Laplacian method (with and without centering), and the hybrid method (a hybrid between the first-order and first-order conditional estimation methods). For purposes of description here and in other NONMEM Users Guides, the term
conditional estimation methods refers not only to these population estimation methods, but also to methods for obtaining conditional estimates themselves.
To summarize, each of the (population) conditional estimation methods involves maximizing (2), but each uses a different approximation to (1). Actually, is minimized with respect to and . This is called the objective function Its minimum value serves as a useful statistic for comparing models. Standard errors for the estimates (indeed, an estimated asymptotic variance-covariance matrix for all the estimates) is obtained by computing derivatives of the objective function.
TOP
TABLE OF CONTENTS
NEXT CHAPTER ...