Chapter 3 - Models for Individual Data
In this chapter, the notation and definitions we will use to discuss models for individual data will be presented. The relationship of these models to data will be discussed, and a distinction between pharmacokinetic structural models (that describe the underlying shape and form of the data) and statistical error models (that describe the "errors" or differences between observations and structural model predictions) will be made. Several error models will be discussed, as will a useful modelling device, the indicator variable.
By individual data we usually mean data from a single individual (animal or human). One could also be concerned with data comprised of a pharmacokinetic response at just one time point from each of a number of individuals. Call this type of data single- response population data. This name comes from the fact that data such as these can, of course, be regarded as a particular instance of the more general data type, population data; i.e., data comprised of one or more pharmacokinetic responses at different time points from a number of individuals sampled from a population. Although one can discuss the treatment of single-response population data as population data, they are often treated just as are individual data.
A simple pharmacokinetic model for data from a single individual is the monoexponential ("one-compartment") model:
This model describes the
typical time course of amount of drug in the body (
), as a function of initial dose (
), time (
), and a parameter,
. As we may be interested in
at several possible times, we explicitly note this by the
subscript
which indexes a list of times,
.
A way to write a generic form for a structural model, omitting details of its structure, is
where
stands for some "response" (dependent variable) of
interest (
in (3.1)), the symbol
stands for the unspecified form of the model (a
monoexponential such as in (3.1)), which is a function of
known quantities,
(
and
in (3.1)), and parameters,
(
in (3.1)). The quantities in x are known, because
they are either measured or controlled, and therefore, are
called fixed effects, in contrast to effects which
are not known and are regarded as random (see below). The
parameters in the parameter vector
are called fixed effect parameters because they
quantify the influence of the fixed effects on the dependent
variable. Each one of an individual’s pharmacokinetic
parameters is a particular type of fixed effect parameter.
With NONMEM, parameters comprising
are (usually) fixed effect parameters, but these may or may
not be an individual’s pharmacokinetic parameters
(contrast figures 2.1 and 2.6). Here we shall use the symbol
for the parameter vector comprised specifically of an
individual’s pharmacokinetic parameters (although
there will be some exception to this).
Aside from the fact that
the values given by a structural model are usually not the
values observed due to measurement error or model
misspecification, an amount of drug (
of (3.1)) is usually not itself observable. Instead, we may
observe a concentration (
) of drug. We need an "observation scaling" model
to describe the relationship between
and
. This might be
where V is another
parameter, Volume of Distribution. (We denote the
concentration in model (3.3) by the symbol
, to distinguish it, the model-predicted value, from the
actually observed value,
. This will soon be discussed further.) PREDPP assumes that
there is always an observation scaling model like (3.3) that
relates an amount of drug (in some compartment of the body)
to the observation, and therefore always expects a
parameter,
that scales (i.e. divides) the predicted amount in the
compartment. In the example above,
is simply
. In other examples, to be discussed later,
can be more complicated. If a value for
is not specified, it is taken to be 1. For the rest of this
discussion, it is convenient to assume that
itself includes a scaling parameter (if such is needed, and
even though such a parameter is not usually regarded as one
of an individual’s pharmacokinetic parameters) and
that
actually includes observational scaling. Note, considering
the example of (3.3), that
, and
. Thus
and
of (3.1) are in general lists of things (vectors), not
single things (scalars).
PREDPP implements a
number of pharmacokinetic models, such as the
one-compartment model (3.1), (3.3). These will be discussed
more fully in Chapter 7. There is no need for further
general discussion of kinetic models, as we assume the
readers of this document are familiar with pharmacokinetics.
However, two modelling features deserve further comment,
alternative parameterizations and the special parameter,
.
Recall the phenobarbital
example of Chapter 2. For the first run, the input
contained, among other things, some lines of code defining
the variables
and
, and then the line
K = CL/V
This line was needed
because PREDPP expects the one-compartment model to be
parameterized using the parameter
, the rate constant of elimination, not clearance and volume
of distribution. However, we chose to estimate typical
population values for
and
, so we had to relate these parameters to THETA and then
relate
to
and
. This is an example of reparametrization of a model so that
the pharmacokinetic parameters used are those of primary
interest to the modeler. In fact, we may use any
parameterization we wish, so long as we are willing to
include the reparameterization line(s) that translate our
parameters into those expected by PREDPP. (Chapter 7
discusses the parameters PREDPP expects for the various
models it implements.) However, there is a program called
TRANS which automatically does this translation. Different
versions of TRANS exist in the PREDPP Library and correspond
to translations of different parameterizations into that
expected by PREDPP.
Usually, observations are
concentrations. So, as in model (3.3),
will usually be set identical to
. However,
is not always simply
. Some examples should clarify this point. (In the
discussion below, we avoid the notation
, and use
, to refer to the scale term for the amount in the
compartment in which concentrations are being
measured.)
This almost trivial case
occurs when one wishes to match the units of predicted
responses to those of the data. For example, suppose
is in milligrams, but concentrations are in ng/ml. If no
scaling is done, the units of
will be kiloliters (i.e.,
=1 corresponds to
=1000 liters). To avoid this, one might choose the
model
thereby converting the
units of
into micrograms, and since mcg/L
ng/ml, the units of
become liters. Of course, one could recode one’s data,
dividing all concentrations by 1000 (or multiplying the dose
by 1000) and avoid this, but that may not be
convenient.
Later in this chapter we will discuss a model used when the data arise from two different assays (call them assay 1 and assay 2). In such a case, there may be a systematic (multiplicative) bias of one assay relative to the other. If we wish to allow for this possibility, we might need a model such as
where
is a new parameter that measures the proportional bias of
the assays (i.e., bias causes the apparent volume of
distribution to be different for data from the two assays).
The parameter
is not really a pharmacokinetic parameter, but for the
purpose of this discussion it can be included in
.
Later in this chapter we
will describe a model useful when two kinds of responses are
measured, plasma and urine concentrations. In the case of
urine concentrations, the predicted total drug in the urine
during a time period (available from an "output"
compartment present in all models implemented by PREDPP; see
Chapter 7) would have to be scaled by the actual urine
volume during that time period. This volume would be an
element of
, and
would be set equal to it.
One does not, in fact, ever observe the predicted plasma concentration (or any other predicted response). What one observes is a measured value which differs from the predicted value by some (usually small) amount called a residual error (also called intra-individual error). We regard this error as a random quantity (see below). We will want NONMEM to fit our model to our data, and in so doing, provide us with estimates of the model parameters. The way NONMEM’s fit follows the data is determined largely by what we tell it about the nature of the errors (see Chapter 5). We must therefore provide NONMEM with another model, an error model.
There are many reasons
that the actual observation may not correspond to the
predicted value (e.g.
as given by the right side of (3.3)) The structural model
may only be approximate, or the quantities in
may have been measured with error, or, as is always true,
pharmacokinetic responses may be measured with some error
(assay error). It is too difficult to model all these
sources of error separately, so we usually make the
simplifying assumption that each difference between an
observation and its prediction (i.e. each error) is a
randomly occurring number. When the data are from a single
individual, and the error model is the Additive error model
(see Section 3.1, below), the error is denoted by
herein, by ETA in NONMEM output, and by ETA or ERR in
NM-TRAN input. (When data are from a population, and the
same error model is used, this error will be denoted
; see Chapter 4.) Therefore a model for the jth observation,
, could be written
Implicit in using the
symbol
in this way is the assumption that all residual errors come
from probability distributions with mean zero and the same
(usually unknown) variance. (The error variance is the mean
squared error.) More complicated error models
involving
can be written (see below). A schematic of model (3.4) is
shown for the structural model of (3.1), (3.3) in figure
3.1. Because this model describes the influence of both
fixed effects (
) and random effects (
), it is called a Mixed Effects Model (hence the
name, NONMEM: NONlinear Mixed Effects
Model). Mixed effects models, in general, may have
more than one random effect, and more than one type of
random effect (Chapter 4); (3.4) is only a particularly
simple example.
Figure 3.1.
vs
for a monoexponential model. The solid line is
; the circles are the observed data points. An error is
indicated.
Even though errors are
unpredictable random quantities, some information about them
is usually assumed, and some can be estimated. First, it is
assumed that the mean error is zero. This simply means that
were the true values for the parameters in
known, the model would have no systematic overall bias
(e.g., be systematically below or above the data points, on
average).
A second aspect of the
error, one that can be estimated by NONMEM, is its typical
size. Since errors may be positive or negative, their
typical size is not given by their mean (which is zero), but
by their standard deviation, the square root of their
variance. One can always simply convert the variance into
the standard deviation, and conversely. NONMEM output gives
estimates of the error variance. With individual data this
variance is denoted in this text by
, and by OMEGA in NONMEM input and output. The standard
deviation (SD) of the error is denoted
herein. The reason that OMEGA, rather than, for example,
OMEGA SQ stands for
in NONMEM input and output will be discussed in Section 3.8.
(We will see, in Chapter 4, that when the error is
symbolized by
, not
, its variance will be denoted
in this text, and SIGMA, not OMEGA, in NONMEM input and
output.) Here, the parameter
quantifies the influence of the random effect,
on the observations,
. It is therefore called a random effects
parameter.
The symbol
is always used to denote a random quantity whose probability
distribution has mean zero and variance
. Model (3.4) says that the errors themselves can be so
regarded, and since an observation equals its prediction
(under the structural model) plus an error, model (3.4) is
called the Additive error model. This model is
illustrated in figure 3.2.
Figure 3.2.
vs
for a monoexponential model. The middle line is
; the outer lines give the approximate "envelope"
for additive errors. Don’t be fooled by the apparent
widening of the gap between the upper and lower curves as
time increases: the vertical distance from the middle line
to either outer line is everywhere the
same.
NONMEM allows an error model which can be more complicated than that of (3.4). One such more complicated, but useful model is the Constant Coefficient of Variation (CCV), or Proportional error model,
A fractional error is an
error expressed as a fraction of the corresponding
prediction. The CCV model says that a fractional error can
be written as an
, i.e. as a random quantity with mean zero and variance
. Under this model, the variance of an error itself is
proportional to the squared prediction, with
being the proportionality factor, and so is not constant
over observations. Since, under this model, the standard
deviation of the error, and also of
, is
, and since the mean of
is
(when
assumes its true value), the coefficient of variation of
is just the constant
(the coefficient of variation of a random quantity is
defined as its standard deviation divided by its mean). This
is the reason the CCV error model is so named. Also for this
reason,
is dimensionless, in contrast to having units equal to those
of the squared observation as with the Additive model. This
error model is illustrated in figure 3.3.
Figure 3.3.
vs
for a monoexponential model. The middle line is
; the outer lines give the approximate "envelope"
for constant coefficient of variation
errors.
The exponential error model is
This model is sometimes referred to as the log-normal model, because it it is additive if logs are taken (and because eta sj is assumed to be normally distributed):
See Chapter 8, Section 3.2 for a discussion of this model.
When most observations obey the CCV model but some observations may be near the lower limit of detection of an assay, a model which may be useful is one which is a combination of both the Additive and CCV error models:
Here there are two types
of
’s,
and
. The first has variance
; the second has a possibly different variance,
. NONMEM permits several types of
’s. Under this model, the variance of the error
portion of the model is
. When the prediction is near zero, the variance is
approximately constant, namely
. This is the smallest variance possible and corresponds,
perhaps, to the limit of assay precision. When the
prediction is considerably greater than zero, the variance
is approximately proportional to the squared
prediction.
A model that has both the additive and the CCV error models as special cases, and smoothly interpolates between them in other cases is the Power Function model:
Here
is raised to the
power in the error model, rather than the
power (Additive error model; note
for any number,
) or the first power (CCV model). The parameter
is a fixed effects parameter, even though its role in the
overall model is to modify the variance model, not the
structural model. With NONMEM all fixed effect parameters
must be elements of the general parameter vector
. If we want the Power Function Model to interpolate between
the additive and CCV models,
must be constrained to lie between 0 and 1. NONMEM allows
this (see Chapter 9). While one might be tempted to combine
the Power Function model with the Additive model, much as
the CCV and Additive model were combined above, such a
combination model can lead to identifiability difficulties,
and for this reason such a combination should be
avoided.
Another more complicated
error model can arise when more than one type of measurement
is made. Suppose, for sake of illustration, that the
observations are drug concentrations, but that they are
measured with two different assays. If one assay may be more
precise than the other, then this is equivalent to saying
that one assay has a smaller
than the other. We would like to be able to take this into
account in the analysis (i.e., not pay as much attention to
the less precise observations), and perhaps (if we have
enough data) estimate the relative precision of the assays
as well. To do this in the notation we have introduced, an
independent variable indicating which observations are
obtained with which assay is needed: we call such an
independent variable an indicator variable.
Let one of the data items
(an element of
) be labeled
, and let
take the value 1 if the assay used for
was of the first type, and the value 0, if it was of the 2nd
type. The variable,
is an indicator variable, and it allows us to write an
additive type error model, say, as
Here there are two types
of
’s,
and
. The first applies to the first type of assay, and has
variance
; the second applies to the second type of assay, and has a
possibly different variance,
. NONMEM permits several types of
’s. Different types of
’s can be correlated, and NONMEM can allow this.
However, this is something we would only need to consider in
the example at hand if the same blood sample were measured
by both assays. We will not emphasize this possibility in
this introductory guide. (This possibility also applies to
random variables describing unexplained interindividual
differences with population data; see Chapter 4)
When the assay is done by
the first method,
will be unity, and (3.8) becomes
so that the variance of
the error is
. When the assay is done by the second method,
will be zero, and (3.7) becomes
so that the variance of
the error is now
. Both
and
are random effect parameters.
An equivalent form of the model that can be implemented easily is
The same need for
separate scales for different measurements can arise when
more than one type of observation is made. Suppose both
plasma concentrations (
) and urine concentrations (
) are measured. The structural model for
might be (3.1), (3.3). If we assume that urine is collected
between each observation of
, then the structural model for
, the drug concentration in the urine collected between time
and time
might be
where
is the fraction of drug excreted unchanged (a parameter),
and
is the urine volume collected between time
and
(a data item)†.
----------
† With all
PREDPP pharmacokinetic models there is an output compartment
for which the total amount of drug leaving the system is
computed automatically. The concentration in the urine is
then obtained by setting the scaling parameter for the
output compartment to
.
----------
Assuming again, for sake
of the example, that we want to use an additive type error
model for the observations, the problem is that urine
concentrations can be orders of magnitude larger than plasma
concentrations, so that, while an additive error model might
be appropriate for either type of observation alone, the two
types of observations must have different typical error
magnitudes; i.e., different variances (
’s).
An indicator variable can
again be used. Let the indicator variable
be unity if the
observation is a
, and 0 if it is a
. We now need to use it for both the structural and error
models, so that:
A little thought shows
that the indicator variable selects the correct prediction (
or
) and the correct error term for each observation (
).
An equivalent form of the model that can be implemented easily is
Of course, there could be
three types of assays, or more, and similarly, more than two
types of observations. One usually needs one less indicator
variable than types of things to be distinguished. So, if
there were three assays, one would define
and
.
would be 1 if the assay were of the first type, and zero
otherwise;
would be 1 if the assay were of the second type, and zero
otherwise. The error model for the data would require three
types of
’s,
,
, and
.
Equation (3.11) results in the following:
An equivalent form of the model that can be implemented easily is
We have just seen examples of more complicated error models than the simple Additive model. We here give a mathematical form for the most general mixed effects model that is considered within the scope of this document:
where
is a vector valued function of
and parameters
(where the latter is interpreted broadly to contain
parameters such as
of (3.7)), and
is a vector of different different
types. The notation
denotes vector transpose. When there is more than one
type, there will be several
’s, one for each type. The collection of these is
denoted
and is labeled OMEGA in NONMEM input and output. This
collection is regarded as a diagonal matrix (diagonal for
now; but see Chapter 4), rather than as a vector. We will
use the symbol
and
interchangeably in this text to denote the (diagonal)
element of this matrix found in position
.