The term ‘demographic models’ can have two meanings, one broad and one narrow. In its broad meaning, demographic models refer to all mathematical, statistical, forecast, and microsimulation models that are applied to studies of demographic phenomena. In its narrow meaning, demographic models refer to empirical regularities in age patterns of demographic events. Demographic models in the broad definition can be found in various entries related to demography (see, among others, Demographic Analysis: Probabilistic Approach; Multistate Transition Models in Demography; Event History Analysis: Applications; Population Dynamics: Theory of Stable Populations; Population Forecasts; Micro simulation in Demographic Research). This article is concerned with demographic models in the narrow definition.
1. Age Patterns of Demographic Events
In a classic statement, Hauser and Duncan (1959) defined demography as ‘the study of the size, territorial distribution, and composition of population, changes therein, and the components of such changes’ (p. 2). In the Hauser-Duncan definition of demography, the study of population changes goes hand in hand with the study of the components of population changes. This is necessitated by the need to decompose a population into components and then study changes in the components before arriving at an overall understanding of the changes in the population. The most elementary, and also the most important, form of population decomposition is by sex and age. Since men are only indirectly involved in reproduction, demographic analysis is often simplified by focussing on women. Such simplification is called ‘one-sex’ modeling (see Population Dynamics: Two-sex Demographic Models), as is commonly seen in fertility and nuptiality models. In mortality models, however, men and women are always kept distinct, and there are substantial mortality differentials in favor of women.
The treatment of age is of significant concern in demographic research. Without exception, the occurrence of all demographic events is age-dependent. Here, the correct interpretation of age-dependency is one of life-course, that is, the likelihood of the occurrence of an event changes as a person (or a cohort) ages. This is true even though most demographic methods and models use cross-sectional data, capitalizing on age-gradients of vital rates in any given population. The use of period-based data usually is necessitated by the lack of cohort-based data.
Demographic models of age schedules are developed on the observation that age patterns of demographic events often show some regularity. Two cautionary notes are in order. First, age regularity is not universal, either across space or over time. Second, all that is assumed is empirical regularity; theoretical reasons behind such regularities are typically neither well established nor well understood. Nonetheless, demographic models capitalizing on empirical regularities are very useful in practice and may provide the basis for theoretical work. A brief discussion of the main uses of demographic models is given as follows.
2. Use of Demographic Models
Demographic models are intended to summarize empirical regularities in age patterns of demographic events, ideally in simple mathematical formulas. Such models can prove very useful in demographic research.
One type of use of demographic models concerns data quality. For example, demographic models may be used to detect and correct faulty data, impute missing data, and allow researchers to infer from partial data. Not surprisingly, this use of demographic models is often found in historical demography and research in less developed countries, where quality data are scarce. The second type of use is substantive and commonly it is found in research with a comparative focus, be it over time (trends) or across societies, regions, or subpopulations. Age-schedule models are also useful for actuarial calculations of life-insurance premiums and premium reserves and for the computation of national or regional population forecasts (e.g., Lee and Tuljapurkar 1994). In demographic applications, cross-group variations in age patterns are typically parameterized as functions of two components: a shape component to capture age effects and a modification component reflecting group membership. Ideally, we would like the second component to be as parsimonious as possible. When demographic models are fully parametric, the two components are integrated, with the functional form being the shape component and the parameter values being the modification component. However, parametric models are not the norm in demography. Parametric models and semi-parametric models will be discussed separately below, with concrete examples taken from research on mortality, nuptiality, and fertility. Coale and Trussell (1996) give a more detailed account of some of the models.
3. Parametric Models
Gompertz (1825) is accredited for the discovery that mortality rates in human populations increase nearly as an exponential function of age. This regularity holds true only after early adulthood (ages 25-30). According to ‘Gompertz’s law,’ the force of mortality (or hazard) is a parsimonious loglinear function of age where nx denotes the force of mortality at age x. Naturally, A is the parameter characterizing the level of mortality at early ages and B is the parameter characterizing the rate of increase in mortality with age. While very parsimonious, the Gompertz model does not always fit empirical data well. Other researchers have modified the model either by adding additional terms or by changing the functional form. For example, Makeham (1867) altered the simple loglinear relationship of Eqn. (1) by adding a time-invariant term intended to capture cause-specific ‘partial forces of mortality.’
To fit observed age patterns of fertility, Brass (1960) proposed a model using polynomial functions. Although Brass’ parametric model is very flexible, it requires the estimation of four unknown parameters. Since fertility rates typically are given in a limited number of 5-year intervals, this method leaves very few degrees of freedom for evaluating goodness-of-fit. An evaluation of various parametric models based on empirical data is given by Hoem et al. (1981).
There are three notable problems with parametric models. First, parametric models often do not fit observed phenomena. Demographers have dealt with such empirical deviations through (a) restricting the applicable age range and (b) allowing for further parameterization. Both solutions have surfaced in research using the Gompertz model.
The second major disadvantage associated with parametric models is the lack of behavioral interpretations for key parameters. This problem arises from the fact that almost all parametric models have resulted from exercises of curve-fitting. Parametric models may reproduce observed age patterns of demographic events, but theoretical interpretation of involved parameters is often unclear.
Finally, parametric models are not always convenient to use for comparing populations or subpopu-lations even though this was one of the motivations for developing them in the first place. This problem is apparent, for example, in the case of a polynomial model. When several parameters in a polynomial function differ between two populations, it is difficult to characterize one population as having higher or lower rates than the other population.
4. Semiparametric Models
In response to these problems with parametric models, semiparametric models have been developed. Semi-parametric models are similar to parametric models in specifying parsimonious mathematical functions but differ from parametric models in allowing age-dependency to be unconstrained and subject to empirical estimation, that is, semiparametric models do not impose any global constraint limiting the age pattern to the rigidity of a parametric mathematical function. Instead, the age pattern is estimated freely and empirically from observed data or calculated from external sources. One manifestation of the semiparametric approach is the use of model schedules or model tables. While allowing for the flexibility in a common age function, model tables place constraints on the variations in the age pattern across populations or subpopulations. Such constraints are often motivated by substantive knowledge of demographic phenomena.
In the area of mortality studies, for example, model life tables have been in wide use, primarily as a tool for correcting faulty data and estimating missing data. In essence, a model life table allows for a typical age schedule of mortality shared by a set of populations that differ mainly in their levels of mortality. The age pattern is flexible and empirically determined over all ages but constrained across populations; and cross-population variability lies in the overall level of mortality. The earliest model life table was developed by the United Nations (1955) for all national populations. Subsequently, Coale and Demeny (1966) added more flexibility by identifying four regional model life tables based on distinctive age patterns of mortality and refined the technique for constructing model life tables.
5. Statistical Demography vs. Mathematical Demography
The construction and derivation of parametric and semiparametric demographic model age schedules have been accomplished through mostly mathematical and sometimes graphical methods. From the outset demographic modeling has been regarded as part of mathematical demography. Since the 1980s, however, a mainline statistical approach has played an increasingly important role in the development of demographic models (e.g., Clogg and Eliason 1988, Hoem 1987, Xie 1990, Xie and Pimentel 1992; see also Demographic Analysis: Probabilistic Approach) for a number of good reasons.
First, the advancement of demography has brought with it more, richer, and better data in the form of sample data, and the use of sample data requires statistical tools. Treating sample data as exactly known quantities induces the danger of contamination by sampling errors. Second, while the method of disaggregation commonly used in mathematical demography may easily lead to inaccurate estimation due to small group sizes, statistical methods more efficiently utilize covariates to examine group differences. Third, because observed data may sometimes be irregular or simply missing, statistical models can help smooth or impute data. Conversely, the strength of empirical techniques developed around demographic models (e.g., indirect estimation, age patterns of fertility, model life tables, etc.) is that they provide descriptions of age patterns that can be utilized to improve statistical analysis.