 Research article
 Open Access
 Published:
Healing, surviving, or dying? – projecting the German future disease burden using a Markov illnessdeath model
BMC Public Health volume 21, Article number: 123 (2021)
Abstract
Background
In view of the upcoming demographic transition, there is still no clear evidence on how increasing life expectancy will affect future disease burden, especially regarding specific diseases. In our study, we project the future development of Germany’s ten most common noninfectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure) in a Markov illnessdeath model with recovery until 2060.
Methods
The diseasespecific input data stem from a consistent data set of a major sickness fund covering about four million people, the demographic components from official population statistics. Using six different scenarios concerning an expansion and a compression of morbidity as well as increasing recovery and effective prevention, we can show the possible future range of disease burden and, by disentangling the effects, reveal the significant differences between the various diseases in interaction with the demographic components.
Results
Our results indicate that, although strongly agerelated diseases like dementia or heart failure show the highest relative increase rates, diseases of the musculoskeletal system, such as dorsal pain and arthrosis, still will be responsible for the majority of the German population’s future disease burden in 2060, with about 25–27 and 13–15 million patients, respectively. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity.
Conclusion
A massive caseload is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity pointofview.
Background
The development of future patient numbers is an important concern for many stakeholders in the health systems. Rational decisions about the planning of hospital capacities, pharmaceutical investments, career choices of (future) healthcare professionals as well as the development of future health care expenditures itself depend on the precise knowledge of the future development of specific diseases.
Germany is one of the fastest ageing countries in the world due to constantly low fertility rates since the 1970s and a continuously increasing life expectancy. In the literature there are different rival theories and hypotheses how an increasing life expectancy will particularly affect the disease burden and the related health care expenditure. Gruenberg (1977) [1] and Verbrugge (1984) [2] hypothesise that a rising longevity goes hand in hand with an increase in years spent in illness and therefore with an expansion of morbidity in older age groups. In contrast, Fries (1980) [3] assumes that an increasing life expectancy leads to a compression of morbidity. Given these somehow contradictory hypotheses, the influence of proximity to death and treatment spending as a function of remaining life expectancy are controversially discussed among health economists [4,5,6,7].
However, even less evidence exists today concerning the (more epidemiological) question of specific diseases’ future development in the light of the different hypotheses. A systematic literature review on PubMed searching for projections (or synonyms) in context of demography and using the keywords prevalence, incidence or burden of disease for specific or chronic noninfectious diseases in general shows 160 relevant publications. There are three categories of studies by their projection methodology: trend extrapolations (99/160), multistate models (57/160) and studies using both methodologies (4/160). In 54 of the studies using trend extrapolation (103/160) indeed current prevalence or incidence rates are transferred to population projections, which excludes a specific modelling of the various theses. This socalled status quo analysis is also commonly used in projections of health expenditures^{Footnote 1}. Out of the 61 studies using multistate modelling (61/160), 17 (17/61) are based on the classical structure of an illnessdeath model (even if only 7 explicitly define it that way). However, only nine of the studies (9/61) focus on an explicit modelling of a compression of morbidity, of them eight (8/9) related to dementia. Furthermore, just seven studies (7/61) compare the development of more than two different diseases, only one of them modelling compression scenarios [9] (see the appendix for more detailed information and results on the systematic database search).
In our paper, we present projections for ten common noninfectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure). The selected diseases represent the intersection between the most common and most expensive disease patterns in Germany [10]. For the projections we use a timediscrete Markov illnessdeath model with recovery. Our model allows us to regard the different hypotheses in context of demographic transition and to quantify the influence of potentially changing variables (diseasespecific survival, incidence and recovery rate) on the future frequency of diseases. In addition, we show the influence of successful prevention on longterm prevalence of the different diseases.
The populationrelated components used for modelling stem from Destatis, the German Federal Statistical Office, whereas the diseasespecific components are computed on the data of a major sickness fund covering approximately four million insureds during the period from 2009 to 2017. Our data set is unique as we calculated the input data ourselves using diseasespecific validation criteria selected for this purpose (shown in section Dataset). Hence, our study is one of the few that use insurance data (7/160), although the resulting treatment prevalence is of particular importance for decision makers and payers in the health care system. Data sources from other studies of the systematic literature review are surveys or other epidemiological studies (61/160), a literature review for the different input factors (34/160), registries (28/160) or mixed data sources (30/160).
The paper is organised as follows: we start with the presentation of our timediscrete Markov illnessdeath model with recovery as well as our data set. Then, we show our results for the future development of the ten diseases (average prevalence rates and number of patients) in different populations and scenarios, also considering the results of other publications. This is followed by a discussion of the results in view of the current state of research and the limitations, finishing with a concluding summary.
Methods
Markov illnessdeath model with recovery
We will calculate the future number of patients and the future average prevalence rates for the total population from 2018 to 2060^{Footnote 2} using a timediscrete Markov illnessdeath model with recovery. The model is based on the cohortcomponentmethod [11], which is widely used for (official) population projections. Regarding epidemiologic modelling, it can be attributed to the work of Fix & Neyman (1951) [12] and is closely related to those of Manton et al. (1984), Brookmeyer et al. (1998), Brinks et al. (2012), and Andersson et al. (2015) [13,14,15,16], but differs inthe detail level of the rich routine data set used. The specific cohort data by age and gender with corresponding detail diagnosis allows us to vary different variables over time (future development of the diseasespecific survival rate, incidence rate and recovery rate). In contrast to most other studies using an illnessdeath approach (16/17) including the work of Milan & Fetzer (2019) [17], on which our modelling is based, the model also includes the possibility of recovery.
The starting point of our model is the number of patients P_{a,g} (differentiated by age a between 0 and 100 and gender g which is men or women) in our starting year T. It results from the prevalence rate p_{a,g,T} multiplied by the cohort size K_{a,g,T}.
In models extrapolating current prevalence rates (status quo analysis) p_{a,g,T} is assumed to be constant over time and only the future cohort sizes determine the future development of patients. In contrast to this, for all following years, age and genderspecific incidence and recovery rates as well as the mortality rates of patients are used in our model to calculate the (future) number of patients P_{a,g,T + t}. At this point we distinguish between the group of patients which are comprised of the surviving patients of the previous year \( {\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}\mathbf{1}} \) and the group of newly diseased patients I_{a,g,T + t}.
In order to calculate the surviving patients of the previous year \( {\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}\mathbf{1}} \) we use the diseasespecific mortality difference md_{a − 1,g,T + t − 1} which is subtracted from the survival rate of each cohort sr_{a − 1,g,T + t − 1}^{Footnote 3}. Also we consider diseasespecific recovery rates r_{a − 1,g,T + t − 1} as follows^{Footnote 4}:
To determine the number of new patients I_{a, g, T + t}, the number of surviving nondiseased from the previous year is calculated as follows in a first step:
In a second step the number of new patients I_{a,g,T + t}, which results from the age and genderspecific incidence rate i_{a,g,T + t}, is multiplied with the surviving nondiseased from the previous year:
The total number of patients P_{T + t} in all years T + t is finally calculated as:
In our model for all years T + t the future cohort sizes, K_{a,g,T + t} as well as the future survival rates sr_{a,g,T + t} of the total population are derived from a population projection, which we calculate via the cohort component method. Within this framework we consider the diseasespecific components. The calculation of the survival rate of the patients as the difference sr_{a − 1,g,T + t − 1} − md_{a − 1,g,T + t − 1} and the surviving nondiseased \( {\boldsymbol{ND}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}\mathbf{1}} \) as the difference between all survivors of the cohort and the surviving patients from the previous period finally merge the population projection with the epidemiological developments. Thus, the design of our model also allows the use of input data from any other population projection or/and diseasespecific statistic. This timediscrete approach is also more intuitive to understand for a broader audience, such as policy setters and health care decision makers.
Dividing the total number of patients by the total number of the population results in the average prevalence rate of the total population, apr, which we will present in addition to the total number of patients in the result section. Obviously, the apr highly depends on the share of the elderly and diseased within the total population. As the German demographic transition leads to an increasing proportion of elderly cohorts, we call this effect cohort effect, which can also be observed in models extrapolating current prevalence rates using the status quo analysis.
As for the further effects of our model, we will take a closer look at the future age and genderrelated prevalence rate p_{T + 1}, which can be obtained by dividing the number of patients (eqs. 2 to 5) by the total corresponding cohort K_{a,g,T + t} = K_{a − 1,g,T + t − 1}sr_{a − 1,g,T + t − 1} and therefore is independent of future cohort sizes:
For reasons of simplicity we use timeindependent incidence, recovery and mortality rates and abstract from the indices of age and gender in eq. (7). The total derivate can be used to determine the impact of changing incidence, recovery and mortality rates on the prevalence in year T + 1.
In our model specification, the variables p_{T}, sr, md, r and i can take on values between 0 and 1 and the diseasespecific mortality difference md is less (or in theory equal) than the survival rate of the entire population sr. As eq. (8) shows, a higher prevalence rate p in year T leads to a higher prevalence rate in year T + 1. The theoretical onetoone impact of this effect is lowered by the degree of the incidence and recovery rate as well as the diseasespecific mortality difference.
An increase of the survival rate sr initially leads to an increase in both, the diseased and the nondiseased population. In conjunction with the incidence rate i, a positive impact on the prevalence rate in year T + 1 can be observed as the rising survival rate leads to a higher “at risk” population. In contrast to this, a higher mortality difference md leads to a decline in the prevalence rate in year T + 1. Both effects combined can be interpreted as follows: The smaller the difference in mortality between the diseased and nondiseased, the higher the positive impact of an increasing survival rate.
The influence of the recovery rate is negative and linked to the life expectancy of the patients. The more patients survive until the following year, the more can recover again. However, the higher the incidence rate and thus the proportion of new patients, the lower the proportion of persons who could potentially recover, which mitigates the negative effect of the recovery rate.
Considering the impact of increasing incidence rates also offers a connection between the incidence and the recovery rate. A higher proportion of recovered people leads to a higher “atrisk” population. The opposite effect results from a higher prevalence rate in year T which comes along with a lower “atrisk” population.
Scenarios
Regarding the effects outlined above, a change of one variable will always affect the future prevalence in interaction with the other components. To illustrate these effects and the sensitivity of the model, we model six scenarios of changing diseasespecific variables md_{a,g}, i_{a,g} and r_{a,g} for each of the ten diseases up to 2060, especially regarding the different hypotheses of expansion and compression of morbidity (see Table 1). In all scenarios we assume increasing survival rates sr_{a,g} according to the moderately increasing life expectancy scenario L2 [18]).
In the first scenario, we hold all diseasespecific variables constant over the time horizon. However, the assumption of a constant mortality difference and rising survival rates (sr_{a,g,T + t} > sr_{a,g,T + t − 1}) leads to an increase in life expectancy of both the nondiseased and the diseased. In conjunction with constant incidence rates (i_{ag} = const), this results in an increasing duration of disease. Thus, the scenario Expansion 1 can be interpreted as a type of expansion of morbidity hypothesis. This scenario serves as our baseline scenario in the following. The scenario Expansion 2 is a more extreme scenario of the expansion of morbidity hypothesis, assuming an additional 30% increase in incidence rates until 2060 (i_{a,g,T + t} > i_{a,g,T + t − 1}).
The compression of morbidity hypothesis is considered in two different scenarios: In the scenario Compression 1 only the healthy population benefits from the increasing life expectancy (\( {\boldsymbol{sr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}}=\boldsymbol{const} \)) which leads to a continuous increase in the mortality difference between the diseased and the healthy population. In the scenario Compression 2 a shift of diseased cases in relation to increasing life expectancy is modelled which is in line with the “traditional” compression of morbidity hypothesis and leads to continuously decreasing incidence rates (i_{a,g,T + t} < i_{a,g,T + t − 1}).
To highlight the longterm impact of effective prevention programmes, a scenario Prevention is modelled with temporarily decreasing incidence rates (i_{a,g,T + t} < i_{a,g,T + t − 1}) up to 30% until 2035. In order to simulate possible effects of better medical care, e.g. due to disease management programmes, the scenario Extended Recovery assumes increasing recovery rates up to 50% until the year 2060 (r_{a,g,T + t} > r_{a,g,T + t − 1}).
Interestingly (and as discussed in the section on the total differential of the prevalence rate), the total effect of the scenarios Compression 1 and 2 as well as of the scenarios Extended Recovery and Prevention on the future (age and genderrelated) prevalence rate is not defined a priori and depends on the numerical ratio of diseaserelated input data and the increase of survival rates.
Dataset
The average diseasespecific input data for each cohort and gender^{Footnote 5} derives from a routine dataset of around four million insureds of the AOK BadenWürttemberg from 2009 to 2017^{Footnote 6}. Due to this large number of people insured by the AOK in BadenWürttemberg, this population is approximately representative of the German population regarding the diseaserates within the age cohorts. Table 2 shows the specific selection criteria for each of the ten diseases. Since there are no coding guidelines for outpatient diagnoses in Germany, we use the criteria of the AOK Research Institute published in various reports [19,20,21,22]. The M2Q^{Footnote 7}/M3Q criterion, for instance, only defines patients as diseased if they have a confirmed diagnosis in at least two and three out of four quarters of the year, respectively. Inpatient primary and secondary diagnosis are included without additional validation criteria. We complete missing data by the following procedure: If the selection criteria are satisfied the year before and the year after, insureds are classified as patients also in the incompletely coded year. Patients are classified as “new patients” when they fail to fulfil the prevalence criteria in any of the four previous years. The days of insurance of the patients identified by diagnosis are then set in relation to those of all insureds to calculate period prevalence p_{a,g} and cumulative incidence i_{a,g} for the years 2015 to 2017 [24]. For pulmonary cancer we use a fiveyear preobservation period for the derivation of the incidence. To take into account the periodic character of depression, we use additional selection criteria for new cases and divergent diagnoses to determine prevalence and incidence.^{Footnote 8}
For the calculation of recovery rates r_{a,g} all surviving patients without a coded diagnosis in the following years are set in relation to the total of all surviving patients. For the definition of recovery we use a fouryear followup period for diseases with realistic cure probabilities (dorsal pain, depression and CVD) and a fiveyear followup period for pulmonary cancer. The maximum followup period of 8 years is used for all other diseases since there are still no cure possibilities available for their most common manifestations. Since dementia is (as of yet) characterized by an irreversible disease progression, no recovery rates are considered in these calculations^{Footnote 9}. For chronic diseases, the recovery rates are to be interpreted as being symptomfree. A recurrence of the disease after years of asymptomatic illness is taken into account by the incidence rate. For each cohort, we calculate mortality differences md_{a,g} as the difference between the 1year survival rates of the diseased and all insureds in a given year and subtract them from the German population’s survival probability sr_{a,g} as described above^{Footnote 10}. Table 3 shows the population weighted determined input data as the average value for different age groups and overall average in the base year 2018 for each disease, in parentheses differentiated by gender (female vs male). In addition, Table 4 illustrates the demographic characteristics of the study population as average values of all years analyzed in millions and as percentage compared to those of the entire German population in 2018.^{Footnote 11}
Inorder to derive the (future) cohort sizes K_{a,g} and survival rates sr_{a,g}, we build different population projections based on input data from Destatis and statistics of mortality.org. As our starting point serves a Stationary Population with constant absolute births and constant life expectancy to separate the effects resulting from diseasespecific (epidemiological) components from the effects of the composition of future cohort sizes on the apr. In our second population projection Population (LE constant) we abstract from a further increase in life expectancy. This projection is based on the German population in 2018 under the assumption of a fertility rate of 1.55 children per woman of fertile age. For our third population projection, Standard Population (LE increasing), we further assume an increase of life expectancy from 83.3 to 88.1 years at birth for women and 78.5 to 84.4 at birth for men according to the moderate increase scenario L2 of the 14th population projection [18]. Migration movement is not taken into account, as too little is known about whether disease rates of the German population are transferrable to migrants [27, 28]. Hence, the Standard Population (LE increasing) represents an absolute decline in population from 83.0 to 66.2 million by 2060, accompanied by an increasing oldage dependency ratio from 35.9 to 69.7%.^{Footnote 12} However, for reason of comparability to other studies, we build a fourth population projection, Population (Migration), where future migration is integrated according to the scenario W2 of the 14th population projection [18].^{Footnote 13} In this case the total population is 79.1 million people in 2060 and the oldage dependency is 58.8%.
Results
The presentation of our results starts in Table 4 with a comparison of the average prevalence rates apr (i.e. the total number of patients divided by the total number of the population) in the years 2018 and 2060 under the assumption of constant diseasespecific variables over the time horizon. We use the three different population projections Stationary Population, Population (LE constant) and Standard Population (LE increasing) to separate the effects resulting from diseasespecific (epidemiological) components and those occurring from the demographic components (initial population structure and increasing life expectancy). The values resulting from Standard Population (LE increasing) correspond to the baseline scenario Expansion 1.
The results show a high increase in the apr for strongly agerelated diseases like dementia, heart failure or CVD, with the ageing of the German population due to its current structure (Population (LE constant)) and rising life expectancy being the key factors driving the large growth rates. The ratio of people with dementia could more than double by 2060 within the Standard Population (LE increasing). In contrast, the increase of the apr of dorsal pain is mainly driven by the epidemiological effect. Regarding arthrosis and COPD, the increase of apr can be attributed to both, the epidemiological as well as the demographic effects. The smallest increase of apr emerges for diabetes and depression. For both, the epidemiological effect is comparatively low. However, an increase in the average prevalence rate is to be expected for all diseases given the baseline scenario Expansion 1. Even when abstracting from an increasing life expectancy, the ageing of the German population in conjunction with the epidemiological effects will lead to a substantial increase of all diseases.
Figure 1 presents the results for the apr in the year 2060 that occur under the different model scenarios (see Table 2) as well as under a simple extrapolation of age and genderrelated prevalence rates for the population of 2060 (status quo (SQ) principle). For this purpose, we use the Standard Population (LE increasing). The yaxis of Fig. 1 shows the relative change of the apr between 2018 and 2060 whereas the xaxis displays the value of the apr for the different scenarios in 2060. Additionally, the xaxis depicts the numbers of apr in 2018.
As a first result, Fig. 1 illustrates that the ranking of the ten diseases with respect to the value of the apr in 2060 is the same as in 2018, even though the relative change of the apr differs significantly between the ten diseases. That means that dorsal pain and arthrosis are expected to be the two major diagnoses in 2060, although e.g. dementia offers a significantly higher change in the apr in all scenarios.
Second, the results show a different impact of the rival hypotheses regarding the consequences of increasing life expectancy on future disease burden: The expansion of morbidity scenarios Expansion 1 and 2 lead to a soaring increase of all diseases compared to the other scenarios. Especially the scenario of Expansion 2 (with an assumed increase of the incidence rate by 30% until 2060) offers a strong increase of the apr. For strongly agerelated diseases such as dementia, CVD or HF, the Compression 2 scenario (shifting the incidence to higher age groups) has a stronger impact on the apr than the Compression 1 scenario, in which the life expectancy for patients is constant over time and only the healthy population benefits from the increasing life expectancy. Yet even in the compression of morbidity scenarios, an increase in all the common diseases can be expected. In other words: The increase in burden of disease due to increasing life expectancy and high incidence rates in older age groups can be mitigated but not fully compensated by a compression.
The assumption of continuously rising recovery rates (scenario Extended Recovery) has an even smaller impact on future apr, although this is also attributable to the low chances of recovery for the considered diseases in general. Only for depression an increasing recovery rate would lead to a constant prevalence rate in the long term. A diminishing effect on future longterm prevalence for all diseases can only be seen in the scenario Prevention. For diabetes and depression, the Prevention scenario even leads to a small decline in the apr. This highlights the importance of effective prevention regarding the upcoming demographic transition.
At a first glance a (simple) extrapolation of current prevalence rates should range between the expansion and compression scenarios, our results offer that this is not true for all diseases. In particular, for dorsal pain, arthrosis, COPD, and cancer the status quo principle leads to an apr in 2060 which is smaller than the scenarios of Prevention. Hence, our results show a wide range future developments of the different diseases depending on the chosen parameters for modelling.
Table 5 shows the absolute results of the projection for 2040 and 2060. As the Standard Population (LE increasing) neglects future migration, the total number of people in Germany will decline between 2040 and 2060. Thus, for the most scenarios and diseases the total numbers of patients are higher in 2040 than 2060. However, the results given the projection Population (Migration) in parentheses offer the opposite effect. Hereby we assume identical diseaserelated input data for migrants.
All in all, our calculations show that all of the ten diseases are expected to increase up until 2060: Diseases of the musculoskeletal system like dorsal pain and arthrosis will be responsible for the majority of the future disease burden within the German population, possibly affecting about 25–27 and 13–15 million people, respectively, by 2060. Diabetes, which is closely related to other diseases like CHD, is expected to impact at least 9.5 million patients in case of expanding morbidity. With up to 7.4 million people affected in 2060, CHD will continue to be the most common cardiovascular disease. The high growth rates of primarily agerelated diseases such as CVD or HF are also steep in absolute terms. Only if prevention strategies are successful, the significant increase in number of patients could be alleviated in the long run.
Our results can be compared with other recent studies for Germany. From the 16 (16/160) studies for Germany in our literature review (concerning our ten most common noninfectious diseases) only six (6/16) were published in the last 5 years and most of them focussing on cancer (3/6), dementia (2/6) or diabetes (1/6)^{Footnote 14}. For diabetes, Tönnies et al. (2019) [29] calculate with the help of an illnessdeath model and under the assumption of constant incidence rates a higher number of 11.0 million patients for 2040. The discrepancy to our projection (10.3 million) for 2040 is probably due to their older input data, which stem from 2010. The most recent study on dementia by Alzheimer Europe (2020) [30] project 2.7 million patients for 2050 with a status quo projection which lies in the interval of our forecast with 2.5 to 3.0 million people affected. Milan & Fetzer (2019) [17] project 2.6 to 3.3 dementia patients for 2060 by using the same model. The slight differences to their results are attributable to more recent population statistics and diseasespecific input data. A comparison of our results with the three studies focusing on cancer is difficult as two of them consider the disease pattern of lung cancer and take a shortterm perspective (up to the year 2020), whereas the third focuses on a trend projection of incidence rates (see Fig. 2, Fig. 3 and Table 6 in the appendix for more detailed information and results on the systematic database search).
Discussion
A projection of ten common noninfectious diseases in concurrent scenarios based on a rich and consistent data set is expanding the literature on the developmentof future disease burden in light of the demographic transition. In this context, ours is one of the few studies using an illnessdeath approach with recovery and modelling compression of morbidity and prevention scenarios. Furthermore, due to its timediscrete specification, our model could be directly linked to any (official) population projection, and therefore adapted by institutions in the field of policy consulting.
In contrast to a naïve extrapolation (status quo principle), our analysis highlights the importance of focusing on the interdependence between demographic and diseasespecific components in projecting future disease burden. Based on six different scenarios we show the possible future range of disease burden and reveal the large differences between the various diseases in interaction with the demographic components. Considering these differences, it becomes clear that the extrapolation of prevalence rates can only reflect the cohort effect caused by population structure and not epidemiologically induced changes in the burden of disease, as observed e.g. for dorsal pain. In contrast, for CHD the status quo projection ranges, as expected, between the compression and expansion scenarios due to minor epidemiological influences.
With regard to the probability of the different hypotheses on future disease burden, the study situation remains inconclusive. Chatterji et al. (2015) [31] show with their detailed review of studies across the world how much the results vary for observed compression or expansion in recent years. However, just looking on the prevalence of chronic diseases (not e.g. in the quality of life) resulted more frequently in an expansion. Considering very similar diseases as our study in connection with proximity to death, BeltránSánchez et al. (2016) [32] show for the United States that those who died in recent times had a higher prevalence of chronic diseases in periods far from death, especially of those chronic diseases with low mortality and high frequency.
Interestingly, even in international studies there are only a few projections for the two major common diseases dorsal pain and arthrosis (1/160 dorsal pain, 10/160 arthrosis or joint replacement procedures), although these diseases are expected to increase the most in total numbers of patients according to our calculations. Our results can be compared with those of Kingston et al. (2018) [33], who use a population sample to model multimorbidity and prevalence of similar diseases for over 65yearolds in England until 2035. In line with our findings, they predict a significant increase for all diseases considered except depression, but with the largest increases for cancer, diabetes and respiratory diseases. In line with our findings, the only study that also compares different compression scenarios, but with regard to disability due to similar diseases in the UK, by Jagger et al. (2006) [9], concludes that improvements in population health cannot fully compensate the effect of population ageing and that there will still be an increase in number of older people with disabilities.
Of course, our results are also subject to limitations. The Markov assumption of the illnessdeath model implies that the transition probabilities depend only on the current state and are not influenced by past events. But complex longterm studies, e.g. on the probability of redisease after a successful recovery, would be necessary to heal this caveat, which are not available for such a large number of insureds. However, regarding the fit with observed incidence or prevalence rates, multistate models used in a retrospective analysis of epidemiological study data (in contrast to regression models) score well [34, 35].
Even if our discrete model has certain advantages, modelling in discrete time might be overestimating epidemiological effects. By comparing the results of a discretetime model with those of a continuous model, Brinks & Landwehr (2014) [36] show that a projection in discrete time can overestimate future prevalence. However, the authors also state that smaller projection intervals lead to smaller deviations. Our chosen oneyear interval leads to about a 10% overestimation in their model.
Nonetheless, this overestimation effect might be somehow offset by the conservative estimates generated by using insurance data, which constitutes another limitation of our measure. Insurance or routine data is primarily collected for invoicing medical services when patients visit a physician. Thus, the resulting prevalence and incidence rates can only be interpreted as treatment rates and are usually slightly lower than those obtained by surveys. In conjunction with the required validation procedures, the actual population incidence could be underestimated. Due to the incomplete coding observed for some diseases, it is also questionable whether the documented onset of illness corresponds to the real date of incidence.
A third limitation could be our data set: The rates determined from the AOK BadenWürttemberg might differ from the rates of the total German population. However, regarding genderspecific differences or frequencies in older cohorts that are particularly relevant for this analysis, various studies indicate that large AOK data sets are representative [37,38,39].
Further insights could be obtained by including multimorbidity in our model^{Footnote 15}. Comorbidity analyses could also provide more detailed insights into causes of mortality differences, which would help limiting the range of possible future scenarios. Despite the limitations mentioned, our results can offer an important guide to rational decisions in health care, especially due to the actuality and detail level of the data used. Although the strongly agerelated diseases such as dementia or heart failure show the highest relative increase rates, the enormous prevalence of musculoskeletal diseases and depression should not be ignored. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity.
Conclusion
We think that our approach is useful for consulting health care professionals and politicians in preparing for the upcoming pressure on health care capacities. As the current COVID19 crisis is showing, health care capacities are quite scarce. Even in our most optimistic scenario we would have the same pressure – at least in numbers – from chronic diseases as currently experienced during the pandemic. The lesson from our analysis is clear: A massive caseload is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity pointofview.
Availability of data and materials
Population data is available at Destatis, Germany’s official statistical office, and mortality.org. The aggregate claim data from the German sickness fund is available upon request depending on the permission of the data donor.
Notes
 1.
See for example the Ageing Report published by the European Commission [8].
 2.
We chose the year 2060 as the end point of the projection as the official population projection of the German Federal Statistical Office also ends in 2060.
 3.
This mortality difference can be interpreted as the difference between the mortality rates of the diseased persons \( {\boldsymbol{mr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}} \) and the population mr_{a,g} or as the (reverse) difference between the corresponding survival rates sr_{a,g} and \( {\boldsymbol{sr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}} \).
 4.
In the respective year under consideration, we still assume the survival rate of the diseased for the recovered persons before they are transferred to the healthy population in the following year.
 5.
An exception are age cohorts between 95 and 100 years, whose disease rates were determined in groups because of relatively few data points.
 6.
The diseasespecific input data is determined in the pseudonymised database environment of the AOK BadenWürttemberg via SQL scripts, resulting in only anonymised rates being used for the model calculations. Further calculations are executed using Microsoft Excel.
 7.
In Germany, this methodology is also used for allocating insureds to risk groups as part of the morbiditybased riskadjustment scheme in the Statutory Health Insurance [23].
 8.
Insureds with single diagnoses F34.1 or F38.1 (short depressive episodes) or isolated outpatient diagnosis in the previous year are not excluded from incidence calculation in order to identify new cases with a documented beginning depressive episode in the preobservation year.
 9.
However, for dementia we will assume emerging recovery rates in the scenario Extended Recovery for reason of comparability to the other diseases.
 10.
 11.
The group of 0–17yearolds is left out because the considered diseases are very rare in these cohorts.
 12.
Age a is limited between 0 and 100 years and with regard to gender, a distinction is made between male and female cohorts.We model our own population projection as Destatis does not publish a scenario without a future shift in migration. For this purpose, we use the data of mortality.org to model the survival rate for persons older than 100 years and calibrate the data on the life tables publishes by Destatis for the L2 scenario. In a last step we aggregate the numbers for all persons older than 100 years as our disease specific input data has only few data points for cohorts of age 100 and older.
 13.
In line with the W2 scenario published by Destatis, we assume an average positive net migration of 220,000 persons and consider their composition of age groups published by Destatis.
 14.
The calculations of the studies mentioned must be compared with the results of the scenario Expansion 1 with consideration of migration, because in all studies the disease rates are also transferred to migrants.
 15.
For example, Kingston et al. (2018) [33] use a dynamic microsimulation model to project not only prevalence but also the number of diseases per patient and predicted an increase in complex multimorbidity with more than four diseases over the next 20 years.
Abbreviations
 CA:

Pulmonary, bronchial and tracheal cancer
 CHD:

Coronary heart disease
 COPD:

Chronic obstructive pulmonary disease
 CVD:

Cerebrovascular diseases
 HF:

Heart failure
 SQ:

Status quo
References
 1.
Gruenberg EM. The failures of success. The Milbank Memorial Fund quarterly. Health and Society. 1977;55:3–24. https://0doiorg.brum.beds.ac.uk/10.2307/3349592.
 2.
Verbrugge LM. Longer life but worsening health? Trends in health and mortality of middleaged and older persons. The Milbank Memorial Fund quarterly. Health and Society. 1984;62:475–591. https://0doiorg.brum.beds.ac.uk/10.2307/3349861.
 3.
Fries JF. Aging, natural death, and the compression of morbidity. N Engl J Med. 1980;303:130–5. https://0doiorg.brum.beds.ac.uk/10.1056/NEJM198007173030304.
 4.
Fuchs VR. "though much is taken": reflections on aging, health, and medical care. The Milbank Memorial Fund quarterly. Health and Society. 1984;62:143–66.
 5.
Seshamani M, Gray AM. A longitudinal study of the effects of age and time to death on hospital costs. J Health Econ. 2004;23:217–35. https://0doiorg.brum.beds.ac.uk/10.1016/j.jhealeco.2003.08.004.
 6.
Zweifel P, Felder S, Meiers M. Ageing of population and health care expenditure: a red herring? Health Econ. 1999;8:485–96. https://0doiorg.brum.beds.ac.uk/10.1002/(sici)10991050(199909)8:6<485::aidhec461>3.0.co;24.
 7.
Breyer F, Lorenz N. The "Red Herring" after 20 Years: Ageing and Health Care Expenditures. Munich: CESifo Group; 2019.
 8.
European Commission. The 2018 ageing report: economic & budgetary projections for the 28 EU member states (2016–2070). Luxembourg: Publications Office; 2018.
 9.
Jagger C, Matthews R, Spiers N, Brayne C, ComasHerrera A, Robinson T. Compression or expansion of disability?: forecasting future disability levels under changing patterns of diseases 2006. London.
 10.
Federal Statistical Office. Federal Health Reporting. 2019. http://www.gbebund.de/gbe10/pkg_isgbe5.prc_isgbe?p_uid=gastg&p_aid=97182139&p_sprache=E. Accessed 15 Mar 2019.
 11.
Whelpton PK. An empirical model of calculating future population. J Am Stat Assoc. 1936;31:457–73. https://0doiorg.brum.beds.ac.uk/10.1080/01621459.1936.10503346.
 12.
Fix E, Neyman J. A simple stochastic model of recovery, relapse, death and loss of patients. Hum Biol. 1951:205–41.
 13.
Manton KG, Liu K. Projecting chronic disease prevalence. Med Care. 1984;22:511–26. https://0doiorg.brum.beds.ac.uk/10.1097/0000565019840600000002.
 14.
Brookmeyer R, Gray S, Kawas C. Projections of Alzheimer's disease in the United States and the public health impact of delaying disease onset. Am J Public Health. 1998;88:1337–42. https://0doiorg.brum.beds.ac.uk/10.2105/ajph.88.9.1337.
 15.
Brinks R, Tamayo T, Kowall B, Rathmann W. Prevalence of type 2 diabetes in Germany in 2040: estimates from an epidemiological model. Eur J Epidemiol. 2012;27:791–7. https://0doiorg.brum.beds.ac.uk/10.1007/s1065401297262.
 16.
Andersson T, Ahlbom A, Carlsson S. Diabetes prevalence in Sweden at present and projections for year 2050. PLoS One. 2015;10:e0143084. https://0doiorg.brum.beds.ac.uk/10.1371/journal.pone.0143084.
 17.
Milan V, Fetzer S. Die zukünftige Entwicklung von Demenzerkrankungen in Deutschland – ein Vergleich unterschiedlicher Prognosemodelle. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2019;62:993–1003. https://0doiorg.brum.beds.ac.uk/10.1007/s00103019029813.
 18.
Destatis. Bevölkerung Deutschlands bis 2060: Ergebnisse der 14. koordinierten Bevölkerungsvorausberechnung. Wiesbaden; 2019.
 19.
Günster C, Altenhofen L, editors. VersorgungsReport 2011: Schwerpunkt: Chronische Erkrankungen. Stuttgart: Schattauer GmbH; 2011.
 20.
Günster C, Klose J, Schmacke N, editors. VersorgungsReport 2012. Stuttgart: Schattauer GmbH; 2012.
 21.
Klauber J, Günster C, Gerste B, Robra BP, Schmacke N, editors. VersorgungsReport 2013/2014: Schwerpunkt: Depression. Stuttgart: Schattauer GmbH; 2014.
 22.
Günster C, Klauber J, Robra BP, Schmacke N, Schmuker C, editors. VersorgungsReport Früherkennung. Berlin: Medizinisch Wissenschaftliche Verlagsgesellschaft; 2019.
 23.
Drösler S, Garbe E, Hasford J, Schubert I. Ulrich V, van de Ven W, et al. Bonn: Sondergutachten zu den Wirkungen des morbiditätsorientierten Risikostrukturausgleichs; 2011.
 24.
Swart E. Health care utilization research using secondary data. In: Janssen C, Swart E, von Lengerke T, editors. Health care utilization in Germany: theory, methodology, and results. New York: Springer; 2014. p. 63–86.
 25.
Veronese N, Cereda E, Maggi S, Luchini C, Solmi M, Smith T, et al. Osteoarthritis and mortality: a prospective cohort study and systematic review with metaanalysis. Semin Arthritis Rheum. 2016;46:160–7. https://0doiorg.brum.beds.ac.uk/10.1016/j.semarthrit.2016.04.002.
 26.
Kuperman EF, Schweizer M, Joy P, Gu X, Fang MM. The effects of advanced age on primary total knee arthroplasty: a metaanalysis and systematic review. BMC Geriatr. 2016. https://0doiorg.brum.beds.ac.uk/10.1186/s1287701602154.
 27.
SchoulerOcak M, Aichberger MC. Versorgung von Migranten. Psychother Psychosom Med Psychol. 2015;65:476–85; quiz 485. https://0doiorg.brum.beds.ac.uk/10.1055/s00341399972.
 28.
Robert KochInstitut. Migration und Gesundheit: Schwerpunktbericht der Gesundheitsberichterstattung des Bundes. Berlin; 2008.
 29.
Tönnies T, Röckl S, Hoyer A, Heidemann C, Baumert J, Du Y, et al. Projected number of people with diagnosed type 2 diabetes in Germany in 2040. Diabet Med. 2019. https://0doiorg.brum.beds.ac.uk/10.1111/dme.13902.
 30.
Alzheimer Europe. Dementia in Europe Yearbook 2019: Estimating the prevalence of dementia in Europe; 2020.
 31.
Chatterji S, Byles J, Cutler D, Seeman T, Verdes E. Health, functioning, and disability in older adults—present status and future implications. Lancet. 2015;385:563–75. https://0doiorg.brum.beds.ac.uk/10.1016/S01406736(14)614628.
 32.
BeltránSánchez H, Jiménez MP, Subramanian SV. Assessing morbidity compression in two cohorts from the health and retirement study. J Epidemiol Community Health. 2016;70:1011–6. https://0doiorg.brum.beds.ac.uk/10.1136/jech2015206722.
 33.
Kingston A, Robinson L, Booth H, Knapp M, Jagger C. Projections of multimorbidity in the older population in England to 2035: estimates from the population ageing and care simulation (PACSim) model. Age Ageing. 2018;47:374–80. https://0doiorg.brum.beds.ac.uk/10.1093/ageing/afx201.
 34.
Barendregt JJ, Ott A. Consistency of epidemiologic estimates. Eur J Epidemiol. 2005;20:827–32. https://0doiorg.brum.beds.ac.uk/10.1007/s1065400522279.
 35.
Binder N, Balmford J, Schumacher M. A multistate model based reanalysis of the Framingham heart study: is dementia incidence really declining? Eur J Epidemiol. 2019;34:1075–83. https://0doiorg.brum.beds.ac.uk/10.1007/s10654019005676.
 36.
Brinks R, Landwehr S. Age and timedependent model of the prevalence of noncommunicable diseases and application to dementia in Germany. Theor Popul Biol. 2014;92:62–8. https://0doiorg.brum.beds.ac.uk/10.1016/j.tpb.2013.11.006.
 37.
Geyer S, Kowalski C. GKVRoutinedaten in der onkologischen Versorgungsforschung. Onkologie heute. 2018;2018:X70–2.
 38.
Jaunzeme J, Eberhard S, Geyer S. Wie "repräsentativ" sind GKVDaten? Demografische und soziale Unterschiede und Ähnlichkeiten zwischen einer GKVVersichertenpopulation, der Bevölkerung Niedersachsens sowie der Bundesrepublik am Beispiel der AOK Niedersachsen. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2013;56:447–54. https://0doiorg.brum.beds.ac.uk/10.1007/s0010301216269.
 39.
Hartmann J, Weidmann C, Biehle R. Validierung von GKVRoutinedaten am Beispiel von geschlechtsspezifischen Diagnosen. Gesundheitswesen. 2016;78:e53–8. https://0doiorg.brum.beds.ac.uk/10.1055/s00351565072.
Acknowledgements
We thank Jana Wolf (Hochschule Aalen) for the detailed language review and the two anonymous reviewers for their valuable comments to improve the quality of our manuscript.
Funding
This research did not receive any grant or externalfunding. Open Access funding enabled and organized by Projekt DEAL.
Author information
Affiliations
Contributions
VM was responsible for selecting the diseasespecific input data and calculating the scenarios, SF and CH supervised the calculations. Furthermore, VM conducted the literature survey according to the criteria chosen by all three authors. SF provided the population projection data. All authors were involved in developing the model and drafting the manuscript. The final version is read and approved by all authors.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
Valeska Milan is an employee of the AOK BadenWürttemberg, the donor of the data set. The theses and opinions shared do not represent those of the AOK BadenWürttemberg, but solely those of the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Milan, V., Fetzer, S. & Hagist, C. Healing, surviving, or dying? – projecting the German future disease burden using a Markov illnessdeath model. BMC Public Health 21, 123 (2021). https://0doiorg.brum.beds.ac.uk/10.1186/s12889020099416
Received:
Accepted:
Published:
DOI: https://0doiorg.brum.beds.ac.uk/10.1186/s12889020099416
Keywords
 Demography
 Projection
 Markov illnessdeath model
 Chronic diseases
 Compression of morbidity