Article Text

Download PDFPDF

The causal relation between human papillomavirus and cervical cancer
  1. F X Bosch1,
  2. A Lorincz2,
  3. N Muñoz3,
  4. C J L M Meijer4,
  5. K V Shah5
  1. 1Institut Català d'Oncologia, Servei d'Epidemiologia i Registre del Càncer, Gran Via Km 2.7 s/n 08907 L'Hospitalet de Llobregat, 08907 Barcelona, Spain
  2. 2Digene Corporation, 1201 Clopper Road, Gaithersburg, MD 20878, USA
  3. 3International Agency for Research on Cancer (IARC), Unit of Field and Intervention Studies, 150 Cours Albert-Thomas, F-69372 Lyon, Cedex 08, France
  4. 4Vrije Universiteit Medical Center, Department of Pathology, De Boelelaan 1117, PO Box 7057–1007, 1081 HV Amsterdam, The Netherlands
  5. 5The John Hopkins School of Public Health, Department of Molecular Microbiology and Immunology, 615 North Wolfe Street, Baltimore, MD 21205, USA
  1. Correspondence to:
 Dr F X Bosch, Epidemiology and Cancer Registry Unit (SERC), Catalan Institute of Oncology (ICO), Hospital Duran i Reynals, Gran Via km 2,7, L'Hospitalet de Llobregat, 08907 Barcelona, Spain;{at}


The causal role of human papillomavirus infections in cervical cancer has been documented beyond reasonable doubt. The association is present in virtually all cervical cancer cases worldwide. It is the right time for medical societies and public health regulators to consider this evidence and to define its preventive and clinical implications. A comprehensive review of key studies and results is presented.

  • human papillomavirus
  • cervical cancer
  • causality
  • review
  • AF, attributable fraction
  • ASCUS, atypical squamous cells of undetermined significance
  • CI, confidence interval
  • CIN, cervical intraepithelial neoplasia
  • EBV, Epstein-Barr virus
  • GP, general primer
  • HCV, hepatitis C virus
  • HIV, human immunodeficiency virus
  • HPV, human papillomavirus
  • HSIL, high grade squamous intraepithelial lesion
  • HSV-2
  • type 2 herpes simplex virus
  • IARC, International Agency for Research on Cancer
  • IBSCC, international biological study on cervical cancer
  • OC, oral contraceptive
  • OR, odds ratio
  • ORF, open reading frame
  • Pap, Papanicolaou
  • PF, protective fraction
  • PCR, polymerase chain reaction
  • RB, retinoblastoma
  • RR, relative risk
  • STD, sexually transmitted disease
  • Th1, T helper cell type 1
View Full Text

Statistics from


During the 1990s, epidemiological studies, supported by molecular technology, provided evidence on the causal role of some human papillomavirus (HPV) infections in the development of cervical cancer. This association has been evaluated under all proposed sets of causality criteria and endorsed by the scientific community and major review institutes. The finding is universally consistent, and to date there are no documented alternative hypotheses for the aetiology of cervical cancer.

HPV has been proposed as the first ever identified, “necessary cause” of a human cancer. In practical terms, the concept of a necessary cause implies that cervical cancer does not and will not develop in the absence of the persistent presence of HPV DNA.

Cervical cancer is still the second most common cancer in women worldwide, although it is a theoretically preventable disease.

In developed parts of the world, and in populations where cytology based programmes are established, it would be beneficial to add HPV testing to the screening protocol. HPV testing was shown by several studies, including one randomised trial, to be of help in solving the ambiguous cases generated by cytology reading.

In populations where cytology programmes are either not in place or are not efficient, HPV testing should now be considered and evaluated as an alternative test for primary screening.

Prevention of exposure to high risk HPV types by vaccination may prove to be the most efficient and logistically feasible preventive intervention for cervical cancer.

At this stage of development, regulatory agencies are requested to evaluate the scientific evidence and weigh its implications in relation to costs, public health investments, and policy. This is a subjective evaluation that could be guided by a careful description of the most relevant studies and findings.


A major discovery in human cancer aetiology has been the recognition that cervical cancer is a rare consequence of an infection by some mucosatropic types of HPV. In public health terms, this finding is equally important as the discovery of the association between cigarette smoking and lung cancer, or between chronic infections with hepatitis B virus (HBV) or hepatitis C virus and the risk of liver cancer. Moreover, as in the HBV disease model, intense efforts are currently going into the development and testing of vaccines that may prevent the relevant HPV infections, and presumably, cervical cancer.

By the year 2000, the epidemiological evidence included a large and consistent body of studies indicating, beyond any reasonable doubt, strong and specific associations relating HPV infections to cervical cancer. The observations have been reported from all countries where investigations have taken place. Studies include prevalence surveys, natural history investigations, case–control studies and, more recently, a randomised intervention trial. Natural history and follow up studies have clearly shown that HPV infection preceded the development of cervical cancer by several years and confirmed that sexual transmission is the predominant mode of HPV acquisition. These studies satisfied, in biological terms, the long known clinical and epidemiological observations that cervical cancer displayed the profile of a sexually transmitted disease (STD). Case–control studies, case series, and prevalence surveys have unequivocally shown that HPV DNA can be detected in adequate specimens of cervical cancer in 90–100% of cases, compared with a prevalence of 5–20% in cervical specimens from women identified as suitable epidemiological controls.

The association has been recognised as causal in nature by several international review parties since the early 1990s, and the claim has been made that this is the first necessary cause of a human cancer ever identified.

The implications of the recognition that, in the absence of viral DNA, cervical cancer does not develop, are of considerable practical importance. On the one hand, the concept of risk groups comes into focus. High risk women can now be sharply redefined as the group of persistent HPV carriers. Operatively, this represents substantial progress from previous versions of the high risk group that identified women by their exposure to a constellation of ill defined factors (low socioeconomic status, high number of sexual partners, smoking, use of oral contraceptives, history of STDs, and any combination of the above). Most of these factors are now viewed either as surrogates of HPV exposure or as relevant cofactors given the presence of HPV DNA. On the other hand, if indeed HPV is a necessary cause of cervical cancer, the implication is that specific preventive practices targeting some putative non-HPV related cervical cancer cases are no longer justified. Finally, technology is now available to screen HPV DNA positive women in the general population. Therefore, the final consideration on the nature of the association between HPV and cervical cancer is of considerable public health relevance. Research at the population level has largely accomplished its task by providing an exhaustive body of evidence. It is now time for public health institutions to evaluate these achievements, consider the costs and benefits involved, and apply this knowledge to their guidelines, recommendations, and policy.


Research in relation to the aetiology of cervical cancer has made substantial progress in the past two decades, both in scientific and operational terms. For decades, the epidemiological profile of women with cervical cancer was recognised as suggestive of a sexually transmitted process, and several infectious agents were proposed over the years including syphilis, gonorrhea, and type 2 herpes simplex virus (HSV-2).

The development of technology to test for the presence of HPV DNA in cellular specimens in the early 1980s and the multidisciplinary collaboration within the field made possible the establishment of a definite aetiological role for HPV in cervical cancer. Evidence is also accumulating for HPV involvement in a considerable proportion of cancers of the vulva, vagina, anal canal, perianal skin, and penis. The association of HPV with cervical cancer has provided the background and the justification for improving screening programmes and for developing HPV vaccines.

Figure 1 is a schematic view of the time scale of this dynamic process. It includes an indication of the results obtained as technology evolved in sensitivity expressed as the per cent of cervical cancer cases that were found to contain viral DNA. The figure also indicates the types of HPV tests that were predominantly used and an estimate of the periods in which the key types of studies were initiated.

Figure 1

Evolution of epidemiological research on human papillomavirus (HPV) and cervical cancer in the past two decades. FISH, filter in situ hybridisation; GP-PCR, general primer PCR; HC I–II, hybrid capture first and second generation; PCR, polymerase chain reaction; SH, Southern blot hybridisation; TS-PCR, type specific PCR.

Figure 2 displays the approximate number of scientific papers identified by Medline searches on HPV and on HPV and cervical cancer, and the number of research abstracts presented at the major annual papillomavirus conferences ( The 1980s generated a rapidly increasing number of publications on HPV DNA prevalence in cervical cancer and reports on validation of the available detection methods. The 1990s produced the key results of case–control and cohort studies, and beginning in the late 1990s there was an increasing number of publications on the clinical uses of HPV testing in screening and triage.

Figure 2

Scientific publications on human papillomavirus (HPV) identified by Medline and the number of research abstracts presented at the annual Papillomavirus International Conferences.


Epidemiological studies are essential to establish the association between risk factors and cancer and to qualify the nature of the association. Traditionally, these include case series, case–control studies, cohort studies, and intervention studies.

Comparisons of exposure between patients with cervical cancer and their relevant controls were initially established using questionnaires. Most studies conducted before the availability of HPV DNA detection systems identified as key risk factors several variables related to the sexual behaviour of the women and of their sexual partners. The most frequently reported risk factors included the number of sexual partners, an early age at first intercourse, or any previous STD.1

Once the relevant biomarkers were validated, in this case the presence of HPV DNA in exfoliated cervical cells, it became possible to advance over questionnaire based studies and establish biologically sound comparisons between patients and controls. In epidemiological terms, these comparisons would analyse cervical cells from women with cervical cancer and from otherwise comparable women without cervical cancer (case–control studies), or from cohorts of women tested for viral DNA (cohort studies). To characterise the link between HPV and invasive cervical cancer, case–control studies proved to be the key study design and the only ones ethically acceptable in human populations. Typically in such a study and at the time of fieldwork, controls are selected to match the age distribution of the cases and, as much as possible, the general characteristics of the cases (place of residence, socioeconomic status, health plan, etc). To characterise the association, all study participants are requested to comply with a questionnaire to assess their individual exposure to any known or suspected risk factor for the disease. The information is then used to estimate the odds ratios (ORs) of disease related to any given exposure. Multivariate analyses have the ability to compare (through statistical adjustment) strictly equivalent groups of women in relation to any of the exposures of interest. The adjusted differences (ratios) in the prevalence in HPV markers between cases and controls are then obtained after having eliminated the effects of any other differences in exposure. Likewise, comparisons of cases and controls in relation to other variables of interest will provide estimates of the relevance of other factors (oral contraceptives (OCs) or smoking) and identify the variables that merely reflect HPV exposure (surrogate variables).

When the technology to detect HPV DNA in samples of DNA extracted from exfoliated cervical cells became available, it was relatively easy to show that most of the sexual behaviour variables were in fact surrogate measures of HPV exposure, reflecting the predominant pathway of acquisition of HPV. As methods became more sensitive, the parameters that merely expressed the probability of HPV (or any other STD) infection, such as number of sexual partners, became statistically irrelevant.2–5

Causality in public health requires a judgment based on scientific evidence from human and experimental (animal) observations. As such, only the latter may benefit from the most stringent criteria of causality; that is, the repeated induction of the disease by exposure to the relevant agent(s) compared with the “spontaneous” occurrence of the same disease in unexposed and yet comparable groups of animals. All causal associations of human cancers have been recognised based on educated judgment of the results of epidemiological studies at the level already available for HPV and cervical cancer. Final proof can only be confirmed by intervention (preventive) trials, in which a reduction of the disease burden (incidence or mortality) is observed following the introduction of a preventive practice in strictly controlled conditions. These studies typically include as controls populations to whom the existing standard of preventive care is being offered.

Table 1 displays some of the criteria that have been proposed to evaluate the nature of the associations encountered by epidemiological studies. This is particularly relevant when causality is being proposed because, as a consequence, preventive or clinical recommendations are made.

Table 1

Epidemiological considerations important for causal inference

In addition to the criteria listed in table 1, some additional contributions might be worth discussing. In 1976, Evans reviewed the history of the causality criteria in infectious disease models and adapted the early postulates of Henle-Koch to both the viral origin of acute diseases and to the relation between viral infections and cancer.14 The human models that inspired most of the latter included two examples: Epstein-Barr virus (EBV) infections and Burkitt's lymphoma, and HSV-2 viral infections and cervical cancer. The technology that was discussed was largely based on antibody detection and the studies involved were seroepidemiological surveys and case–control studies. Antibody measurements were the methods of choice for the assessment of exposure. Evans proposed a unified scheme for causation that included most of the criteria mentioned in table 1.14 In 1976, Rothman15 introduced the concepts of “necessary and sufficient causes”. This model is useful to accommodate the growing evidence of the multifactorial origin of human cancer in many instances. Finally, several authors have defined criteria to evaluate the findings of molecular technology that provided the basis of the studies of HPV and cervical cancer.16,17

Because of its wider acceptance, we will discuss in detail the criteria proposed by Hill, and its version adopted by the International Agency for Research on Cancer (IARC) monograph programme, in addition to the model on necessary and sufficient causes proposed by Rothman in 1995.18

In brief, the criteria proposed by Hill8 as summarised by Rothman19 include the following:

Hill suggested that the following aspects of an association should be considered when attempting to distinguish causal from non-causal associations: (1) strength, (2) consistency, (3) specificity, (4) temporality, (5) biological gradient, (6) plausibility, (7) coherence, (8) experimental evidence, and (9) analogy.


By “strength of association”, Hill means the magnitude of the ratio of incidence rates. Hill's argument is essentially that strong associations are more likely to be causal than weak associations because if they were the result of confounding or some other bias, the biasing association would have to be even stronger and would therefore presumably be evident. Weak associations, on the other hand, are more likely to be explained by undetected biases. Nevertheless, the fact that an association is weak does not rule out a causal connection.


Consistency refers to the repeated observation of an association in different populations under different circumstances.


The criterion of specificity requires that a cause should lead to a single effect, not multiple effects. However, causes of a given effect cannot be expected to be without other effects on any logical grounds. In fact, everyday experience teaches us repeatedly that single events may have many effects.


Temporality refers to the necessity that the cause should precede the effect in time. The temporality of an association, is a sine qua non: if the “cause” does not precede the effect that is indisputable evidence that the association is not causal.

Biological gradient

Biological gradient refers to the presence of a dose–response curve. If the response is taken as an epidemiological measure of effect, measured as a function of comparative disease incidence, then this condition will ordinarily be met.


Plausibility refers to the biological plausibility of the hypothesis, an important concern but one that may be difficult to judge


Taken from the Surgeon General's report on Smoking and Heath (1964)9: “The term coherence implies that a cause and effect interpretation for an association does not conflict with what is known of the natural history and biology of the disease.”

Experimental evidence

Such evidence is seldom available for human populations. In human data, the experimental criterion takes the form of preventive interventions and explores whether there is evidence that a reduction in exposure to the agent is associated with a reduction in risk.


The insight derived from analogy seems to be handicapped by the inventive imagination of scientists, who can find analogies everywhere. Nevertheless, the simple analogies that Hill offers—if one drug can cause birth defects, perhaps another can also—could conceivably enhance the credibility that an association is causal.

As is evident, these nine aspects of epidemiological evidence offered by Hill to judge whether an association is causal are saddled with reservations and exceptions; some may be wrong (specificity) or occasionally irrelevant (experimental evidence and perhaps analogy). Hill admitted that: “none of my nine viewpoints can bring indisputable evidence for or against the cause and effect hypothesis and none (except temporality) can be required as a sine qua non”.

The IARC in its monograph programme largely adopted the causality criteria proposed by Hill and established rules to decide on the carcinogenicity of a given exposure, particularly when human data are scarce and must be combined with experimental data. However, the final qualification of the carcinogenicity of any given substance being evaluated is taken by vote of the external (non-IARC) participants.

The monograph programme and its criteria has been reviewed and accepted by most scientists in the field of human carcinogenesis. To date, 77 monographs have been published, of which five involve biological agents such as HPV.12 In its preamble, the monograph programme establishes guidelines to qualify an epidemiological observation as causal, and also defines rules to be followed when human data suggest lack of carcinogenicity potential. These criteria are useful to challenge any aetiological hypothesis when the epidemiological studies are inconsistent or when only weak associations are reported.

Finally, another useful way of examining the nature of an association was provided by a model system that proposed that any given disease would occur as a consequence of human exposure to a “sufficient cause”.18 A sufficient cause is described, in its simplest model, as the concurrence in a given individual of a constellation of factors (called the components of the sufficient cause), following which the disease will develop. Each given disease will have its own sufficient cause or sets of sufficient causes (lung cancer may have a sufficient cause that involves cigarette smoking, but another sufficient cause that does not include smoking, such as intense radon exposure in non-smokers). According to the model, a necessary cause is described as a component of a sufficient cause that is part of all the sufficient causes described. To prevent disease it is not necessary to identify all the components of a sufficient cause, or to remove them all: it is sufficient to remove one component from each sufficient cause, that is to remove, if it exists, the necessary cause.


In the following sections, we will review several studies that have provided evidence of the association between HPV and cervical cancer, using the criteria outlined in table 1. For purposes of clarity, we shall concentrate the discussion on the criteria that have proved to be of greater value in the evaluation of human carcinogens and on the studies that focused on invasive cervical cancer.

Strength of the association

This criterion is usually discussed by examining the magnitude of the relative risk (RR), or the OR, which is the estimate of the RR in case–control studies. We shall use as the primary example the results of the IARC multicentre case–control study on invasive cervical cancer, as presented at international scientific meetings, and either published or at different stages of preparation for publication. In brief, this project included nine case–control studies in different parts of the world, mostly in high risk countries. HPV DNA testing was done in two central research laboratories using the MYO9/1120 and the general primer (GP) GP5+/6+21,22 polymerase chain reaction (PCR) testing systems. The published results have reported ORs for cervical cancer in the range of 50 to 100 fold for HPV DNA. ORs for specific associations (such as HPV-16 and squamous cell cancer and HPV-18 and cervical adenocarcinomas) range between 100 and 900. These estimates lead to calculations of attributable fractions (AF) for the entire study greater than 95%.23

Table 2 shows the size of the multicentre case–control study and the prevalence of HPV DNA in each relevant group. Figure 3 displays the HPV DNA prevalence in eight countries in cervical cancer cases and controls. It is noteworthy that the first two studies conducted in Spain and Colombia (fig 3) used early versions of the MYO9/11 PCR system that identified HPV DNA in approximately 75% of the cases. The rest of the studies were analysed using the GP5+/6+ PCR system and its modifications, which resulted in an almost 20% increase in the HPV DNA detection rate.

Table 2

Size of the IARC multicentre case–control study and human papillomavirus (HPV) DNA prevalence

Figure 3

Prevalence of human papillomavirus (HPV) DNA in cases and controls in the IARC multicentre case–control study.24–30

Table 3 shows the corresponding estimates of the RR (OR and 95% confidence interval (CI)). Results are presented separately for squamous cell carcinomas and adenocarcinomas of the cervix. Given the case–control design of the study, these very high ORs reflect the risk in relation to existing HPV DNA in cervical cells (HPV DNA point prevalence), not in relation to “ever” being infected with HPV (cumulative lifetime exposure). Furthermore, if HPV shedding was intermittent among controls, their corresponding HPV prevalence would have been underestimated, resulting in an inflation of the ORs observed. It is usually interpreted that the HPV DNA point prevalence at advanced age (over 40 years of age) reflects viral persistency. However, much research is still devoted to defining viral persistency and its prognosis accurately, a crucial definition for the clarification of the uses of HPV testing in screening and patient management.31

Table 3

Odds ratio for the association of human papillomavirus (HPV) DNA and cervical cancer in the IARC multicentre case–control study: preliminary data23 32 33

Most of the discussion in the text uses HPV DNA as a generic marker that includes any positive result for several HPV types. It is now possible to provide estimates of the RR for at least 10 different HPV types showing that there are no significant differences in the risk of cervical cancer in relation to the HPV types most commonly found in these lesions. The preliminary results of the IARC multicentre case–control were pooled and summarised by Muñoz et al in 2000,32 at the HPV 2000 Papillomavirus Conference ( These analyses indicated that for squamous cell carcinomas, the age and centre adjusted OR was 83.3 (95% CI, 54.9 to 105.3). The prevalence of the four most common HPV types and their ORs among 1545 cases with single infections were: HPV-16, 59% (OR = 182); HPV-18, 12% (OR = 231); HPV-45, 4.8% (OR = 148); and HPV-31, 3.7% (OR = 71.5). Other less common HPV types showing equally high ORs were: HPV-33, OR = 77.6; HPV-35, OR = 34.8; HPV-51, OR = 42.7; HPV-52, OR = 145.7; HPV-58, OR = 78.9; and HPV-59, OR = 347.3.

The most common types among cases were also the most common types among HPV positive control women: HPV-16, 30.3%; HPV-18, 8.2%; HPV-31, 4.8%; and HPV-45, 3.9%. These findings indicate that in addition to HPV-16 and HPV-18, HPV types 31, 33, 35, 45, 51, 52, 58, and 59 should be considered as human carcinogens.

The HPV type distribution in the population and in patients with cervical cancer shows a seemingly modest geographical variability that has not been fully described (J Kornegay, personal communication, 2001).34–36 The description and the implications of such variability for HPV testing and HPV vaccination are to be determined.

The results of the multicentre study are consistent with findings from other countries that have generated recent data on invasive cervical cancer and preinvasive disease in Costa Rica,37 Thailand,38 Norway,39 Denmark,40 and virtually all other countries in which these studies have been conducted.

Multiple HPV types were detected in the multicentric study on average in 7.3% of the cases and 1.9% of the controls, and did not show a significantly increased risk (OR = 54.5; 95% CI, 35.5 to 83.6) over women positive for only one HPV type (OR = 86.6; 95% CI, 68.2 to 110).

The proportion of multiple types in a given specimen varies across studies and particularly in relation to the HPV detection method used. Table 4 provides an indication of the proportion of specimens from cases and from the general population that showed multiple types. The table suggests that populations at high risk of cervical cancer and with high rates of human immunodeficiency virus (HIV) positivity tend to show higher proportions of multiple types than do populations not belonging to these risk groups. Longitudinal studies have suggested that the one time, cross sectional detection of type specific HPV may underestimate the cumulative lifetime diversity of exposure to HPV.31 However, in all studies of invasive carcinoma, the risk linked to multiple HPV types does not vary significantly from the risk linked to single HPV types.

Table 4

Prevalence of multiple human papillomavirus (HPV) types in patients with cervical cancer and women without cervical cancer

The similarity in the prognostic value of detection of any of the 10 high risk HPV types, in addition to any combination of them, clearly indicates that group testing for high risk HPVs would be sufficient in the context of clinical and screening protocols.

Figure 4 shows, for comparison purposes, some estimates of the strength of associations between environmental factors and human cancer that were recognised as causal in nature by epidemiological studies and subsequently proved in human populations by intervention studies. The figure includes risk (RRs or ORs) as the measurement of the strength of the associations and AFs representing the proportion of disease that is attributable to (caused by) the exposure. Below the reference line the risk column displays its reverse estimate as a less than one (protective) OR or RR and the protective fractions (PF%) in the right hand column show results that have already been achieved in disease reduction after specific exposure reduction interventions.

Figure 4

Selected examples of the strength of the associations (RR/ OR) between risk factors and human cancer; estimates of the attributable fraction (AF%) and of the protective fraction (PF%). Refs: the Philippines,28 Costa Rica,37 Bangkok,5 Taiwan,43 Greece,44 Italy,45 UK,46 Korea,47 and Taiwan.48

Strength of association. Evaluation

The association between HPV DNA in cervical specimens and cervical cancer is one of the strongest ever observed for a human cancer. HPV-16 accounts for almost 50% of the types identified in cervical cancer. The cancer risk for any one of at least 10 HPV types or for any combination of HPV types does not differ significantly.


There is a striking consistency between the results of the multicentre case–control study and over 50 other studies conducted in other countries, under different protocols and HPV DNA testing systems. Figures 5, 6, and 7 summarise the results of studies that compared the prevalence of HPV DNA in patients with cervical cancer and controls. Some of the studies used the prevalence of HPV-16 DNA to calculate ORs and some reported results for HPV DNA (all types combined). Some studies focused on invasive cervical cancer, whereas others used preinvasive lesions as the definition of cases. When indicated, separate analyses are presented for squamous cell carcinomas and for adenocarcinomas. Studies that have compared risk factors for cervical intraepithelial neoplasia stage 3 (CIN 3) and invasive cancer have not reported any significant differences in their associations with HPV or with their epidemiological profile.38,49

Figure 5

Odds ratios (OR) and 95% confidence intervals for associations found in case–control studies using PCR methods between human papillomavirus 16 (HPV-16) (or its nearest surrogate) and invasive cervical cancers. *The OR estimate is ∞ owing to the absence of HPV positive controls. Adapted from IARC monograph 64, 1995.12

Figure 6

Odds ratios (OR) and 95% confidence intervals for associations found in case–control studies using non-PCR methods between human papillomavirus 16 (HPV-16) (or its nearest surrogate) and invasive cervical cancers. *The OR estimate is ∞ owing to the absence of HPV positive controls. Adapted from IARC monograph 64, 1995.12

Figure 7

Odds ratios (OR) and 95% confidence intervals for associations found in case–control studies after the year 2000. HPV, human papillomavirus.

Apart from confirming the high ORs shown in figs 5 and 6, fig 7 also demonstrates the consistency of results between squamous cell carcinomas and adenocarcinomas, the consistency of findings between preinvasive disease and invasive cancer, and the consistency of findings between risk estimates for HPV DNA (all types considered) and risk estimates restricted to high risk types.

Consistency. Evaluation

The association between HPV DNA in cervical specimens and cervical cancer is consistent in a large number of investigations in different countries and populations. There are no published studies with observations challenging the central hypothesis on causality.


Specificity, as defined by Hill, tended to be relegated to a secondary level for cancer causality evaluation once it became clear that carcinogenic exposures are usually complex (for example, cigarette smoke) and can induce cancer in different organs and even cancers of different histological profile in the same organ.

In the case of HPV, the complexity of the association is being unveiled. The HPV family includes over 100 HPV types, of which 30–40 are mucosatropic and at least 15 types have been clearly linked to cervical cancer. In addition, some of these types are also related to other cancers of the genital tract (vulvar cancer, vaginal cancer, and cancers of the anal canal, perianal skin, and the penis) and perhaps to cancers of other organs (such as oropharyngeal and skin cancer).

To examine the association of HPV and human cancer in light of the specificity criteria, we shall widen the original scope (one exposure/one disease) to verify whether a more complex model involving multiple HPV types and several cancer sites seems to occur with frequencies suggesting a consistent departure from a random model.

  1. About 15 HPV types are involved in over 95% of the cervical cancer cases. HPV-16 and HPV-18 are the most common types identified and represent 50% and 10%, respectively, of the viral types involved in invasive cancer. Figure 8 shows the cumulative prevalence of five HPV types in cervical carcinomas by histological type in 2400 cases included in the multicentre case–control study. It clearly shows that these five HPV types comprise 80–95% of the viral types identified in carcinomas.

  2. Adenocarcinomas and adenosquamous cell carcinomas are more closely related to HPV-18 and its phylogenetically related family (HPV types 39, 45, and 59) than are squamous cell carcinomas, which in turn are closely linked to HPV-16 and its phylogenetically related family (HPV types 31, 35, and 52).34,87 The reasons for such specificity are unknown.

  3. Cancers of the vulva and vagina are closely related to HPV-16. Approximately 40–50% of vulvar cancer shows HPV DNA, and in several series HPV-16 is by far the predominant type in more than 80% of cases.88–90

  4. Cancer of the tonsil is closely related to HPV-16, whereas other cancers of the oral cavity show inconsistent and lower prevalences of HPV DNA.91–94

  5. Skin cancers related to the epidermodysplasia verruciformis condition are related to a restricted number of dermatotrophic HPV types. These are also recovered from basal cell carcinomas and squamous cell carcinomas of the skin in immunosuppressed and immunocompetent individuals.95

  6. Other associations, reported in a small number of cases, seem to occur with some specificity. For example HPV-16 and cancers of the conjunctiva96 and HPV-16 and cancers of the ungueal bed.

  7. Studies on HPV variants (variation within HPV types at the single nucleotide level) are beginning to unveil risk differences.97–99 The geographical distribution of HPV variants and its relevance for HPV testing and for vaccine development are still uncertain.

  8. HPV has been excluded as a likely cause or even as a risk factor for other human cancers. A large number of investigations (largely unpublished) have not provided support to the hypothesis of the involvement of these viruses in the causation of cancers of the endometrium, ovary, prostate, or other sites (reviewed by Shah and Howley16 and Syrjänen and Syrjänen100).

Figure 8

Cumulative prevalence of human papillomavirus (HPV) types in cervical cancer. Taken from the IARC multicentre case–control study; preliminary data.23

Specificity. Evaluation

The association of type specific HPV DNA and cervical cancer is significantly different from random. Systematic patterns of HPV type and cervical cancer histology suggest a fair degree of specificity. Patterns are also observed when the scope of HPV and cancer expands to include the full spectrum of HPV types and the large number of addi-tional cancer sites that have been investigated.

In conclusion, although the specificity criteria can be viewed as of secondary applicability, the global picture indicates that HPV types are not randomly associated with human cancer. A fair degree of specificity is consistently reported, even if the complexities of the type specific viral properties and of the organ/cell susceptibility have not been fully disclosed.


Of the criteria outlined by Hill and repeatedly endorsed by the IARC monograph programme and other bodies, the demonstration that exposure has occurred before the diagnosis is considered a “sine qua non” condition for a risk factor and for establishing causality. Five groups of studies have contributed data relevant to the temporality criterion.

Descriptive data

Cross sectional studies have repeatedly reported that subclinical HPV infections are highly prevalent in young individuals, whereas invasive cervical cancer typically develops in the third decade and later (fig 9). The cross sectional prevalence of HPV DNA decreases spontaneously to a background level of 2–8% in most populations in groups that are 40 years old and above. In countries where intensive screening of young women takes place, part of the HPV prevalence reduction could be attributable to aggressive treatment of HPV related cervical lesions. Women who remain chronic HPV carriers are currently described as the true high risk group for cervical cancer. In some populations, a second mode of HPV DNA prevalence has been observed for older women (50 years and above), with uncertain relevance in relation to the risk of cervical cancer.36,37,101 In all settings investigated, the point prevalence of HPV DNA in the young age groups is strongly related to the sexual behaviour patterns that are dominant in each population.102–107

Figure 9

Age specific prevalence (%) of high rish (HR) human papillomavirus (HPV) DNA in 3700 women entering a screening programme and age specific incidence rate (x105) (ASIR) of cervical cancer in the Netherlands. Adapted from Jacobs et al and Parkin et al.106,108

These population studies provide support for the concept that HPV infections precede the development of cervical cancer by some decades. In fact, from most cancer registries, including the USA based registries, it is well established that the age specific incidence of cervical cancer has a rising trend in the age interval 20–40, and shows a plateau or continues to increase smoothly after that age. Only occasionally do cases of invasive disease occur at earlier ages. Figure 9 shows the age specific, cross sectional prevalence of high risk HPV DNA in a screening programme in the Netherlands, and the corresponding age specific incidence rates of cervical cancer in that country. The distributions shown in fig 9 are highly reproducible in studies in other settings in high and low risk countries.3,24,106,108 However, the age specific incidence rates of invasive cervical cancer are strongly influenced by the local impact of screening programmes in each country.3,24,106,108

Follow up studies

For cervical cancer, compliance with the temporality criteria has been established by numerous cohort studies that monitored women from cytological normalcy to the stage of high grade cervical intraepithelial neoplasia (high grade squamous intraepithelial lesions (HSIL) or CIN 2/3). Monitoring of women to invasive disease is not acceptable on ethical grounds and thus that information is not available.

Repeated sampling of women being followed for viral persistence and cervical abnormalities has shown that the median duration of the infections is around eight months for high risk HPV types, compared with 4.8 months for the low risk HPV types. In two unrelated studies, the time estimates were fairly consistent. In one study in a high risk population in Brazil, the mean duration of HPV detection was 13.5 months for high risk HPV types and 8.2 months for the non-oncogenic types. HPV-16 tended to persist longer than the average for high risk types other than HPV-16.109 The results were remarkably similar in a student population in the USA and in the UK.31,110 The self limiting course of most HPV infections is consistent with the cross sectional profile displayed in fig 9. However, the currently observed time intervals may still suffer from imprecision in the estimates of time at first exposure, from the variability in the endpoint definition, and from censoring as a result of treatment of the early lesions.

Follow up studies of women with and without cervical abnormalities have indicated that the continuous presence of HR-HPV is necessary for the development, maintenance, and progression of progressive CIN disease.110–114 A substantial fraction (15–30%) of women with HR-HPV DNA who are cytomorphologically normal at recruitment will develop CIN 2 or CIN 3 within the subsequent four year interval.111,115,116 Conversely, among women found to be HR-HPV DNA negative and cytologically identified as either atypical squamous cells of undetermined significance (ASCUS) or borderline or mild dysplasia, CIN 2/3 is unlikely to develop during a follow up of two years, and their cytology is likely to return to normal.117,118 Women found positive for low risk HPVs rarely become persistent carriers and their probability of progression to CIN 2/3 is extremely low.117,119

As ongoing cohorts expand their follow up time, more precise estimates are being provided on the predictive value of viral persistence as defined by repeated measurements of viral types and variants. One such cohort in Sao Paulo has shown that the incidence of cervical lesions in women who were HPV negative twice was 0.73/1000 women months. The corresponding incidence among women with repeated HPV-16 or HPV-18 positive results was 8.68, a 12 fold increased incidence. The OR for HPV persistence among women who were twice HPV positive for the same oncogenic types was OR = 41.2 (95% CI, 10.7 to 158.3).120 Retrospective assessment of HPV status using archival smears from cases of cervical cancer and controls has provided evidence that HPV DNA preceded the development of invasive disease, and showed its value in signalling false negatives smears.117 An interesting observation from the same group suggests that the clearance of HR-HPV in otherwise established cytological lesions is a marker associated with the regression of CIN lesions.118,121 Finally, the persistence of HPV DNA after treatment for CIN 2/3 is an accurate predictor of relapse, and is at least as sensitive as repeated vaginal cytology.122

These results are useful in defining the clinical role of HPV testing. However, most observations on preinvasive disease have limitations for making inferences on cervical cancer causality. This is because even in controlled settings, observations are not allowed to continue beyond the stage of HSIL/CIN 3 or carcinoma in situ.

Retrospective cohorts

A particularly interesting approach to conducting follow up studies of invasive cancer (as opposed to studies of CIN 3) without ethical and time constraints is provided by so called “nested case–control studies”. These are studies initiated several years in the past that assembled and stored large banks of biological specimens from healthy individuals. Linkage studies can then identify cases of cervical cancer (or any other condition) that have occurred in the interval and the original specimens can then be analysed for the presence of HPV biomarkers. HPV DNA prevalence can then be compared with the corresponding prevalence in specimens of epidemiologically sound controls (individuals from the same cohort who did not develop the condition under otherwise equivalent exposures). These studies have documented the existence of HPV exposure years before the development of the disease, thus reproducing the conditions of a longitudinal study. With this approach, a RR estimate of 16.4 (95% CI, 4.4 to 75.1) was seen for invasive cervical cancer in Sweden using DNA extracted from stored Papanicolaou (Pap) smears123 and a RR of 32 (95% CI, 6.8 to 153) was seen in the Netherlands.117 In a similar study design, an OR of 2.4 (95% CI, 1.6 to 3.7) was obtained using serological markers of HPV exposure.124

Preventive interventions

Since the late 1980s, multiple studies have evaluated HPV testing as an adjunct to cytology in screening programmes. These have considered HPV testing either as a triage test in cases of mild abnormalities125–127 or as a primary screening test.128–130 It is not the purpose of this paper to review this literature and excellent summaries are being regularly produced and updated (see later). In brief, triage studies have shown that HPV testing is more sensitive than repeated cytology in identifying underlying high grade lesions in women with ASCUS.114,119,121,131,132 Studies that reflect primary screening conditions (in the absence of fully randomised trials) have shown that the sensitivity of HPV tests is higher than standard cytology in detecting high grade lesions, whereas the specificity is age dependent. HPV tests show lower specificity than cytology in younger women, accounting for the bulk of transient infections, whereas in older women (ages 30–35 and above) specificities tend to be similar for both tests.107,133,134

In terms of causality assessment, these studies showed that it is possible to predict the concurrent presence of neoplastic disease (usually HSIL, CIN 2–3, or severe dyskaryosis), or the risk of developing it, by means of HPV DNA detection. This property of the HPV test offers an indirect measurement of the strength of the association and of the temporal sequence of the events.

Determinants of HPV infection

Epidemiological studies investigating risk factors for HPV infection clearly and consistently have shown that the key determinants among women are the number of sexual partners, the age at which sexual intercourse was initiated, and the likelihood that each of her sexual partners was an HPV carrier.103,105,135–141 These are lifelong behavioural traits, thus clearly preceding the development of cervical cancer.

The role of men as possible vectors of HPV was measured in the early epidemiological studies by questionnaires that asked about the sexual behaviour of the husbands or sexual partners of patients with cervical cancer and controls. In addition, more recent studies had the ability to measure HPV DNA in exfoliated cells from the penile shaft, the coronal sulcus, and the distal urethra.142–146

These and other studies consistently showed that the risk of cervical cancer for a given woman can be predicted by the sexual behaviour of her husband as much as her own sexual behaviour. In populations where female monogamy is dominant, the population of female sex workers plays an important role in the maintenance and transmission of HPV infections. Moreover, the probability that a woman is an HPV carrier and her risk of developing cervical cancer have been shown to be related to the presence of HPV DNA in the penis or the urethra of her husband or sexual partner.104,147–149 More recently, it has been possible to confirm that male circumcision protected men from being HPV carriers and their wives from developing cervical cancer.150 These observations confirmed, in terms of HPV infections, observations made over a century ago151 and a scientific hypothesis formulated almost 30 years ago that male sexual behaviour is a central determinant of the incidence of cervical cancer.152,153

In conclusion, the natural history studies of HPV infections satisfy in biological terms most of the observations that were historically linked to cervical cancer. In the past two decades, the cervical cancer puzzle has become a coherent description that includes the identification of HPV as the sexually transmitted aetiological agent and the characterisation of the major determinants of HPV acquisition.154

Temporality. Evaluation

HPV infections precede cervical precancerous lesions and cervical cancer by a substantial number of years. The epidemiology and the dynamics of HPV infection in populations satisfy previous observations that related cervical cancer to a sexually transmitted disease.

Biological gradient

This refers to the presence of a dose–response curve indicating that the magnitude of the exposure is related to the risk of disease. This requirement, largely supported by chemically induced models of carcinogenesis, is difficult to apply in models of viruses and cancer. For HPV DNA, it is difficult to measure viral load in relation to the DNA of the cancer cells in the specimen, although early studies tended to show a correlation between HPV DNA amount and disease status.127 Some recent publications have provided relevant evidence using real time PCR methods. A study that used a nested case–control design found that cases consistently had higher viral loads for HPV-16 than controls, and that high viral loads could be detected up to 13 years before the diagnosis of cervical cancer.155 Women with high viral loads for HPV-16 had a 30 fold greater risk of developing cervical cancer than did HPV negative women. This also applied to women under the age of 25. A related paper using the same population showed that the 20% of the population with the highest viral loads for HPV-16 had a 60 fold higher risk of developing carcinoma in situ when compared with HPV negative women.85 Of importance for clinical and screening purposes, another study confirmed that high viral loads predicted cervical lesions and, more interestingly, that the reduction of viral load or clearance of viral DNA in repeated visits predicted regression of CIN lesions to normalcy.156 These studies suggest that measuring viral load, at least of HPV-16, may distinguish between clinically relevant infections and those that are unlikely to progress. However, in contrast to the above results one large prospective study in Portland USA, using quantitative hybrid capture, did not find viral load to be a determinant of risk of future CIN 3 (A Lorincz et al, unpublished data, 2002). More research is needed to validate these methods and the results need to be extended and confirmed in clinical studies.157

Biological gradient. Evaluation

The risk of cervical cancer may be related to estimates of viral load. The technology to estimate viral load is being developed and compliance with the biological gradient requirement needs to be further validated.

Biological plausibility and coherence

The mechanisms by which HPV induces cancer in humans and the molecular genetics of the process are being investigated intensively and excellent reviews are readily available.12,16,17,158–161 These investigations provide additional evidence on the causality of the association by describing viral and host interactions leading to cell transformation and malignancy. Of the criteria outlined in table 1, both the “biological plausibility” and understanding of the “mechanisms” are in rapid expansion as a consequence of developments in molecular methods and technology.

Figure 10 shows in a schematic manner some of the major components of the transition from HPV infection to cervical cancer. Whereas transient infections are largely subclinical, progression is closely related to the persistence of viral DNA. This process goes frequently with viral disruption in the E1/E2 regions and integration into the cellular DNA. E2 disruption releases the viral promoters of E6 and E7 and increases the expression of these transforming genes. The E6 and E7 viral proteins are capable of selectively degrading the p53 and retinoblastoma gene (RB) products, respectively, leading to inactivation of two important cellular negative regulatory proteins.

Figure 10

Mechanisms of human papillomavirus (HPV) carcinogenesis. HSIL, high grade squamous intraepithelial lesion; LSIL, low grade squamous intraepithelial lesion; RB, retinoblastoma gene.

Some characteristics that provide support for the role of HPV in the induction of cervical cancer were recently outlined.17 Accordingly, the causal nature of this association is indicated by: (1) the regular presence of HPV DNA in the neoplastic cells of tumour biopsy specimens; (2) the demonstration of viral oncogene expression (E6 and E7) in tumour material (but not in stromal cells); (3) the transforming properties of these genes (E6 and E7); (4) the requirement for E6 and E7 expression to maintain the malignant phenotype of cervical carcinoma cell lines; (5) interaction of viral oncoproteins with growth regulating host cell proteins; and (6) epidemiological studies pointing at these HPV infections as the major risk factors for cervical cancer development.

In their review, Shah and Howley16 provided references to some of the key experiments that exemplify most of the requirements indicated by zur Hausen,17 namely: (1) The genomes of HPV-16 and HPV-18 are capable of immortalising human keratinocytes in cell culture, whereas the DNA of the low risk HPV types (6/11) do not.162 (2) In raft cultures, the oncogenes of the high risk HPV types induce morphological changes that closely resemble preinvasive cervical lesions.163,164 (3) In HPV associated lesions, the viral genome is present in every cell and is always transcriptionally active.165 (4) The viral genome is present in the original tumour and in metastases.166 (5) Most of the cell lines established from cervical cancer contain either HPV-16 or HPV-18 genomes.167 (6) The pattern of transcription changes as the lesion increases in severity. All open reading frames (ORFs) are expressed in early lesions but the expression of ORFs E4 and E5 is not found in many invasive cancers.165 (7) The E6 and E7 ORFs contain the transforming ability of HPV. These are always intact and are consistently expressed in cervical cancer cell lines, in cells transformed by HPV, and in HPV associated cancer tissue. They are transcribed at higher levels in high grade lesions than in low grade lesions.165,168 (8) In most cell lines and in many HPV associated cancers, the HPV DNA is integrated into the cellular DNA. HPV-18 is nearly always integrated, whereas HPV-16 can be found episomally or in the integrated form.76,169,170

In reviewing work on the molecular genetics of cervical carcinoma, Lazo indicated different mechanisms of cancer induction. The effects of E6 and E7 on host regulatory proteins can be considered to be HPV related mechanisms. An additional effect could be expected from the consequences of viral integration and the specific impact on the integration sites. The third mechanism, which may or may not be related to HPV, is the accumulation of the cellular genetic damage needed for tumour development. The existence of this mechanism is strongly suggested by the observations of recurrent losses of heterozygosity and by recurrent amplifications in a large fraction of cervical carcinomas.160 The role of non-identified tumour suppressor genes is also suggested by experiments showing that the tumorigenicity of HeLa cells could be suppressed by fusion with normal fibroblasts or keratinocytes, and that the tumorigenicity of SiHa cells was suppressed by the introduction of chromosome 11 via microcell transfer technology.171–173 Similarly, the immortality of Hela and Sitta cells was suppressed by the introduction of chromosomes 3, 4, and 6.174,175

Although a review of the field is far from the purposes of this discussion, it seems quite clear that the biology of cervical cancer in relation to HPV has become a paradigm of viral mediated oncogenesis. The work being regularly published has clearly shown that the viral DNA detected by epidemiological studies is not a passenger infection of the cancerous tissue, but a biologically meaningful association.

Biological plausibility and coherence. Evaluation

The association of HPV DNA in cervical specimens and cervical cancer is plausible and coherent with previous knowledge. This includes in vitro experiments, animal experiments, and observations in humans. Novel criteria of causality are being proposed and tested as molecular technology develops and is introduced into epidemiological research protocols.

Biological mechanisms of HPV carcinogenesis

In previous decades, our understanding of cancer pathways was rudimentary and often incorrect. In the face of such uncertainty, arguments based on assumptions of molecular biology were not particularly convincing. However, with the large body of work now available it is possible to develop a reasonable understanding of the ways in which cancer may develop and ways in which HPV infection can drive the process. Thus, we can say with some confidence that it is plausible for HPV to caus