Article Text

PDF

The clinical relevance of detection of minimal residual disease in childhood acute lymphoblastic leukaemia
  1. J Moppett,
  2. G A A Burke,
  3. C G Steward,
  4. A Oakhill,
  5. N J Goulden
  1. Department of Paediatric Oncology and Haematology, Bristol Royal Hospital for Children, Bristol BS2 8JD, UK
  1. Correspondence to:
 Dr N J Goulden, Department of Paediatric Oncology and Haematology, Bristol Royal Hospital for Children, Upper Maudlin Street, Bristol BS2 8JD, UK; 
 nick.goulden{at}ubht.swest.nhs.uk

Abstract

Risk directed treatment forms a central component of modern protocols for childhood acute lymphoblastic leukaemia (ALL). A review of recent studies of minimal residual disease (MRD) analysis shows that it is a powerful prognostic factor in both first line and relapse treatment. However, the value of MRD analysis is both time point and protocol specific, and the threshold for MRD detection of the technique used impacts upon the results obtained. MRD analysis does have a useful role to play in the risk directed treatment of childhood ALL, and this is currently being investigated in large prospective studies.

  • children
  • lymphoblastic leukaemia
  • minimal residual disease
  • ALL, acute lymphoblastic leukaemia
  • BFM, Berlin–Frankfurt–Munster
  • BMT, bone marrow transplantation
  • CR2, second complete remission
  • EORTC, European Organisation for Research and Treatment of Cancer
  • EFS, event free survival
  • FACS, fluorescence activated cell sorter
  • MRD, minimal residual disease
  • PCR, polymerase chain reaction
  • RFS, relapse free survival
  • WCC, white blood cell count

Statistics from Altmetric.com

With modern treatment, over 95% of children suffering from acute lymphoblastic leukaemia (ALL) achieve a remission defined by light microscopy as the presence of less than 5% blasts in the bone marrow.1 Unfortunately, approximately 25% of patients will subsequently relapse.1 In almost all cases, relapse is mediated by a clone that is either identical or related to the disease present at diagnosis,2 demonstrating that the leukaemia has persisted through treatment at levels below the detection limit of the light microscope. This is not surprising when one considers that at the time of diagnosis the theoretical leukaemia burden is in the order of 1012 cells,3 so that there may be up to 1010 residual leukaemic blasts in a remission bone marrow. This submicroscopic disease is commonly termed minimal residual disease (MRD).

“Acute lymphoblastic leukaemia is still the most common cause of death from cancer in childhood”

This review will focus primarily on the use of MRD measurement as a prognostic tool in ALL. A review of current (non-MRD based) approaches to risk directed treatment will be followed by a discussion of clinically relevant technical issues. Results of clinical studies will then be presented. The final section of this article discusses how MRD assessment is being integrated into new trials of treatment for ALL and highlights areas for future development of MRD research.

RISK DIRECTED TREATMENT FOR ALL

The stepwise improvement in the overall prognosis of childhood ALL seen over the past 40 years has been parallelled by the use of increasingly intensive treatment.1,4 As a consequence, many of those cured with modern treatment are being over treated.5–7 In contrast, up to 25% of children relapse and approximately half of those will go on to die of disease: ALL is still the most common cause of death from cancer in childhood. The search for reliable prognostic factors that will allow the greatest number of children to be cured with the minimum toxicity (commonly termed risk directed treatment) has been a holy grail of leukaemia trials. Traditionally, risk directed protocols have relied on simple readily measurable factors such as presenting white blood cell count (WCC), age, sex, bulk of disease, and cytogenetic abnormalities.5–7 Most modern protocols combine this with the morphological assessment of early response to treatment.6

There is now wide consensus that male sex, WCC > 50 × 109/litre at diagnosis, age < 1 or > 10 years, and the presence of the Philadelphia chromosome or true hypodiploidy are associated with a high risk of relapse.5–7 In addition, almost all studies have shown that for children with otherwise homogeneous features receiving identical treatment there is a strong correlation between slow early response and a higher risk of relapse. Crucially, the children’s cancer group has recently demonstrated that the prognosis of a minority of such slow responders can be improved by switching to a more intensive regimen.8 Unfortunately, most children destined to relapse lack high risk features at diagnosis and have a good early response as defined by morphology. Moreover, it is a recurrent feature of successive trials that the outlook for previously distinct subgroups converges with increasingly intensive treatment. Finally, it is important to note that for any given subgroup any small improvement in overall prognosis engendered by a more aggressive regimen is accompanied by over treatment of most of those who are cured.

MRD analysis is simply a tool to extend the well known correlation between prognosis and disease response. It is reasonable to think of MRD measurement as a molecular or immunophenotypical microscope. We and others believe that the introduction of sensitive reproducible techniques for the measurement of MRD could revolutionise the approach to risk directed treatment. Several studies have now shown that MRD analysis can define prognosis with more precision than the factors discussed above.9,10 Early response as assessed by MRD analysis is most important.11,12 Augmentation (or reduction) of treatment on the basis of MRD assessment is the next challenge for the treatment of ALL. In addition, MRD assessment, unlike all of the other risk factors identified, has prognostic relevance in the setting of relapse.13,14

TECHNICAL CONSIDERATIONS

In our previous review for this journal,15 we suggested that clinically useful markers of MRD must be widely applicable, stable during the course of the disease, specific, and sufficiently sensitive to predict outcome. As MRD moves into the clinical arena, we should now also specify that all techniques must be amenable to quality assurance and should be economically viable. Currently, polymerase chain reaction (PCR) analysis of antigen receptor gene rearrangements and flow cytometry (reviewed in Campana and Coustan-Smith16) represent the most clinically useful targets for MRD analysis (table 1).

Table 1

Comparison of clinically useful methods for minimal residual disease analysis

Gene rearrangement PCR relies on the identification of leukaemia specific rearrangements in a remission bone marrow. The simplest method defines the presence of MRD on the basis of a band of the same size as that seen at diagnosis after PCR of bone marrow DNA. Sensitivity can be improved by the use of fluorescent primers. This so called fluorescent gene scanning is capable of detecting one leukaemic cell in 1000 normal cells (10−3). It is a simple, rapid method that is particularly useful for the identification of the persistence of a relatively high degree of MRD early in treatment and is currently used to define children at very high risk of relapse by the European Organisation for Research and Treatment of Cancer (EORTC).17 This method is not truly quantitative and it is therefore not possible to standardise sensitivity.

Allele specific approaches rely on detection of the leukaemia specific junctional sequence in the DNA of remission bone marrow. Until the advent of real time PCR, most groups preferred to use junction specific radiolabelled probes rather than allele specific primers.10 More recently, the development of Taqman and fluorescence resonance energy transfer technologies has led to widespread adoption of allele specific priming.18 RQ antigen receptor PCR is routinely capable of detecting one leukaemic cell in 10 000 normal cells. Moreover, because the method is truly quantitative, sensitivity can be standardised and levels of disease correlated with prognosis defined. These complex allele specific approaches are particularly relevant to protocols that aim to reduce the amount of treatment for those children defined as having had a very good early response to treatment, as in the current BFM study.

Flow cytometry relies on the detection of qualitative and quantitative differences of antigen expression between leukaemic cells and their normal counterparts. Results are available within a few hours of receipt of the sample and are quantitative. It is important to note that methods of quantitation are not yet standardised. The simplest methods use a limited panel of antibodies and three colour fluorescent activated cell sorter (FACS) analyses. In general, these are capable of a sensitivity of 10−3.19 Couston-Smith et al have reported a routine sensitivity of 10−4 using four colour flow cytometry.20 To date, only one study has directly compared the results generated by PCR of gene rearrangements and flow cytometry.21 Here, Neale et al showed an excellent correlation between allele specific immunoglobulin heavy chain PCR and the four colour flow system.

Concerns about the instability of MRD markers leading to false negative results were first raised more than a decade ago.2 There is now good evidence that if multiple targets are used (that is, two antigen receptor loci or several immunophenotypic combinations) the risk of false negative results can be reduced to less than 5%.22 It is also important to note that even with these very sensitive tests a negative result does not imply the clearance of all residual disease. Indeed, a whole body tumour load of 10 million cells (10−7) is undetectable by current technology.

Perhaps the most encouraging development over the past decade has been the development of programmes designed to standardise methods for the detection of MRD. This has been led by the Biomed consortium.21 More recently the formation of the European study group on MRD in ALL has involved 20 laboratories in seven different countries in quality assurance rounds and technical workshops. This standardisation is fundamental to successful inclusion of MRD in clinical protocols.

CLINICAL RELEVANCE OF MRD IN ALL

Current evidence of the usefulness of MRD analysis

Early studies investigating the clinical relevance of MRD in ALL produced conflicting results. This was largely because of poor study design and is a caveat that should be borne in mind when the clinical validity of any new technology is being explored. The publication of prospective blinded studies of homogeneous groups of patients has provided much greater consensus.9,10,23 In addition, these studies have confirmed that the clinical relevance of MRD is a function of the technique used to measure MRD, the timing of measurement, and the treatment protocol.

First line treatment

Several prospective studies have now shown that MRD analysis during the first months of treatment can predict outcome within groups of children with homogenous clinical risk features receiving identical chemotherapy.9–11,23 Perhaps the most widely discussed study in Europe is that published by Van Dongen et al.10 Here, PCR of antigen receptor genes and subsequent allele specific oligoprobing was used to measure MRD in a cohort of children treated according to the BFM 90 protocol. This risk directed protocol stratified treatment according to leukaemic cell mass, immunophenotype, the presence of the Philadelphia translocation, and the response to seven days of prednisolone. Bone marrow from a cohort of 129 patients was examined for the presence of MRD at both one and three months after diagnosis. The distribution of clinically assigned risk groups and outcome did not differ significantly between children in whom MRD was assessed and the overall population receiving treatment. In each case, at least one marker of MRD with a minimum sensitivity of detection of one leukaemic cell in 10 000 normal ones was used. Three MRD based prognostic groups could be identified. A sizeable low risk group of 55 patients (43%) who were MRD negative (< 10−4) at both time points had a three year relapse rate of 2%. These were drawn from the clinical standard risk (22%) and medium risk (78%) groups. In contrast, in the 19 high risk patients, MRD remained detectable at or above 10−3 at both time points. These children, who were drawn in roughly equal amounts from the clinical medium and high risk groups, had a 75% relapse rate at three years. An intermediate group comprising all other patients had an intermediate prognosis, with a three year event free survival (EFS) of 77%. However, this group could be further subdivided according to MRD status at one year of treatment into those who were MRD negative (< 10−4) and those who were MRD positive, with three year relapse free survival (RFS) rates of 90% and 39%, respectively.10 As can be seen from table 2, in the BFM 90 study MRD based analysis was of more predictive value than clinical risk allocation.

Table 2

Comparison of clinical and MRD based risk groups in the study by Van Dongen and colleagues10

Other studies have concentrated on the predictive value of MRD analysis at the end of induction.9 Cave et al measured MRD at the end of induction in 178 children on the EORTC 58881 protocol using a competitive antigen receptor gene allele specific PCR (sensitivity 5 × 10−5). In this study, the MRD negative low risk group had a 92% four year RFS, compared with a 60% four year RFS for the MRD positive patients. Within the MRD positive group, there was a clear tendency to increased relapse risk with increased disease level.9 Nyvold et al have recently reported antigen receptor gene allele specific PCR MRD assessment in the Nordic Organisation for Paediatric Haematology and Oncology ALL MRD-95 study.24 Of the 100 children who had MRD analysis at day 29 (end of induction), 40 had MRD < 10−4. None of these children has relapsed to date, compared with a seven year EFS of 52% for those with MRD > 10−4.

Immunophenotypical MRD analysis has also been shown to provide independent prognostic information.20,23 Initially, reports of four colour FACS by Coustan-Smith et al suggested that single time point MRD analysis at week 14 (equivalent to time point 2 in the study by Van Dongen et al) was the most predictive. Here, the RFS of the MRD negative group (< 10−4) at three years was 93.4%, and for the MRD positive group 57.9%. When the study was extended and re-reported in 2000, a dual time point analysis similar to that published by Van Dongen et al showed a 32% RFS for their high risk group (MRD positive at both time points), compared with a 90% three year RFS for their low risk group (MRD negative by week 14). Dworzak et al, using a (three colour) immunophenotypical method of MRD analysis, but reporting results in terms of absolute blast count, show that the presence of blasts detectable at > 10 blasts/μl at day 33 and > 1 blast/μl at week 12 is associated with a 0% RFS compared with a 94% RFS for all others.19

“It is previous treatment that determines the proportions of patients who will be found with a particular level of minimal residual disease, but it is the entire therapeutic protocol that dictates the ultimate prognosis”

However, the broad similarity of all (molecular and immunophenotypical) of the above results does hide some important differences. For example, the relapse risk for the MRD based low risk group in the papers by Coustan-Smith et al is five times that of the comparable group in the study by Van Dongen et al.10,20,23 Relapses from the low risk group represent 4% of all relapses in the study by Van Dongen et al, compared with 43% in the papers by Coustan-Smith et al. There are two plausible reasons why such discrepancies may arise.

First, assuming that biologically similar disease is being assessed at an identical time point, it may be that a true treatment based difference in the levels of disease reduction achieved is being seen. This treatment effect is exemplified by the studies of Zur Stadt et al and Gruhn et al.25,26 In the studies by Cave et al and Van Dongen et al,9,10 high level MRD (⩾ 10−2) was found at the end (week 5) of a four drug induction in 15 of 133 (11%) and 27 of 169 (16%) patients, respectively. This level of MRD positivity was associated with relapse rates of 73% and 74%, respectively. However, the study reported by Zur Stadt et al found high level MRD (⩾ 10−2) in more children (20 of 76) treated with a three drug induction (omitting asparaginase), only five of whom relapsed.25 In contrast, the study reported by Gruhn et al, on the St Jude’s experience with Total therapy XII and XIIIA, showed that with an intensive six drug induction regimen, only seven of 26 patients had detectable (low level) MRD at the end of induction (day 43 bone marrow).26 Four of these seven, who all had disease measured at > 2 × 10−5, suffered leukaemic relapse, whereas the other three with disease < 2 × 10−5 remained in continuous complete remission. These studies confirm that it is previous treatment that determines the proportions of patients who will be found with a particular level of MRD, but that it is the entire therapeutic protocol that dictates the ultimate prognosis.

However, it is plausible that subtle differences in the sensitivity of the MRD analysis performed also contribute to the discrepancies seen. The combined data from all the above studies suggest a positively skewed, bell shaped distribution of disease level once treatment has begun (fig 1). A minority of patients with persistent high level disease represent the positively skewed tail. The bulk of patients have a narrow distribution of disease level between 10−3 and 10−5 (the lower limit of assay sensitivity). Within this large body of patients, a change in assay sensitivity too small to detect (sensitivities are reported as whole log integers) will have large effects on the proportion of patients detected and the relapse risk identified. A scenario where the technique used by Van Dongen et al is only slightly more sensitive overall than that of Coustan-Smith et al (for example, 10−4.3 versus 10−4.0) would lead to exactly the results seen.10,23 The technique used by Van Dongen et al will exclude some children with low level MRD positivity from the low risk group who would be included by the method used by Coustan-Smith et al. This would lead to a smaller low risk group, with a better RFS (Coustan-Smith’s low risk group: 70% of patients, RFS 90%, represents 43% of relapses; Van Dongen’s low risk group: 42.5% of patients, 98% RFS, represents 4% of relapses).

Figure 1

A representative (and approximate) distribution of disease levels at the end of induction. It can be seen that small changes in assay threshold have large effects on the proportion of patients below that threshold. Percentages represent approximate proportions of the population at risk under two thresholds, 10−4 and 10−4.3.

Children undergoing bone marrow transplantation for ALL

Bone marrow transplantation (BMT) is reserved for the treatment of a minority of children in first remission and many of those who suffer a bone marrow relapse. Three studies have highlighted the prognostic value of MRD measurement immediately before conditioning for BMT.13,27,28 Critics may argue that the comparison of the results from these is invalidated by the notorious difficulty caused by variable transplant protocols. We would counter that the reverse is true: the application of standardised MRD measurement in this situation may finally provide an insight into the relative value of each approach.

Table 3 details three publications containing data on 80 patients transplanted in second complete remission (CR2) for ALL.13,27,28 Knechtli et al report a uniform group of children undergoing T cell depleted allogeneic transplantation.13 For CR2 patients in that study, the detection of high level MRD (> 10−3) before transplantation (nine of 39) was universally associated with relapse, whereas the RFS for those who were low level (< 10−3 to 10−5) MRD negative (24 of 39) was 73%. In contrast, Uzunel et al (reporting a mixed group of children and adults) and Bader et al report a majority of T replete transplants (which may confer a lower risk of relapse, although not overall survival). In the papers by Uzunel et al and Bader et al, those who survived in complete clinical remission despite high level MRD before transplantion all suffered either graft versus host disease or, in the case of one child, were treated with pre-emptive immunotherapy for progressive mixed chimaerism.

Table 3

A comparison of studies examining the relation between pre-BMT MRD levels and subsequent relapse

Caveats in the interpretation of clinical studies

Although it is our belief that the introduction of standardised MRD measurement will radically improve the accuracy of risk directed treatment others have pointed out shortcomings in the design of the studies discussed above. Even the largest has only examined a proportion of the patients enrolled on the relevant parallel clinical study. Consequently, none has given a true representation of the clinical usefulness of MRD analysis. In contrast, the population based study of Levett et al highlights the logistic difficulties encountered in truly inclusive studies.29 An additional concern is that none of the clinically significant thresholds determined in these studies has been tested in another data set separate from that in which the hypothesis was generated. On the basis of this, Donadieu and Hill have questioned whether—in the absence of large, blinded, prospective studies where MRD methodology is standardised and quality controlled, and hypotheses and detection thresholds determined in advance—MRD analysis should be used at all.30

Ultimately, the drive to include MRD studies in the stratification of treatment for ALL is derived from an understandable desire to optimise risk directed treatment as soon as possible. Thus, although purists may argue that inclusive blinded studies should confirm the findings of small pilots before interventional studies, it is unlikely that this will happen. Instead, such non-interventional analysis will be derived from the control arms of randomised interventional protocols, such as that currently being conducted by the BFM, in which hypotheses and thresholds have been set in advance.

INTEGRATION OF MRD ANALYSIS INTO CLINICAL TRIALS AND FUTURE PERSPECTIVES

Although the technology for measurement of MRD has been available for some time, the uptake into clinical trials has not been rapid. This reflects the need to establish the relevance of MRD results for each protocol and to standardise methodology. Several MRD based risk stratification trials are now running in Europe and the USA. MRD based stratification of treatment is also planned for the next UK study. There are important logistical and economic implications (reviewed in Goulden and colleagues4).

Although it is hoped that the early detection of a suboptimal response will allow intensified treatment to lead to improved survival, no child has yet been cured because of an MRD test. Indeed, MRD is simply providing a stimulus to the development of novel treatments and greater collaboration. To this end, it will be important to establish the correlation between MRD and well established prognostic factors, in addition to newer ones, such as microarrays,31 drug sensitivity, blast glutathione content,32 and methylation status of the p21(CIP1/WAF1/SDI1) gene.33

“Minimal residual disease detection has yielded a wealth of vital clinically relevant information that is now beginning to impact on therapeutic decisions in both first line and salvage treatment”

Beyond this, the future of MRD research must explore the biology of the residual leukaemia. In this regard, flow cytometry with the possibility of sorting MRD clones offers obvious advantages over other methods of MRD detection. The difficulties in obtaining sufficient cells from small clinical samples must be overcome and a new level of partnership between physicians, scientists, children with ALL, and their families must be established to allow more sampling in “remission”, with the aim of understanding the mechanisms by which MRD persists.

Take home messages

  • Several large retrospective studies have shown that minimal residual disease (MRD) analysis is a powerful prognostic factor in both first line and relapse treatment for childhood acute lymphoblastic leukaemia

  • The predictive value of an MRD test is both protocol and time point specific

  • Different levels of MRD have different prognostic values

  • Prospective studies are now under way to investigate the efficacy of treatment directed according to relapse risk as predicted by MRD analysis

To date, MRD detection has yielded a wealth of vital clinically relevant information that is now beginning to impact on therapeutic decisions in both first line and salvage treatment. In the future, the application of rigorously controlled MRD analysis should guide the development of new treatments to improve the outcome for the 20% of children with ALL do not yet feel the benefit of the advances made in the cure of their disease. Just as importantly, MRD may also be used to reverse the trend of the past 40 years and explore disintensification of treatment for those with good risk.

REFERENCES

View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.