Article Text

Download PDFPDF

Proteomic characterisations of ulcerative colitis endoscopic biopsies associate with clinically relevant histological measurements of disease severity
  1. Aaron M Gruver1,
  2. Matt D Westfall2,
  3. Bradley L Ackermann1,
  4. Salisha Hill2,
  5. Ryan D Morrison2,
  6. Juraj Bodo3,
  7. Keith K Lai3,
  8. David C Gemperline1,
  9. Eric D Hsi3,
  10. Daniel C Liebler2,
  11. Jochen Schmitz1,
  12. Robert J Benschop1
  1. 1Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana, USA
  2. 2Protypia, Inc, Nashville, Tennessee, USA
  3. 3Department of Laboratory Medicine, Cleveland Clinic, Cleveland, Ohio, USA
  1. Correspondence to Dr Aaron M Gruver, Eli Lilly and Company, Indianapolis, Indiana, USA; gruver_aaron_m{at}


Aims and methods Accurate protein measurements using formalin-fixed biopsies are needed to improve disease characterisation. This feasibility study used targeted and global mass spectrometry (MS) to interrogate a spectrum of disease severities using 19 ulcerative colitis (UC) biopsies.

Results Targeted assays for CD8, CD19, CD132 (interleukin-2 receptor subunit gamma/common cytokine receptor gamma chain), FOXP3 (forkhead box P3) and IL17RA (interleukin 17 receptor A) were successful; however, assays for IL17A (interleukin 17A), IL23 (p19) (interleukin 23, alpha subunit p19) and IL23R (interleukin 23 receptor) did not permit target detection. Global proteome analysis (4200 total proteins) was performed to identify pathways associated with UC progression. Positive correlation was observed between histological scores indicating active colitis and neutrophil-related measurements (R2=0.42–0.72); inverse relationships were detected with cell junction targets (R2=0.49–0.71) and β-catenin (R2=0.51–0.55) attributed to crypt disruption. An exploratory accuracy assessment with Geboes Score and Robarts Histopathology Index cut-offs produced sensitivities/specificities of 72.7%/75.0% and 100.0%/81.8%, respectively.

Conclusions Pathologist-guided MS assessments provide a complementary approach to histological scoring systems. Additional studies are indicated to verify the utility of this novel approach.

  • colitis
  • proteins
  • inflammatory bowel diseases

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from


Ulcerative colitis (UC) is a chronic disease of unknown cause characterised by inflammation of the colon. Subjects experience periods of remission interspersed with intermittent disease flares which reduce overall quality of life. Many patients with UC experience a severe clinical course and approximately 30% require colectomy within 10 years of diagnosis.1 Endoscopic surveillance of the gastrointestinal (GI) tract is part of the standard of care, and during these procedures, biopsies are routinely taken for histopathological evaluation. Morphology-based measurements that characterise the extent of disease activity have been developed as tools to assess response to targeted therapies in clinical trials.2 3

Various biologic therapies that target specific immunological pathways have been studied as potential therapeutics for UC including those that block the interleukin-23 (IL-23)/T helper 17/IL-17 immune axis.4 5 Mechanisms involving neutrophil migration, epithelial barrier restitution and tight junction protein regulation have been hypothesised as contributing to the pathogenesis of UC.6 7 Assays measuring related targets, such as claudin RNA levels, have been proposed in support of therapeutic studies.8 However, gene expression studies may not be a reliable predictor of protein levels since transcript and protein do not always correlate.9

New technical approaches to simultaneously assess multiple protein targets from formalin-fixed, paraffin-embedded (FFPE) GI biopsies are needed to maximise development of UC therapies and to enhance understanding of disease pathogenesis. Although proteomics has previously been used to characterise UC biopsy specimens,10 most studies have not attempted to correlate observed protein abundances to accepted histopathological scoring methods of UC disease assessment. Except for a recent publication, previous methods have used fresh-frozen tissue rather than FFPE, which limits adoption for retrospective clinical studies where formalin-fixed tissues have been collected.11 Here, we evaluate a proteomic-based approach to characterise routine endoscopic tissue collections and determine how these results compare to established pathological assessments. To our knowledge, this pilot study is the first to compare the correlation of mass spectrometry (MS) measurements directly obtained from FFPE biopsies to relevant histological measurements of UC disease severity.


Case selection and histological assessments

Cases diagnosed as UC (2015–2018) were retrieved from the Cleveland Clinic pathology archives. Twenty-two endoscopic biopsies were identified. Pre-existing H&E slides and prior diagnoses were reviewed by pathologists (AMG, EDH) to confirm case suitability. No information regarding Robarts Histopathology Index (RHI) and Geboes Score (GS) was available during case selection. Cases with insufficient tissue remaining to perform additional tests were excluded. A third pathologist, specialising in GI and hepatobiliary pathology (KKL), independently assessed the H&E slides without knowledge of the overall experimental design or associated proteomic results. RHI and GS were performed as described.2 3

Whole slide imaging and biopsy area assessments

H&E tissue sections were scanned using an Aperio slide scanner (Leica Biosystems). Pathologist-assisted manual annotations were used to identify biopsy boundaries. Tissue areas were calculated using Aperio ImageScope V. (Leica Biosystems).

Proteomic analysis and data visualisation

FFPE tissue samples were randomised and provided for analysis without knowledge of associated histological assessments. A total of 15, 5-micron tissue sections were provided per sample for protein extraction with the exception of 3 samples which contained insufficient tissue content. Tissue sections were deparaffinised and rehydrated as reported previously.12 Subsequent targeted and global proteome analyses were performed essentially as described, respectively.13 14 Identified proteins were filtered to examine biologically relevant associations. Data were visualised using R (V.4.0.3) with libraries ggplot2 and tidyheatmap.15

Statistical analysis

Correlations and p values were calculated using the hmisc library in R using the rcorr function. P values were adjusted for multiplicity using the Benjamini-Hochberg method.16 For plotting purposes, the R function lm was used to display a line of best fit for correlation. Categorical myeloperoxidase (MPO) values (positive result: >log2−0.5; negative result: <log2−0.5) were compared using a 2×2 test with assigned GS (positive result: ≥4.1; negative result: <4.1) and RHI (positive result: ≥16; negative result: <16) to produce sensitivity and specificity calculations.


Visual qualitative assessments of the H&E-stained tissues were performed to generate a cohort comprising a spectrum of UC disease activity (figure 1). Of the original 22 biopsies selected, 19 (86.4%) produced adequate amounts of protein for global protein analysis. Biopsy areas, measured from H&E-stained section images, ranged from 4.5 to 23.2 mm2 (average=12.1 mm2; SD=5.3 mm2). Comparative analysis revealed that 100% of specimens containing areas >7.0 mm2 produced adequate protein for proteomic analysis. Independent GS and RHI assessments were performed without prior knowledge of the experimental design or MS data collection (table 1). Because higher GS and RHI scores reflect acute inflammation severity, initial comparisons focused on correlating these with global proteomic MPO measurements (figure 2). While both scoring systems demonstrated positive correlation, linear correlation of ordinal GS measurements was not as strong as the continuous RHI values (R2=0.530 vs 0.718, respectively; false discovery rate adjusted p value=0.002 vs 0.014, respectively). An exploratory accuracy assessment, using categorical MPO values with assigned GS and RHI cut-offs, produced sensitivities and specificities of 72.7%/75.0% for GS and 100.0%/81.8% for RHI, respectively.

Figure 1

Representative images captured from biopsy samples displaying a range of inflammatory change: (A) GS 5.4, RHI 30; (B) GS 5.1, RHI 23; (C) GS 5.2, RHI 14; (D) GS 0.1, RHI 0. H&E stain; original magnification ×100. GS, Geboes Score; RHI, Robarts Histopathology Index.

Table 1

Histological characterisation of the UC cohort

Figure 2

Results of global assessment. (A) Heatmap showing expression of multiple markers indicative of the level of neutrophil infiltration and activity: Multiple neutrophil markers with rows mapped for relative log2 expression compared with other patients for the same protein. Each row is independently sorted by RHI. _SUMSCORE_ is the total (log2) for each column of MS data. (B) Correlation between MPO and RHI and Geboes Score. (C) Correlation between ELANE and RHI and Geboes Score. CD44, CD44 molecule (Indian blood group); CD55, CD55 molecule (Cromer blood group); CYBA, cytochrome b-245 alpha chain; CYBB, cytochrome b-245 beta chain; ELANE, elastase, neutrophil expressed; ICAM1, intercellular adhesion molecule 1; ITGAM, integrin subunit alpha M; ITGB2, integrin subunit beta 2; MMP9, matrix metallopeptidase 9; MPO, myeloperoxidase; MS, mass spectrometry; RHI, Robarts Histopathology Index; UC, ulcerative colitis.

To evaluate the utility of MS in evaluating UC FFPE specimens, targeted MS assays were performed on a small subset of proteins involved in immune function and UC biology (table 2). Analysis revealed that nearly 100% of samples showed detectable levels of FOXP3 (forkhead box P3) (Treg cells) and CD8 (effector T-cells), while a subset of approximately 30% of patients had detectable levels of CD19 (B-cells). Receptors for critical cytokine signalling in T-cell and autoimmune biology, CD132 (interleukin-2 receptor subunit gamma/common cytokine receptor gamma chain) and IL17RA (interleukin 17 receptor A), were detected in all specimens. However, cytokines IL23 (p19) (interleukin 23, alpha subunit p19), IL17A (interleukin 17A) and cytokine receptor IL23R (interleukin 23 receptor) were not detected. While no clear correlations were observed between this limited panel of targeted assays and disease severity, positive correlations were observed between MS measurements and immunohistochemistry evaluations illustrating the ability to quantitatively measure specific protein targets in FFPE UC specimens when abundant target is present (data not shown).

Table 2

Targeted MS results for specific immune cell and select cytokine-related markers in ulcerative colitis

In addition to the targeted MS measurements, further assessment of global proteomic data revealed multiple biological correlations within the dataset when compared with RHI and GS histological assessments. Markers indicative of neutrophil infiltration and activity (eg, MPO, elastase (ELANE)) showed distinct clustering across the cohort with two clusters that were either neutrophil marker high or neutrophil marker low (figure 2). Scatterplots of selected proteins MPO and ELANE showed good correlation with RHI and GS (figure 2). Not surprisingly, several cell junction regulating protein families were also detected in the UC specimens (eg, cadherins, claudins, occludins and tight junction proteins) (figure 3). These factors were also compared with RHI and GS with the same samples clustering together as they did with neutrophil markers. Interestingly, all cell junction-related proteins examined showed good correlation with RHI or GS but with an inverse relationship with neutrophil markers (figure 3 vs figure 2).

Figure 3

Results of global assessment. (A) Heatmap showing expression of multiple markers of cell junction regulating proteins: Multiple ‘cell junction’ markers with rows mapped for relative log2 expression compared with other patients for the same protein. Each row is independently sorted by RHI. _SUMSCORE_ is the total (log2) for each column of MS data. (B) Correlation between CDH1 and histological scores. (C) Correlation between CLDN4 and histological scores. (D) Correlation between OCLN and histological scores. CDH1, cadherin 1; CDH13, cadherin 13; CDH17, cadherin 17; CGN, cingulin; CLDN3, claudin 3; CLDN4, claudin 4; MS, mass spectrometry; OCLN, occludin; PCDH1, protocadherin 1; RHI, Robarts Histopathology Index; SYMPK, symplekin; TJP1, tight junction protein 1; TJP2, tight junction protein 2; TJP3, tight junction protein 3; UC, ulcerative colitis.

Multiple other biological pathways also correlated with the RHI and GS clinical scores. The heatmaps in figure 4 show a similar clustering of histologic data for Wnt-β-catenin (CTNNB1) and the NK (natural killer) cell marker (NCAM-1/CD56) as seen with cell junction proteins. As with the neutrophil and cell junction heatmaps, specimens clustered into two distinct populations that correlated with RHI and GS with different biological and therapeutic implications based on association of the proteins with neutrophil infiltration or loss of cell junction integrity. For instance, β-catenin correlated with higher levels of cell junction proteins, suggesting a relation with intact tight junctions, likely reflecting the regulation of mucosal WNT (Wingless and Int-1) signalling by inflammatory cytokines in these samples.

Figure 4

Results of global assessment. (A) Heatmap showing expression of NK (natural killer) cell marker and Wnt-β catenin: NK marker CD56 with rows mapped for relative log2 expression compared with other patients for the same protein. Each row is independently sorted by RHI. _SUMSCORE_ is the total (log2) for each column of MS data. (B) Correlation between CTNNB1 and histological scores. (C) Correlation between NCAM1 and histological scores. CTNNB1, Wnt-β-catenin; MS, mass spectrometry; NCAM1 (CD56), neural cell adhesion molecule 1; RHI, Robarts Histopathology Index; UC, ulcerative colitis.


While proteomic analysis of frozen UC biopsy samples has been described,10 17 archival FFPE biopsies are a rich and largely untapped proteomic resource for investigative pathologists involved in the study of UC and the care of affected patients. Histology-based measurements that characterise the extent of disease activity have been developed as tools to assess treatment response; however, observer variability can limit their effectiveness.2 3 Data presented here suggest an MS-based approach to characterise routine endoscopic FFPE tissue collections is feasible, correlative to established histological approaches, and advantageous for tissue stewardship when judging specimen input requirements against the number of biomarkers simultaneously assessed. The successful deployment of targeted assays for CD8, CD19, CD132, FOXP3 and IL17RA suggests the assays to detect IL17A, IL23 (p19) and IL23R were unsuccessful due to low target abundance or a failure of these targets to crosslink to the cellular matrix during fixation, which limited adequate recovery. Global MS assessments of the 4200 proteins identified uncovered several expected associations with UC biology (eg, neutrophil activity association with disease severity) and revealed potential insights into more complex biology (eg, CTNNB1 association with intact tight junctions), indicating the usefulness of this approach for informing disease status and discovering new pathological processes that could provide treatment ideas for novel therapeutic approaches in UC.

Differences between ordinal (GS) and continuous (RHI) representations of histological scoring systems may affect the correlations observed with proteomic measurements. To our knowledge, while no GS or RHI cut-offs have been firmly established that predict therapeutic outcome, several studies have explored their use.18 For example, GS grade ≥3.1 has been independently associated with risk of clinical relapse,19 and a modified GS ≤12 or RHI scores ≤9 (with subscores of 0 for neutrophils in the epithelium and without erosions or ulcers) have been used to measure histological response in correlation with faecal calprotectin levels.20 When using a GS that corresponds to a brisk acute inflammatory response leading to epithelial crypt destruction (GS ≥4.1), the sensitivity and specificity of measuring MPO by MS was 72.7% and 75.0%, respectively, in this dataset. When applying a score that represents the upper half of the RHI scale (RHI ≥16), the sensitivity and specificity was 100.0% and 81.8%, respectively. These results suggest that FFPE tissue-based proteomic approaches warrant further investigation for exploration of UC biomarkers and association with patient outcomes. A quantitative MS method, capable of precisely assessing multiple protein targets simultaneously, may eventually provide a more comprehensive and statistically robust approach for support of prospective clinical studies.

Despite limitations associated with pilot studies using a small population representative of a range of disease activity, these data catalyse several potential directions for future research. The observation that proteomic data segregated the RHI-defined cohort into subpopulations suggests a customised targeted MS panel of disease-specific biomarkers could be created to test a validation cohort of UC specimens and further interrogate the relationship between RHI score, neutrophil activity, and epithelial barrier function, or other novel targets. Such assessments have potential to provide new insights into UC biology that could ultimately assist pathologists in assessing prognosis and therapeutic response. Lastly, the technical success of this effort suggests proteomic characterisations, using both targeted and global MS analyses, could be explored in other inflammatory diseases and autoimmune conditions where FFPE tissue samples are collected as part of routine patient care.

Ethics statements

Ethics approval

Eligible, deidentified, archival samples were obtained according to protocols approved by the Institutional Review Board of Cleveland Clinic (IRB reference # 20-913).


The authors wish to thank Mary Zuniga for logistical and operational assistance throughout the course of these studies.



  • Handling editor Runjan Chetty.

  • Contributors AMG, JS, RJB and BLA conceived the work and designed the study. AMG, EDH, KKL and JB performed tissue analyses. MDW, SH, RDM and DCL performed proteomic analyses. DCG provided bioinformatics and statistical support. All authors wrote, edited and critically reviewed the manuscript.

  • Funding Funding provided by Eli Lilly and Company.

  • Competing interests AMG, BLA, DCG, JS and RJB declare they were employees of Eli Lilly and Company during this work. MDW, SH, RDM and DCL are employees of Protypia. JB, KKL and EDH have no relevant financial competing interests to disclose.

  • Provenance and peer review Not commissioned; internally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.