We summarize here our opinions as stated in a recent White Paper.
CCP is
a methodology proposed to do this. We take no position in this opinion paper
regarding the efficacy of CCP as applied to PCa but we examine the original
assertions in some detail. Conceptually it makes sense. It is as follows:
1. A handful of genes if over expressed, when combined with
other metrics, can provide fairly accurate prognostic measures of PCa.
2. Selecting the genes can be accomplished in a variety of
ways ranging from logical and clear pathway control genes such as PTEN to just
a broad base sampling wherein the results have a statistically powerful
predictive result.
3. Measuring the level of expression in some manner and from
the measurements combine those in a reasonable fashion to determine a broad
based metric.
4. Combining the gene expression metric with other variable
to ascertain a stronger overall metric.
The CCP work to date has been focused somewhat on these
objectives.
Let us now briefly update the work as detailed in the
industry press. As indicated in a recent posting:[1]
.....initially measured the levels
of expression of a total of 31 genes involved in CCP. They used these data to
develop a predefined CCP “score” and then they set out to evaluate the value of
the CCP score in predicting risk for progressive disease in the men who had
undergone an RP or risk of prostate cancer-specific mortality in the men who
had been diagnosed by a TURP and managed by watchful waiting.
Thus there seems to be a strong belief in the use of CCP,
especially when combined with other measures such as PSA.
The CCP test has been commercialized as Prolaris by Myriad. In
a Medscape posting they state[2]:
"PSA retained a fair amount of its predictive value,
but the predictive value of the Gleason score "diminished" against
the CCP score." he said. "Once you add the CCP score, there is little
addition from the Gleason score, although there is some."
"Overall, the CCP score was a highly significant
predictor of outcome in all of the studies,...it was
the dominant predictor in all but 1 of the studies in the multivariate
analyses, and typically a unit change in the score was associated with a
remarkably similar 2- to 3-fold increase in either death from prostate cancer
or biochemical recurrence, indicating that this is a very robust predictor, and
seems to work in a whole range of circumstances."
Thus there is some belief that CCP when combined with other
metrics has strong prognostic value.
In this
analysis we use CCP as both an end and a means to an end. CCP is one of many
possible metrics to ascertain prognostic values. There is a wealth of them. We
thus start with the selection of genes. We first consider general issues and
then apply them to the CCP approach. This is the area where we have the
majority of our problems.
Let us first examine how they obtained the data. We shall
follow the text of the 2011 paper and then comment accordingly.
1. Extract RNA
2. Treat the RNA with enzyme to generate cDNA
3. Collect the cDNA and confirm the generation of key
entities.
4. Amplify the cDNA
5. Pre amplify the cDNA prior to measuring in an array.
7. In arrays record levels of expression
Clearly there may be many sources of noise or error in this
approach, especially in recording the level of fluorescent intensity. The
problem is however that at each step we have the possibility of measurement
bias or error. These become additive and can substantially alter the data
results.
In this section we consider the calculations needed to
develop a reliable classifier. This is a long standing and classic problem.
Simply stated:
“Assume you have N gene expression levels, G(i), and you
desire to find some function g(G(1),…,G(N)) such that this function g divides
the space created by the Gs into two regions, one with no disease progression
and one with disease progression.”
Alternatively we could ask for a function f(G(1),…,G(N)) such
that the probability of disease progression, or an end point of death in a
defined period, is f or some function derived therefrom.
Let us begin with general classifiers. First let us review
the process of collecting data. The general steps are above. We start with a
specimen and we end up with N measurements of gene expression. In the CCP case
we have some 31 genes we are examining and ascertaining their relative excess
expression. Now as we had posed the problem above we are seeking a classifier
to determine a function f or g as above which would either bifurcate the space
of N genes or a function f from which we could ascertain survival based upon
the N gene expression measurements.
Now from classic classifier analysis we can develop the two
metrics; a simple bifurcating classifier and a probability estimator. The
simple classifier generates a separation point, a line or plane as shown below,
for which being below is benign and being above is problematic. This is akin to
the simple PSA test of being above or below 4.0. However we all know that this
has its problems. Thus there may be some validity in the approach for
prognostic purposes. Clearly a high value indicates a significant chance for
mortality, one assumes directly related to this disease.
Let us now examine the CCP index calculation in some detail.
We use the 2011 paper as the source. The subsequent papers refer
back to this and thus we rely upon what little is presented here. The approach
we take herein is to use what the original paper stated and then line by line
establish a mathematical model and where concerns or ambiguities we point them
out for subsequent resolution. In our opinion the presentation of the
quantitative model is seriously flawed in terms of its explanation and we shall
show the basis of our opinion below.
We have provided a detailed examination in our recent White
Paper. In our opinion there is a lack of transparency and reproducibility in
the 2011 paper and thus one cannot utilize what is presented.
This area of investigation is of interest but it in my
opinion raises more questions than posing answers. First is the issue of the
calculation itself and its reproducibility. Second is the issue of the
substantial noise inherent in the capture of the data.
1. Pathway Implications: Is this just another list of Genes?
The first concern is the fact that we know a great deal
about ligands, receptors, pathway elements, and transcription factors. Why, one
wonders, do we seem to totally neglect that source of information.
2. Noise Factors: The number of genes and the uncertainties in measurements raise serious concerns as to stability of outcomes.
Noise can be a severe detractor from the usefulness of the
measurement. There are many sources of such noise especially in measuring the
fluorescent intensity. One wonders how they factor into the analysis. Many
others sources are also present from the PCR process and copy numbers to the
very sampling and tissue integrity factors.
3. Severity of Prognosis and Basis: For a measurement which is predicting patient death one would expect total transparency.
The CCP discriminant argues for the most severe
prognostication. Namely it dictates death based upon specific discriminant
values. However as we have just noted, measurement noise can and most likely
will provide significant uncertainty in the “true” value of the metric.
4. Flaws in the Calculation Process: Independent of the lack of apparent transparency, there appear in my opinion to be multiple points of confusion in the exposition of the methodology.
In our opinion, there are multiple deficiencies in the
presentation of the desired calculation of the metric proposed which make it
impossible to reproduce it. We detail them in our White Paper.
5. Discriminants, Classifiers, Probability Estimators: What are they really trying to do?
The classic question when one has N independent genes and
when one can measure relative expression is how does one take that data and
determine a discriminant function. All too often the intent is to determine a
linear one dimensional discriminant. At the other extreme is a multidimensional
non-linear discriminant. This is always the critical issue that has been a part
of classifiers since the early 1950s. In the case considered herein there is little if any description of or justification of the method employed. One could assume that the authors are trying to obtain an estimate of the following:
P[Death in M months]=g(G1,...GN))
where Gk is the level of expression of one of the 31 genes. One would immediately ask; why and how? In fact we would be asked to estimate a Bayesian measure:
P[Death in M months|G1,...GN]
which states that we want the conditional probability. We know how to do this for systems but this appears at best to be some observational measure. This in my opinion is one of the weak points.
P[Death in M months]=g(G1,...GN))
where Gk is the level of expression of one of the 31 genes. One would immediately ask; why and how? In fact we would be asked to estimate a Bayesian measure:
P[Death in M months|G1,...GN]
which states that we want the conditional probability. We know how to do this for systems but this appears at best to be some observational measure. This in my opinion is one of the weak points.
6. Causal Genes, where are they?
One of the major concerns is that one genes expression is
caused by another gene. In this case of 31 genes there may be some causality
and thus this may often skew results.
7. Which Cell?
One of the classic problems is measuring the right cell. Do
we want the stem cell, if so how are they found. Do we want metastatic cells,
then from where do we get them. Do we want just local biopsy cells, if so
perhaps they under-express the facts.
8. Why this when we have so many others?
We have PSA, albeit with issues, we have SNPs, we have ligands, receptors, pathway elements, transcription factors, miRNAs and the list goes on. What is truly causal?
8. Why this when we have so many others?
We have PSA, albeit with issues, we have SNPs, we have ligands, receptors, pathway elements, transcription factors, miRNAs and the list goes on. What is truly causal?
Basically this approach has possible merit. The problem, in my
opinion, is the lack of transparency in the description of the test metric.
Also the inherent noisy data is a concern in my opinion. Moreover one wonders why so much Press.
1.
Cooperberg, M., et al,
Validation of a Cell-Cycle Progression Gene Panel to Improve! Risk
Stratification in a Contemporary Prostatectomy Cohort! https://s3.amazonaws.com/myriad-library/Prolaris/UCSF+ASCO+GU.pdf
2.
Cooperberg, M., et al,
Validation of a Cell-Cycle Progression Gene Panel to Improve Risk
Stratification in a Contemporary Prostatectomy Cohort, JOURNAL OF CLINICAL
ONCOLOGY, 2012.
3.
Cuzick J., et al,
Prognostic value of a cell cycle progression signature for prostate cancer
death in a conservatively managed needle biopsy cohort, British Journal of
Cancer (2012) 106, 1095 – 1099.
4.
Cuzick, J., et al,
Prognostic value of an RNA expression signature derived from cell cycle
proliferation genes for recurrence and death from prostate cancer: A
retrospective study in two cohorts, Lancet Oncol. 2011 March; 12(3): 245–255.
5. Duda, R., et al, Pattern Classification, Wiley (New York) 2001.
6. McGarty, T., Prostate Cancer Genomics, Draft 2, 2013, http://www.telmarc.com/Documents/Books/Prostate%20Cancer%20Systems%20Approach%2003.pdf
7. Theodoridis, S., K., Koutroumbas, Pattern Recognition, AP (New
York) 2009.