Friday, May 18, 2012

Cancer Models:Prediction and Control

We will now consider what are the essential elements for modeling cancers. The first step is to re-establish the goals of a model and then its structure. Finally we will lead into the interrelationship between a model and the data which is used to justify it.This work is detailed in a recent White Paper.

Many authors have developed models concerning pathways and also cancer. The books by Klipp et al and that of Szlassi et al are excellent overviews of the area with significant detail. The Klipp et al book is a truly superb discussion regarding pathways and modeling alternatives. The books by Bellomo et al and Wang are directed specifically at cancer modeling but unfortunately they lack adequate pathway dynamics to be of substantial use. Yet they are the only books available within the focused area.

At the core, we want a model which reflects the following qualities:

1. Based Upon Reality: The model must at its core be based upon the known reality. It must conform with what we currently know and understand. Namely it must reflect in its core the elements which we consider critical and the temporal and spatial dynamics of those elements. The model must be based upon a tempero-spatial system of measurable quantities ;linked in some kinetic manner using reasonably well understood processes.

2. Predictability: Any modeling must, if it is to have any credibility, have the ability to predict, to say what will happen, and then to have that prediction validated. Although the ability may be statistical in nature the statistical confidence must be justifiable. We know all too well that many things are correlated, yet not causal, and not predictable.

3. Measurable: One must be able to measure and then predict the quantities which make up the model. Many of the modeling systems include proteins but they react in some zero-one format. We know in reality that we have concentrations, or better yet specific numbers of proteins, produced in a cell. Yet we cannot yet measure the number of each of these proteins. We all too often can at best measure their presence or absence. However, is it not the case that it is the excess or the low density of some set of proteins which shift reactions, and that reactions are often concentration dependent.

4. Modellable: We want a system which can be modeled. It must reflect the measurable quantities in space and time and the tempero-spatial dynamics of them, using techniques that we can then use for prediction and validation.

In this paper we examine and analyze several models of cancer. Specifically we look at intracellular, extracellular and full body models. We attempt to establish a linkage between all of them. Many researchers have looked at the gene level, the pathway level and the gross flow of cancer cell level, namely whole body. Connecting them has been complex to say the least.

But herein we look at the pathway level and a whole body level and demonstrate the nexus, physically, and from this we argue that one can construct both prognostic tools as well as methodologies to deal with metastasis.

The following graphic lays out the flow of development and its implications as we detail them herein.

 The key question we ask is just what is it we are modeling in cancer cell dynamics. Let us consider some options:

This type of model focuses on the genes, and their behavior. It is basically one where we examine the gene type and its product.

This type of model falls in several subclasses. All begin with protein pathways and the “dynamics” of such pathways. But we have two major subclasses; protein measures and temporal measures. By the former we mean that we can look at the proteins as being on or off, there or not there, or at the other extreme looking at the total number of proteins of a specific type generated and present at a specific time. By the latter, namely the temporal state, we can look at the proteins in some static sense, namely there or not there at some average snapshot instance, or we can look at the details over time, the detailed dynamics. In all cases we look at the intracellular dynamics only.

Let us consider the two approaches.

i. On-Off: In this approach the intracellular relationships are depicted as activators or inhibitors, namely if present they allow or block an element in a pathway. PTEN is a typical example, if present it blocks Akt, if absent it allows Akt to proceed and enter mitosis. p53 is another example for if present we have apoptosis and if absent we fail to have apoptosis. These are simplistic views. This is a highly simplistic view but it does align with the understanding available say with limited microarray techniques. This is an example of the data collection defining what the model is or should be.

ii. Density: This is a more complex model and it does reflect what we would see as reality. The underlying assumptions here are:

a. Genes are continually producing proteins via transcription and translation.

b. Transcription and translation are affected at most by proteins from other genes acting as repressors or activators. There are no other elements affecting the process of transcription and translation. Not that this precludes any miRNA, methylation, or other secondary factors. We shall consider them later. In fact they may often be the controlling factors.

c. The kinetics of protein production can be determined. Namely we know the rate at which transcription and translation occur in a normal cell or even in a variant. That is we know that the production rate of proteins can be given by a typical creation differential equation.

Here we have production rates dependent on the concentration of other proteins. The processes related to consumption are not totally understood (see Martinez-Vincente et al). We understand cell growth, as distinct from mitotic duplication, but the growth of a cell is merely the expansion of what was already in the cell when at the end of its mitotic creation. In contrast, we understand apoptosis, the total destruction of the cell, we also understand that certain proteins flow outside the cell or may be used as cell surface receptors, but the consumption of these is not fully understood. Yet we can postulate a similar destruction differential equation.

This is based upon the work of Martinez-Vincente et al which states[1]:

All intracellular proteins undergo continuous synthesis and degradation. This constant protein turnover, among other functions, helps reduce, to a minimum, the time a particular protein is exposed to the hazardous cellular environment, and consequently, the probability of being damaged or altered. At a first sight, this constant renewal of cellular components before they lose functionality may appear a tremendous waste of cellular resources.

However, it is well justified considering the detrimental consequences that the accumulation of damaged intracellular components has on cell function and survival. Furthermore, protein degradation rather than mere destruction is indeed a recycling process, as the constituent amino acids of the degraded protein are reutilized for the synthesis of new proteins.

The rates at which different proteins are synthesized and degraded inside cells are different and can change in response to different stimuli or under different conditions. This balance between protein synthesis and degradation also allows cells to rapidly modify intracellular levels of proteins to adapt to changes in the extracellular environment. Proper protein degradation is also essential for cell survival under conditions resulting in extensive cellular damage. In fact, activation of the intracellular proteolytic systems occurs frequently as part of the cellular response to stress. In this role as ‘quality control’ systems, the proteolytic systems are assisted by molecular chaperones, which ultimately determine the fate of the damaged/unfolded protein.

Damaged proteins are first recognized by molecular chaperones, which facilitate protein refolding/repairing. If the damage is too extensive, or under conditions unfavorable for protein repair, damaged proteins are targeted for degradation. Protein degradation is also essential during major cellular remodeling (i.e. embryogenesis, morphogenesis, cell differentiation), and as a defensive mechanism against harmful agents.

We have also discussed this process with regards to the function of ubiquitin, which marks proteins for elimination. As Goldberg states[2]:

Proteins within cells are continually being degraded to amino acids and replaced by newly synthesized proteins. This process is highly selective and precisely regulated1, and individual proteins are destroyed at widely different rates, with half-lives ranging from several minutes to many days. In eukaryotic cells, most proteins destined for degradation are labelled first by ubiquitin in an energy requiring process and then digested to small peptides by the large proteolytic complex, the 26S proteasome.

Indicative of the complexity and importance of this system is the large number of gene products (perhaps a thousand) that function in the degradation of different proteins in mammalian cells. In the past decade, there has been an explosion of interest in the ubiquitin–proteasome pathway, due largely to the general recognition of its importance in the regulation of cell division, gene expression and other key processes1. However, the cell’s degradative machinery must have evolved initially to serve a more fundamental homeostatic function — to serve as a quality-control system that rapidly eliminates misfolded or damaged proteins whose accumulation would interfere with normal cell function and viability.

Also we refer to the recent review work of Ciechanover which details the evolution of this understanding[3].

In contrast the proteins are consumed and thus the negative sign. In toto we have a combined equation as a total balance of proteins. This assumes we have a production mechanism for each of the proteins, namely their genes and the activators and repressors as required.

d. Pathway Dynamics must be meaningful. Let us consider the pathway as shown below. This is a typical melanoma pathway we have shown before.

Now let us consider PTEN blocking BRAF and Akt. Now physically it is one molecule of PTEN needed for each molecule of BRAF and PI3K. But what if we have the following: n(PTEN)n(PI3K).

Here we have PTEN blocking some but not all the BRAF and PTEN blocking all the PI3K. At least at time t. Do we have an internal mechanism which then produces even more PTEN? One must see here that we are looking at the actual numbers of PTEN, real numbers reflecting the production and destruction rates. We know for example that if we have a mutated BRAF then no matter how much PTEN we have an unregulated pathway.

Now it is also important to note that this “model” and approach is distinct in ways from classic kinetics, since the classic model assume a large volume and concentrations in determining kinetic reaction rates of catalytic processes. Here we assume a protein binds one on one with another protein to facilitate a pathway.

Thus knowing the dynamics of individual proteins, and knowing the pathways of the proteins, namely the temporary adhesion of a protein, we can determine several factors:

1.     The number of free proteins by type
2.     The pathways activated or blocked
3.     The resultant cellular dynamics based on activated pathways.

It should be noted that we see pathways being turned on and off as we produce and destroy proteins. There is a dynamic process ongoing and it all depends on what would be a stasis level of proteins by type. The question is; are cells in stasis or are they in a continual mode of regaining a temporary stasis?

This also begs the question, that if as we have argued, that cancer is a loss of stasis due to pathway malfunction, then can this be a process of instability in the course of a normal cell? Namely is there in the dynamics of cell protein counts, unstable oscillator type modes resulting in uncontrolled mitotic behavior. Namely can a cell get locked into an unstable state and start reproducing itself in that state, namely an otherwise normal cell.

e. Total intracellular dynamics can be modeled yet the underlying processes are still not understood and the required measurements are yet to be determined.

Here we look at the intercellular dynamics as well, not just as a stand-alone model. By this methodology we look at intercellular communications by ligand binding and the resulting activation of the intracellular pathways. We must consider both the intercellular signalling between like cells but also between unlike, such a white cells perhaps as growth factor inhibitors and the like. We also then must consider the spatiodynamics, namely the “movement” of the cells, or in effect the lack of fixedness or specificity of function. This becomes a quite complex problem.

There are two functions we examine here:

a. Intercellular binding or adhesion: E cadherin is one example that we see in melanocytes. Pathway breakdown may result in the malfunctioning of E cadherin.

 The above demonstrated E cadherin in melanocyte-keratinocyte localization. The bonds are strong and this stabilizes the melanocyte in the basal layer. If however the E cadherin is compromised then the bond is broken, or materially weakened, and the melanocyte starts to wander. Movement for example above the bottom of the basal layer and upwards is pathognomonic of melanoma in situ. Wandering downward to the dermis becomes a melanoma. Thus the pathways activating E cadherin production is one pathway essential in the inter-cellular dynamics.

b. Ligand production and receptor production: Here we have cells producing ligands, proteins which venture out of the cell and become signalling elements in the intercellular world. We have the receptor production as well, where we have on the surface of cells, various receptors, also composed of cell generated proteins, which allow for binding sites of the ligands and result in pathway activation of some type. For example various Growth Factors, GF proteins, find their way to receptors, which in turn activate the pathways. Wnt is an example of one of these ligands which we have shown above.

It can also be argued that as ligands are produced and as the “flow” throughout the intercellular matrix, we can obtain effects similar to those in the Turing tessellation models. Namely a single ligand may be present everywhere but density of ligands may vary in a somewhat complex but determinable manner, namely is a wavelike fashion.

This is akin to the Turing model used in patterning of plants and animals[4]. Namely the concentration of a ligand, and in turn its effect, may be controlled by

In this case we would want a model which reflects the total body spatiotemporal dynamics This type of models is an ideal which may or may not be achievable. In a simple sense it is akin to diffusion dynamics, viewing the cancer cells as one type of particle and the remaining body cells as another type. The cancer cells have intercellular characteristics specific to cancer and the body cells have functionally specific characteristics. Thus we could ask questions regarding the “diffusion” of cancer cells from a local point to distant points based upon the media in between. The “rate” of such diffusion could be dependent upon the local cells and their ability for example to nourish the cancer cells as well. In this model we could define an average concentration of cancer cells at some position x and time t and we would have some dynamic process as well.

This is a diffusion like equation and is a whole body equation. Perhaps knowing what the rate of diffusion is on a cell by cell basis may allow one to determine the most likely diffusion path for the malignancy, and in turn direct treatment as well.

This is of course pure speculation since there has been to my knowledge any study in this area. Except one could imagine a system akin to PET scans and the like which would use as input the surface markers from a malignancy and then the body diffusion rates to plot out in space and time the most likely flow of malignant cells and thus plan out treatment strategies. Although this model is speculative we shall return again to it in a final review of such models since it does present a powerful alternative.

This concept of total cellular dynamics is in contradistinction to the intercellular transport. In the total cellular dynamics model we regard the model as one considering the flow of altered cells across an existing body of stable differentiated cells.

We may then ask, what factors drive cancer cells to what locations? One may putatively state that cancer cells will follow the path of least resistance and/or will proceed along “flow lines” consistent with what propagation dynamics they may be influenced by.

The concept of a model of Total Cellular Dynamics is somewhat innovative. It focuses on the movement of the cancer cells throughout the body. We will consider three possible possibilities:

1. No Stem Cells

2. Stem Cells but Fixed at Initial Location

3. Stem Cells which are mobile.

In Case 1 all malignant cells are clones of each other at least at the start. As the malignant cells continue through mitosis additional mutations are likely so that after a broad set of mitotic divisions we have a somewhat heterogeneous set of malignant cells, some more aggressive than others. As with most such cancer cells they also produce ligand growth factors which stimulate each other and result in the cascade of unlimited growth and duplication.

In Case 2 we assume that there was a single cell which mutated and that this becomes the CSC. The CSC replicates producing one CSC for self-replication and TICs which migrate. We assume that the CSC may from time to time actually double, but not at the mitosis rate of the base. Furthermore we assume the CSC sends out growth factors, GF, to the TICs. The GF flow outward in a wave like manner from the somewhat position stabilized CSCs to the TICs which are mobile and both diffuse and flow throughout the body. The GF must find the TICs which become a distant metastasis.

In Case 3 in contrast to Case 2, we assume mobile CSC and thus the CSCs also flow according to some set of rules.

Now depending on the case we assume we can model the flow of cancer cells according to some simple dynamic distributed models[5]. Thus we could have[6] a partial differential equation of the type found in McGarty (see White Paper).

This provides diffusion, flow, and rate elements. The rate term, the F term, is a rate of change in time at a certain location and time specific. It is the duplication rate at that specific location due to the normal mitotic change. The last term may be both pathway and environment driven.

Now this description has certain physical realities.
Here above we describe the three factors in terms of their effects and their causes. The three elements of the equation; diffusion, flow, and growth, are the three ways in which cancer cells move. We can summarize these as below:

Physical Effect
Cancer cells begin to diffuse due to concentration effects.
Cancer cells are “forced” to move by a flow mechanism driven them in a direction along flow lines.
Cancer cells begin to go through mitosis and cell growth.
Genetic Driver
Movement is due to the loss of location restrictors such as E cadherin found in melanocytes and restricting their movement.
Flow lines may be developed by means of metabolic needs of the cell in search of the nutrients required for growth. This may be a combination of angiogenesis as well as a Warburg like effect.
Growth factor ligands attach to the surface of the cell. Flow of such ligands and their production may be influenced by a Turing flow effect thus accounting for complexity of location of growth.
Slow migration in local areas.
Cells have lost functionality and move to maximize their nutrition input to facilitate growth.
Cancer cells may find optimal areas for proliferation based upon factor related to ligand density.

Now consider the following graphic as a human body,

We have a D, E, F, for each gross portion of the body. We also have a model as specifically below in the Table:


The above numbers are purely speculative. But if we can ascertain them then we get a solution of p(x,t) in time. Note that here we have a two dimensional space. Thus we have the above constants applying only to this artifactually spatial model. Distance is measured in terms of movement across the interfaces. For simplicity we assume that all other space is impenetrable by any means. This we have production, flow and diffusion in each area.
 Note that in the above we have laid out the x and y coordinates such that we have blood flow in the center, namely the metastasis flows via blood, and then enters organs as shown. The “location” of the organs are distances. Note also the origin of the malignancy is at (0,0).

Now we can relate the constants to the pathway distortions which are part of the malignancy as well.

The question is how do we determine these constants so that we may verify the model. Let us assume we can do so via examination of prior malignancy, not an obvious task but one we shall demonstrate. One must be cautious also to include in the determination pathway factors for each malignancy and its state and stage. Thus the three constants will be highly dependent upon the specific genetic makeup of the initial malignancy.

Turing Tessellation

In 1952 Alan Turing, in the last year and a half of his life, was focusing on biological models and moving away from his seminal efforts in encryption and computers. It was Turing who in the Second World War managed to break many of the German codes on Ultra and who also created the paradigm for computers which we use today. In his last efforts before his untimely suicide Turing looked at the problem of patterning in plants and animals. This was done at the same time Watson and Crick were working on the gene and DNA. Turing had no detailed model to work with, he had no gene, and he had just a gestalt, if you will, to model this issue. Today we have the details of the model to fill in the gaps in the Turing model.

The Turing model was quite simple. It stated that there was some chemical, and a concentration of that chemical, call it C, which was the determinant of a color. Consider the case of a zebra and its hair. If C were above a certain level the hair was black and if below that level the hair was white.  As Turing states in the abstract of the paper:

"It is suggested that a system of chemical substances, called morphogens, reacting together and diffusing through a tissue, is adequate to account for the main phenomena of morphogenesis. Such a system, although it may originally be quite homogeneous, may later develop a pattern or structure due to an instability of the homogeneous equilibrium, which is triggered off by random disturbances. Such reaction-diffusion systems are considered in some detail in the case of an isolated ring of cells, a mathematically convenient, though biologically unusual system.

The investigation is chiefly concerned with the onset of instability. It is found that there are six essentially different forms which this may take. In the most interesting form stationary waves appear on the ring. It is suggested that this might account, for instance, for the tentacle patterns on Hydra and for whorled leaves. A system of reactions and diffusion on a sphere is also considered. Such a system appears to account for gastrulation. Another reaction system in two dimensions gives rise to patterns reminiscent of dappling. It is also suggested that stationary waves in two dimensions could account for the phenomena of phyllotaxis.

The purpose of this paper is to discuss a possible mechanism by which the genes of a zygote may determine the anatomical structure of the resulting organism. The theory does not make any new hypotheses; it merely suggests that certain well-known physical laws are sufficient to account for many of the facts. The full understanding of the paper requires a good knowledge of mathematics, some biology, and some elementary chemistry. Since readers cannot be expected to be experts in all of these subjects, a number of elementary facts are explained, which can be found in text-books, but whose omission would make the paper difficult reading."

Now, Turing reasoned that this chemical, what he called the morphogen, could be generated and could flow out to other cells and in from other cells. Thus focusing on one cell he could create a model across space and time to lay out the concentration of this chemical. He simply postulated that the rate of change of this chemical in time was equal to two factors; first the use of the chemical in the cell, such as a catalyst in a reaction or even part of the reaction, and second, the flow in or out of the cell. The following equation is a statement of Turing's observation.

It allows one to solve for a concentration, C, as a function of time and space. It requires two things. First is the diffusion coefficient to and from cells and second the functional relationship which shows how the chemical is used within a cell.

The question now is how does one link the coefficients in the models. For example if we believe that diffusion D depends on E cadherin concentration, namely as E cadherin decreases then D increases we may postulate a simple linear relationship between diffusion constants and protein concentrations, where the constants are to be determined. We know that the more E cadherin the stickier is the cell and the less diffusion that occurs. Thus the above is at the least a first order approximation. In a similar manner we can relate F to PTEN and p53.

This is merely suppositional. But we do know the following:

1. The genes which are expressed for adhesion and replication are known.

2. We know the pathways for these genes

3. We know the intracellular models controlling these genes.

4. We know that functionally an excess or paucity of a gene has a certain effect.

5. We know that in general in small amounts the world is linear.

6. We know that we can use regression techniques based upon collected data to determine coefficients in a general sense.

Thus we have a fundamental basis to express the relationships for all gross constants in terms of linearized versions of the protein concentrations.

Now we have related intracellular concentrations, which themselves may be temporally and spatially dependent, to the total parameter values for the flow of cells throughout the body. We may also want to relate these to organ specific parameters as well.

Thus what we have achieved is as follows:

1. Model relating intracellular and whole body.

2. Methodology to determine the constants.

3. Methodology to go from patient data to prognostic data.

4. Methodologies to establish possible treatment methodologies. Namely what gene controls will result in what whole body reactions.

We can now summarize this models we have considered. First we should emphasize that for the most part those working in the field have developed pathway models which exhibit a non-temporal mode, it is some steady state model, and the model assumes a protein to protein connection, as if there were a single protein molecule produced and that the interacting proteins were there or not. Part of the simplicity of the models is determined by the limits of what can be measured. We have herein attempted not to limit the results by what can be accomplished currently but has extended the model to levels which assist in a fuller representation of reality. However even here we may very be falling short.

For we have deliberately neglected such things as miRNA, methylation, and the stem cell paradigm just to name a few.

We combine all four methods in a graphic below. We summarize the key differences and differentiators. Currently most of the analytical models focus on pathways. This can generally be supported by means of microarray technology and even rough estimates of relative concentrations may be inferred by such an approach.

 The risks we see even in the above models is the absence of exogenous epigenetic factors and the inclusion of a stem cell model. The latter issue is one of major concern. For example if we have true cancer stem cells, CSC, then we have a proliferation of differing cell types. The use of microarrays is for the most part and averaging methodology, not a cell by cell methodology. If we collect cells from say a melanoma tumor. how much of that is a CSC and how much a TIC. And frankly should we identify CSCs only and perform our analysis on those cells alone.

1.      Martinez-Vincente, M., et al, Protein degradation and aging, Experimental Gerontology 40 (2005) 622–633.
 2.      Goldberg, A., Protein degradation and protection against misfolded or damaged proteins, NATURE, Vol 426, 18/25 December 2003.
 [3] Ciechanover , A, Intracellular Protein Degradation: From a Vague Idea through the Lysosome and the Ubiquitin-Proteasome System and onto Human Diseases and Drug Targeting, Rambam Maimonides Medical Journal, January 2012, Volume 3, Issue 1
[4] Turing, A., The Chemical Basis of Morphogenesis, Phil Trans Royal Soc London B337 pp 37-72, 19459.
 [5] See Andersen p 277 of Bellomo et al for an variant on what we are proposing here. The Andersen model is somewhat similar but lacks the detail we present herein. Also there is in the same volume a paper by Pepper and Lolas focusing on the dynamics of the lymphatic cancer system, p 255. Bellomo, N., et al, Selected Topics in Cancer Modeling, Birkhauser (Boston) 2008.
[6] McGarty, T., Stochastic Systems and State Estimation, Wiley (New York) 1974.

1.     Szallasi, Z. System Modeling in Cellular Biology: From Concepts to Nuts and Bolts. MIT Press (Cambridge) 2006.
1.     Klipp, E., et al, Systems Biology, Wiley (Weinheim, Germany) 2009.