Friday, February 5, 2010

Cancer, Genomics and a New Paradigm

In reading a paper by one of the Professors on my Doctoral Oral Exams, Bob Gallager, I come back to his comments time and again. For example in his presentation on Tom Kailath's 70th Birthday celebration, he recounted the essence of Shannon's famous paper developing Information Theory as we now know it. It was a simple paper because it presented a simple paradigm, the binary symmetric channel.

Shannon's paradigm or example was as shown below. Inputs, outputs and errors.

From this simple token comes everything we now know about communications and information theory. Gallager remarks:

Information theory has prospered because of 4 major ingredients: 1) There is a rich and elegant mathematical structure based on probability. 2) There are many toy problems that are fun and simple, but which can be extended to approach reality. 3) The application field is digital communication, which has rapidly grown in importance. 4) The culture is to attack new problems in a discipline oriented fashion.

Namely with this simple model all the other distractions could be put aside for a while and then added one at a time to the elegant simplicity to create a functioning world view.

Gallager then continues:

It is equally important to constantly simplify the structure. Detail must be abstracted away. Simple but generalizable examples (and counterexamples) are critical. Human minds do not evolve on technological time scales, and theories that are not accessible to human minds are not much use. As data expands, the importance of simple structure becomes essential.

The last statement is of driving importance. Let me take another step and that is from Shannon to Wiener. Wiener developed cybernetics, which in simple terms is the application of the paradigm of feedback control to life. The Wiener paradigm is shown below:

This paradigm of feeding back was used by Wiener in areas from control, to signal processing, biophysics and society in general. Again a simple example.

The third is the Watson and Crick paradigm of the cell. This we show below:

DNA, to RNA to protein, simple. Yet as we have learned over the past almost sixty years not so simple. There are feedback loops, errors, and complex control mechanisms.

In the past twenty years we have come to understand the cell protein interactions and we show them as follows:

These loops are massively complex and we show links of activators and suppressors all over. In some ways this is akin to what Shannon faced when he drew the simple diagram since there were modulators, demodulators, antennas, receivers, transmitters, and even more. Shannon reduced the problem to a simple paradigm. In that paradigm, that example, he found the essence of the development of information and communications theory.

The question then is; can we go back to one of the simple paradigms again in the areas of cell dynamics. Can we do what we have seen in Shannon or Wiener or even in Watson and Crick. Can we restart the paradigm for what we have here in unwieldy. Can we re-look at the chart above and from it distill a "system" which is both analyzable and projectable.

Dougherty in a paper entitled On the Epistomological Crisis in Genomics hints at this goal we have defined. We have been applying this in the analysis of secondary pathways in plants and humans and have been developing a similar simple paradigm for cell dynamics, a model which accommodates the pathways yet does so in a dynamic and controllable manner The answer is to use the systems models that we know all too well and then combine system identification procedures we also know well to determine cell dynamics.

Thus I propose a model of the following type, one of course which will require some simplification:

In the above we have a complex system of genes, controlled by their products in a feedback manner, with the controls being randomly hit by genetic alterations from time to time. The result is the control of pathways, secondary as well as primary, and this model gives structure to what we have seen before as just an interconnected collections of proteins. Moreover this approach applies a dynamic to the model as well as a way to assess the control, namely management, of model aberrations. This gives a paradigm to work with.

Now how do we go from the map or proteins to the model of the life of a cell? Simply is one follows both what we have done in the above mentioned work and if we use Goodarzi et al in their paper Revealing Global Regulatory Perturbations across Human Cancers. We use microarray data by the truck load and then using the model, the paradigm of cell dynamics, determine by standard system identification methods the constants.

To summarize:

1. Develop a simple model or paradigm, so that the elements focused upon are a few and are recognizable.

2. Use the massive amounts of data to incorporate into the model to obtain a good data fit. Remember that this is NOT econometrics, we have a model of reality here, it reflects reality, it is reality, just as Shannon did in his model.

3. Use the model to look at the temporal dynamics. The temporal dynamics are what one must look at in Cancer. The cells grow uncontrolled but that is the final step. The keys are understanding the intermediate dynamics. If we can can we can recognize them when present and control them so we do not have an adverse outcome.

4. Generalize the model across a wide base of cancers.

Just a thought, we are slowly filling in the details.