[SOUND] In lecture 15, I'm going to continue with describing to you some case studies of the different kinds of, experiments and modeling that are being integrated in systems biology studies. And the purpose is really to give you both [UNKNOWN] of different kinds of computations that can be applied to a different biological problems, as well as all of different levels of data sets and organizations and so on. Okay, let's get started. So the first, I'm going to discuss three papers in the, in lecture 15. The first one is one that compares signaling networks in normal and transformed liver cells. This is a paper from the laboratory of [UNKNOWN] and Doug Laughlenberger and in contrast in normal signaling pathway models which are largely deterministic O.D.E or P.D.E. type models like the ones I've shown you many times over. This one uses. Analyze the signalling network dynamics using discrete logics. So, this is like a boolean type model with some little bit of variation, with some la- with some sort of logical specifications and followed by but of what would be machine learning type of approaches where you have a data set that you train with and then you can make some predictions and analysis of the model. So the problem that they're addressing is, there are these normal cells, normal liver cells and these transform the cancer liver cells, cancer liver cell cultures, and they want to know all their differences in signaling network topologies between these two. So to answer this question they want to do a lot of experiments. Very complex experiments that involve a bunch of sort of extra-cellular signals and drugs, and so on. So, to analyze this complex data what they start to do is to build and start with a network, which they call prior knowledge network, which has 72 nodes and. 112 edges. We conduct sort of Boolean dynamics on this network to capture signal flow, and we process the network in a way, this is where sort of a logical specifications come in, to identify nodes whose activities experimentally measure and deduce the model. To a logical interaction so they come up with a model that has 32 nodes instead of 78 and 128 logical interactions. This, of course, results in a really large set of models, something like ten to the 38 models against which the data can be evaluated for training the models. So, so what they do for training is search the models 50 to 100 times using a standard genetic algorithm so that it allows them to kind of Figure out what are all the various states of the various nerves and how they can be connected and so on. So this kind of boolean logic based boolean modeling approach can be used to identify changes in network configuration and transform cells. That's the take-home message. If they hadn't found anything they probably wouldn't have published the paper. They wouldn't be discussing it. So. You're going to see that they're successful at the end. So the experimental design they have is a pretty complex one. So they have normal hepatocytes, or primary human hepatocytes, which are normal liver cells. And they have four different cancer cell lines. And each of these are going to be treated with five different ligands and three different drugs, and sometimes these drugs are used in combination. So it's a pretty complex experiment. So they're going to. Treat these cells with all these various ligands, and drugs, and look for the activity state, which is by phosphorylation of all these various cellular signalling markers. They have some 16 target proteins whose activities they measured by immunoblotting. Using [INAUDIBLE] antibodies. And then, they can look at this, and, look for the various ligands. Here are the various, signaling components. And they, they tell you whether they got a significant signal, which is, did the ligand chain the phosphorylation state or not? Or no signal and they get this matrix. This is for the control sets. They do an, little bitty even more complicated approach if you honed for the cancer cell lines. They have multiple cancer cell lines, and for each one of these what they can do is the following. They can ask the question. Is a certain ligand stimulating a certain, intracellular signaling protein in a certain cell type. And what is plotted here is effects when the one and four of the cells are stimulated. Two and four, three and four. Four and four different colors. And we can see which ligand for which, intercellular signalling. This ligand stimulates which intercellular signalling components in a number of these cancer cell lines. When they do all of this and they sort of they use a train data sets to figure out what which of those network topologies would match. What was a, a, a, what was found in the experimental data, what they find is they can come up with, topology that shows that, that is differences between the normal liver cells and the average of the four cancer liver cells. [INAUDIBLE] And the red, I think is the cancer cells. The blue arrows are the normal ones. And the black ones are ones that occur at the same frequency in both normal and cancer, and so they don't provide any power in differentiating in the separating the topology of the cancer cell lines from the normal cell lines. So the line thickness here sort of shows a frequency of our currents in the trained model and you can see that, by, depending on like I told you, there were good or ten to the to 238's or something like this trained models. Now ten to the 38 trained models and, if these connections occur in more of these train models. They are more likely to be that they are connection or an edge that is likely to be operative in one or more of the cell lines and that's what they find here. So you can clearly see even without looking at the details that the, the red and the blue lines don't always converge. There are some blue pathways and just some red pathways like here. Like for instance here. And so you can really classify or separate the topology of the normal cell, the signalling to a topology of the normal cell, the signalling that to a topology of the cancer cell. You can calculate distinct overall topologies for each cell line and this is done by computing the mean difference in the edges and then you can, like, put them all together and come up with a school. And when you do this, what you find. Find is all of the cancer transforms cell lines are clustered up here and all of them are quite far away from anonymous cell line which is down here. So what this is that the topology of the. Transform cell lines as different from the topology of the normal cell line network, signaling the total topology, and for different transformed cell lines, they are closer to each other than any one of them is to the normal. And of course they have this canonical topology, which is somewhere out here in the. You can also take this and plot this as a hierarchy clustering graph, and this is shown out here. And you can see that the, here are the four different cancer cell lines. And here is the normal one at the bottom. You can use this type of analysis to identify key molecular. Differences using the logic based models. And they identified a couple of them which is alterations in the level of Hsp-27 which is a chaperone changes in the control of the transcription factor NF-KB by extracellular ligands. And changes in the phosphorylation of [UNKNOWN] and [UNKNOWN] in saline treated cancer cells but aren't normal cells and these kinds of observations which come from the sort of differential analysis in these models allow them to sort of identify. Potentially new drug targets. And also figure out how drugs work. Because if you go back some, you will see that they were using several drugs here to block the effects of multiple. [NOISE] Intercellular signalling components. So what the study allowed them to do was to use logic based boolean models and these are these can be a good approach for extracting knowledge from complex comminatory experiment. Because otherwise you'd have to really model this entire, sort of network, with all the variations and the stimulations, and the drag inhibitions, in a OE type model there would just be, way too may parameters that are unconstrained and you would remember the famous story about the spherical cow. Although, this approach looks actually pretty good to me, the approach is successful in identifying differences and topologies under variety, for a variety of perturbations, different cell types, multiple ligands, different drugs, so overall the approach has the potential to be versatile in capturing. Dynamics of signalling topologies in cells of various sorts. Of course as I told you with the drug targets they were able to sort of catch the distinct topologies that are operational when drugs are used and therefore these kinds of network analysis can be used to understand drug action as well as identify potential new drug targets. Of course the problem with this kind of approach is what you start with prior knowledge network. So the approach is surely constrained by prior knowledge both at the experimental level and computational level. So it becomes hard to discover new nodes in the system if something else, something is not there in the system that is contributing or driving the behavior, one would not be able to see that with this kind of analysis, what you start off with is a canonical pathway. The next study I would like to discuss with you is computational model of a whole cell. It's model of a bacterial cell. This is micro bacterium that is about, that has 525 genes. and. So it's not a very big, Model system, but it has complexity in terms of the various types of sub-cellular processes that these people start to model. So basically I sort of did a cut and paste here of what they wanted to do. It says our model attempts to one describe the life cycle of a single cell from the level of individual molecules and their interactions. Account for the specific function of every annotated gene product. Oh, that sounds good, and accurately predict the range of observable cell behaviors.So basically what these people did was to model the difference sub cellular functions as modules, and they had 28 of these functions and each module separately using the appropriate modeling approach. So for instance for metabolism. They used flex balance analysis. For protein degradation they used a plus one distribution based model. So then they combined the various models to run whole cell simulations, if you will, for 16 whole cell variants. So essentially what they did was to, is they run the simulations with each module, depending on how they model that module, and they had sort of like an, overarching simulation where the activity of each module. Was dependant at, at a certain time step was dependant on what happened to other modules that could affect it in the previous time step. And so here you have these various sort of Functions and modules and you can see and you can make this sort of like by parted graph, and you can look at flow from one to another to function and just as you send surely you would would rather sort of an ingenious way of of approaching the whole cell model constraining the parts of the whole cell model which it's likely to be correct. So one of the questions, so what can you do with a model of this type. So one of the questions they asked is why is there more subword that what we observed was that there was more variation for the sub-processes between between cell and cell. Then for, for instance the entire cell cycle. So for instance you can see this first one that DNA replication, or initiation of replication is much more broader than the actual full cell cycle. So they were trying to explain why this happens, so for, to understand this they ran a whole set of 128 simulations, and looked at the relationship between cell cycle. And the subprocesses of the molecular entities of participater. So what they found is actually very interesting. To do sort of an exhaustive analysis of everything, which I won't describe here, they found the nucleotide triphosphates, or deoxynucleotide triphosphate levels control the balance between. Initiation of DNA synthesis and the replication phase of DNA synthesis. Both of these phases are independently controlled by Deoxynucleotides phosphatases and these two. For a balance its self off. Additionally the novel observation that it wasn't entirely a structure of DNA or the enzyme activities were actually the levels of the deoxynucleotide triphosphotase that is important indicate that energy might be playing a major role and so essentially what they. Claim is the dead found in emergent biochemical or metabolic control of the cell cycle. That's actually an interesting observation. Another thing that whole cell models can do is to provide you with a global view of a type of sub-cellular function you might be interested in. So in this case the example that they give is global view of energy usage. So you can look that during the cell cycle, the rate of synthesis of ATP or GTP or even any DPH is all. They're all pretty [INAUDIBLE], stable over cell cycle. However, the ATP and GTP are produced at a much higher rate than the, any DPH, and so on. And then they ascribe from the modeling, which part of the cell cycle utilized how much energy. So clearly translation uses a lot of energy, harmony and transcription, also use a lot of energy. But quite surprisingly, nearly 44% of the energy that's produced is unaccounted for. So this, this sort of gives you the usage by the various subcellular processes and you can see most of them use very little of the cell energy. So the question becomes, for instance, which is a really interesting question, why is there such a large, 44% of the energy that is produced is not utilized in the cell cycle. So. What is this energy for? Perhaps redundancy robustness in the good energy production and the energy consuming system here. Alternately it may be that we still don't truly understand all of the functions of this bacteria and there are some hidden functions that we don't know about the utilize energy. And this is why the bacteria sort of save this energy. So this will have to be sort of resolved by further studies. Okay, so this was at a global level, so let me contrast this with something that happens. It's at a really molecular, or even sub-molecular, level. So, this is it, part of the study what they did was they combined the simulations of the full bacterial cell with gene disruption experiments, and then asked the question, they disrupted certain genes what would the behavior of the cell would be both in control. A wild type and or in mutated cells and the and the gene they disrupted allowed the cell to bind to sort of participate in an energy growth processes under a variety of conditions. So they compared the growth of these cells in both wild type and deceptive strains and from this. Comparison of the wire tap and disruptive strains they were able to sort of come match the model or actually not match the model with the experiments and the discrepancy allowed them really to sort of identify the range in which the k cat for these such as p d p or t d k two would be can be operational both in the wild type. And in the transformed cells that allows this for the usage of these processes in cell growth. Once they know the limitations of the usage of these enzymes in cell growth, what they can cal sort of identifies the range of k cat will enrich these enzymes would have to be operating to give rise to the observed. Wholesale behavior. And in this manner by comparing these goat rates they were able to have predictions for the k cats of enzymes that have not yet been really purified. And these K for this bacteria, how are these K cats. As they claim in the paper agree very nicely with known key cards for. For other bacterial species that are closely related. To the one they are studying. So the conclusion from this kind of whole sale modeling. Are the following. A whole cell modeling approach is a reasonable one to integrate multiple subcellular processes to sorry, simulate whole cell behavior. From a simulation perspective, the sort of technological advances that the study. A a sort of shows is actually quite substantial and its a significant advance. They do make a number of interesting predictions that can be tested experimentally and I think in the future this is what they are going to be I suspect this is what they will be doing as to test some of these pr, higher load predictions. But what is not certain is that yes this model works for a very simple organism, but will this approach work if it has to be scaled to an organism with say, 10,000, 25,000 genes. This we don't know as yet. [NOISE] [BLANK_AUDIO]