Systems Biology

This essay was written by Richard Goldstein and was first published in the 2005 Mill Hill Essays.

Modern science generally proceeds from the idea of ‘taking something apart and seeing how it works’. The basic principle has been reductionist – first figure out how each of the parts works before trying to fit the parts together. In recent years biologists have been concentrating on the first aspect, the investigation of individual parts, but we are now increasingly addressing the second aspect. How do we use our knowledge of the individual components and how they interact to understand how a living being works? How do we use that understanding to improve our health and our environment? This is the emerging area of ‘systems biology’. The questions addressed by systems biology have been with us for centuries. The circulatory system, for example, was mapped by William Harvey in the 17th century, and even earlier by Ibn Nafis in the 13th century. For decades, scientists have investigated the regulatory system and how it turns genes on and off. So what is it about systems biology that is new? What new challenges does it bring? And how is it going to change the way that we do science? Let us look first at why science has changed, at recent advances in experimental techniques, computational tools, and theoretical understanding. These advances have allowed us to address old questions in new ways, and to ask some new questions as well. Firstly, new experimental methods have given us unprecedented understanding about the constituent parts. The human genome programme, combined with other related research programmes, has provided us with extensive inventories of the components of biological systems – the genes, the proteins and nucleic acid fragments that the genes encode, as well as the multitude of chemicals involved in the functioning of a living system. This has been helped by the rise of ‘industrial science’. Robotics and automation have helped researchers to acquire data at an unprecedented pace. Where once a scientist might have spent a career studying one particular protein, increasingly we can study hundreds of components in parallel. Especially important, the availability of such information for different types of organisms, from bacteria to humans, allows us to make detailed comparisons between species.

Secondly, we have developed techniques for mapping out the interactions of these component parts. Increasingly, we can use sophisticated experimental and computational methods to observe or predict which proteins and other biologically important molecules communicate with each other, and how they are affected by these interactions.

These two aspects are still firmly focused on the study of the component parts but a new type of experiment looks at the state of the system in a holistic manner. We can, for instance, measure how much of a given molecule the cell is producing and how this quantity depends upon what the cell is doing at that time. And we can do this for thousands of different molecules simultaneously. We can also look at how proteins are modified by the cell after they have been produced. These methods allow us to take a ‘snapshot’ of the state of the whole cell. The power of these methods has been increased by the ability to genetically modify organisms, for instance, deleting every single gene in yeast, one gene at a time, and seeing how the organism is affected. This deluge of information will only help us to understand if it can be managed efficiently. The data have to be collected, curated and made available to the scientists in the field. That is where the development of computers has been important. It is now possible to store gigantic databases, allowing access to researchers from around the globe. In addition to information storage and retrieval our growing computer power is also allowing us to simulate large and complicated systems, from metabolic and regulatory circuits, to cells, to whole organs. The rise in our ability to model systems biology has come about because of our ability to use systems of computers, to link computers together to perform complex calculations. Finally, it is becoming clear that the investigation of a system of interacting components may require different ways of thinking than the study of the individual components, just as the flow of traffic cannot be understood solely by considering automobile mechanics. Biological systems are complicated, involving different forms of control and feedback. Consequently, the results of a change to the system can be unpredictable and counter-intuitive, just as the widening of a road can sometimes increase traffic congestion. We may not be able to go easily from the properties of individual system elements to the characteristics of the whole; rather, new ideas may be needed. This conceptual aspect has been helped by the rise of theoretical advances in computer science, physics, and control theory, which have provided us with new tools for understanding complex, regulated systems.

These developing aspects – the abundance of information on specific components, the ability to measure the overall state of a system, and increases in our ability to model and understand complex systems – point out what is different and new about modern systems biology. There has been much research understanding the molecules of life (what could be considered a bottom-up approach), as there has been describing the large-scale anatomical features (what could be considered a top-down approach). But we now have a chance to make a connection between the two, to understand how the largescale features emerge from the molecular description, how the molecules function in the larger biological context, and how we can understand biology, both healthy and diseased, on both these levels. The essence of systems biology is making a connection between the various size scales, from the molecular to the organismal.

There are real challenges in bridging the gap between the molecular description and the description of cells, tissues, organs, and whole organisms; or between the time-scales of molecular motions (billionths of a second) and the decades over which some living systems exist, as well as the longer changes over which species and ecosystems evolve. These challenges are primarily of three types – practical, conceptual, and cultural.

One practical difficulty is the number of small individual parts involved in a large complex system. If we want to model the human body on a cellular level, we would have to model approximately one hundred trillion different cells. A typical cell would contain approximately ten thousand different kinds of proteins. Even with the fastest computers, we are still a long way from performing realistic simulations at this level. The problem is that what happens at the molecular level matters on these larger scales. The difference between drinking ethanol and drinking methanol is the difference between drunkenness, and blindness or even death, though the two molecules differ only by a few atoms. Changing just one DNA base out of the 3 billion in our genome can cause incurable disease. The action of drugs is also necessarily molecular. If we need to include information at the molecular level, but cannot include a complete molecular description, then we have to use a reduced description – including what is important, and ignoring what is not. But how do we know what is important?

An additional practical problem is the combination of different types of data. We have to find a way of manipulating sequences of letters (such as the genome), structural data, visual images, networks, graphs, and distributions to give an integrated view of the biological system. Each form of data has its own properties, its own strengths, its own inaccuracies. The essence of systems biology is finding a genuine way to compare apples with oranges.

As mentioned before, it is likely that new concepts will be needed to understand biological systems. These concepts may arise from biology, from other fields such as cybernetics and control theory, or they may need to be created. Two important concepts in systems biology are the notions of ’emergence’ and ‘self-organisation’. Emergence refers to the process by which large-scale phenomena occur as a result of interactions at a smaller scale. The same phenomenon, such as nerve impulses in the brain, can be very different on the microscopic level – ion flow through membrane channels – and on the macroscopic level – thoughts, perceptions, and consciousness. Related to this is the process of ‘self-organisation’, how it is possible for small, simple units spontaneously to organise themselves into more complicated structures. Both of these concepts suggest that it will be difficult to go smoothly from our knowledge of the molecular to an understanding at the systems level. One systems and other large-scale integrated complicated systems, such as transport systems, the internet, and economics. If so, and provided there are no specific properties of biological systems that invalidate such comparisons, then it may be possible for systems biology to use the concepts that have been developed for these other areas.

The diverse forms of data, the demands of different types of analysis, the combination of experimental, computational, and theoretical techniques, means that systems biology must be an inherently interdisciplinary field. This study will require integrated teams of scientists from across disciplines, including theoretical and computational biologists and wet-lab molecular biologists, those working on the properties of individual molecular components along with scientists working on larger, more complex, integrated phenomena. This will require different groups of people, all with their own specific languages and mindsets, to learn from each other in a productive way.

Systems Biology :: by Richard Goldstein [from Mill Hill Essays 2005, ISBN 0-9546302-3-8] Modern science generally proceeds from the idea of ‘taking something apart and seeing how it works’. The basic principle has been reductionist – first figure out how each of the parts works before trying to fit the parts together. In recent years biologists have been concentrating on the first aspect, the investigation of individual parts, but we are now increasingly addressing the second aspect. How do we use our knowledge of the individual components and how they interact to understand how a living being works? How do we use that understanding to improve our health and our environment? This is the emerging area of ‘systems biology’. The questions addressed by systems biology have been with us for centuries. The circulatory system, for example, was mapped by William Harvey in the 17th century, and even earlier by Ibn Nafis in the 13th century. For decades, scientists have investigated the regulatory system and how it turns genes on and off. So what is it about systems biology that is new? What new challenges does it bring? And how is it going to change the way that we do science? Let us look first at why science has changed, at recent advances in experimental techniques, computational tools, and theoretical understanding. These advances have allowed us to address old questions in new ways, and to ask some new questions as well. Firstly, new experimental methods have given us unprecedented understanding about the constituent parts. The human genome programme, combined with other related research programmes, has provided us with extensive inventories of the components of biological systems – the genes, the proteins and nucleic acid fragments that the genes encode, as well as the multitude of chemicals involved in the functioning of a living system. This has been helped by the rise of ‘industrial science’. Robotics and automation have helped researchers to acquire data at an unprecedented pace. Where once a scientist might have spent a career studying one particular protein, increasingly we can study hundreds of components in parallel. Especially important, the availability of such information for different types of organisms, from bacteria to humans, allows us to make detailed comparisons between species. Secondly, we have developed techniques for mapping out the interactions of these component parts. Increasingly, we can use sophisticated experimental and computational methods to observe or predict which proteins and other biologically important molecules communicate with each other, and how they are affected by these interactions. These two aspects are still firmly focused on the study of the component parts but a new type of experiment looks at the state of the system in a holistic manner. We can, for instance, measure how much of a given molecule the cell is producing and how this quantity depends upon what the cell is doing at that time. And we can do this for thousands of different molecules simultaneously. We can also look at how proteins are modified by the cell after they have been produced. These methods allow us to take a ‘snapshot’ of the state of the whole cell. The power of these methods has been increased by the ability to genetically modify organisms, for instance, deleting every single gene in yeast, one gene at a time, and seeing how the organism is affected. This deluge of information will only help us to understand if it can be managed efficiently. The data have to be collected, curated and made available to the scientists in the field. That is where the development of computers has been important. It is now possible to store gigantic databases, allowing access to researchers from around the globe. In addition to information storage and retrieval our growing computer power is also allowing us to simulate large and complicated systems, from metabolic and regulatory circuits, to cells, to whole organs. The rise in our ability to model systems biology has come about because of our ability to use systems of computers, to link computers together to perform complex calculations. Finally, it is becoming clear that the investigation of a system of interacting components may require different ways of thinking than the study of the individual components, just as the flow of traffic cannot be understood solely by considering automobile mechanics. Biological systems are complicated, involving different forms of control and feedback. Consequently, the results of a change to the system can be unpredictable and counter-intuitive, just as the widening of a road can sometimes increase traffic congestion. We may not be able to go easily from the properties of individual system elements to the characteristics of the whole; rather, new ideas may be needed. This conceptual aspect has been helped by the rise of theoretical advances in computer science, physics, and control theory, which have provided us with new tools for understanding complex, regulated systems. These developing aspects – the abundance of information on specific components, the ability to measure the overall state of a system, and increases in our ability to model and understand complex systems – point out what is different and new about modern systems biology. There has been much research understanding the molecules of life (what could be considered a bottom-up approach), as there has been describing the large-scale anatomical features (what could be considered a top-down approach). But we now have a chance to make a connection between the two, to understand how the largescale features emerge from the molecular description, how the molecules function in the larger biological context, and how we can understand biology, both healthy and diseased, on both these levels. The essence of systems biology is making a connection between the various size scales, from the molecular to the organismal. There are real challenges in bridging the gap between the molecular description and the description of cells, tissues, organs, and whole organisms; or between the time-scales of molecular motions (billionths of a second) and the decades over which some living systems exist, as well as the longer changes over which species and ecosystems evolve. These challenges are primarily of three types – practical, conceptual, and cultural. One practical difficulty is the number of small individual parts involved in a large complex system. If we want to model the human body on a cellular level, we would have to model approximately one hundred trillion different cells. A typical cell would contain approximately ten thousand different kinds of proteins. Even with the fastest computers, we are still a long way from performing realistic simulations at this level. The problem is that what happens at the molecular level matters on these larger scales. The difference between drinking ethanol and drinking methanol is the difference between drunkenness, and blindness or even death, though the two molecules differ only by a few atoms. Changing just one DNA base out of the 3 billion in our genome can cause incurable disease. The action of drugs is also necessarily molecular. If we need to include information at the molecular level, but cannot include a complete molecular description, then we have to use a reduced description – including what is important, and ignoring what is not. But how do we know what is important? An additional practical problem is the combination of different types of data. We have to find a way of manipulating sequences of letters (such as the genome), structural data, visual images, networks, graphs, and distributions to give an integrated view of the biological system. Each form of data has its own properties, its own strengths, its own inaccuracies. The essence of systems biology is finding a genuine way to compare apples with oranges. As mentioned before, it is likely that new concepts will be needed to understand biological systems. These concepts may arise from biology, from other fields such as cybernetics and control theory, or they may need to be created. Two important concepts in systems biology are the notions of ’emergence’ and ‘self-organisation’. Emergence refers to the process by which large-scale phenomena occur as a result of interactions at a smaller scale. The same phenomenon, such as nerve impulses in the brain, can be very different on the microscopic level – ion flow through membrane channels – and on the macroscopic level – thoughts, perceptions, and consciousness. Related to this is the process of ‘self-organisation’, how it is possible for small, simple units spontaneously to organise themselves into more complicated structures. Both of these concepts suggest that it will be difficult to go smoothly from our knowledge of the molecular to an understanding at the systems level. One systems and other large-scale integrated complicated systems, such as transport systems, the internet, and economics. If so, and provided there are no specific properties of biological systems that invalidate such comparisons, then it may be possible for systems biology to use the concepts that have been developed for these other areas. The diverse forms of data, the demands of different types of analysis, the combination of experimental, computational, and theoretical techniques, means that systems biology must be an inherently interdisciplinary field. This study will require integrated teams of scientists from across disciplines, including theoretical and computational biologists and wet-lab molecular biologists, those working on the properties of individual molecular components along with scientists working on larger, more complex, integrated phenomena. This will require different groups of people, all with their own specific languages and mindsets, to learn from each other in a productive way. In particular, with the rise of modern experimental and theoretical techniques, there are now two very different approaches in science. One approach is the standard ‘hypothesis-driven’ approach. A question is asked, a hypothesis is constructed, predictions are made based on that hypothesis, and experiments are designed to test if these predictions are correct. Increasingly, however, scientists are performing what is called ‘discovery research’. In this process, data are accumulated and then analysed for patterns, and the patterns are used to create hypotheses. In this way, experiments are performed in the absence of a hypothesis to be tested. If hypothesis-driven research is a torch, being directed to a particular aspect, discovery research is a sieve, panning for interesting combinations of units that assemble in non-random ways. This sieving process cannot be random, but must rest on what has been achieved through the hypothesis-driven methods. The hypotheses so generated must then be tested through this standard approach. It is therefore important that these two groups of scientists, approaching science from fundamentally different perspectives, be able to work together productively.

Systems biology has developed its own approaches to addressing these challenges, one of which is to place theoretical and mathematical modelling at the centre of the endeavour. Understanding a biological system is now considered to mean that a theoretical model can be constructed that is able to make correct predictions. This means that such models must be validated with new experimental results. In addition, these models can highlight what is important, where our most profound ignorance lies, and suggest future experimental work. But theoretical models also have to be based on what we know about the components, how they interact, and how the global patterns emerge. Theoretical models have to be based on our experimental understanding. The result is an iterative approach. New data and new results are assembled and used to create mathematical models of the biological system.

These models are then used to make predictions. These predictions can then either be tested by looking at other forms of data, or by accumulating new data or through direct experimental testing. The results of this experimental process are then used to construct better models. An example of this process is the development of models of the heart. The heart is an attractive area of study as there are experimental data at a range of different scales, from the expression of genes in different cells in different locations, to the flow of ions through various membrane channels, to the electronic responses observable on an electrocardiogram and measurements of muscle contraction, blood pressure and blood flow. During the past 40 years, the iterative process of experimental observations and mathematical models had developed models for the different type of cardiac cells, as well as more mechanical models of the overall structure. Work is now progressing to connect these two types of models, so that what occurs at the organ level both reflects and determines the cellular processes. The resulting models are now sufficiently accurate to be used by the US Food and Drug Administration to assess the action of drugs developed for the heart. Models of other organs, such as the liver, kidney, lung, and pancreas are in development. Much of modern drug discovery involves identifying how diseased organisms are different from non-diseased organisms, and trying to decipher how to make the former more resemble the latter. As we are able to investigate the differences between disease and health more deeply, we obtain increased understanding of where these differences are, and how they can be addressed by drugs. Unfortunately, the hazards in the development of new drugs are many. Any chemical can have a wide range of impacts on an individual, in a way that may depend on the individual. It is difficult to understand why some drugs work and some do not, why some have negative side-effects, and why some drugs work for some people but not for others.

In this manner, drug research is like a maintenance employee surveying a flooded room. It is obvious that the difference between a flooded and dry room is the amount of water, so pumps are brought in to bring down the water level. But sometimes that is not sufficient – water keeps flooding in. It is necessary to identify which water main has broken, and to turn it off. But how do we know which water main is responsible for the flooding? How do we know which valve to close? How can we predict what else will happen if we close a given valve? How much does the choice of correct valve depend upon the particular room that is being flooded? The ability to look beyond the water level at the parts of the system causing the flooding greatly increases our ability to deal with the situation. Similarly, understanding the various components and how they are interacting to produce the disease can greatly increase the range of possible drug targets. Understanding how each component works in the context of the other Simulations might be able to replace some fraction of currently-needed animal research. Finally, by understanding the role of the various components, we can see which components are different for different people, and thereby determine whether a given drug is effective, ineffective, or harmful. This would greatly accelerate the drug-development process, as it would be possible to test the drug on those people who are most likely to respond and least likely to have side effects. The ultimate goal is personalised medicine, where the particular combination of drugs would be decided for an individual patient based upon, for instance, their specific genome.

Finally, there are potentially useful non-medical applications for systems biology. If we can understand the inside workings of organisms, we can potentially modify such organisms to do our bidding. Design bacteria that gobble up oil spills? Or bacteria that are able to attack and kill cancer cells, but provide no harm to normal cells?

Systems biology is essentially about making connections: connecting smallscale molecular events with large-scale physiological consequences; connecting actions that take place in billionths of a second with growth, development and ageing; connecting experimental, theoretical, and computational approaches; connecting scientists with different backgrounds and languages and cultures.

The pressures driving systems biology – the flood of information about components, their interactions, and how they work in larger systems, the growing power of experimental and computational techniques, the insights of new theoretical approaches – show no sign of slackening. It will be increasingly difficult to think about large-scale biological systems without thinking about their components, and correspondingly difficult to consider these components except in the context of how they work together with others. For these reasons, systems biology is likely to become a dominant force in the medical sciences, even if it ceases at some point to be called ‘systems biology’. With these changes will come difficult challenges – cultural as well as scientific – but also unprecedented opportunities. Systems biology promises to allow us to better understand living systems and to modify their behaviour, preventing and curing disease and providing for many biotechnological applications.

Leave a comment

name*

email* (not published)

website