On Making the Genome Whole - Part I

On Making the Genome Whole

Part 1: Twilight of the Double Helix
Steve Talbott
([email protected])
When Francis Crick and James Watson announced in 1953 that they had discovered the double-helical "secret of life", they bequeathed to our imaginations an image combining the cool, efficient, geometrically precise beauty of a crystal, with the compelling logic of a computer program.
The logic, as it would be pieced together over the next few years, was simple and elegant. Four chemical groups — nucleotide bases, the distinctive "letters" of a life-engendering code — were strung by the millions along both spiraling strands of the double helix. Each successive group of three letters lying along a strand was a code naming a particular amino acid, and a sequence of many such codes represented, in proper order, all the amino acids making up a single protein. Thousands of such proteins, so constructed, were the primary workhorses of every cell, forming many of its various structures and mediating its countless chemical interactions. And so the double helix (otherwise known as DNA), with its carefully sequenced letters, was the instruction book for assembling a living organism.
The salient facts of organism assembly in this early picture were likewise straightforward. The forty-six chromosomes in a human cell consisted, most essentially, of double helical DNA, and this DNA was divided into numerous genes, each of which in turn coded for one protein. By a process known as "transcription" and facilitated by an enzyme, an individual gene gave rise to a kind of mirror image of itself in the form of a molecule known as "messenger RNA" (mRNA). This molecule, also containing a sequence of nucleotide bases, preserved the gene's coding for a protein. Then another kind of RNA, called "transfer RNA" (tRNA), came into play: in conjunction with some specialized machinery at the site of protein synthesis, tRNA read the code imprinted upon the mRNA and used it to assemble amino acids into the specified protein. This latter process was called "translation".
Perhaps the most compelling detail in this picture was the fact that when a mistake occurred — when a letter of the DNA code was transcribed into the wrong letter of mRNA — an error-correction machinery zeroed in on the mistake and fixed it. Nothing could have illustrated more vividly the directed, computer-like efficacy of the entire process. The scheme was both satisfyingly logical and causally effective. The DNA codes that named a protein simultaneously constituted the master template and initiating machinery for constructing it. The master DNA instruction manual was passed from parent to offspring with remarkable fidelity, and its instructions were executed in such a way that information and control always flowed in a single direction. "DNA makes RNA, and RNA makes protein", as the saying went. Within the individual organism, DNA was a kind of First Cause or Unmoved Mover. As Nobel laureate Max Delbrück put it, DNA "acts, creates form and development, and is not changed in the process" (1971).
In fact, the story was so neat — and, for most researchers, so entirely convincing — that one heard occasional murmurings of regret about the unfortunate lot of future biologists. Wouldn't they be left with the not very stirring task of working out the subordinate details? If the overall logic and the governing causal pathways were already known, at least in principle, what could remain except for nitpicking at ever lower levels of analysis?

A Disturbing Context

And yet, as we now know, the story was crushingly false to life. Biologists need not have fretted over their sources of career satisfaction — nor over their employment prospects. It was some forty years after the discovery of the double helix that one of the most massively funded research projects in the history of science mobilized genetic laboratories around the world to tease out the complete and definitive text of the genomic "Book of Life". It is true that this Human Genome Project, which many hoped would lead directly to the final solution of life, again raised questions about a meaningful future for biologists. But any such worry has again been set aside. For the project was scarcely completed when the realization struck that the "solution" was still enigmatically encoded as a raw, undeciphered text. The key to interpretation, many decided, was an even more ambitious project, the elucidation of the "proteome" — the tens of thousands of proteins in the body, with their complex folding patterns and endlessly diverse functioning. That effort is now under way.
Meanwhile the envisioned keys to life themselves have been growing ever more diverse, each speaking its own distinctive language, and each looking like part of a puzzle that keeps growing in scope and complexity faster than our identification of individual pieces. Take, for example, the actual stuff of chromosomes, called chromatin, which consists not only of DNA but, even more extensively, of proteins that give their own form and structure to the chromosome. For many years this unruly protein setting was largely ignored as geneticists focused on the controlling wizardry of the coded genes. But now numerous laboratories are uncovering how the continual and intricately choreographed modification of chromatin affects the activity of genes. Although the researchers' first impulse was to find another "simple code", it now appears, according to geneticist Shelley Berger of Philadelphia's Wistar Institute, that "a more likely model is of a sophisticated, nuanced chromatin 'language' in which different combinations of basic building blocks yield dynamic functional outcomes" (2005, p. 407). And so the chromatin interpretation industry has become one of the largest enterprises within molecular biology.
But chromatin is hardly the end of it. New and strange names point to multiplying decipherment challenges. We hear of the "methylome" and "membranome", the "histone code" and "RNAi-interference code". And, most encompassing of all, there is the "epigenome", consisting of all the varied cellular processes that bear on the activity of genes — processes that not only influence whether or not a gene is transcribed, but can even alter the effective sequence of genetic letters. These epigenetic ("extra-genetic") processes seem to determine the genetic code at least as much as they are determined by it. But if this is true, then what has become of the master controller?
Every biologist today will grant the inadequacy of the story of the 1950s. Many would probably add that it's perfectly natural for our understanding to grow more complex with time — so why harp on the inevitable limitations of those early pioneers who made great discoveries?
But this, I'm convinced, is to miss the dramatic significance of the current revision of our understanding of the living organism. Exactly how that earlier story was false, and with what seismic implications for the foundations of biology, is still scarcely appreciated by the general public — or even by many of those scientists who have been pronouncing the end of the era of the gene.
What's at stake is the nature of biological explanation — our understanding of understanding itself. Particularly at issue are the distortions introduced by a one-sidedly logical-causal habit of thinking — distortions worsened by the continuing failure to enter into the more organic sort of understanding that so many have hoped for over the years and even centuries.
I will have a great deal to say about the character of both logical-causal thinking and organicism. But first we need to ground ourselves in some of the striking revelations stemming from the ongoing research in epigenetics — research that has now, by force of its paradigm-subverting potential, assumed a position front and center in the consciousness of molecular biologists.

Two Problems

Already at the beginning of the double helix era a troubling question bedeviled all discussions of the DNA sequence as the Master Logic and First Cause of life. Every human being begins life as a single cell containing an entire genome. But over the course of development this cell becomes many radically different kinds of cell in all the various tissues of the body. From muscle to nerve, from retina to kidney, from skin to brain, every cell contains the same chromosomes with the same DNA sequences¹. If these sequences are what determine traits, how do we account for the dramatically different kinds of cell?
Imagine the situation concretely. You have a single, undifferentiated cell, and then this cell divides and the two daughter cells enter upon pathways leading to different tissues. During cell division, the chromosomes are faithfully replicated, so that each daughter cell receives the same "instruction set". How is it that these identical instruction sets proceed to direct the cells along divergent paths, so that offspring of the one eventually gain the ability to expand and contract as part of a muscle, while offspring of the other take on a rigid form together with a specialized ability to transmit electrical signals? It seems that a cell of the heart muscle must possess a "self-understanding" decisively different from that of a brain cell, and this understanding cannot derive solely from its DNA.
Actually, this problem was raised by many observers long before the genomic era. For example, the developmental biologist F. R. Lillie, remarking in 1927 on the contrast between "genes which remain the same throughout life" and the developmental process, "which never stands still from germ to old age", asserted that "Those who desire to make genetics the basis of physiology of development will have to explain how an unchanging complex can direct the course of an ordered developmental stream" (Lillie 1927, pp. 367-8).
Fundamental though it was, the objection received little attention for several decades. Meanwhile, a central result of the Human Genome Project posed a second problem. Instead of the expected hundred thousand or more genes in the human genome, there turned out to be only twenty-five thousand or so — roughly the number possessed, for example, by a simple, one-millimeter-long, transparent roundworm, Caenorhabditis elegans. If it really is genes that account for the organism in all its complexity, how can it be that a human being and a primitive worm can be accounted for by a similar number of genes? "As far as protein-coding genes are concerned", writes Ulrich Technau, a developmental biologist from the University of Vienna, "the repertoire of a sea anemone . . . is almost as complex as that of a human" (Technau 2008, p. 1184).
The answer increasingly proposed by biologists is that genes are far from the whole story if you want to understand the organism. Some ninety-nine percent of human DNA does not consist of genes — that is, does not code for proteins. Most of this noncoding DNA was long referred to as "junk" and was assumed to be an evolutionary accumulation of meaningless genetic detritus. As it happens, though, an intriguing pattern has emerged: noncoding DNA accounts for only 10% of the DNA of a one-celled prokaryote, 32% in yeast, 75% in roundworms, 83% in insects, 91% in a pufferfish, and 98% in a chicken (Costa 2008, p. 12). In other words, the more complex the organism, the greater the amount of junk!
The obvious thing to do was to look more closely at this neglected DNA. And after several years of looking, the reversal of thought has been both radical and ironic: the "junk" is now hailed as a primary measure of our evolutionary progress. In concert with the cell as a whole, it helps to provide the sophisticated coordination of genomic resources distinguishing the higher organisms from the lower.
This same junk is also thought to contain part of the answer to our first problem — organ differentiation in the presence of a fixed genetic code. The power of differentiation lies, not in the genes, but in the management of them. The junk, it turns out, has a lot to do with this management. Furthermore — and this is where the currently flourishing discipline of epigenetics comes to full flower — the resources for management are found, not only in noncoding DNA, but in processes broadly distributed throughout the cell.
We will look at some of these processes after briefly noting the kind of experimental result that has encouraged researchers to begin exploring the epigenome.

Reckoning with the Environment

In the mammalian genome chromosomes normally come in pairs, one inherited from the mother and the other from the father. Any given gene occurs twice, with one version ("allele") located on the first chromosome of a pair and the other on the second. When the two alleles are identical, the organism is said to be homozygous for that gene; when the alleles are different, the organism is heterozygous. For example, there are mice who, in their natural ("wildtype") state are dark-colored — a color that is partly dependent on a gene known as Kit. The mice are normally homozygous for this gene. When, however, one of the Kit alleles is replaced with a certain mutant gene, the now heterozygous mouse shows white feet and a white tail tip.
That result was perfectly natural (if you call such artificial gene manipulations "natural"). But it is also where the story becomes interesting. Scientists at the University of Nice-Sophia Antipolis in France took some of the mutant, white-spotted mice and bred them together (Rassoulzadegan et al. 2006). In the normal course of things, some of the offspring were again wildtype homozygous animals — neither of their Kit alleles was mutant. However, to the researchers' surprise, these "normal", wildtype offspring maintained, to a variable extent, the same white spots characteristic of the mutants. It was an apparent violation of Mendel's law of inheritance: while the genes themselves were sorted between generations properly, their effects did not follow the "rules". A trait was displayed despite the absence of its corresponding gene. Apparently something in addition to the genes themselves — something epigenetic — figured in the inheritance of the mice offspring, producing the distinctive coloration.
Another group of researchers, led by Michael Skinner at the University of Washington, looked at the effects of the fungicide vinclozolin on laboratory rats. (Anway et al. 2006; Crews et al. 2007). Banned in Scandinavia and Europe but allowed on some crops in the U.S., vinclozolin is an endocrine-disrupting chemical. If pregnant female rats are exposed to it while their embryos are undergoing sexual organ differentiation, the male offspring develop serious problems as adults — death of sperm-generating cells, lowered sperm count and motility and, later, immune abnormalities and various diseases including cancer. The remarkable thing is that the effects were found to be transmitted over four generations without weakening. That is, acquired characteristics — deficiencies in embryos brought on by fungicide exposure — were inherited by offspring who were not subject to the same exposure. This led Skinner to ask a troubling question: "How much of the disease we see in our society today is transgenerational and more due to exposures early in life than anything else?" (quoted in Brown 2008).
The whole business looks rather like vindication for the long-dismissed Lamarckian doctrine of the inheritance of acquired characteristics, a doctrine that has indeed been making a comeback of late. But inheritance aside, puzzling results such as these put the question, "Are genes equivalent to destiny?" in a new light. In 2007 a team of researchers at Duke University reported that exposure of pregnant mice to bisphenol A (a chemical used in many common plastics such as baby bottles and dental composites) "is associated [in the offspring], with higher body weight, increased breast and prostate cancer, and altered reproductive function". The exposure also shifted the coat color of the mice toward yellow — a change again found to be transmitted across generations despite its not being linked to a gene mutation. But more to the present point: the changes brought on by the chemical were negated when the researchers supplemented the maternal diet with folic acid, a B vitamin (Dolinoy et al. 2007).
And so an epigenome that responds to the environment can respond to healthy as well as unhealthy influences. As another illustration of this: researchers at McGill University in Montreal looked at the consequences of two kinds of maternal behavior in rats. Some mother rats patiently lick and groom their newborns, while others generally neglect their pups. The difference turns out to be reflected in the lives of the offspring: those who are licked grow up (by the usual measures) to be relatively confident and content, whereas the neglected ones show depression-like symptoms and tend to be fearful when placed in new situations.
This difference is correlated with different levels of activity in particular genes in the hippocampus of the rats' brains. Not that the genes themselves are changed; what the researchers found was various epigenetic modifications of the hippocampus that alter the way the genes work (Weaver et al. 2004). Other investigations have pointed toward similar changes in the brains of human suicide victims who were abused as children (Poulter et al. 2009).
Perhaps even more surprisingly, mouse embryos grown by means of in vitro fertilization (IVF) -- spending their first several days in a petri dish -- showed epigenetic changes resulting in altered gene "expression" (transcription). And now there are reports that humans conceived through IVF have an increased risk of several birth defects. The main suspect is again the epigenome (Kolata 2009).

A Broader Picture of Gene Function

So what is going on?
All the examples just given show how the environment can play into the organism's genetic performance. They suggest that genes do not bear a fixed meaning, independent of their context. And one aspect of this context currently receiving intense scrutiny has to do with RNA.
Far from simply carrying out orders for the production of proteins, RNA seems to be involved in wide-ranging cellular functions. Humans possess only about twenty-one thousand protein-coding genes — genes that give rise to mRNA that in turn yields protein — and these constitute about 1.2% of our DNA. Yet by one estimate 93% of the genome produces RNA transcripts — transcripts that, except for a tiny percentage, are not templates for proteins (Zimmer 2008, p. D5). If they are not engaged in producing proteins, what are these noncoding RNAs doing?
They seem to be doing a great deal, although scientists have barely begun to unravel the story. Take, for example, the mice who retained white spots on paws and tail despite the loss of the corresponding mutant gene. When the researchers extracted all the RNA — but not the DNA — from cells of mutant mice and then injected this RNA into the fertilized eggs of normal mice, the eggs developed into adults with the mutant characteristics. It appears, then, that RNA has something to do with the epigenetic inheritance of the white spots.
But there are numerous different kinds of RNA, and there are even more roles they play in the organism. Further, they are only one kind of element in the overall epigenetic landscape — a landscape whose complexity makes any summary presentation extremely misleading. Nevertheless, here are a few pointers into that complexity:

DNA methylation. Every cell "tags" or "marks" various sites along a DNA molecule with a small chemical group known as a "methyl group". These marks, or their absence, can dramatically alter the expression, or transcription, of nearby genes, often shutting them down or "silencing" them. Researchers investigating those mice exposed as embryos to bisphenol A found, among other things, decreased methylation near a key gene affecting coat color. In humans, distinctive patterns of DNA methylation are associated with Rett syndrome (a form of autism) and various forms of mental retardation. Stephen Baylin, a geneticist at Johns Hopkins School of Medicine, says that the silencing, via DNA methylation, of tumor suppressor genes is "probably playing a fundamental role in the onset and progression of cancer. Every cancer that's been examined so far, that I'm aware of, has this [pattern of] methylation" (quoted in Brown 2008).
It's not only the local gene that can be affected by methyl marks, however. The larger pattern of methylation can play a role in orchestrating gene expression over extended stretches of a chromosome. This is connected with chromatin remodeling, discussed below. And, as we will also see shortly, noncoding RNAs figure in DNA methylation.
While some epigenetic changes are heritable through the germ line, many are not — and necessarily so. You wouldn't want the epigenome of a heart cell or kidney cell — or, more relevantly, a gonad cell — to find its way unchanged into the fertilized egg. The slate upon which all the developmental processes of the adult have been written needs to be wiped clean in order to clear a space for the next generation. (Or relatively clean — heritable epigenetic marks are somehow preserved.) As part of this slate-cleaning, a wave of demethylation passes along each chromosome shortly after fertilization and is completed by the time of implantation in the uterus. Immediately following this, a new methylation occurs, appropriate for the embryo and giving it a fresh epigenetic start. When, in mammals, the stage of embryonic methylation is blocked artificially, the organism quickly dies².

Histone modification and chromatin remodeling. You will recall that there is more protein than DNA in a human chromosome. The two together constitute chromatin, an intricately formed, ever-changing substance whose physical, chemical, and electrical qualities figure greatly in gene activity. Among the key proteins are histones, eight of which join together to form something like a spool. Such spools occur along the entire length of the chromosome, with the double helical DNA wrapping 1.67 times around each spool and then extending, string-like, a short distance before wrapping around the next spool. (The DNA-histone complex is called a "nucleosome".) But normally not much of the DNA is "strung out" in this way. The nucleosomes commonly pack themselves into dense, three-dimensional arrangements, upon which are superimposed yet further levels of condensation.
All this is intimately bound up with the transcription of DNA into mRNA — that is, with the expression of genes. Wherever the chromosome is densely packed, the enzymes and other substances participating in transcription do not have easy access to the genes, and therefore gene expression is reduced. And this is where methylation enters the picture again. Methyl groups can attach not only to DNA, but also to the histones — and particularly to the long, filamentary "tails" extending out from the histones. The methyl groups here, too, affect the expression of local genes. They do this in part by mobilizing various proteins, which then become associated with the chromatin and alter its conformation. Some of these chemical complexes seem to work with each other while others work against each other. The net result is a "chromatin remodeling" that may proceed, wave-like, down long stretches of the chromosome, rendering genes either less or more available for transcription.
And, for good measure, the whole remodeling process can be facilitated by DNA methylation. "Thus modification at one level, in this case methylation on the genomic DNA, may have pronounced effects at other levels of organization of the chromatin, a theme of growing importance in the field" (Feil 2008, p. 2).
Other chemical groups beside methyl — groups such as phosphate, acetyl, and ubiquitin — can also attach to the histones, each with its distinctive and as yet scarcely traced interactions and effects. But there are few simple rules. While histone acetylation is generally associated with higher transcription rates, both methylation and ubiquitylation may either repress or activate transcription. Similarly, the phosphorylation of a particular histone site can correlate either with opening up of the chromatin structure and activated transcription, or (during cell division) with the closing and condensation of chromatin — thereby illustrating "the importance of genomic context" (Berger 2007, p. 408). In general, where a methyl, ubiquitin, or other group attaches to a histone tail, and how the group associates with other molecules, shapes its role in gene transcription. Such histone modifications — not only local modifications, but their global pattern — can be correlated with cancer and can even aid in predicting the clinical outcomes of cancer treatments (Seligson et al. 2005).
Chromatin remodeling, however, affects more than gene expression within the genome of an existing organism. It also helps to shape the possibilities for future genomes. It does this by influencing the location and rates of mutation throughout the genome. New evidence suggests that "the physical structure of the genome can directly influence the rate of mutation down to the single-nucleotide level, with far-reaching implications for genome evolution" (Semple and Taylor 2009). This is one of the ways the long-reigning doctrine of random variation is currently being undermined — that is, the doctrine that chance is the supplier of the stuff from which organisms are fashioned.

RNA interference and micro-RNA. Various lines of research during the 1990s led to the discovery of extremely short RNA molecules with an extraordinary ability: they could, with great efficiency, silence particular genes. The frenzy of investigation triggered by this discovery of "RNA interference" (RNAi) has already yielded what geneticists are unabashedly referring to as a "revolution" in their field.
The central molecular players here go by the name of "small interfering RNA" (siRNA). They are derived from the disassembly of long, double-stranded RNA — often from incoming viruses. They are truly small — only about 21-25 nucleotides long — but their short sequences are nevertheless long enough to provide a match with just one particular mRNA and thereby to target that mRNA. The siRNA, after becoming part of a larger protein complex called a "RISC", repeatedly locates its target mRNA, whereupon one of the RISC proteins cleaves the mRNA to pieces. Or else, depending on how perfect the complementarity of sequences between the siRNA and mRNA turns out to be, the latter may simply be disabled from translation rather than sliced up. In neither case is the relevant gene directly silenced, but the mRNA resulting from it is repressed. This is known as "post-transcriptional silencing".
The process, however, is far from being as neat as this description might suggest. For example, an entire drama plays out in the production of siRNA from viruses or, sometimes, from other, endogenously produced molecules. And, of course, the question of overall function arises: what significance is there in the selection of mRNAs for silencing, and how is this selection managed? There are complications at the target end of the process as well. A given mRNA can be masked from the siRNA by virtue of attached proteins, preventing its destruction. Or, conversely, those proteins may lay it bare for destruction by unfolding it and exposing it to the siRNA's complementary nucleotide sequence.
The still rapidly unfolding story of RNA interference is taking on ever wider significance. To begin with, it's not only in the cell of origin that siRNA plays a role. It can migrate to other parts of the body — and its migration to germ cells might explain some cases of epigenetic inheritance. That is, its presence in the germ cell could have much the same result as the loss or mutation of a gene.
It's also been found that siRNAs do not act only post-transcriptionally; they can cooperate with other players in directly silencing genes. They do this by participating in various DNA methylation and chromatin remodeling processes. It appears that, by means of their own short nucleotide sequences, they target specific regions of the chromosome for structural modification (Moazed 2009), with implications for gene expression in those regions.
And, in yet another surprise, researchers have discovered a role for siRNA in what they are calling "small RNA-induced gene activation" — the very opposite of silencing. By targeting a "promoter" site close to a particular gene, the siRNA can powerfully increase expression of the gene.
This last point illustrates an important truth of the living organism: we dare not assume that the meaning of any substance or any process remains constant in all contexts. What the discoveries in epigenetics are telling us is that this is true even of those foremost symbols of immovable constancy, the genes.
The dramatic significance of RNA interference is indicated by the excitement of those researchers wishing to put it to use. For example, they are already using RNA interference to silence the genes that help speed the deterioration of ripe tomatoes on your kitchen shelf. Involving as it does short, easily synthesized molecules, RNAi "has provided scientists with an incredibly powerful tool . . . . it is possible to selectively inactivate virtually any gene, simply by introducing an appropriate synthetic RNA into the cell" (Jablonka and Lamb 2005, p. 136). Of course, if the entire story of epigenetics tells us anything at all, it is that the word "simply" in this enthusiastic endorsement will not fully justify itself. But hope springs eternal.
There is another class of very short RNA not always clearly distinguished from siRNA in the technical literature. It is not derived from viruses, but only (by various elaborate pathways) from double-stranded RNA encoded in the genome. Its final processing occurs outside the nucleus in the cell cytoplasm. Like siRNA, it becomes associated with a multiprotein RISC, locates mRNA molecules, and then disables them in one way or another — evidently not so much by cleaving them as by preventing their translation. And, like siRNA, this "micro-RNA" (miRNA) identifies the target mRNA based on a complementation between its own sequence of letters (nucleotide bases) and that of the target — usually near one end of the target. However, unlike with siRNA, this match of sequences need not be very exact, so that a single micro-RNA can prevent translation of many different mRNA molecules, effectively silencing many genes.
There are at least several hundred micro-RNAs in the human genome, each of which might in this way regulate the activity of hundreds of genes. All together, micro-RNAs, siRNAs, and other classes of small RNAs not discussed here "have the potential to regulate the expression of almost all human genes" (Siomi and Siomi 2009, p. 403). They can serve to activate as well as repress gene activity, and some of them are associated with cancer, while others seem to help prevent it. In the opinion of Whitehead Institute molecular biologist David Bartel, "It's going to be very difficult to find a developmental process or disease that isn't influenced by micro-RNAs" (quoted in Pollack 2008, p. D3).
If we were to look a little further, we would find that not only do small RNAs regulate gene expression, but they in turn are regulated by yet further systems of "control". For example, proteins can block the formation of small RNAs from their precursors, or else be required as assistants in this formation. It can even happen that, through a kind of mimicry, an mRNA "fools" a RISC into binding to it, but because of the way the mRNA differs from the normal target mRNA, the RISC cannot disable it. In this way the mRNA takes the micro-RNA out of action, resulting in elevated expression of the actual target mRNA.
The idea of target mimicry introduces unanticipated complexity into the network of RNA-regulatory interactions and raises the possibility that a large number of mRNA-like non-coding RNAs recently identified in humans could be attenuators of the regulation [by small-RNA-protein complexes]. (Siomi and Siomi 2009, p. 403)

Intersecting "networks of regulation" is how this sort of thing is commonly described. One might begin to suspect that, one way or another, almost everything is involved in the regulation of almost everything else — not a very useful observation, perhaps, except so far as it lends pointedness and poignancy to the question, If everything is doing the regulating, what is left to be regulated? Or, if there is no clear distinction between regulator and regulated, maybe we're just not using the right language at all.

Transcription factors, RNA editing, and much more. Even before researchers shifted their attention to the epigenome over the past decade, certain well-established findings were powerfully nudging them toward a less linear-logical, more contextual understanding of the gene. The simplistic early schema — DNA > RNA > protein — has been under the stress of ramifying complications for a long while.
To begin with, there was not only the curious fact that the supremacy of the logically neat gene required a substantial part of the genome to be dismissed as junk; a good part of the real estate within protein-coding genes also had to be dismissed. That is, the cell as a whole does a great deal of picking and choosing when it comes to deciding what really constitutes a gene. The parts of the traditionally defined gene that survive this process are called "exons", while the segments cast aside are "introns".
The separation of the exon sheep from the intron goats occurs only after the gene is transcribed into an initial form of mRNA known as "precursor mRNA". Through a splicing process influenced by complex signaling within the cell, the introns within this precursor are culled, and the remaining sections are knitted together.
But none of this is cut-and-dried. The same precursor mRNA can undergo different splicing patterns ("alternative splicing"), so that particular protein-coding regions of DNA produce, by one estimate, an average of 5.7 different final transcripts (Zimmer 2008, p. D5). At least 86% of human genes, it is thought, are subject to alternative splicing (Muers 2008). An extreme case is a gene active in the inner ear of chickens (with an assumed analog in humans): it has 576 alternatively spliced variants. These variants
code for a protein that has a role in determining the sound frequency to which inner ear cells respond, and the variations in the protein sequence parallel variations in the frequencies to which different cells respond. It seems that having so many versions of the protein enables the chicken to tune its cells and distinguish between the sounds it hears. (Jablonka and Lamb 2005, p. 67)
There's an awful lot of significant management going on here, and it's not all being orchestrated by genes.
Even more contrary to expectation, some of the exons composing the final mRNA may come from other genes and even other chromosomes. More radically still, entirely different RNA transcripts from different parts of the genome are sometimes spliced together ("trans-splicing"). And, quite apart from the various types of splicing, there is mRNA "editing" whereby specific letters of the code are removed and replaced with different letters not corresponding to the original DNA sequence. Both the editing and splicing suffered by particular gene transcripts may systematically differ in different types of cell, despite the identical DNA sequences in those cells.
Nor is that the end of it. Once the splicing and editing are completed, the same mature mRNA can be translated into many different proteins; the same protein can go through countless modifications by being cut up or having any number of chemical groups added to it; and the resulting proteins, whatever they may be, can fold in various ways, which radically alters their function. This folding in turn can be influenced by, among other things, the character of nearby molecules. In other words, the protein end result — or, rather, the vast range of possible end results — of a particular DNA sequence can hardly be thought of as determined by a single cause, genetic or otherwise. Given the endlessly interwoven processes at work, there is no possible way to say less than this: the cell as a whole has the final say about what a gene means.
Coming back, finally, to the DNA that was supposed to be masterminding the entire show: near many genes (or sometimes remote from them) there are various "control" regions that help to regulate the expression of the gene. Of course, something, or many things, must participate in the regulating. It turns out that some 2600 proteins in the human body can, by virtue of their form, bind themselves to DNA — often at regulation sites. Some of these sites, called "promoters", are close to the regulated gene, and the proteins binding to them are called "transcription factors". Depending on the protein, its presence may either encourage or discourage gene transcription. Other proteins — repressors and activators — may help recruit transcription factors, sometimes by acting at points along the chromosome considerably distant from the affected gene, and these repressors and activators are affected in turn by co-repressors and co-activators. These complexes can intensify or diminish the role of any particular transcription factor. In a seemingly boundless tapestry of shifting patterns, many proteins act in concert, so that their effect upon DNA is a subtle integral of their separate "causal" potentials.
In all these processes, DNA itself, of course, plays its crucial role. The point is only that there's no one point-of-origin and no causal chain of command, however circuitous, that by itself provides, or could even conceivably provide, an adequate and understandable picture of what is going on. Understanding, as we will see later, requires something more than logical-causal thinking.

Overcoming fragmentation

To itemize distinct "mechanisms" in the way I have just now done is to encourage exactly the sort of isolating perspective that needs to be overcome. None of these factors and influences can be cleanly separated from the others. According to Aaron Goldberg and his colleagues at Rockefeller University's Laboratory of Chromatin Biology, "It is becoming clear that significant crosstalk exists between different epigenetic pathways". For example, small RNAs "often act in concert with various components of the cell's chromatin and DNA methylation machinery to achieve stable silencing". There's much to sort out, they say, but "the emerging dialectic of epigenetics, including the marks, writers, presenters, readers, and erasers, promises to be a rich conversation" (Goldberg et al. 2007, pp. 637-8; never mind the authors' strange juxtaposition of "machinery" and "conversation").
In a similar vein, geneticist Shelley Berger speaks at some length about the methylation of a particular histone. Originally the mark was simply thought to have positive effects on transcription. But ongoing research has revealed a dizzying array of outward-rippling interactions between this methylated site and various other activators, repressors, co-repressors, and so on. "How", she asks, "can the binding of so many complexes to one [type of methylated histone site] be explained?" Compelled toward rather nontechnical language to capture the situation, she says "it may be that there is an intricate 'dance' of associations, with these changing places over time". There is a kind of rhythm between positive- and negative-acting complexes, where "the entire chromatin context [of the methylated histone] would dictate the overall outcome".
Thus a useful analogy may be that the modifications [of chromatin] constitute a nuanced language, in which the individual marks (the "words") become meaningful only once they are assembled and viewed within their unit array, such as a transcription unit (a "sentence"). To put it simply, the genomic and regulatory context must be considered for the biological meaning to be understood. (Berger 2007, p. 409)

But, as we have seen, the regulatory context seems to extend outward without limit. Nothing less than the dynamics of cell, whole organism, and environment can make sense of any particular tract of DNA — can interpret it and turn it into a fitting expression of its larger context. The genome, perhaps we could say, is not so much an instruction manual as a dictionary of words and phrases together with a set of grammatical constraints. And then, from conception through maturity, the developing organism continually plays over this dictionary epigenetically, constructing the story of its destiny from the available textual (genetic) resources.

Next: "The Riddle of Dynamic Form in the Organism"

Notes
1. Actually, while this is the usual way of stating the matter, there are many cases throughout the animal kingdom where chromosomes are not the same in every cell. The human immune system provides one striking example:
During the maturation of lymphocytes (the white blood cells that produce the antibodies needed to fight infection and destroy foreign cells), DNA sequences in the antibody genes are moved from one place to another, and are cut, joined, and altered in various ways to produce new DNA sequences. Because there are so many different ways of joining and altering the bits of DNA, vast numbers of different sequences, each coding for a different antibody, are generated. Consequently, the DNA of one lymphocyte is different from that of most other lymphocytes, as well as from that of other cells in the body. (Jablonka and Lamb 2005, p. 68)
For many other examples, see pp. 68-70 of the cited work. However, the fact that so many different tissues and organs do have the same DNA still raises the question discussed in the main text.
2. Early stages of this slate-cleaning and management of methylation have already begun in the undeveloped egg cells present in the gonads of the female embryo.

References

Anway, Matthew D., Charles Leathers, and Michael K. Skinner (2006). "Endocrine Disruptor Vinclozolin Induced Epigenetic Transgenerational Adult-Onset Disease", Endocrinology vol. 147, no. 12, pp. 5515-23. Available online at http://endo.endojournals.org.
Berger, Shelley L. (2007). "The Complex Language of Chromatin Regulation During Transcription", Nature vol. 447 (May 24), pp. 407-12.
Brown, Valerie (2008). "Environment Becomes Heredity" (July 14), available online at http://www.miller-mccune.com/article/environment- becomes-heredity.
Costa, Fabricio F. (2008). "Non-coding RNAs, Epigenetics and Complexity", Gene vol. 410, pp. 9-17.
Crews, David, Andrea C. Gore, Timothy S. Hsu, et al. (2007). "Transgenerational Epigenetic Imprints on Mate Preference", PNAS vol. 104 (Apr. 3), pp. 5942-6. Available online at http://www.pnas.org/content/104/14/5942.
Delbrück, M. (1971). "Aristotle-totle-totle", in Of Microbes and Life, edited by Jacques Monod and Ernest Borek. New York: Columbia University Press, pp. 50-5.
Dolinoy, Dana C., Dale Huang, and Randy L. Jirtle (2007). "Maternal Nutrient Supplementation Counteracts Bisphenol A-induced DNA Hypomethylation in Early Development", PNAS vol. 140, no. 32 (Aug. 7), pp. 13056-61.
Feil, R. (2008). "Epigenetics, an Emerging Discipline with Broad Implications," C. R. Biologies, doi:10.1016/j.crvi.2008.07.027.
Jablonka, Eva and Marion J. Lamb (2005). Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation in the History of Life. Cambridge MA: MIT Press.
Kolata, Gina (2009). "Picture Emerging on Genetic Risks of IVF," New York Times (Feb. 17). Available online: http://www.nytimes.com/2009/02/17/health/17ivf.html.
Moazed, Danesh (2009). "Small RNAs in Transcriptional Gene Silencing and Genome Defence", Nature vol. 457 (Jan. 22), pp. 413-20.
Pollack, Andrew (2008). "The Promise and Power of RNA," New York Times (Nov. 11), pp. D1, D3.
Poulter, M., L. Du, I. Weaver, et al. (2009). "GABAA Receptor Promoter Hypermethylation in Suicide Brain: Implications for the Involvement of Epigenetic Processes", Biological Psychiatry vol. 64, no. 8, pp. 645-52.
Rassoulzadegan, Minoo, Valérie Grandjean, Pierre Gounon, et al. (2006). "RNA-mediated Non-Mendelian Inheritance of an Epigenetic Change in the Mouse", Nature vol. 441 (May 25), pp. 469-74.
Seligson, David B., Steve Horvath, Tao Shi, et al. (2005). "Global Histone Modification Patterns Predict Risk of Prostate Cancer Recurrence", Nature vol. 435 (June 30), pp. 1262-6.
Semple, Colin A. M. and Martin S. Taylor (2009). "The Structure of Change", Science vol. 323 (Jan. 16), pp. 347-8.
Siomi, Haruhiko and Mikiko C. Siomi (2009). "On the Road to Reading the RNA-interference Code", Nature vol. 457 (Jan. 22), pp. 396-404.
Technau, Ulrich (2008). "Small Regulatory RNAs Pitch In", Nature vol. 455 (Oct. 30), pp. 1184-5.
Weaver, I. C. et al.?? (2004). "Epigenetic Programming by Maternal Behavior", Nature Neuroscience vol. 7, pp. 847-54.
Zimmer, Carl (2008). "Now: The Rest of the Genome," New York Times (November 11), pp. D1, D5.
Go to table of contents

ABOUT THIS NEWSLETTER

NetFuture, a freely distributed electronic newsletter, is published and copyrighted by The Nature Institute. The editor is Steve Talbott, author of Devices of the Soul: Battling for Our Selves in the Age of Machines. (http://natureinstitute.org/txt/st). You may redistribute this newsletter for noncommercial purposes. You may also redistribute individual articles in their entirety, provided the NetFuture url and this paragraph are attached.
NetFuture is supported by freely given reader contributions, and could not survive without them. For details and special offers, see http://netfuture.org/support.html .
Current and past issues of NetFuture are available on the Web:
http://netfuture.org

To subscribe or unsubscribe, go to
http://netfuture.org/subscribe.html.
If you have problems subscribing or unsubscribing, send mail to: [email protected].

Home