The blueprint for the construction, development and maintenance of our bodies is encoded in our cells' DNA molecules, each of which remarkably contains the same units of information, or genes. In the simplest terms, a gene is a template for building a protein that has a specific function in our bodies. For example, factors such as eye color are determined by genes. In the case of brown eyes, a gene provides a template for the production of a dark pigment protein that is deposited in the iris of the eye. The sum total of all the genes in our body is known as the genome, and the combined actions of all the proteins encoded in our genome, along with environmental factors, determine every aspect of our bodies from our sex to our hair color.
Understanding the relationship between the genetic code and the human body has been a priority for biologists and geneticists for many decades. Their approach has been to determine the sequence of base-pair units in the DNA molecules and then decode this sequence to identify the underlying genes. This is a tricky problem since the DNA molecules in a single human cell measure approximately 6.5 feet when stretched out and laid end-to-end. However, recent technological advances have resulted in an amazing increase in our capacity to sequence DNA. It is now possible to sequence entire genomes in days, and computer technologies have advanced to a point where massive data sets can be readily analyzed. More importantly, the dollar cost of sequencing an entire individual's genome has decreased dramatically. These advances have opened up a new era in genomics with great promise for understanding the basis of genetics in general and genetic-based diseases in particular.
A major question in genetics that has been answered by these new developments is one that has intrigued biologists for decades - that is, how many genes are present in the human genome - or how many genes are needed to encode all the proteins in our bodies? Scientists had originally assumed that there must be well over 100,000 genes to account for the complexity of modern humans, but the recent sequencing of the human genome has revealed only 21,000 classical genes, a surprisingly small number. Moreover, given the amount of DNA in each cell, there are apparently huge stretches of DNA that don't encode genes at all. These intervening sequences have been referred to as junk DNA, and it has been speculated that "junk" builds up through the inefficient process of evolution. For example, genes can inadvertently become duplicated, resulting in an extra inactivated copy that simply hangs around doing nothing.
However, it is now becoming apparent that much of this "junk" DNA is not junk after all. In the last few weeks, 30 research papers have been published stemming from a major research project called ENCODE (Encyclopedia of DNA Elements). The ENCODE program was initiated by the National Human Genome Research Institute in Bethesda, Md., in 2003 and is comprised of 32 research teams and 440 scientists from around the world. Each research team analyzed billions of DNA base pairs in several different cell types and carefully cataloged 15 trillion bytes of raw data. Their data now reveal that many sequences in the DNA that are not classical genes are nevertheless translated into products that play major regulatory roles - for example influencing and controlling the production proteins encoded by classical genes. It turns out that there are about 4 million so-called gene switches - transcription factors that control when our genes turn on and off and how much protein they make. In fact, approximately three quarters of our DNA is now thought to be actively involved in building and maintaining our bodies. These findings change our view of the genome from one that is centered on specific genes to one that is much more holistic. The new map of genetic switches that has been developed by ENCODE will now allow scientists to better understand gene regulation and hopefully innovate new approaches for treating diseases in which the regulation has gone awry. These include common chronic diseases, such as diabetes and heart disease, which result from the disruption of multiple factors, not just single genes.
Thomas Edison once said, "To invent, you need a good imagination and a pile of junk." While the human species may have started out that way, it's heartening to know that we no longer carry around a lot of "junk" after all. Evolution has turned out to be remarkably efficient and an exemplary recycler.
David L. "Woody" Woodland, Ph.D. is the chief scientific officer of Silverthorne-based Keystone Symposia on Molecular and Cellular Biology, a nonprofit dedicated to accelerating life science discovery by convening internationally renowned research conferences in Summit County and worldwide. Woody can be reached at (970) 262-1230 ext. 131 or firstname.lastname@example.org.