Scientists ponder sequence of genes
Eric Lander was riding in a taxi during the week in February when government and private scientists published a nearly complete sequence of human genes. Not knowing that Lander, of the Whitehead Institute in Cambridge, played a major role in that effort, the driver explained that the first map of all our genes – the blueprint of life – was now a reality. Lander remembers thinking, “then, why am I going back to work?”
Contrary to what many people believe, genes have not been “mapped.” No one knows for sure how many genes humans possess, or what they do exactly. What scientists have is a sequence of 3 billion pairs of letters, the equivalent of 75,500 pages of The New York Times filled with combinations of four letters, A, T, C, and G, which stand for the chemical bases of which all genes are made. Thousands of these combinations strung together make up one gene. But where does a gene start and where does it end?
Lander compares the present state of the human genome to a list of parts for a Boeing 777. “Having a list of 100,000 parts doesn’t tell you how to put it together or how it flies,” he says.
Last week, some of the best minds that wrestle with the problem of making sense of the human genome met at Harvard University to find some answers to the question, “What’s next?” Attended by about 1,000 people, the “Genomics 2001” symposium was organized by the University’s Department of Molecular and Cellular Biology and its Center for Genomic Research (CGR), and sponsored by NetGenics, a Cleveland genetics company.
Andrew Murray, director of CGR, Lander, and the other 15 speakers kept referring to the present human genome sequence as a “working draft.” About 5 percent of the sequence has yet to be filled in. That comes to about 150 million of those letter pairs. Several speakers said that the sequence should be complete in less than two years, or before the end of 2002.
Missing genes
Much has been made of the so-called missing genes. Six months ago most scientists thought that humans boasted about 100,000 genes in their chromosomes. Some estimates ran as high as 120,000, even 140,000. But the actual number is a big surprise, between 30,000 and 40,000. Lander guesses 32,000-33,000.
It turns out that 100,000 was just an educated guess to begin with. Walter Gilbert, Carl M. Loeb University Professor, a gene-sequencing pioneer and Nobel laureate, made a rough calculation about 15 years ago and came up with that number. “It was just a back-of-the-envelope calculation,” Lander says. “But it made its way into the textbooks. Now I have 10 years of students to write to.”
At one research center in Cold Spring Harbor, N.Y., scientists run an open betting pool on the final number of genes that will be identified. The cost of placing a bet is steadily getting higher as knowledge continues to increase.
Not all of the 3 billion letter-pairs that make up our genetic heritage spell out useful words. There are lots of blank spaces in the book of life. The educated guess is that only 1 to 11/2 percent of the base pairs are parts of working genes. The rest is, to use the scientific term, “junk.”
Genes and junk are arranged along threadlike chromosomes curled up inside the core, or nucleus, of every cell. In chromosomes, long gaps exist without any genes at all. In other locations, they are packed together like pearls on a necklace. Geneticists give this structure the lofty name “lumpy.” No one knows why the human genome is so lumpy, or carries so many useless letters.
Researchers aren’t finding much variation in human genes despite the great diversity of people who populate our planet. Modern humans are thought to have spread from Africa to Europe and Asia about 100,000 years ago. “That’s only about 5,000 generations,” notes Lander, “the blink of an eye in 3 billion years of evolution. Humans have much the same genetic makeup as they did when they walked out of Africa.”
There are variations, of course. These variations account for such things as an individual’s risk for heart problems and Alzheimer’s disease, resistance to the AIDS virus, and whether a person will be born with type 1 diabetes, multiple sclerosis, or any of a long list of other genetic maladies.
Several speakers at the symposium mentioned a goal of putting together a catalog, or table, matching variations in human genes with the afflictions they underlie. Once such precise information becomes available, scientist would know what targets they must hit with drugs and other treatments.
That kind of knowledge, notes Chris Sander of the Massachusetts Institute of Technology, “would have a profound impact on the evolution of life on Earth.”
Getting down to the chips
Much of the symposium was devoted to techniques for reading the book of life and applying that knowledge to cure diseases. Letter combinations in a gene comprise a code that specifies what kind of protein a gene will produce when it’s turned on. Proteins catalyze the activity of life. They turn genes on and off, protect against sickness, send thoughts through our brain and energy through our muscles. Abnormal proteins lead to everything from brain degeneration to bowel inflammation.
But there’s not a one-to-one correspondence between genes and proteins. Variations in the code of each gene can produce two, three, or four different proteins. Three variants in the spelling of one protein, for example, account for much of the risk for Alzheimer’s disease. However, all protein variants aren’t harmful.
“There are probably between 100,000 and 300,000 different proteins,” Chris Sander estimates, “so we have a lot of work ahead of us.”
Fifteen years ago, most scientists never dreamed of sequencing all the genes of humans, mice, fish, and bacteria. Legions of graduate students and technicians worked manually to unravel the cloth of life thread by thread. Today, most of these workers have been replaced by computers, robots, and other technologies. “Working together, laboratories worldwide can now sequence a total of 1,000 letters in one second,” Lander says. “It’s been great to live through a time like this.”
Symposium attendees expressed little doubt that technology will enable them to take the next giant step, cataloging all the genes and proteins that make us what we are. Several speakers described glass chips the size of a dime that can tease gene identities out of the 3 billion-letter sequence. Similar protein chips can reveal the identities of their proteins.
Sander described an effort under way to tie together laboratories all over the globe in a project that would determine the structure of all proteins. “I think it can be done in five years, certainly less than 10,” he says.
Gene chips are expected to be able to find the variants, or mutations, responsible for various cancers. One chip in the testing stage has correctly identified mutations in the BRCA1 gene, mutations of which are the major cause of hereditary breast cancer. In the near future, chips should be able to detect which genes are turned on when an individual suffers from breast, prostate, lung, or any other cancer. That’s a major step toward both early diagnosis and design of new drugs to snuff out these malignancies.
Stephen Chanock of the National Cancer Institute described a project to catalog all genes and variants involved in cancers. There are 850 suspects on the list so far. “We hope to eventually describe the susceptibility of every individual to cancer,” he told the audience. “All the necessary information will be made available free of charge on the Internet.”
Several speakers told the audience about projects to sequence the genes of other animals and plants, particularly those involved in causing diseases. Joseph DeRisi of the University of California, San Francisco, for example, described an effort to find all the genes in the major parasite responsible for malaria. A bug known as Plasmodium falciparum kills about 3 million people a year, most of them children in Africa, and sickens another 300 million or more. No effective vaccine exists, and P. falciparum continually develops resistance to drugs developed to defeat it.
For researchers to get the genetic material they need, the bug has to be grown in human blood. “We’re always looking for new graduate and postdoctoral students to ‘work’ on this project,” DeRisi quipped. “It will take another year to finish it, and we hope the information will lead to the design of a final drug or vaccine solution.”
Everyone at the meeting got the point that a tremendous amount of work lies ahead before medical researchers get to pick fruit from the gene-sequence tree. But you also could detect a lot of optimism.
Richard Durbin of the Sanger Centre in England caught that optimism in a cartoon. It shows two men seated before a mountain of jigsaw puzzle pieces. One of them holds up a piece and cheerfully exclaims, “I think I found a corner piece.”